...
- Cell Manager - a DPOD server (virtual or physical) that manages all Federated Cell Members (FCMs) as well as provides central providing central DPOD services such as the Web Console, reports, alerts, etc.
- Federated Cell Member (FCM) - a DPOD server (usually physical with local high speed storage) that includes Store data nodes and agents (Syslog and WS-M) for collecting, parsing and storing data. There could be one or more federated cell members per cell.
See the The following diagram describes the Cell Environment:
The following procedure describes the process of establishing a DPOD cell environment.
...
Anchor | ||||
---|---|---|---|---|
|
From | To | Ports (Defaults) | Protocol | Usage |
---|---|---|---|---|
DPOD Cell Manager | Each Monitored Device | 5550 (TCP) | HTTP/S | Monitored device administration management interface |
DPOD Cell Manager | DNS Server | TCP and UDP 53 | DNS | DNS services. Static IP address may be used. |
DPOD Cell Manager | NTP Server | 123 (UDP) | NTP | Time synchronization |
DPOD Cell Manager | Organizational mail server | 25 (TCP) | SMTP | Send reports by email |
DPOD Cell Manager | LDAP | TCP 389 / 636 (SSL). TCP 3268 / 3269 (SSL) | LDAP | Authentication & authorization. Can be over SSL. |
DPOD Cell Manager | Each DPOD Federated Cell Member | 9300-9305 (TCP) | ElasticSearch | ElasticSearch Communication (data + management) |
NTP Server | DPOD Cell Manager | 123 (UDP) | NTP | Time synchronization |
Each Monitored Device | DPOD Cell Manager | 60000-60003 (TCP) | TCP | SYSLOG Data |
Each Monitored Device | DPOD Cell Manager | 60020-60023 (TCP) | HTTP/S | WS-M Payloads |
Users IPs | DPOD Cell Manager | 443 (TCP) | HTTP/S |
DPOD's Web Console | ||||
Admins IPs | DPOD Cell Manager | 22 (TCP) | TCP | SSH |
Each DPOD Federated Cell Member | DPOD Cell Manager | 9200, 9300-9400 | ElasticSearch | ElasticSearch Communication (data + management) |
Each DPOD Federated Cell Member | DNS Server | TCP and UDP 53 | DNS | DNS services |
Each DPOD Federated Cell Member | NTP Server | 123 (UDP) | NTP | Time synchronization |
NTP Server | Each DPOD Federated Cell Member | 123 (UDP) | NTP | Time synchronization |
Each Monitored Device | Each DPOD Federated Cell Member | 60000-60003 (TCP) | TCP | SYSLOG Data |
Each Monitored Device | Each DPOD Federated Cell Member | 60020-60023 (TCP) | HTTP/S | WS-M Payloads |
Admins IPs | Each DPOD Federated Cell Member | 22 (TCP) | TCP | SSH |
Cell Manager Installation
...
- DPOD cell manager should be installed in Non-Appliance Mode with Medium Load architecture type, as detailed in the Hardware and Software Requirements. The manager server can be both virtual or physical.
- Install the following software package (RPM): bc
Installation
Install DPOD as described in one of the following installation procedures:
...
Note |
---|
As described in the prerequisites section, the cell manager should have two network interfaces. When installing DPOD, the user is prompted to choose the IP address for the Web Console - this should be the IP address of the external network interface. |
- After DPOD installation is complete, the user should execute the following operating system performance optimization script:
Code Block | ||
---|---|---|
| ||
/app/scripts/tune-os-parameters.sh |
Federated Cell Member Installation
...
- DPOD federated cell member (FCM) should be installed in Non-appliance Mode with High_20dv architecture type, as detailed in the Hardware and Software Requirements.
- The Install the following software packages package (RPMs) are recommended for RPM): bc, pciutils, nvme-cli, numactl
- The following software packages (RPMs) are recommended for system maintenance and troubleshooting, but are not required: telnet client, net-tools, iftop, tcpdump, bc, pciutils
Installation
DPOD Installation
- Install DPOD in Non-Appliance Mode: Installation procedure
- The four letter Installation Environment Name should be identical to the one that was chosen for the Cell Manager.
Note |
---|
As described in the prerequisites section, the federated cell member should have two network interfaces. When installing DPOD, the user is prompted to choose the IP address for the Web Console - this should be the IP address of the external network interface (although the FCM does not run the Web Console service). |
...
- To identify which of the server's NVMe disk bays is bound to which of the CPUs, use the hardware manufacture documentation.
Also, write down the disk's serial number by visually observing the disk. In order to identify the disk OS path (e.g.: /dev/nvme01n), disk serial and disk NUMA node use the following command :
Identify all NVMe Disks installed on the server
Code Block theme RDark lspci -nn | grep NVMNon-Volatile expected output : 5d:00.0 Non-Volatile memory controller [0108]: Intel Corporation Express Flash NVMe P4500 [8086:0a54] 5e:00.0 Non-Volatile memory controller [0108]: Intel Corporation Express Flash NVMe P4500 [8086:0a54] ad:00.0 Non-Volatile memory controller [0108]: Intel Corporation Express Flash NVMe P4500 [8086:0a54] ae:00.0 Non-Volatile memory controller [0108]: Intel Corporation Express Flash NVMe P4500 [8086:0a54] c5:00.0 Non-Volatile memory controller [0108]: Intel Corporation Express Flash NVMe P4500 [8086:0a54] c6:00.0 Non-Volatile memory controller [0108]: Intel Corporation Express Flash NVMe P4500 [8086:0a54]
Locate disk's NUMA node
Use the disk PCI slot listed in previous command to identify the NUMA node (the first disk PCI slot is : 5d:00.0 )Code Block theme RDark linenumbers true lspci -s 5d:00.0 -v expected output : 5d:00.0 Non-Volatile memory controller: Intel Corporation Express Flash NVMe P4500 (prog-if 02 [NVM Express]) Subsystem: Lenovo Device 4712 Physical Slot: 70 Flags: bus master, fast devsel, latency 0, IRQ 93, NUMA node 1 Memory at e1310000 (64-bit, non-prefetchable) [size=16K] Expansion ROM at e1300000 [disabled] [size=64K] Capabilities: [40] Power Management version 3 Capabilities: [50] MSI-X: Enable+ Count=129 Masked- Capabilities: [60] Express Endpoint, MSI 00 Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [100] Advanced Error Reporting Capabilities: [150] Virtual Channel Capabilities: [180] Power Budgeting <?> Capabilities: [190] Alternative Routing-ID Interpretation (ARI) Capabilities: [270] Device Serial Number 55-cd-2e-41-4f-89-0f-43 Capabilities: [2a0] #19 Capabilities: [2d0] Latency Tolerance Reporting Capabilities: [310] L1 PM Substates Kernel driver in use: nvme Kernel modules: nvme
From the command output (line number 8) we can identify the NUMA node ( Flags: bus master, fast devsel, latency 0, IRQ 93, NUMA node 1 )
Identify NVMe Disks path
Use the disk PCI slot listed in previous command to identify the disk's block device pathCode Block theme RDark ls -la /sys/dev/block |grep 5d:00.0 expected output : lrwxrwxrwx. 1 root root 0 Nov 5 08:06 259:4 -> ../../devices/pci0000:58/0000:58:00.0/0000:59:00.0/0000:5a:02.0/0000:5d:00.0/nvme/nvme0/nvme0n1
Use the last part of the device path (nvme0n1) as input for the following command :Code Block theme RDark nvme -list |grep nvme0n1 expected output : /dev/nvme0n1 PHLE822101AN3P2EGN SSDPE2KE032T7L 1 3.20 TB / 3.20 TB 512 B + 0 B QDV1LV45
The disk's path is /dev/nvme0n1
...
Code Block | ||
---|---|---|
| ||
# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT nvme0n1 259:0 0 2.9T 0 disk └─vg_data2-lv_data 253:6 0 2.9T 0 lvm /data2 nvme1n1 259:5 0 2.9T 0 disk └─vg_data22-lv_data 253:3 0 2.9T 0 lvm /data22 nvme2n1 259:1 0 2.9T 0 disk └─vg_data3-lv_data 253:2 0 2.9T 0 lvm /data3 nvme3n1 259:2 0 2.9T 0 disk └─vg_data33-lv_data 253:5 0 2.9T 0 lvm /data33 nvme4n1 259:4 0 2.9T 0 disk └─vg_data44-lv_data 253:7 0 2.9T 0 lvm /data44 nvme5n1 259:3 0 2.9T 0 disk └─vg_data4-lv_data 253:8 0 2.9T 0 lvm /data4 |
Install NUMA Software
Code Block | |
---|---|
theme | RDark|
yum install numactl |
Preparing Local OS Based Firewall
...
Code Block | ||
---|---|---|
| ||
numactl -s Example output for 4 CPU server : policy: default preferred node: current physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 cpubind: 0 1 2 3 nodebind: 0 1 2 3 membind: 0 1 2 3 |
Alter
...
The services files are located on the directory /etc/init.d/ with name prefix MonTier-SyslogAgent- (should be 4 service files)
Look in the service file the string "numa" and make sure the numa variable definition is as follows :
Code Block | ||
---|---|---|
| ||
numa="/usr/bin/numactl --membind=0 --cpunodebind=0"
/bin/su -s /bin/bash -c "/bin/bash -c 'echo \$\$ >${FLUME_PID_FILE} && exec ${numa} ${exec}...... |
...
Store's Node 3 and 4 (OPTIONAL - only if the server has 4 CPUs)
The services files are located on the directory /etc/init.d/ with the name MonTier-es-raw-trans-Node-2 3 and MonTier-es-raw-trans-Node-34.
Code Block | ||
---|---|---|
| ||
For node MonTier-es-raw-trans-Node-23 OLD VALUE : numa="/usr/bin/numactl --membind=1 --cpunodebind=1" NEW VALUE : numa="/usr/bin/numactl --membind=2 --cpunodebind=2" For node MonTier-es-raw-trans-Node-34 OLD VALUE : numa="/usr/bin/numactl --membind=1 --cpunodebind=1" NEW VALUE : numa="/usr/bin/numactl --membind=3 --cpunodebind=3" |
- Restart DPOD's services using the app-utils.sh .
Cell Member Federation Verification
...