Federated architecture best fits customers that execute high load (thousands of transactions per second or more) in their gateways, where the vast majority of the transactions is executed on-premise.
The cell environment implements the federated architecture by distributing DPOD's Store and DPOD's processing (using DPOD's agents) across different federated servers.
The cell environment has two main components:
The following diagram describes the Cell Environment:
The following procedure describes the process of establishing a DPOD cell environment.
From | To | Ports (Defaults) | Protocol | Usage |
---|---|---|---|---|
DPOD Cell Manager | Each Monitored Device | 5550 (TCP) | HTTP/S | Monitored device administration management interface |
DPOD Cell Manager | DNS Server | TCP and UDP 53 | DNS | DNS services. Static IP address may be used. |
DPOD Cell Manager | NTP Server | 123 (UDP) | NTP | Time synchronization |
DPOD Cell Manager | Organizational mail server | 25 (TCP) | SMTP | Send reports by email |
DPOD Cell Manager | LDAP | TCP 389 / 636 (SSL). TCP 3268 / 3269 (SSL) | LDAP | Authentication & authorization. Can be over SSL. |
DPOD Cell Manager | Each DPOD Federated Cell Member | 443 (TCP) | HTTP/S | Communication (data + management) |
DPOD Cell Manager | Each DPOD Federated Cell Member | 9300-9305 (TCP) | ElasticSearch | ElasticSearch Communication (data + management) |
DPOD Cell Manager | Each DPOD Federated Cell Member | 22 (TCP) | TCP | root access is needed for the cell installation and for admin operations from time to time. |
NTP Server | DPOD Cell Manager | 123 (UDP) | NTP | Time synchronization |
Each Monitored Device | DPOD Cell Manager | 60000-60003 (TCP) | TCP | SYSLOG Data |
Each Monitored Device | DPOD Cell Manager | 60020-60023 (TCP) | HTTP/S | WS-M Payloads |
Users IPs | DPOD Cell Manager | 443 (TCP) | HTTP/S | DPOD's Web Console |
Admins IPs | DPOD Cell Manager | 22 (TCP) | TCP | SSH |
Each DPOD Federated Cell Member | DPOD Cell Manager | 443 (TCP) | HTTP/S | Communication (data + management) |
Each DPOD Federated Cell Member | DPOD Cell Manager | 9200, 9300-9400 | ElasticSearch | ElasticSearch Communication (data + management) |
Each DPOD Federated Cell Member | DNS Server | TCP and UDP 53 | DNS | DNS services |
Each DPOD Federated Cell Member | NTP Server | 123 (UDP) | NTP | Time synchronization |
NTP Server | Each DPOD Federated Cell Member | 123 (UDP) | NTP | Time synchronization |
Each Monitored Device | Each DPOD Federated Cell Member | 60000-60003 (TCP) | TCP | SYSLOG Data |
Each Monitored Device | Each DPOD Federated Cell Member | 60020-60023 (TCP) | HTTP/S | WS-M Payloads |
Admins IPs | Each DPOD Federated Cell Member | 22 (TCP) | TCP | SSH |
DPOD Cell Manager | Each DPOD Federated Cell Member | 60000-60003 (TCP) | TCP | Syslog keep-alive data |
DPOD Cell Manager | Each DPOD Federated Cell Member | 60020-60023 (TCP) | TCP | HTTP/S WS-M keep-alive data |
Install DPOD as described in one of the following installation procedures:
After DPOD installation is complete, execute the following operating system performance optimization script and reboot the server:
/app/scripts/tune-os-parameters.sh reboot |
The following section describes the installation process of a single Federated Cell Member (FCM). Please repeat the procedure for every FCM installation.
After DPOD installation is complete, execute the following operating system performance optimization script and reboot the server:
/app/scripts/tune-os-parameters.sh reboot |
The cell member is usually a bare metal server with NVMe disks for maximizing server I/O throughput.
Each of the Store's logical nodes (service) will be bound to specific physical processor, disks and memory using NUMA (Non-Uniform Memory Access) technology.
The following table contains the list of OS mount points that should be configured along with additional information that must be gathered before federating the DPOD cell member to the cell environment.
Please copy this table, use it during the procedure, and complete the information in the empty cells as you follow the procedure:
Store Node | Mount Point Path | Disk Bay | Disk Serial | Disk OS Path | PCI Slot Number | NUMA Node (CPU #) |
---|---|---|---|---|---|---|
2 | /data2 | |||||
2 | /data22 | |||||
2 * | /data222 | |||||
3 | /data3 | |||||
3 | /data33 | |||||
3 * | /data333 | |||||
4 | /data4 | |||||
4 | /data44 | |||||
4 * | /data444 |
* Lines marked with asterisk (*) are relevant only in case DPOD sizing team recommends 9 disks instead of 6 disks per cell member. You may remove these lines in case you have only 6 disks per cell member.
To identify which of the server's NVMe disk bays is bound to which of the CPUs, use the hardware manufacture documentation.
Write down the disk bay as well as the disk's serial number by visually observing the disk.
To list the OS path of each disk, execute the following command and write down the disk OS path (e.g.: /dev/nvme0n1) according to the disk's serial number (e.g.: PHLE8XXXXXXC3P2EGN):
nvme -list Expected output: Node SN Model Namespace Usage Format FW Rev ---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- -------- /dev/nvme0n1 PHLE8XXXXXXC3P2EGN SSDPE2KE032T7L 1 3.20 TB / 3.20 TB 512 B + 0 B QDV1LV46 /dev/nvme1n1 PHLE8XXXXXXM3P2EGN SSDPE2KE032T7L 1 3.20 TB / 3.20 TB 512 B + 0 B QDV1LV46 /dev/nvme2n1 PHLE8XXXXXX83P2EGN SSDPE2KE032T7L 1 3.20 TB / 3.20 TB 512 B + 0 B QDV1LV46 /dev/nvme3n1 PHLE8XXXXXXN3P2EGN SSDPE2KE032T7L 1 3.20 TB / 3.20 TB 512 B + 0 B QDV1LV46 /dev/nvme4n1 PHLE8XXXXXX63P2EGN SSDPE2KE032T7L 1 3.20 TB / 3.20 TB 512 B + 0 B QDV1LV46 /dev/nvme5n1 PHLE8XXXXXXJ3P2EGN SSDPE2KE032T7L 1 3.20 TB / 3.20 TB 512 B + 0 B QDV1LV46 |
To list the the PCI slot for each disk OS path, execute the following command and write down the PCI slot (e.g.: 0c:00.0) according to the last part of the disk OS path (e.g.: nvme0n1):
lspci -nn | grep NVM | awk '{print $1}' | xargs -Innn bash -c "printf 'PCI Slot: nnn '; ls -la /sys/dev/block | grep nnn" Expected output: PCI Slot: 0c:00.0 lrwxrwxrwx. 1 root root 0 May 16 10:26 259:2 -> ../../devices/pci0000:07/0000:07:00.0/0000:08:00.0/0000:09:02.0/0000:0c:00.0/nvme/nvme0/nvme0n1 PCI Slot: 0d:00.0 lrwxrwxrwx. 1 root root 0 May 16 10:26 259:5 -> ../../devices/pci0000:07/0000:07:00.0/0000:08:00.0/0000:09:03.0/0000:0d:00.0/nvme/nvme1/nvme1n1 PCI Slot: ad:00.0 lrwxrwxrwx. 1 root root 0 May 16 10:26 259:1 -> ../../devices/pci0000:ac/0000:ac:02.0/0000:ad:00.0/nvme/nvme2/nvme2n1 PCI Slot: ae:00.0 lrwxrwxrwx. 1 root root 0 May 16 10:26 259:0 -> ../../devices/pci0000:ac/0000:ac:03.0/0000:ae:00.0/nvme/nvme3/nvme3n1 PCI Slot: c5:00.0 lrwxrwxrwx. 1 root root 0 May 16 10:26 259:3 -> ../../devices/pci0000:c4/0000:c4:02.0/0000:c5:00.0/nvme/nvme4/nvme4n1 PCI Slot: c6:00.0 lrwxrwxrwx. 1 root root 0 May 16 10:26 259:4 -> ../../devices/pci0000:c4/0000:c4:03.0/0000:c6:00.0/nvme/nvme5/nvme5n1 Tip: you may execute the following command to list the details of all PCI slots with NVMe disks installed in the server: lspci -nn | grep -i nvme | awk '{print $1}' | xargs -Innn lspci -v -s nnn Tip: you may execute the following command to list all disk OS paths in the server: ls -la /sys/dev/block |
To list the NUMA node of each PCI slot, execute the following command and write down the NUMA node (e.g.: 1) according to the PCI slot (e.g.: 0c:00.0):
lspci -nn | grep -i nvme | awk '{print $1}' | xargs -Innn bash -c "printf 'PCI Slot: nnn'; lspci -v -s nnn | grep NUMA" Expected output: PCI Slot: 0c:00.0 Flags: bus master, fast devsel, latency 0, IRQ 45, NUMA node 1 PCI Slot: 0d:00.0 Flags: bus master, fast devsel, latency 0, IRQ 52, NUMA node 1 PCI Slot: ad:00.0 Flags: bus master, fast devsel, latency 0, IRQ 47, NUMA node 2 PCI Slot: ae:00.0 Flags: bus master, fast devsel, latency 0, IRQ 49, NUMA node 2 PCI Slot: c5:00.0 Flags: bus master, fast devsel, latency 0, IRQ 51, NUMA node 3 PCI Slot: c6:00.0 Flags: bus master, fast devsel, latency 0, IRQ 55, NUMA node 3 |
This is an example of how a row of the table should look like:
Store Node | Mount Point Path | Disk Bay | Disk Serial | Disk OS Path | PCI Slot Number | NUMA node (CPU #) |
---|---|---|---|---|---|---|
2 | /data2 | Bay 1 | PHLE8XXXXXXC3P2EGN | /dev/nvme0n1 | 0c:00.0 | 1 |
Execute the following command and verify all NVMe disks have the same speed (e.g.: 8GT/s):
lspci -nn | grep -i nvme | awk '{print $1}' | xargs -Innn bash -c "printf 'PCI Slot: nnn'; lspci -vvv -s nnn | grep LnkSta:" Expected output: PCI Slot: 0c:00.0 LnkSta: Speed 8GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- PCI Slot: 0d:00.0 LnkSta: Speed 8GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- PCI Slot: ad:00.0 LnkSta: Speed 8GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- PCI Slot: ae:00.0 LnkSta: Speed 8GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- PCI Slot: c5:00.0 LnkSta: Speed 8GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- PCI Slot: c6:00.0 LnkSta: Speed 8GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- |
Configure the mount points according to the table with all gathered information.
It is highly recommended to use LVM (Logical Volume Manager) to allow flexibility for future storage needs.
The following example uses LVM. You may use it for each mount point (replace vg_data2 with vg_data22/vg_data222/vg_data3 etc.):
pvcreate -ff /dev/nvme0n1 vgcreate vg_data2 /dev/nvme0n1 lvcreate -l 100%FREE -n lv_data vg_data2 mkfs.xfs -f /dev/vg_data2/lv_data |
The following example is the line that should be added to /etc/fstab for each mount point (replace vg_data2 and /data2 with the appropriate values from the table):
/dev/vg_data2/lv_data /data2 xfs defaults 0 0 |
Create a directory for each mount point (replace /data2 with the appropriate values from the table):
mkdir -p /data2 |
This example is for 6 disks per cell member and does not include other mount points that should exist, as describe in Hardware and Software Requirements. |
Execute the following command and verify mount points:
lsblk Expected output: NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT nvme0n1 259:2 0 2.9T 0 disk └─vg_data2-lv_data 253:0 0 2.9T 0 lvm /data2 nvme1n1 259:5 0 2.9T 0 disk └─vg_data22-lv_data 253:11 0 2.9T 0 lvm /data22 nvme2n1 259:1 0 2.9T 0 disk └─vg_data3-lv_data 253:9 0 2.9T 0 lvm /data3 nvme3n1 259:0 0 2.9T 0 disk └─vg_data33-lv_data 253:10 0 2.9T 0 lvm /data33 nvme4n1 259:3 0 2.9T 0 disk └─vg_data44-lv_data 253:8 0 2.9T 0 lvm /data44 nvme5n1 259:4 0 2.9T 0 disk └─vg_data4-lv_data 253:7 0 2.9T 0 lvm /data4 |
Execute the following command:
yum install numactl |
Most Linux-based OS use a local firewall service (e.g.: iptables / firewalld). Since the OS of the Non-Appliance Mode DPOD installation is provided by the user, it is under the user's responsibility to allow needed connectivity to and from the server.
Configure the local firewall service to allow connectivity as described in the prerequisites section at the top of this page.
When using DPOD Appliance mode installation for the cell manager, local OS based firewall service configuration is handled by the cell member federation script. When using DPOD Non-Appliance mode installation for the cell manager, local OS based firewall service configuration should be done by the user in addition to configuring the local OS based firewall service configuration of the cell memeber. |
In order to federate and configure the cell member, run the following script in the cell manager once per cell member.
For instance, to federate two cell members, the script should be run twice (in the cell manager) - first time with the IP address of the first cell member, and second time with the IP address of the second cell member.
Important: The script should be executed using the OS root user.
/app/scripts/configure_cell_manager.sh -a <internal IP address of the cell member> -g <external IP address of the cell member> For example: /app/scripts/configure_cell_manager.sh -a 172.18.100.34 -g 172.17.100.33 |
Note that the script writes two log files, one in the cell manager and one in the cell member. The log file names are mentioned in the script's output.
In case of a failure, the script will try to rollback the configuration changes it made, so the problem can be fixed before rerunning it again.
DPOD cell member is using NUMA (Non-Uniform Memory Access) technology.
The default cell member configuration binds DPOD's agent to NUMA node 0 and the Store's nodes to NUMA node 1.
If the server has 4 CPUs, the user should update the service files of nodes 2 and 3 and change the bound NUMA nodes to 2 and 3 respectively.
To identify the amount of CPUs installed on the server, use the NUMA utility:
numactl -s Expected output: policy: default preferred node: current physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 cpubind: 0 1 2 3 nodebind: 0 1 2 3 membind: 0 1 2 3 |
Note: In case the cell member has only 2 CPUs - skip this step.
To update the service files, execute the following command:
sed -i 's#/usr/bin/numactl --membind=1 --cpunodebind=1#/usr/bin/numactl --membind=2 --cpunodebind=2#g' /etc/init.d/MonTier-es-raw-trans-Node-2 sed -i 's#/usr/bin/numactl --membind=1 --cpunodebind=1#/usr/bin/numactl --membind=3 --cpunodebind=3#g' /etc/init.d/MonTier-es-raw-trans-Node-3 |
After a successful federation, you will be able to see the new federated cell member in the Manage → System → Nodes page.
For example:
Also, the new agents will be shown in the agents list in the Manage → Internal Health → Agents page:
Configure the monitored gateways to use the federated cells agents. Please follow instructions on Adding Monitored Devices.