Overview
Federated architecture best fits customers that execute high load (thousands of transactions per second or more) in their gateways.
The cell environment implements the federated architecture by distributing DPOD's Store and DPOD's agents across different federated servers.
The cell environment has two main components:
- Cell Manager - a DPOD server (usually virtual) that manages all Federated Cell Members (FCMs), as well as providing central DPOD services such as the Web Console, reports, alerts, resource monitoring, etc.
- Federated Cell Members (FCMs) - DPOD servers (usually physical with very fast local storage) that include Store data nodes and agents (Syslog and WS-M) for collecting, parsing and storing data.
The following diagram describes the cell environment:
Prerequisites
- Before installing a cell environment, make sure to complete the sizing process with IBM Support Team to get recommendations for the hardware and architecture suitable for your requirements.
- DPOD cell manager and federated cell members must be of the same version (minimum version is 1.0.8.6).
- DPOD cell manager is usually virtual and can be installed in both Appliance Mode or Non-Appliance Mode with Medium Load architecture type, as detailed in the Hardware and Software Requirements.
- DPOD federated cell members (FCMs) can be one of the following:
- Physical servers installed in Non-appliance Mode (based on RHEL) with High_20dv architecture type, as detailed in the Hardware and Software Requirements.
Physical servers are used when the cell is required to process high transactions per second (TPS) load. - Virtual servers installed in Non-appliance Mode with Medium architecture type or higher, as detailed in the Hardware and Software Requirements.
Virtual servers are used when the cell is required to process moderate transactions per second (TPS) load, or when the cell is part of a non-production environment where the production cell uses physical servers (to keep environments architecture similar).
- Physical servers installed in Non-appliance Mode (based on RHEL) with High_20dv architecture type, as detailed in the Hardware and Software Requirements.
- All DPOD cell members must be identical - only physical or only virtual (cannot mix physical and virtual cell members in the same cell), and with the same resources (CPUs, RAM, disk type and storage capacity).
- Physical federated cell members with 4 CPU sockets and NVMe disks require special disks and mount points configuration to ensure performance. See Configuring Cell Members with 4 CPU Sockets and NVMe Disks.
- Each cell component (manager / FCM) should have two network interfaces:
- External network interface - for DPOD users to access the Web Console (on the cell manager) and for communication between DPOD and Monitored Gateways (on both the cell manager and the members).
- Internal network interface - for internal DPOD components inter-communication (should be a 10Gb Ethernet interface).
- This design allows separation between the two types of communications which may be used to enhance the security (e.g.: deny end-users from being able to access the inter-cell communication).
- We recommend having 2 different VLANs with different subnets as this makes it easier to configure the servers without using static routing and to configure the network firewall rules.
- Network ports should be opened in the network firewall as detailed below:
From | To | Ports (Defaults) | Protocol | Usage |
---|---|---|---|---|
DPOD Cell Manager (external IP address) | Each Monitored Device | 5550 (TCP) | HTTP/S | Monitored device administration management interface. If the SOMA port is different than 5550 - the port should be changed accordingly. |
DPOD Cell Manager (external IP address) | DNS Server | TCP and UDP 53 | DNS | DNS services. Static IP address may be used. |
DPOD Cell Manager (external IP address) | NTP Server | 123 (UDP) | NTP | Time synchronization |
DPOD Cell Manager (external IP address) | Organizational mail server | 25 (TCP) | SMTP | Send reports by email |
DPOD Cell Manager (external IP address) | LDAP | TCP 389 / 636 (SSL). TCP 3268 / 3269 (SSL) | LDAP | Authentication & authorization. Can be over SSL. |
DPOD Cell Manager (internal IP address) | Each DPOD Federated Cell Member (internal IP address) | 443 (TCP) | HTTP/S | Communication (data + management) |
DPOD Cell Manager (internal IP address) | Each DPOD Federated Cell Member (internal IP address) | 22 (TCP) | TCP | SSH root access is needed for the cell installation and for admin operations from time to time. |
DPOD Cell Manager (internal IP address) | Each DPOD Federated Cell Member (internal IP address) | 9300-9305 (TCP) | ElasticSearch | ElasticSearch Communication (data + management) |
DPOD Cell Manager (external IP address) | Each DPOD Federated Cell Member (external IP address) | 60000-60003 (TCP) | TCP | Syslog keep-alive data |
DPOD Cell Manager (external IP address) | Each DPOD Federated Cell Member (external IP address) | 60020-60023 (TCP) | TCP | HTTP/S WS-M keep-alive data |
NTP Server | DPOD Cell Manager (external IP address) | 123 (UDP) | NTP | Time synchronization |
Users IPs | DPOD Cell Manager (external IP address) | 443 (TCP) | HTTP/S | DPOD's Web Console |
Admins IPs | DPOD Cell Manager (external IP address) | 22 (TCP) | TCP | SSH |
Each DPOD Federated Cell Member (internal IP address) | DPOD Cell Manager (internal IP address) | 443 (TCP) | HTTP/S | Communication (data + management) |
Each DPOD Federated Cell Member (internal IP address) | DPOD Cell Manager (internal IP address) | 9200, 9300-9400 | ElasticSearch | ElasticSearch Communication (data + management) |
Each DPOD Federated Cell Member (external IP address) | DNS Server | TCP and UDP 53 | DNS | DNS services |
Each DPOD Federated Cell Member (external IP address) | NTP Server | 123 (UDP) | NTP | Time synchronization |
Each Monitored Device | Each DPOD Federated Cell Member (external IP address) | 60000-60003 (TCP) | TCP | SYSLOG Data |
Each Monitored Device | Each DPOD Federated Cell Member (external IP address) | 60020-60023 (TCP) | HTTP/S | WS-M Payloads |
NTP Server | Each DPOD Federated Cell Member (external IP address) | 123 (UDP) | NTP | Time synchronization |
Admins IPs | Each DPOD Federated Cell Member (external IP address) | 22 (TCP) | TCP | SSH |
Cell Manager Installation
Prerequisites
- Make sure to meet the prerequisites listed at the top of this page.
- For Non-appliance Mode, follow the procedure: Prepare Pre-Installed Operating System.
DPOD Installation
- For Appliance Mode, follow the procedure: Appliance Installation.
For Non-appliance Mode, follow the procedure: Non-Appliance Installation. - During installation, when prompted to choose the data disk type (SSD / non SSD), choose the cell members disk type (should be SSD) instead of the cell manager disk type.
- During installation, when prompted to choose the IP address for the Web Console, choose the IP address of the external network interface.
- Install the following software package (RPM): bc
- Execute the following operating system performance optimization commands and reboot the server:
sed -i 's/^NODE_HEAP_SIZE=.*/NODE_HEAP_SIZE="2G"/g' /etc/init.d/MonTier-es-raw-trans-Node-1 /app/scripts/tune-os-parameters.sh reboot
Federated Cell Member Installation
The following section describes the installation process of a single Federated Cell Member (FCM). Please repeat the procedure for every FCM installation.
Prerequisites
- Make sure to meet the prerequisites listed at the top of this page.
- Follow the procedure: Prepare Pre-Installed Operating System.
- The cell member server should contain disks according to the recommendations made in the sizing process with IBM Support Team, which includes disks for OS, install, and data (one for /data and 6 to 9 additional disks for /data2/3/4...).
- Physical federated cell members with 4 CPU sockets and NVMe disks require special disks and mount points configuration to ensure performance. See Configuring Cell Members with 4 CPU Sockets and NVMe Disks.
Most Linux-based OS use a local firewall service (e.g.: iptables / firewalld). Since the OS of the Non-Appliance Mode DPOD installation is provided by the user, it is under the user's responsibility to allow needed connectivity to and from the server.
Configure the local firewall service to allow connectivity as described in the prerequisites section at the top of this page.The following software packages (RPMs) are recommended for system maintenance and troubleshooting, but are not required: telnet client, net-tools, iftop, tcpdump
DPOD Installation
- Physical servers should use RHEL as the operating system (and not CentOS).
- Use Non-appliance Mode and follow the procedure: Non-Appliance Installation
- The four-letter Installation Environment Name should be identical to the one that was chosen during the Cell Manager installation.
- During installation, when prompted to choose the IP address for the Web Console, choose the IP address of the external network interface.
- Install the following software package (RPM): numactl pciutils nvme-cli
- Execute the following operating system performance optimization commands and reboot the server:
/app/scripts/tune-os-parameters.sh reboot
After the server reboots, make sure httpd
service is running and can be restarted successfully. If an error is displayed during the service restart, please see if the following information helps in resolving it: https://access.redhat.com/solutions/1180103
systemctl restart httpd
Configuring Mount Points of Cell Member
List of Mount Points
The cell member server should contain disks according to the recommendations made in the sizing process with IBM Support Team, which includes disks for OS, install, and data (one for /data and 6 to 9 additional disks for /data2/3/4...). The data disks should be mounted to different mount points. The required mount points are:
- In case the server has 6 disks: /data2, /data22, /data3, /data33, /data4, /data44
- In case the server has 9 disks: /data2, /data22, /data222, /data3, /data33, /data333, /data4, /data44, /data444
Mapping Mount Points to Disks
Map the mount points to disks:
- In case of physical federated cell members with 4 CPU sockets and NVMe disks - use the information gathered at Configuring Cell Members with 4 CPU Sockets and NVMe Disks to map the mount point with the proper disk:
Mount Points | Disks |
---|---|
/data2, /data22 and /data222 (if exists) | Disks connected to NUMA node 1 |
/data3, /data33 and /data333 (if exists) | Disks connected to NUMA node 2 |
/data4, /data44 and /data444 (if exists) | Disks connected to NUMA node 3 |
- For all other types of federated cell members servers - you may map the mount points to any disk.
Creating Mount Points
Use LVM (Logical Volume Manager) to create the mount points. You may use the following commands as an example of how to configure a single mount point (/data2 on disk nvme0n1 in this case):
pvcreate -ff /dev/nvme0n1 vgcreate vg_data2 /dev/nvme0n1 lvcreate -l 100%FREE -n lv_data vg_data2 mkfs.xfs -f /dev/vg_data2/lv_data echo "/dev/vg_data2/lv_data /data2 xfs defaults 0 0" >> /etc/fstab mkdir -p /data2 mount /data2
Inspecting final configuration
Execute the following command and verify mount points (this example is for 6 disks per cell member and does not include other mount points that should exist):
lsblk Expected output: NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT nvme0n1 259:2 0 2.9T 0 disk └─vg_data2-lv_data 253:0 0 2.9T 0 lvm /data2 nvme1n1 259:5 0 2.9T 0 disk └─vg_data22-lv_data 253:11 0 2.9T 0 lvm /data22 nvme2n1 259:1 0 2.9T 0 disk └─vg_data3-lv_data 253:9 0 2.9T 0 lvm /data3 nvme3n1 259:0 0 2.9T 0 disk └─vg_data33-lv_data 253:10 0 2.9T 0 lvm /data33 nvme4n1 259:3 0 2.9T 0 disk └─vg_data44-lv_data 253:8 0 2.9T 0 lvm /data44 nvme5n1 259:4 0 2.9T 0 disk └─vg_data4-lv_data 253:7 0 2.9T 0 lvm /data4
Cell Member Federation
In order to federate and configure the cell member, run the following script in the cell manager once per cell member.
Important: The script should be executed using the OS root user, and also requires remote root access over SSH from the cell manager to the cell member.
Execute the script suitable for your environment:
In case of a physical federated cell members with 4 CPU sockets and NVMe disks:
/app/scripts/configure_cell_manager.sh -a <internal IP address of the cell member> -g <external IP address of the cell member> -i physical
In case of a physical federated cell member with 2 CPU sockets or SSD disks:
/app/scripts/configure_cell_manager.sh -a <internal IP address of the cell member> -g <external IP address of the cell member> -i physical -n true
In case of a virtual federated cell member:
/app/scripts/configure_cell_manager.sh -a <internal IP address of the cell member> -g <external IP address of the cell member> -i virtual
The script writes two log files - one in the cell manager and one in the cell member. The log file names are mentioned in the script's output.
In case of a failure, the script will try to rollback the configuration changes it made, so the problem can be fixed before rerunning it again.
If the rollback fails, and the cell member services do not start successfully, it might be required to uninstall DPOD from the cell member, reinstall and federate it again.
Updating Configuration for Physical Federated Cell Members with 4 CPU Sockets and NVMe Disks
Note: If the cell member server does not have 4 CPU sockets or does not have NVMe disks - skip this step.
To update the service files, execute the following commands:
sed -i 's#/usr/bin/numactl --membind=1 --cpunodebind=1#/usr/bin/numactl --membind=2 --cpunodebind=2#g' /etc/init.d/MonTier-es-raw-trans-Node-3 sed -i 's#/usr/bin/numactl --membind=1 --cpunodebind=1#/usr/bin/numactl --membind=3 --cpunodebind=3#g' /etc/init.d/MonTier-es-raw-trans-Node-4
To verify the NUMA configuration for all services, execute the following command:
grep numactl /etc/init.d/*
Updating Configuration for Federated Cell Members with at least 384GB RAM
Note: If the cell member server has less than 384GB RAM - skip this step.
To update the service files, execute the following command:
sed -i 's/^NODE_HEAP_SIZE=.*/NODE_HEAP_SIZE="64G"/g' /etc/init.d/MonTier-es-raw-trans-Node-2 sed -i 's/^NODE_HEAP_SIZE=.*/NODE_HEAP_SIZE="64G"/g' /etc/init.d/MonTier-es-raw-trans-Node-3 sed -i 's/^NODE_HEAP_SIZE=.*/NODE_HEAP_SIZE="64G"/g' /etc/init.d/MonTier-es-raw-trans-Node-4
Cell Member Federation Verification
After a successful federation, you will be able to see the new federated cell member in the Manage → System → Nodes page. For example:
Also, the new agents will be shown in the agents list in the Manage → Internal Health → Agents page:
Configure the Monitored Gateways to Use the Federated Cell Member Agents
Configure the monitored gateways to use the federated cells agents. Please follow instructions on Adding Monitored Devices.