TODO - diagram
Overview - TODO
The remote collector deployment should assist in 2 scenarios:
- Data should be collected across several deployments but a consolidate single view is required (only one Local nodes is required).
- When a Local Node is reaching a CPU limit and an offload of work is required (can offload up to 20% CPU in high load).
In order to setup a new Remote Collector server you will need to install another new DPOD server based on the prerequisites below. The Node that will contain the Data and the console will be called "Local Node" and the second installation (contains only the Syslog and WS-M agent) will be called "remote collector".
Prerequisites
- The DPOD cell manager and cell FCM must be with the same version (minimum version is 1.0.10.0 )
- DPOD ce ll manager can be both "Appliance Mode" or "Non Appliance Mode" installation with "medium" architecture type as detailed in the Hardware and Software Requirements. The manager server can be both virtual or physical.
- DPOD cell member (FCM) should be "Non appliance Mode" installation with "High_20dv with High Load" architecture type as detailed in the Hardware and Software Requirements
- Each cluster component (manager / FCM ) should have two network interfaces :
- External interface - for DPOD users to access UI and for communication between DPOD and Monitored Gateways.
- Internal Interface - for internal DPOD components communication (should be 10GB Ethernet interface, for more information see configuring FCM)
- Each installation will requires some different ports to be opened in the firewall - see table 1
table 1
From | To | Ports (Defaults) | Protocol | Usage |
---|---|---|---|---|
Cell Manager DPOD Appliance | Each Monitored Device | 5550 (TCP) | HTTP/S | Monitored Device administration management interface |
Cell Manager DPOD Appliance | DNS Server | TCP and UDP 53 | DNS | DNS services. Static IP address may be used. |
Cell Manager DPOD Appliance | NTP Server | 123 (UDP) | NTP | Time synchronization |
Cell Manager DPOD Appliance | Organizational mail server | 25 (TCP) | SMTP | Send reports by email |
Cell Manager DPOD Appliance | LDAP | TCP 389 / 636 (SSL). TCP 3268 / 3269 (SSL) | LDAP | Authentication & authorization. Can be over SSL |
Cell Manager DPOD Appliance | Each of the Cell Member DPOD Appliance | 9300-9305 (TCP) | Elasticsearch | Elasticsearch Communication (data + management) |
NTP Server | Cell Manager DPOD Appliance | 123 (UDP) | NTP | Time synchronization |
Each Monitored Device | Cell Manager DPOD Appliance | 60000-60003 (TCP) | TCP | SYSLOG Data |
Each Monitored Device | Cell Manager DPOD Appliance | 60020-60023 (TCP) | HTTP/S | WS-M Payloads |
FROM Users IPs | Cell Manager DPOD Appliance | 443 (TCP) | HTTP/S | Access to with IBM DataPower Operations Dashboard Console |
FROM Admins IPs | Cell Manager DPOD Appliance | 22 (TCP) | TCP | SSH |
Cell Member DPOD Appliance | Cell Manager DPOD Appliance | 9200, 9300-9400 | Elasticsearch | Elasticsearch Communication (data + management) |
Cell Member DPOD Appliance | DNS Server | TCP and UDP 53 | DNS | DNS services |
Cell Member DPOD Appliance | NTP Server | 123 (UDP) | NTP | Time synchronization |
NTP Server | Cell Member DPOD Appliance | 123 (UDP) | NTP | Time synchronization |
Each Monitored Device | Cell Member DPOD Appliance | 60000-60003 (TCP) | TCP | SYSLOG Data |
Each Monitored Device | Cell Member DPOD Appliance | 60020-60023 (TCP) | HTTP/S | WS-M Payloads |
FROM Admins IPs | Cell Member DPOD Appliance | 22 (TCP) | TCP | SSH |
Manager Installation
DPOD cell manager can be both "Appliance Mode" or "Non Appliance Mode" installation with "medium" architecture type as detailed in the Hardware and Software Requirements. The manager server can be both virtual or physical.
- "Appliance Mode" installation procedure
- "Non appliance Mode" installation procedure
As described on the prerequisites section the cell topology requires two network interfaces . when installing the cell manager (the standard DPDO installation before federating to cell) user will be prompt to choose the ip address for the UI console, this should be the "External Interface"
Federate Cluster Member Installation
The following section will describe the installation process of a single Federated Cluster Member (FCM). User should repeat the procedure for every FCM installation.
Prerequisites
- DPOD cell member (FCM) should be "Non Appliance Mode" installation with "High_20dv with High Load" architecture type as detailed in the Hardware and Software Requirements.
- In addition to the "Non Applianc Mode" software requirements user should Install the following software packages (RPM) :
iptables
iptables-services
numactl
Installation
DPOD installation
Install DPOD "Non Appliance Mode" as described in the following installation procedure.
As described on the prerequisites section the cell topology requires two network interfaces . when installing the cell member (the standard DPDO installation before federating to cell) user will be prompt to choose the ip address for the UI console.
Although cell member does not have UI service user should choose the the "External Interface".
After the DPOD installation is complete user should execute the following operation system performance optimization script.
/app/scripts/tune-os-parameters.sh
User should reboot the server for the new performance optimization should take effect.
Prepare Cell Member for Federation
The cell member is usually "bare metal" server with NVMe disks for maximizing server throughput.
Each of the Store's logical node (service) will be bound to a specific physical processor , disks and memory (using NUMA technology → Non-uniform memory access ).
The default cell member configuration assume 6 NVMe disks which will serve 3 Store data nodes (2 disks per node)
The following OS mount points should be configured by the user before federating the DPOD installation to "cell member".
We highly recommend the use of LVM (Logical volume Manager) to allow "flexible" storage for future storage needs .
note - colored table cells should be completed by the user based on his specific hardware.
Store Node | mount point path | Disk Bay | Disk Serial | Disk Path | CPU No |
---|---|---|---|---|---|
2 | /data2 | ||||
2 | /data22 | ||||
3 | /data3 | ||||
3 | /data33 | ||||
4 | /data4 | ||||
4 | /data44 |
How to identify Disk OS path and Disk serial
- To identify which of the server's NVMe disk bay is bound to which of the CPU use the hardware manufacture documentation.
Also, write down the disk's serial number by visually observing the disk. In order to identify the disk os path (example : /dev/nvme01n) and the disk serial the user should install the NVMe disk utility software provided by the hardware supplier. Example : for Intel based NVMe SSD disks install the "Intel® SSD Data Center Tool" (isdct).
Example output of the Intel SSD DC tool :isdct show -intelssd - Intel SSD DC P4500 Series PHLE822101AN3PXXXX - Bootloader : 0133 DevicePath : /dev/nvme0n1 DeviceStatus : Healthy Firmware : QDV1LV45 FirmwareUpdateAvailable : Please contact your Intel representative about firmware update for this drive. Index : 0 ModelNumber : SSDPE2KE032T7L ProductFamily : Intel SSD DC P4500 Series SerialNumber : PHLE822101AN3PXXXX
- Use the disks bay number and the disk serial number (visually identified) and correlate with the output of the disk tool to identify the disk os path.
Examples for Mount Points and Disk Configurations
Store Node | mount point path | Disk Bay | Disk Serial | Disk Path | CPU No |
---|---|---|---|---|---|
2 | /data2 | 1 | PHLE822101AN3PXXXX | /dev/nvme0n1 | 1 |
2 | /data22 | 2 | /dev/nvme1n1 | 1 | |
3 | /data3 | 4 | /dev/nvme2n1 | 2 | |
3 | /data33 | 5 | /dev/nvme3n1 | 2 | |
4 | /data4 | 12 | /dev/nvme4n1 | 3 | |
4 | /data44 | 13 | /dev/nvme5n1 | 3 |
Example for LVM Configuration
pvcreate -ff /dev/nvme0n1 vgcreate vg_data2 /dev/nvme0n1 lvcreate -l 100%FREE -n lv_data vg_data2 mkfs.xfs -f /dev/vg_data2/lv_data pvcreate -ff /dev/nvme1n1 vgcreate vg_data22 /dev/nvme1n1 lvcreate -l 100%FREE -n lv_data vg_data22 mkfs.xfs /dev/vg_data22/lv_data
The /etc/fstab file :
/dev/vg_data2/lv_data /data2 xfs defaults 0 0 /dev/vg_data22/lv_data /data22 xfs defaults 0 0 /dev/vg_data3/lv_data /data3 xfs defaults 0 0 /dev/vg_data33/lv_data /data33 xfs defaults 0 0 /dev/vg_data4/lv_data /data4 xfs defaults 0 0 /dev/vg_data44/lv_data /data44 xfs defaults 0 0
Example for the final configuration for 3 Store's nodes
Not including other mount points needed as describe on DPOD Hardware and Software Requirements
# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT nvme0n1 259:0 0 2.9T 0 disk └─vg_data2-lv_data 253:6 0 2.9T 0 lvm /data2 nvme1n1 259:5 0 2.9T 0 disk └─vg_data22-lv_data 253:3 0 2.9T 0 lvm /data22 nvme2n1 259:1 0 2.9T 0 disk └─vg_data3-lv_data 253:2 0 2.9T 0 lvm /data3 nvme3n1 259:2 0 2.9T 0 disk └─vg_data33-lv_data 253:5 0 2.9T 0 lvm /data33 nvme4n1 259:4 0 2.9T 0 disk └─vg_data44-lv_data 253:7 0 2.9T 0 lvm /data44 nvme5n1 259:3 0 2.9T 0 disk └─vg_data4-lv_data 253:8 0 2.9T 0 lvm /data4
Cell Member Federation
In order to federate and configure the cell member run the following script on in the cell manager once per cell member - e.g. if you want to add twocell members, run the script twice (in the cell manager), first time with the IP address of the first cell member, and second time with the IP address of the second cell manager.
impotent : the command should be executed using the os "root" user.
/app/scripts/configure_federated_cluster_manager.sh -a <internal IP address of the cell member> -g <external IP address of the cell member> For example: /app/scripts/configure_federated_cluster_manager.sh -a 172.18.100.34 -g 172.17.100.33
Example for a successful execution - note that the script writes two log file, one in the cell manager and one in the cell member, the log file names are mentioned in the script's output.- TODO
Example for a failed execution, you will need to check the log file for further information.
in case of a failure, the script will try to rollback the configuration changes it made, so you can try to fix the problem and run it again. - TODO
Cell Member Federation Post Steps
NUMA configuration
DPOD cell member is using NUMA technology ( Non-uniform memory access).
Default cell manager configuration bound DPOD's agent to CPU 0 and the Store's nodes to CPU 1.
If the server has 4 CPUs user should edit node 2-3 service file and change the bind CPU to 2 and 3 respectively.
Identify NUMA configuration
To identify the amount of CPU installed on the server use the NUMA utility :
numactl -s Example output for 4 CPU server : policy: default preferred node: current physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 cpubind: 0 1 2 3 nodebind: 0 1 2 3 membind: 0 1 2 3
Alter store's node 3-4
OPTIONAL - if the server has 4 CPU alter the Store's nodes service file to bound each node to different CPU.
The services files are located on directory /etc/init.d/ with the name MonTier-es-raw-trans-Node-2 and MonTier-es-raw-trans-Node-3.
For node MonTier-es-raw-trans-Node-2 OLD VALUE : numa="/usr/bin/numactl --membind=1 --cpunodebind=1" NEW VALUE : numa="/usr/bin/numactl --membind=2 --cpunodebind=2" For node MonTier-es-raw-trans-Node-3 OLD VALUE : numa="/usr/bin/numactl --membind=1 --cpunodebind=1" NEW VALUE : numa="/usr/bin/numactl --membind=3 --cpunodebind=3"
Cell Member Federation Verification
After a successful execution, you will be able to see the new remote collectors in the Manage → System → Nodes page,
For example, if we added two remote collectors:
Also, the new agents will be shown in the agents in the Manage → Internal Health → Agents page.
For example, we have one local node with two agents and two remote collectors with two agents each, the page will show six agents:
Configure The Monitored Device to Remote Collector's Agents
It is possible to configure entire monitored device to remote collector's agent or just a specific domain.
To configure monitored device / specific domain please follow instructions on Adding Monitored Devices
Manual Setup Steps
We recommend using the script described in the previous section.
There is no need to take any manual steps if you already run the script.
The following communication and ports are used in a remote collector deployment scenario (table 1). Perform the following commands to accomplish this task on each DPOD local firewall:
Run in Local Node -
Change the XXXX to the IP of the Remote Collectoriptables -I INPUT -p tcp -s XXXX/24 --dport 9300:9309 -j ACCEPT service iptables save service iptables restart
After running the commands, run the following command and search the output for two entries showing port 9300 (shown in red in the below screenshot)
iptables -L -n
table 1From
To
Ports (Defaults)
Protocol
Usage
Local Node DPOD Appliance
Each Monitored Device
5550 (TCP)
HTTP/S
Monitored Device administration management interface
Local Node DPOD Appliance
DNS Server TCP and UDP 53
DNS DNS services
Local Node DPOD Appliance
NTP Server
123 (UDP)
NTP
Time synchronization
Local Node DPOD Appliance
Organizational mail server
25 (TCP)
SMTP
Send reports by email
Local Node DPOD Appliance
LDAP
TCP 389 / 636 (SSL).
TCP 3268 / 3269 (SSL)
LDAP
Authentication & authorization. Can be over SSL
NTP Server
Local Node DPOD Appliance
123 (UDP)
NTP
Time synchronization
Each Monitored Device
Local Node DPOD Appliance
60000-60009 (TCP)
TCP
SYSLOG Data
Each Monitored Device
Local Node DPOD Appliance
60020-60029 (TCP)
HTTP/S
WS-M Payloads
FROM Users IPs
Local Node DPOD Appliance
443 (TCP)
HTTP/S
Access to with IBM DataPower Operations Dashboard Console
FROM Admins IPs
Local Node DPOD Appliance
22 (TCP)
TCP
SSH
Remote Collector DPOD Appliance
Each Monitored Device
5550 (TCP)
HTTP/S
Monitored Device administration management interface
Remote Collector DPOD Appliance
DNS Server TCP and UDP 53
DNS DNS services
Remote Collector DPOD Appliance
NTP Server
123 (UDP)
NTP
Time synchronization
Remote Collector DPOD Appliance
Organizational mail server
25 (TCP)
SMTP
Send reports by email
Remote Collector DPOD Appliance
LDAP
TCP 389 / 636 (SSL).
TCP 3268 / 3269 (SSL)
LDAP
Authentication & authorization. Can be over SSL
NTP Server
Remote Collector DPOD Appliance
123 (UDP)
NTP
Time synchronization
Each Monitored Device
Remote Collector DPOD Appliance
60000-60009 (TCP)
TCP
SYSLOG Data
Each Monitored Device
Remote Collector DPOD Appliance
60020-60029 (TCP)
HTTP/S
WS-M Payloads
FROM Users IPs
Remote Collector DPOD Appliance
443 (TCP)
HTTP/S
Access to with IBM DataPower Operations Dashboard Console
FROM Admins IPs
Remote Collector DPOD Appliance
22 (TCP)
TCP
SSH
- From the Local Node's UI, go to the Manage menu, select "Nodes" under "System" and click "Edit"
Enter the IP address of the Remote Collector device and click "Update", you can leave the "Agents DNS Address" empty - In the Local Node
Connect to the Local Node DPOD via ssh as root user (using putty or any other ssh client)
Using the Command Line Interface choose option 2 - "Stop All", and wait until all the services are stopped, this may take a few minutes to complete. In the Local Node
Using putty or any other ssh client, issue the following command:sed -i -e "s/^SERVICES_SIXTH_GROUP=\".*MonTier-SyslogAgent-1 MonTier-HK-WdpServiceResources MonTier-HK-WdpDeviceResources/SERVICES_SIXTH_GROUP=\"MonTier-HK-WdpServiceResources MonTier-HK-WdpDeviceResources/g" /etc/sysconfig/MonTier
In the Local Node
Using putty or any other ssh client, issue the following command:mv /etc/init.d/MonTier-SyslogAgent-1 /etc/init.d/Disabled-MonTier-SyslogAgent-1 mv /etc/init.d/MonTier-SyslogAgent-2 /etc/init.d/Disabled-MonTier-SyslogAgent-2 mv /etc/init.d/MonTier-SyslogAgent-3 /etc/init.d/Disabled-MonTier-SyslogAgent-3 mv /etc/init.d/MonTier-SyslogAgent-4 /etc/init.d/Disabled-MonTier-SyslogAgent-4 mv /etc/init.d/MonTier-SyslogAgent-5 /etc/init.d/Disabled-MonTier-SyslogAgent-5 mv /etc/init.d/MonTier-SyslogAgent-6 /etc/init.d/Disabled-MonTier-SyslogAgent-6 mv /etc/init.d/MonTier-SyslogAgent-7 /etc/init.d/Disabled-MonTier-SyslogAgent-7 mv /etc/init.d/MonTier-SyslogAgent-8 /etc/init.d/Disabled-MonTier-SyslogAgent-8 mv /etc/init.d/MonTier-SyslogAgent-9 /etc/init.d/Disabled-MonTier-SyslogAgent-9 mv /etc/init.d/MonTier-SyslogAgent-10 /etc/init.d/Disabled-MonTier-SyslogAgent-10 mv /etc/init.d/MonTier-WsmAgent-1 /etc/init.d/Disabled-MonTier-WsmAgent-1 mv /etc/init.d/MonTier-WsmAgent-2 /etc/init.d/Disabled-MonTier-WsmAgent-2 mv /etc/init.d/MonTier-WsmAgent-3 /etc/init.d/Disabled-MonTier-WsmAgent-3 mv /etc/init.d/MonTier-WsmAgent-4 /etc/init.d/Disabled-MonTier-WsmAgent-4 mv /etc/init.d/MonTier-WsmAgent-5 /etc/init.d/Disabled-MonTier-WsmAgent-5
Note: some errors might appear for services that are not exists in your specific deployment architecture type - for example "mv: cannot stat ‘/etc/init.d/Disabled-MonTier-SyslogAgent-10’: No such file or directory"
- In the Local Node
Using any text editor (like vi), edit /etc/hosts files (e.g. vi /etc/hosts)
Change the following entries:
montier-es from 127.0.0.1 to the IP of the Local node device
montier-syslog and montier-wsm to the IP of the remote collector device
you should save the changes when exit (e.g wq) - In the Local Node
Using the Command Line Interface - Select option 1 "Start All", this may take a few minutes to complete - Connect to the Remote Collector DPOD via ssh as root user (using putty or any other ssh client)
Using the Command Line Interface choose option 2 - "Stop All", and wait until all the services are stopped, this may take a few minutes to complete. In the Remote Collector
Using putty or any other ssh client, issue the following commands:mv /etc/init.d/MonTier-es-raw-trans-Node-1 /etc/init.d/Disabled-MonTier-es-raw-trans-Node-1 mv /etc/init.d/MonTier-es-raw-trans-Node-2 /etc/init.d/Disabled-MonTier-es-raw-trans-Node-2 mv /etc/init.d/MonTier-es-raw-trans-Node-3 /etc/init.d/Disabled-MonTier-es-raw-trans-Node-3 mv /etc/init.d/MonTier-es-raw-trans-Node-4 /etc/init.d/Disabled-MonTier-es-raw-trans-Node-4 mv /etc/init.d/MonTier-Derby /etc/init.d/Disabled-MonTier-Derby mv /etc/init.d/MonTier-HK-ESRetention /etc/init.d/Disabled-MonTier-HK-ESRetention mv /etc/init.d/MonTier-HK-SyslogKeepalive /etc/init.d/Disabled-MonTier-HK-SyslogKeepalive mv /etc/init.d/MonTier-HK-WsmKeepalive /etc/init.d/Disabled-MonTier-HK-WsmKeepalive mv /etc/init.d/MonTier-HK-WdpDeviceResources /etc/init.d/Disabled-MonTier-HK-WdpDeviceResources mv /etc/init.d/MonTier-HK-WdpServiceResources /etc/init.d/Disabled-MonTier-HK-WdpServiceResources mv /etc/init.d/MonTier-Reports /etc/init.d/Disabled-MonTier-Reports mv /etc/init.d/MonTier-UI /etc/init.d/Disabled-MonTier-UI sed -i -e "s/^SERVICES_FIRST_GROUP=\".*/SERVICES_FIRST_GROUP=\"\"/g" /etc/sysconfig/MonTier sed -i -e "s/^SERVICES_SECOND_GROUP=\".*/SERVICES_SECOND_GROUP=\"\"/g" /etc/sysconfig/MonTier sed -i -e "s/^SERVICES_THIRD_GROUP=\".*/SERVICES_THIRD_GROUP=\"\"/g" /etc/sysconfig/MonTier sed -i -e "s/\MonTier-HK-WdpServiceResources MonTier-HK-WdpDeviceResources//g" /etc/sysconfig/MonTier sed -i -e "s/^SERVICES_SEVENTH_GROUP=\".*/SERVICES_SEVENTH_GROUP=\"\"/g" /etc/sysconfig/MonTier
Note: some errors might appear for services that are not exists in your specific deployment architecture type - for example "mv: cannot stat ‘/etc/init.d/MonTier-es-raw-trans-Node-4’: No such file or directory"
- In the Remote Collector
Using any text editor (like vi), edit /etc/hosts files (e.g. vi /etc/hosts)
Change the following entries:
montier-es from 127.0.0.1 to the ip of the Local Node device - In the Remote Collector
Using the Command Line Interface choose option 1 - "Start All", and wait until all the services are stopped, this may take a few minutes to complete. - Verify in the console in Management → Internal health → Agents that all agents are in green state.
- Run the following two scripts, you will need to obtain them from IBM support:
in the Local Node - configure_local_node.sh
in the Remote Collector - configure_remote_collector.sh - In the Local Node - !! Only if DPOD was already attached to DataPower Gateways !!
you will need to reconfigure again all the the attached device.
After the setup is complete - DPOD's web console will not longer be available for the Remote Collector, The only way to connect to the Remote Collector will be via ssh client