DPOD automatically runs several self diagnostic tests (checks) on the following internal components:
Config Database availability
Config Database size
- ElasticSearch (the Big Data Store) availability
Store Nodes status
File system free space
Internal Big Data Retention process
Dropped syslog messages
Dropped WS-M messages
Syslog and WS-M agent status
Log Target Misconfiguration
Log Target Enabled and Down
Critical Service
Info |
---|
You can turn off some or all of the checks from by logging in to your DPOD server (via ssh), editing the file: /app/hk_keepalive/MonTier-HK-SyslogKeepalive/conf/MonTierHousekeeping.conf and restarting the keepalive service via app-util.sh or from the admin console |
The internal alerts can be published in three ways:
- Via Email - from the system parameters page, change the value of "Internal Alerts - Send Email on Alert" to "true", and enter the email destination/s (comma separated) in "Internal Alerts - Email Destination Address for Alerts" - you'll need to restart the keepalive service via app-util.sh for the change to take effect
- Via Syslog - from the system parameters page, change the value of "Internal Alerts - Send Syslog on Alert" to "true", the syslog destination is identical to the destination of the normal system alerts, and can be configured from the parameters "Hostname of the target server for syslog alerts" and "Port of the target server for syslog alerts" - you'll need to restart the keepalive service via app-util.sh for the changes to take effect
- As a notification in the web console, you can change the interval of the alerts from the user preferences page
Internal Alerts Page
The internal alerts page showing current and historic alerts.
...
Internal Alerts Page
Latest Checks Status Section
Each box represent a check type of the alert and clicking on the box will show allow you to see only the relevant internal alerts in the grid below.
The boxes are color coded as follows:
Green - No problems detected
Red - One or more issues were detected, or the previous schedule check did not run (because of a problem with DPOD for example), but a previous check detected a problem.
Grey - The check did not run yet - some checks only start a few minutes to an hour after DPOD starts up - this is normal and does not indicate on any problems.The bottom section shows a table with all the alerts (current and historical)
Filters
Start TIme: Show internal alerts by time range in the grid below.
Check Type: Show internal alerts by type in the grid below.
Internal Alerts Grid:
Name | Description |
---|---|
Time | The time where the internal alert was generated |
Alert | Description of the internal alert - for example "Syslog agent is down" |
Additional info | Any other diagnostic information - for example "Agent Name: SyslogAgent-1" |
UUID | An internal UUID , you can search for this UUID in the logs when instructed to do so by support personal |
Publishing Internal Alerts
Publishing Internal Alerts to Web Console
See internal alerts field in User Preferences page.
Publishing Internal Alerts via E-Mail
Navigate to System Parameters page.
Update the following parameters:
Set
Internal Alerts - Send Email on Alert
totrue
.Set
Internal Alerts - Email Destination Address for Alerts
to the required e-mail address.Fill the
Email SMTP
category parameters with your organization SMTP e-mail server details.
Publishing Internal Alerts via Syslog Messages
Internal alerts can be published to other systems via Syslog messages by follwing this steps:
Navigate to System Parameters page.
Update the following parameters:
Set
Internal Alerts - Send Syslog on Alert
totrue
.Set
Alerts Syslog Server Hostname
to the required host name or IP address.Set
Alerts Syslog Server Port
to the required port number.Set
Syslog Severity Field Value
to the required severity level (info, warn, error, etc).
Internal Alerts Syslog
Message Format
Code Block |
---|
<16>Mar 14 13:26:22 ${hostname} [0x00a0002a][DPOD-internal-alert][info] AlertContent:(A critical directory was not found in filesystem) AlertUUID:(d44b291e-efda-4236-bd2c-8de0ab1d4e3d) AdditionalInfo:(Mount Point: /logs) |
| DPOD server hostname |
| The message ID for all alerts will always be 0x00a0002a |
| The message level may be set via the system parameter. |
List of Possible AlertContent
Values
Cannot connect to DB
Query failed from DB
Not enough space in mount point
A critical directory was not found in filesystem
Store cluster status is red
Could not connect to the Store
Syslog agent is down
WS-M agent is down
Messaging agent is down
Agent dropped Syslog messages
Agent dropped WS-M messages
The Store retention process is not working
Database table exceeding threshold size
Publishing a message to mail/syslog failed
A post upgrade configuration failed - please re-setup log targets on all monitored devices from the Manage menu
Found a misconfigured log target, please setup Syslog again for the device
Found log target(s) in a down state, setting up Syslog for the device might provide more information
A critical service is not working
Reindex execution failed
Cloud agent interact failure