The Agents screen is accessible by logging into the UI Console Web Console and navigating to [Manage→Internal Health→Agents].
...
Detail | Description | Desired State |
---|---|---|
General Health Status | The general health status of the agent is relayed by the color of the box wrapping its details.
| Green - the agent is healthy. |
Date / Time | The timestamp of the last successful Keepalive check. A delay of over a minute may suggest a performance problem. Note: It is important to verify that system time is synced correctly when reading these values. | < 3 minutes |
NumberMsg. Rate | This is the total number of messages of all types (syslog, WS-M, and keepalive messages) processed by this agent in the last 10 minutes. Verify:
You may redirect syslog traffic from one agent to another by assigning it to a domain a specific agent. For more details see Adding Monitored Devices. | 1 < value < 500,000 (if there are any monitored devices) |
Dropped Msgs (For Syslog and WS-M agents) | This is the total number of syslog or WS-M messages that were sent from the monitored devices but were not processed by the DPOD agent in the last 10 minutes. Dropped messages usually indicate that the agent cannot keep up with the load, consider redistribution of traffic to other agents. | 0 |
If you encounter any problems, see how to Troubleshoot links to agent status troubleshooting.
...
Graph | Details | Desired State |
---|---|---|
Files Proces Pending | The graph depicts the number of large payloads waiting to be processed. A value higher than 1000 indicates a high load on WS-M subscription. WS-M usage should be avoided until this folder is cleared by the system or you clear it manually. | Only a few files displayed |
Channel Fill %Utilization | The graph depicts stream processing usage in percentage. Each colored graph denotes a different agent. Verify that all agents use less than 80% of their stream processing capacity. If usage goes above 80%, data coming in from collector agents might be lost. | Under 80% for all agents |
Agents Free Memory | The graph depicts the collector agents’ free memory over time, where each agent is denoted in a different color. When an agent's free memory is too low, you might encounter performance problems. See how to Troubleshoot the issue. | Verify that each agent |
...
The agent details widget displays the following information for the agent:
Detail | Description |
---|---|
HostIP | The host IP where this agent runs |
DNS | The DNS of the agent (if set) |
Port | The port the agent is listening on |
Keepalive | On / Off state of the keepalive service for this agent |
Dropped Msgs (10 mins) | How many messages were lost by the agent |
Message Rate (10 min) | Number of messages handles by agent in the last 10 minutes |
Newest Message | Timestamp of the latest message received on this agent |
...
Column | Description |
---|---|
Device | The device emitting the message. Click on the device name to view the device in the Raw Messages view. |
Domain | The domain for this message |
Category | Always montier-ka todo: hk? |
Severity | Always debug |
Time | Timestamp for the Keep-Alive message |
Directiontodo: | N/A |
Object Typetodo: | N/A |
Object Nametodo: | N/A |
Trans. ID | todo:N/A |
Client IP | This will always be the originating host so 0.0.0.0 |
Message | Keep-Alive message text. |
...
Reporting Domains (24 hrs.)
todo: of what? For each... in the table.This widget lists the domains reporting in the preceding 24 hours period.
The list may be used to identify that a device has dropped off the monitoring list.
Column | Description |
---|---|
Device Name | Name of reporting device |
Domain Name | Name of reporting domain |
Recent Resources Messages
This widget displays only for the Device and Service Resources Agents. It lists recent resource messages in a table.
Resource messages are status messages where the resource relays the status of its resource consumption. todo is this correct?
For each resource message, the table displays the following details:
Column | Description |
---|---|
Device ID | ID of the device this resource message relates to |
Device Name | Name of the device this resource message relates to |
Load Time | Timestamp when the load sampling was taken |
Load | Load sampling value |
Memory Time | Timestamp when the memory sampling was taken |
Used Memory | Memory used sampling value |
Total Memory | Total memory for the device |
Total Memory % | Percentage of total memory used at sampling time |
CPU Time | Timestamp when the CPU usage sampling was taken |
CPU | CPU usage (%) at sampling time |
Monitored Devices (24 hrs.)
...