The System Health dashboard provides an overview of all devices health states based on metrics.
The dashboard is composed of multiple cards divided into groups(optional) to display the health of a device.
In addition, the dashboard displays the DPOD health state.
Assumptions:
- The existing alerts infrastructure (with additional few more fields) will provide all data and logic to decide if a sample should be alerted and if it considers an error or warning or good.
- The alerts mechanism will be the only source of current and future metrics. New fields: Is alert used as health metric, warning threshold, damage points
- Any health metric must be based on a detailed investigation screen
...
- List of default metrics:
- The user can define whether a metric is part of the system health by selecting the "System Health Metric" option
- You may edit the metric setting from [Manage → Alert → Setup Alerts → Edit Alert]
Device Health Settings:
- For each device, the user can define whether the device is displayed in the System Health dashboard, Damage Points Threshold, Total Warnings Threshold
- For each device, the user can set thresholds and damage points per health metric. - TODO add link to device settings
Device Group Settings:
- For each device, the user can define the device group and the display order - TODO add link to device groups
System Parameters (DB) - default values for:
- "System Health Dashboard Sample Time Range (min.)" - default to 5 minutes.
...
- A single device card - includes:
- Health states of the past hour:
Icon | description | last X minutes(System Parameter) | past hour health |
---|---|---|---|
Good |
|
| |
Warning |
|
| |
Error |
|
| |
No metrics samples | - |
| |
+ background color of card is red | Critical |
| - |
- Clicking a device card should direct to the device health dashboard
...