Streams Replication Manager Health Tests
Streams Replication Manager SRM Driver Health
This is a Streams Replication Manager service-level health test that checks that enough of the SRM Drivers in the cluster are healthy. The test returns "Concerning" health if the number of healthy SRM Drivers falls below a warning threshold, expressed as a percentage of the total number of SRM Drivers. The test returns "Bad" health if the number of healthy and "Concerning" SRM Drivers falls below a critical threshold, expressed as a percentage of the total number of SRM Drivers. For example, if this test is configured with a warning threshold of 95% and a critical threshold of 90% for a cluster of 100 SRM Drivers, this test would return "Good" health if 95 or more SRM Drivers have good health. This test would return "Concerning" health if at least 90 SRM Drivers have either "Good" or "Concerning" health. If more than 10 SRM Drivers have bad health, this test would return "Bad" health. A failure of this health test indicates unhealthy SRM Drivers. Check the status of the individual SRM Drivers for more information. This test can be configured using the Streams Replication Manager Streams Replication Manager service-wide monitoring setting.
Short Name: SRM Driver Health
Healthy SRM Driver Monitoring Thresholds
- Description
- The health test thresholds of the overall SRM Driver health. The check returns "Concerning" health if the percentage of "Healthy" SRM Drivers falls below the warning threshold. The check is unhealthy if the total percentage of "Healthy" and "Concerning" SRM Drivers falls below the critical threshold.
- Template Name
- STREAMS_REPLICATION_MANAGER_STREAMS_REPLICATION_MANAGER_DRIVER_healthy_thresholds
- Default Value
- critical:49.99, warning:94.99
- Unit(s)
- PERCENT
Streams Replication Manager SRM Service Health
This Streams Replication Manager service-level health test checks for the presence of a running, healthy SRM Service. The test returns "Bad" health if the service is running and the SRM Service is not running. In all other cases it returns the health of the SRM Service. A failure of this health test indicates a stopped or unhealthy SRM Service. Check the status of the SRM Service for more information. This test can be enabled or disabled using the SRM Service Role Health Test SRM Service service-wide monitoring setting.
Short Name: SRM Service Health
SRM Service Role Health Test
- Description
- When computing the overall STREAMS_REPLICATION_MANAGER health, consider SRM Service's health
- Template Name
- STREAMS_REPLICATION_MANAGER_STREAMS_REPLICATION_MANAGER_SERVICE_health_enabled
- Default Value
- true
- Unit(s)
- no unit