Streams Replication Manager Health Tests

Streams Replication Manager SRM Driver Health

This is a Streams Replication Manager service-level health test that checks that enough of the SRM Drivers in the cluster are healthy. The test returns "Concerning" health if the number of healthy SRM Drivers falls below a warning threshold, expressed as a percentage of the total number of SRM Drivers. The test returns "Bad" health if the number of healthy and "Concerning" SRM Drivers falls below a critical threshold, expressed as a percentage of the total number of SRM Drivers. For example, if this test is configured with a warning threshold of 95% and a critical threshold of 90% for a cluster of 100 SRM Drivers, this test would return "Good" health if 95 or more SRM Drivers have good health. This test would return "Concerning" health if at least 90 SRM Drivers have either "Good" or "Concerning" health. If more than 10 SRM Drivers have bad health, this test would return "Bad" health. A failure of this health test indicates unhealthy SRM Drivers. Check the status of the individual SRM Drivers for more information. This test can be configured using the Streams Replication Manager Streams Replication Manager service-wide monitoring setting.

Short Name: SRM Driver Health

Healthy SRM Driver Monitoring Thresholds

Description
The health test thresholds of the overall SRM Driver health. The check returns "Concerning" health if the percentage of "Healthy" SRM Drivers falls below the warning threshold. The check is unhealthy if the total percentage of "Healthy" and "Concerning" SRM Drivers falls below the critical threshold.
Template Name
STREAMS_REPLICATION_MANAGER_STREAMS_REPLICATION_MANAGER_DRIVER_healthy_thresholds
Default Value
critical:49.99, warning:94.99
Unit(s)
PERCENT

Streams Replication Manager SRM Service Health

This Streams Replication Manager service-level health test checks for the presence of a running, healthy SRM Service. The test returns "Bad" health if the service is running and the SRM Service is not running. In all other cases it returns the health of the SRM Service. A failure of this health test indicates a stopped or unhealthy SRM Service. Check the status of the SRM Service for more information. This test can be enabled or disabled using the SRM Service Role Health Test SRM Service service-wide monitoring setting.

Short Name: SRM Service Health

SRM Service Role Health Test

Description
When computing the overall STREAMS_REPLICATION_MANAGER health, consider SRM Service's health
Template Name
STREAMS_REPLICATION_MANAGER_STREAMS_REPLICATION_MANAGER_SERVICE_health_enabled
Default Value
true
Unit(s)
no unit