This is a MapReduce service-level health test that checks that enough of the TaskTrackers in the cluster are healthy. The test returns "Concerning" health if the number of healthy TaskTrackers falls below a warning threshold, expressed as a percentage of the total number of TaskTrackers. The test returns "Bad" health if the number of healthy and "Concerning" TaskTrackers falls below a critical threshold, expressed as a percentage of the total number of TaskTrackers. For example, if this test is configured with a warning threshold of 95% and a critical threshold of 90% for a cluster of 100 TaskTrackers, this test would return "Good" health if 95 or more TaskTrackers have good health. This test would return "Concerning" health if at least 90 TaskTrackers have either "Good" or "Concerning" health. If more than 10 TaskTrackers have bad health, this test would return "Bad" health. A failure of this health test indicates unhealthy TaskTrackers. Check the status of the individual TaskTrackers for more information. This test can be configured using the MapReduce MapReduce service-wide monitoring setting.
Short Name: TaskTracker Health
Healthy TaskTracker Monitoring Thresholds🔗
- Description
- The health test thresholds of the overall TaskTracker health. The check returns "Concerning" health if the percentage of "Healthy" TaskTrackers falls below the warning threshold. The check is unhealthy if the total percentage of "Healthy" and "Concerning" TaskTrackers falls below the critical threshold.
- Template Name
-
mapreduce_tasktrackers_healthy_thresholds
- Default Value
- critical:90.0, warning:95.0
- Unit(s)
- PERCENT