Ozone DataNode Health
This is a Ozone service-level health test that checks that enough of the Ozone DataNodes in the cluster are healthy. The test returns "Concerning" health if the number of healthy Ozone DataNodes falls below a warning threshold, expressed as a percentage of the total number of Ozone DataNodes. The test returns "Bad" health if the number of healthy and "Concerning" Ozone DataNodes falls below a critical threshold, expressed as a percentage of the total number of Ozone DataNodes. For example, if this test is configured with a warning threshold of 95% and a critical threshold of 90% for a cluster of 100 Ozone DataNodes, this test would return "Good" health if 95 or more Ozone DataNodes have good health. This test would return "Concerning" health if at least 90 Ozone DataNodes have either "Good" or "Concerning" health. If more than 10 Ozone DataNodes have bad health, this test would return "Bad" health. A failure of this health test indicates unhealthy Ozone DataNodes. Check the status of the individual Ozone DataNodes for more information. This test can be configured using the Ozone Ozone service-wide monitoring setting.
Short Name: Ozone DataNode Health
Healthy Ozone DataNode Monitoring Thresholds
- Description
- The health test thresholds of the overall Ozone DataNode health. The check returns "Concerning" health if the percentage of "Healthy" Ozone DataNodes falls below the warning threshold. The check is unhealthy if the total percentage of "Healthy" and "Concerning" Ozone DataNodes falls below the critical threshold.
- Template Name
- OZONE_OZONE_DATANODE_healthy_thresholds
- Default Value
- critical:90.0, warning:99.0
- Unit(s)
- PERCENT