Atlas Health Tests

Atlas Server Health

This is a Atlas service-level health test that checks that enough of the Atlas Servers in the cluster are healthy. The test returns "Concerning" health if the number of healthy Atlas Servers falls below a warning threshold, expressed as a percentage of the total number of Atlas Servers. The test returns "Bad" health if the number of healthy and "Concerning" Atlas Servers falls below a critical threshold, expressed as a percentage of the total number of Atlas Servers. For example, if this test is configured with a warning threshold of 95% and a critical threshold of 90% for a cluster of 100 Atlas Servers, this test would return "Good" health if 95 or more Atlas Servers have good health. This test would return "Concerning" health if at least 90 Atlas Servers have either "Good" or "Concerning" health. If more than 10 Atlas Servers have bad health, this test would return "Bad" health. A failure of this health test indicates unhealthy Atlas Servers. Check the status of the individual Atlas Servers for more information. This test can be configured using the Atlas Atlas service-wide monitoring setting.

Short Name: Atlas Server Health

Healthy Atlas Server Monitoring Thresholds

Description
The health test thresholds of the overall Atlas Server health. The check returns "Concerning" health if the percentage of "Healthy" Atlas Servers falls below the warning threshold. The check is unhealthy if the total percentage of "Healthy" and "Concerning" Atlas Servers falls below the critical threshold.
Template Name
ATLAS_ATLAS_SERVER_healthy_thresholds
Default Value
critical:90.0, warning:99.0
Unit(s)
PERCENT