Apache Ambari Operations
Also available as:
PDF
loading table of contents...

Ambari Metrics Alerts

Alert

Description

Potential Causes

Possible Remedies

Metrics Collector Process

This alert is triggered if the Metrics Collector cannot be confirmed to be up and listening on the configured port for number of seconds equal to threshold.

The Metrics Collector process is not running.

Check the Metrics Collector is running.

Metrics Collector – ZooKeeper Server Process

This host-level alert is triggered if the Metrics Collector ZooKeeper Server Process cannot be determined to be up and listening on the network.

The Metrics Collector process is not running.

Check the Metrics Collector is running.

Metrics Collector – HBase Master Process

This alert is triggered if the Metrics Collector HBase Master Processes cannot be confirmed to be up and listening on the network for the configured critical threshold, given in seconds.

The Metrics Collector process is not running.

Check the Metrics Collector is running.

Metrics Collector – HBase Master CPU Utilization

This host-level alert is triggered if CPU utilization of the Metrics Collector exceeds certain thresholds.

Unusually high CPU utilization generally the sign of an issue in the daemon configuration.

Tune the Ambari Metrics Collector.

Metrics Monitor Status

This host-level alert is triggered if the Metrics Monitor process cannot be confirmed to be up and running on the network.

The Metrics Monitor is down.

Check whether the Metrics Monitor is running on the given host.

Percent Metrics Monitors Available

This is an AGGREGATE alert of the Metrics Monitor Status.

Metrics Monitors are down.

Check the Metrics Monitors are running.

Metrics Collector -Auto-Restart Status

This alert is triggered if the Metrics Collector has been auto-started for number of times equal to start threshold in a 1 hour timeframe. By default if restarted 2 times in an hour, you will receive a Warning alert. If restarted 4 or more times in an hour, you will receive a Critical alert.

The Metrics Collector is running but is unstable and causing restarts. This could be due to improper tuning.

Tune the Ambari Metrics Collector.

Percent Metrics Monitors Available

This is an AGGREGATE alert of the Metrics Monitor Status.

Metrics Monitors are down.

Check the Metrics Monitors.

Grafana Web UI

This host-level alert is triggered if the AMS Grafana Web UI is unreachable.

Grafana process is not running.

Check whether the Grafana process is running. Restart if it has gone down.

More Information

Tuning Ambari Metrics