Managing and Monitoring a Cluster
Also available as:
PDF
loading table of contents...

ZooKeeper alerts

Descriptions, potential causes and possible rememdies for alerts triggered by ZooKeeper.

Table 1. ZooKeeper Alerts
Alert Alert Type Description Potential Causes Possible Remedies
Percent ZooKeeper Servers Available AGGREGATE This service-level alert is triggered if the configured percentage of ZooKeeper processes cannot be determined to be up and listening on the network for the configured critical threshold, given in seconds. It aggregates the results of ZooKeeper process checks. The majority of your ZooKeeper servers are down and not responding.

Check the dependent services to make sure they are operating correctly.

Check the ZooKeeper logs /var/log/hadoop/zookeeper.log for further information.

If the failure was associated with a particular workload, try to understand the workload better.

Restart the ZooKeeper servers from the Ambari UI.

ZooKeeper Server Process PORT This host-level alert is triggered if the ZooKeeper server process cannot be determined to be up and listening on the network for the configured critical threshold, given in seconds.

The ZooKeeper server process is down on the host.

The ZooKeeper server process is up and running but not listening on the correct network port (default 2181).

Check for any errors in the ZooKeeper logs /var/log/hbase/ and restart the ZooKeeper process using Ambari Web.

Run the netstat-tuplpn command to check if the ZooKeeper server process is bound to the correct network port.