5.2.2. NameNode HA Healthy process

This service-level alert is triggered if either the Active NameNode or Standby NameNode are not running.

 5.2.2.1. Potential causes
  • The Active, Standby or both NameNode processes are down.

  • The Nagios Server cannot connect to one or both NameNode hosts.

 5.2.2.2. Possible remedies
  • On each host running NameNode, check for any errors in the logs (/var/log/hadoop/hdfs/) and restart the NameNode host/process using Ambari Web.

  • On each host running NameNode, run the netstat-tuplpn command to check if the NameNode process is bound to the correct network port.

  • Use ping to check the network connection between the Nagios server and the hosts running NameNode.


loading table of contents...