This alert is triggered if the NameNode process cannot be confirmed to be up and
listening on the network for the configured critical threshold, given in seconds. It uses
the Nagios check_tcp
[1]plugin.
The NameNode process is down on the HDFS master host
The NameNode process is up and running but not listening on the correct network port (default 8201)
The Nagios server cannot connect to the HDFS master through the network.
Check for any errors in the logs (
/var/log/hadoop/hdfs/
)and restart the NameNode host/process using the HMC Manage Services tab.Run the
netstat-tuplpn
command to check if the NameNode process is bound to the correct network portUse
ping
to check the network connection between the Nagios server and the NameNode
[1] The check_tcp
plugin tests if a process is up and listening on a
specified socket (host/port) address. Ambari uses this check to determine the run time
status of various Hadoop services. With future Ambari releases this functionality will
be improved to include more robust tests such as running some service operations to
make sure the service is healthy.