5.1.4. DataNode process

This host-level alert is triggered if the individual DataNode processes cannot be established to be up and listening on the network for the configured critical threshold, given in seconds. It uses the Nagios check_tcp plugin.

 5.1.4.1. Potential causes
  • DataNode process is down or not responding

  • DataNode are not down but is not listening to the correct network port/address

  • Nagios server cannot connect to the DataNodes

 5.1.4.2. Possible remedies
  • Check for dead DataNodes in Ambari Web.

  • Check for any errors in the DataNode logs (/var/log/hadoop/hdfs) and restart the DataNode, if necessary

  • Run the netstat-tuplpn command to check if the DataNode process is bound to the correct network port

  • Use ping to check the network connection between the Nagios server and the DataNode


loading table of contents...