5.3.1. ResourceManager process

This host-level alert is triggered if the individual ResourceManager process cannot be established to be up and listening on the network for the configured critical threshold, given in seconds. It uses the Nagios check_tcp plugin.

 5.3.1.1. Potential causes
  • The ResourceManager process is down or not responding.

  • The ResourceManager is not down but is not listening to the correct network port/address.

  • Nagios Server cannot connect to the ResourceManager

 5.3.1.2. Possible remedies
  • Check for dead ResourceManager.

  • Check for any errors in the ResourceManager logs (/var/log/hadoop/yarn) and restart the ResourceManager, if necessary.

  • Use ping to check the network connection between the Nagios Server and the ResourceManager host.


loading table of contents...