Verify the state of each NameNode, using one the following methods:
Open the web page for each NameNode in a browser, using the configured URL.
The HA state of the NameNode should appear in the configured address label. For example: NameNode 'example.com.8020' (standby) .
Note The NameNode state may be "standby" or "active". After bootstrapping, the HA NameNode state is initially "standby".
Query the state of a NameNode, using JMX(tag.HAState)
Query the service state, using the following command:
hdfs haadmin -getServiceState
Verify automatic failover.
Locate the Active NameNode.
Use the NameNode web UI to check the status for each NameNode host machine.
Cause a failure on the Active NameNode host machine.
Turn off automatic restart of the service.
In Windows Services pane, locate the Apache Hadoop NameNode service, right-click, and choose Properties.
On the Recovery tab, select Take No Action for First, Second, and Subsequent Failures, then choose Apply.
Simulate a JVM crash.
For example, you can use the following command to simulate a JVM crash:
'taskkill.exe /t /f /im namenode.exe'
Alternatively, power-cycle the machine, or unplug its network interface to simulate outage.
The Standby NameNode state should become Active within several seconds.
Note The time required to detect a failure and trigger a failover depends on the configuration of
ha.zookeeper.session-timeout.ms
property. The default value is 5 seconds.Verify that the Standby NameNode state is Active.
If a standby NameNode does not activate, verify that HA settings are configured correctly.
Check log files for
zkfc
daemons and NameNode daemons to diagnose issues.