Preparing the hardware resources for HDFS High Availability
Make sure that you prepare the required hardware resources for High Availability.
-
NameNode machines: The machines where you run Active and Standby NameNodes, should have exactly the same hardware.
-
JournalNode machines: The machines where you run the JournalNodes. The JournalNode daemon is relatively lightweight, so these daemons may reasonably be co- located on machines with other Hadoop daemons, for example the NameNodes or the YARN ResourceManager.
-
ZooKeeper machines: For automated failover functionality, there must be an existing ZooKeeper cluster available. The ZooKeeper service nodes can be co-located with other Hadoop daemons.
In an HA cluster, the Standby NameNode also performs checkpoints of the namespace state. Therefore, do not deploy a Secondary NameNode in an HA cluster.