Configuring HDFS High Availability

The HDFS NameNode High Availability (HA) feature enables you to run redundant NameNodes in the same cluster in an Active/Passive configuration with a hot standby. This eliminates the NameNode as a potential single point of failure (SPOF) in an HDFS cluster.

In a standard configuration, the NameNode is a single point of failure (SPOF) in an HDFS cluster. Each cluster has a single NameNode, and if that machine or process becomes unavailable, the cluster as a whole is unavailable until the NameNode is either restarted or brought up on a separate machine. This situation impacts the total availability of the HDFS cluster in two major ways:

  • In the case of an unplanned event such as a machine crash, the cluster would be unavailable until an operator restarted the NameNode.

  • Planned maintenance events such as software or hardware upgrades on the NameNode machine would result in periods of cluster downtime.

HDFS NameNode HA avoids this by facilitating either a fast failover to a standby NameNode during machine crash, or a graceful administrator-initiated failover during planned maintenance.