Ensure that you complete the following hardware prerequisites:
Shared Storage
Power Fencing device
IP fail over with a floating IP
Hardware requirement for RHEL HA cluster
Shared storage is required for storing the NameNode metadata. Use a highly available shared storage NFS device.
Ensure that you use a Power fencing device.
Note | |
---|---|
Red Hat HA cluster utilizes power fencing to deal with network split-brain events. Fencing guarantees the integrity of NameNode metadata. For more information, see: Fencing Topology. |
Ensure that an additional static IP is available for the cluster.
The IP must be a static reserved entry in your network DNS table. This IP will act as the public IP for the NameNode Service.
Note | |
---|---|
Red Hat HA clustering utilizes a floating IP address for the NameNode service across the HA cluster. More details on using a floating IP for RHEL are available here. |
The RHEL HA cluster must have a minimum of two nodes.
The number of nodes in your HA cluster depends on the number of concurrent node failures you want the HDP platform to withstand. The RHEL HA cluster can be configured to include a maximum of 16 nodes. Choose hardware specs for the RHEL HA Cluster nodes according to the NameNode hardware recommendations available here.
Ensure that you complete the following software prerequisites:
Step 1: Complete the prerequisites for High Availability Add-On package for RHEL.
Use the instructions available here (RHEL v5.x., RHEL v6.x).
Step 2: Install the HA Add-On package for RHEL.
Use the instructions available here (RHEL v5.x.,RHEL v6.x).
Important | |
---|---|
You can use the graphical user interface (GUI) to configure a RHEL v6.x cluster configuration until you specify a Hadoop service configuration (Deploy HDP HA Configurations).
You must use the |
Step 3: Ensure that the following cluster configurations are available on all the machines in your RHEL HA cluster:
Cluster domain that specifies all the nodes in the RHEL HA cluster. See instructions here (RHEL v5.x., RHEL v6.x.).
Fail over domain. See instructions here (RHEL v5.x., RHEL v6.x.).
Power Fencing device. See instructions here (RHEL v5.x., RHEL v6.x.).
Add cluster service and resources (Floating IP and NFS mount). Ensure that you add the
<service domain>
configurations and to add resources to the service group: See instructions here (RHEL v5.x., RHEL v6.x.).When the above are configured, you will have a
cluster.conf
file similar to the following sample configuration. (Note that this sample configuration does not declare a true fencing device because that is specific to the environment. Modify the configuration values to match your infrastructure environment.)<?xml version="1.0"?> <cluster config_version="8" name="rhel6ha"> <clusternodes> <clusternode name="rhel6ha01" nodeid="1"> <fence> <method name="1"> <device name="BinTrue"/> </method> </fence> </clusternode> <clusternode name="rhel6ha02" nodeid="2"> <fence> <method name="1"> <device name="BinTrue"/> </method> </fence> </clusternode> </clusternodes> <cman expected_votes="1" two_node="1"/> <fencedevices> <fencedevice agent="fence_bin_true" name="BinTrue"/> </fencedevices> <rm log_level="7"> <failoverdomains> <failoverdomain name="HANameNode" ordered="1" restricted="1"> <failoverdomainnode name="rhel6ha01" priority="1"/> <failoverdomainnode name="rhel6ha02" priority="2"/> </failoverdomain> </failoverdomains> <service domain="HANameNode" name="NameNodeService" recovery="relocate"> <ip address="10.10.10.89" sleeptime="10"/> <netfs export="/hdp/nfs" force_unmount="1" fstype="nfs" host="10.10.10.88" mountpoint="/hdp/hadoop/hdfs/nn" name="HDFS data" options="rw,soft,nolock"/> </service> </rm> </cluster>
Use the following instructions to validate the configurations for RHEL HA cluster.
Step 1: Validate that the floating IP address is available on the primary machine. (Primary machine is the machine where the NameNode process is currently running).
ip addr show eth1
If the IP address is available, you should see a message (as shown in the following example). In this example, the IP address is configured at rheln1.hortonworks.local
:
root@rheln1 ~]# ip addr show eth1 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:0c:29:cb:ca:76 brd ff:ff:ff:ff:ff:ff inet 172.16.204.10/24 brd 172.16.204.255 scope global eth1 inet 172.16.204.12/24 scope global secondary eth1 inet6 fe80::20c:29ff:fecb:ca76/64 scope link valid_lft forever preferred_lft forever
Step 2: Validate that the NameNode service starts on the secondary machine.
ip addr show eth3
Step 3: Validate fail over for the IP address.
Shut down alternate host machines.
Ensure that the IP address fails over properly.