Follow the tasks listed below to configure NameNode HA:
Stop all the HDP services that are currently running on your cluster using the instructions provided here.
Stop the RHEL cluster NameNodeService service using the RHEL Cluster administration tools (RHEL v5.x., RHEL v6.x.).
Install the NameNode monitoring component on all the nodes in your RHEL HA cluster.
Ensure that you have set up the HDP repository on the RHEL HA cluster nodes as part of the HDP installation using the instructions provided here.
Install the RPMs.
For RHEL/CentOS 5
yum install hmonitor*.rpm yum install hmonitor-resource-agent*.rpm
For RHEL/CentOS 6
yum install hmonitor*.rpm yum install hmonitor-resource-agent*.rpm
Edit the /etc/cluster/cluster.conf
file to add the service domain specifications. You can use the following sample configuration. (Note that this sample configuration is for a small cluster and the timeouts for booting, probing, and stopping have been reduced to a minimum.)
<service domain="HANameNode" name="NameNodeService" recovery="restart"> <ip address="10.10.10.89" sleeptime="10"/> <netfs export="/hdp/hadoop-nfs" force_unmount="1" fstype="nfs" host="10.10.10.88" mountpoint="/hdp/hadoop/hdfs/nn" name="HDFS data" options="rw,soft,nolock"/> <hadoop __independent_subtree="1" __max_restarts="10" __restart_expire_time="600" name="NameNode Process" daemon="namenode" boottime="10000" probetime="10000" stoptime="10000" url="http://10.0.0.30:50070/dfshealth.jsp" pid="/var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid" path="/"/> </service>
The following table explains the parameters used in the above configuration:
Name | Description | Mandatory/Optional |
daemon |
Name of the hadoop service which will be started by |
Mandatory |
url |
URL to check for the service web page. This should be on the floating IP address. |
Mandatory |
pid |
Process ID of the NameNode process. (Default: “”) |
Optional |
path |
Path under DFS to probe. (Default: /) |
Optional |
boottime |
Time (in milliseconds) to allow for the service to boot up. This must include any activities that take place before the service web pages and IPC services are reachable. For the NameNode, it must include the time for the edit log to be replayed. (Default: 180000) |
Optional |
probetime |
Time (in milliseconds) to allow for a process to respond to liveness probes. This duration must be longer than the maximum expected GC pause. (Default: 120000) |
Optional |
stoptime |
Time (in milliseconds) to allow for a clean shutdown before forcibly killing a process. (Default: 60000) |
Optional |