Follow the tasks listed below to configure JobTracker HA:
Stop all the HDP services that are currently running on your cluster using the instructions provided here.
Stop the RHEL cluster JobTrackerService service using the RHEL Cluster administration tools (RHEL v5.x., RHEL v6.x.).
Install the JobTracker monitoring component on all the nodes in your RHEL HA cluster.
Ensure that you have set up the HDP repository on the RHEL HA cluster nodes as part of the HDP installation.
Use the instructions provided here.
Install the RPMs.
yum install hmonitor*.rpm yum install hmonitor-resource-agent*.rpm
Edit the /etc/cluster/cluster.conf
file to add the service domain
specifications. You can use the following sample configuration. (Note that this
sample configuration is for a small cluster and the timeouts for booting, probing,
and stopping have been reduced to a minimum.)
<service domain="HAJobTracker" name="JobTrackerService" recovery="restart"> <ip address="10.0.0.30" sleeptime="10"/> <hadoop __independent_subtree="1" __max_restarts="20" __restart_expire_time="600" name="JobTracker Process" daemon="jobtracker" boottime="60000" probetime="20000" stoptime="30000" url="http://10.0.0.30:50030/" waitfs="true" /> <hadoop __independent_subtree="1" __max_restarts="20" __restart_expire_time="600" name="HistoryServer Process" daemon="historyserver" boottime="60000" probetime="20000" stoptime="30000" url="http://10.0.0.30:51111/" waitfs="true" />
The following table explains the parameters used in the above configuration:
Name | Description | Mandatory/Optional |
daemon |
Name of the hadoop service which will be started by
|
Mandatory |
url |
URL to check for the service web page. This should be on the floating IP address. |
Mandatory |
pid |
Process ID of the Master Service process. (Default: “”) |
Optional |
boottime |
Time (in milliseconds) to allow for the service to boot up. This must include any activities that take place before the service web pages and IPC services are reachable. (Default: 180000) |
Optional |
probetime |
Time (in milliseconds) to allow for a process to respond to liveness probes. This duration must be longer than the maximum expected GC (Garbage Collection) pause. (Default: 120000) |
Optional |
stoptime |
Time (in milliseconds) to allow for a clean shutdown before forcibly killing a process. (Default: 60000) |
Optional |