Use
hostname -f
to identify the FQDN for all the host machines.Note If you are deploying on Amazon EC2, use the Internal FQDN.
On the master-install-location, change directory to
master-install-location/gsInstaller
.Create the following flat text files:
Note The mandatory files are required for minimal install (Apache Hadoop core components). The optional files are needed if you wish to install that component (for example, HBase, Hive, WebHCat, etc.) in your cluster.
Mandatory files: gateway, namenode, snamenode, jobtracker, nodes
Note The
nodes
file is used to define the DataNodes and TaskTrackers.Optional files: hbasemaster, hivemetastore, webhcatnode, nagiosserver, gangliaserver, oozieserver, hbasenodes, zknodes
Note The
hbasenodes
file is used to define the RegionServers for your HBase cluster.
Provide FQDN of your host machines in each these text files:
Option I (single node installations): Provide the FQDN of the same host machine for all of the text files.
Option II (multi node installations):
For the following files, provide FQDN of EXACTLY one host machine:
gateway
,namenode
,snamenode
,jobtracker
,hbasemaster
,hivemetastore
,oozieserver
,webhcatnode
,nagiosserver
,gangliaserver
.For the following files, provide FQDN (separated by a new-line character) for a MINIMUM of three host machines::
nodes
,hbasenodes
For the
zknodes
file, provide FQDN for a MINIMUM of one host machine.:Note Multiple host machines must follow the Zookeeper ensemble rule.