Use the following instructions to manually add a slave node:
On each new slave node, configure the remote repository as described in Installing Zookeeper in Installing HDP Manually.
On each new slave node, install HDFS.
On each new slave node, install compression libraries.
On each new slave node, create the DataNode and YARN NodeManager local directories.
Copy the Hadoop configurations to the new slave nodes and set appropriate permissions.
Option I: Copy Hadoop config files from an existing slave node.
On an existing slave node, make a copy of the current configurations:
tar zcvf hadoop_conf.tgz /etc/hadoop/conf
Copy this file to each of the new nodes:
rm -rf /etc/hadoop/conf cd / tar zxvf $location_of_copied_conf_tar_file/hadoop_conf.tgz chmod -R 755 /etc/hadoop/confa
Option II: Manually set up the Hadoop configuration.
On each of the new slave nodes, starat the DataNode:
su -l hdfs -c "/usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh start datanode"
On each of the new slave nodes, start the NodeManager:
su -l yarn -c "/usr/hdp/current/hadoop-yarn-nodemanager/sbin/yarn-daemon.sh start nodemanager"
Optional - If you use a HDFS or YARN/ResourceManager
.include
file in your cluster, add the new slave nodes to the.include
file, then run the applicablerefreshNodes
command.To add new DataNodes to the
dfs.include
file:On the NameNode host machine, edit the
/etc/hadoop/conf/dfs.include
file and add the list of the new slave node host names (separated by newline character).Note If no dfs.include file is specified, all DataNodes are considered to be included in the cluster (unless excluded in the dfs.exclude file). The
dfs.hosts
anddfs.hosts.exlude
properties inhdfs-site.xml
are used to specify thedfs.include
anddfs.exclude
files.On the NameNode host machine, execute the following command:
su -l hdfs -c "hdfs dfsadmin -refreshNodes"
To add new NodeManagers to the
yarn.include
file:On the ResourceManager host machine, edit the
/etc/hadoop/conf/yarn.include
file and add the list of the slave node host names (separated by newline character).Note If no
yarn.include
file is specified, all NodeManagers are considered to be included in the cluster (unless excluded in theyarn.exclude
file). Theyarn.resourcemanager.nodes.include-path
andyarn.resourcemanager.nodes.exclude-path
properties in yarn-site.xml are used to specify theyarn.include
andyarn.exclude
files.On the ResourceManager host machine, execute the following command:
su -l yarn -c "yarn rmadmin -refreshNodes"