Use the following instructions to decommission DataNodes in your cluster:
On the NameNode host machine, edit the
<HADOOP_CONF_DIR>/dfs.exclude
file and add the list of DataNodes hostnames (separated by a newline character).where
<HADOOP_CONF_DIR>
is the directory for storing the Hadoop configuration files. For example,/etc/hadoop/conf
.Update the NameNode with the new set of excluded DataNodes. On the NameNode host machine, execute the following command:
su <HDFS_USER> hdfs dfsadmin -refreshNodes
where
<HDFS_USER>
is the user owning the HDFS services. For example,hdfs
.Open the NameNode web UI (
http://<NameNode_FQDN>:50070
) and navigate to the DataNodes page. Check to see whether the state has changed to Decommission In Progress for the DataNodes being decommissioned.When all the DataNodes report their state as Decommissioned (on the DataNodes page, or on the Decommissioned Nodes page at
http://<NameNode_FQDN>:8088/cluster/ nodes/decommissioned
), all of the blocks have been replicated. You can then shut down the decommissioned nodes.If your cluster utilizes a
dfs.include
file, remove the decommissioned nodes from the<HADOOP_CONF_DIR>/dfs.include
file on the NameNode host machine, then execute the following command:su <HDFS_USER> hdfs dfsadmin -refreshNodes
Note If no
dfs.include
file is specified, all DataNodes are considered to be included in the cluster (unless excluded in thedfs.exclude
file). Thedfs.hosts
anddfs.hosts.exclude
properties inhdfs-site.xml
are used to specify thedfs.include
anddfs.exclude
files.