Decommission DataNodes
Edit the configuration files and execute commands on the NameNode host machine.
-
On the NameNode host machine, edit the
<HADOOP_CONF_DIR>/dfs.exclude
file and add the list of DataNodes hostnames (separated by a newline character).where
<HADOOP_CONF_DIR>
is the directory for storing the Hadoop configuration files. For example,/etc/hadoop/conf
. -
Update the NameNode with the new set of excluded DataNodes. On the NameNode host machine, execute the following command:
su <HDFS_USER> hdfs dfsadmin -refreshNodes
where
<HDFS_USER>
is the user owning the HDFS services. For example,hdfs
. -
Open the NameNode web UI (
http://<NameNode_FQDN>:50070
) and navigate to the DataNodes page. Check to see whether the state has changed to Decommission In Progress for the DataNodes being decommissioned. -
When all the DataNodes report their state as Decommissioned (on the DataNodes page, or on the Decommissioned Nodes page at
http://<NameNode_FQDN>:8088/cluster/ nodes/decommissioned
), all of the blocks have been replicated. You can then shut down the decommissioned nodes. -
If your cluster utilizes a
dfs.include
file, remove the decommissioned nodes from the<HADOOP_CONF_DIR>/dfs.include
file on the NameNode host machine, then execute the following command:su <HDFS_USER> hdfs dfsadmin -refreshNodes
NoteIf no
dfs.include
file is specified, all DataNodes are considered to be included in the cluster (unless excluded in thedfs.exclude
file). Thedfs.hosts
anddfs.hosts.exclude
properties inhdfs-site.xml
are used to specify thedfs.include
anddfs.exclude
files.