When a DataNode is decommissioned, the NameNode ensures that every block from the
DataNode will still be available across the cluster as dictated by the replication factor. This
procedure involves copying blocks from the DataNode in small batches. If a DataNode has
thousands of blocks, decommissioning can take several hours. Before decommissioning hosts with
DataNodes, you should first tune HDFS:
Minimum Required Role:
Configurator (also provided by
Cluster Administrator,
Limited Cluster Administrator , and
Full Administrator)
-
Run the following command to identify any problems in the HDFS file system:
hdfs fsck / -list-corruptfileblocks -openforwrite -files -blocks -locations 2>&1 > /tmp/hdfs-fsck.txt
- Fix any issues reported by the
fsck
command. If the command output
lists corrupted files, use the fsck
command to move them to the
lost+found
directory or delete them:
hdfs fsck file_name -move
or
hdfs fsck file_name -delete
- Raise the heap size of the DataNodes. DataNodes should be configured with at least
4 GB heap size to allow for the increase in iterations and max streams.
- Go to the HDFS service page.
- Click the Configuration tab.
- Select
.
- Select .
- Set the Java Heap Size of DataNode in Bytes property
as recommended.
To apply this configuration property to other role groups as
needed, edit the value for the appropriate role group.
- Increase the replication work multiplier per iteration to a larger number (the
default is 2, however 10 is recommended).
- Select
.
- Expand the
category.
- Configure the Replication Work Multiplier Per
Iteration property to a value such as 10.
To apply this configuration property to other role groups as
needed, edit the value for the appropriate role group.
- Increase the replication maximum threads and maximum replication thread hard
limits.
- Select
.
- Expand the
category.
- Configure the Maximum number of replication threads on a
DataNode and Hard limit on the number of replication threads
on a DataNode properties to 50 and 100 respectively. You can decrease
the number of threads (or use the default values) to minimize the impact of
decommissioning on the cluster, but the trade off is that decommissioning will take
longer.
To apply this configuration property to other role groups as
needed, edit the value for the appropriate role group.
- Restart the HDFS service.