Recommended configurations for the Balancer
The HDFS Balancer can run in either Background or Fast modes. Depending on the mode in which you want the Balancer to run, you can set various properties to recommended values.
Background and fast modes
-
HDFS Balancer runs as a background process. The cluster serves other jobs and applications at the same time.
- Fast Mode
-
HDFS Balancer runs at maximum (fast) speed.
Property | Default | Background Mode | Fast Mode |
---|---|---|---|
dfs.datanode.balance.max.concurrent.moves |
50 | 4 x (# of disks) | 4 x (# of disks) |
dfs.datanode.balance.bandwidthPerSec | 10485760 (10 MB) | use default | 10737418240 (10 GB) |
Property | Default | Background Mode | Fast Mode |
---|---|---|---|
dfs.datanode.balance.max.concurrent.moves | 50 | # of disks | 4 x (# of disks) |
dfs.balancer.moverThreads | 1000 | use default | 20,000 |
dfs.balancer.max-size-to-move | 10737418240 (10 GB) | 1073741824 (1GB) | 107374182400 (100 GB) |
dfs.balancer.getBlocks.min-block-size | 10485760 (10 MB) | use default | 104857600 (100 MB) |
Troubleshooting tip
When the block report length exceeds the configured limit, you might receive the
following exception in the NameNode
logs:
java.io.IOException: Requested data length 141557760 is longer than maximum configured RPC length 134217728
- Causes
- When there are too many blocks in one or more Datanodes, for example in the range of several millions or 10M+, it leads to a large size of block report from those particular Datanodes, and the block report length exceeds the default value of 128M. In that case the NameNode only receives a part of block report instead of the entire bock report as the block report gets truncated at the default configured length of 128M. Since the HDFS Balancer receives block information from NameNode, with wrong information of block number the balancer might not move the blocks from the Datanode with the above issue.
- Solutions
- Increase the value of
ipc.maximum.data.length
to 256M or above the reported data length from the above exception, so that the NameNode can receive a large size of block report. The balancer would then be able to progress further.