Parameters to configure the Disk Balancer
You must configure various parameters through HDFS Service Advanced
Configuration Snippet (Safety Valve)
in hdfs-site.xml
for
effectively planning and executing the Disk Balancer.
Parameter | Description |
---|---|
dfs.disk.balancer.enabled | Controls whether the Disk Balancer is enabled for the cluster. The Disk Balancer executes only if this parameter is set to True. The default value is False. |
dfs.disk.balancer.max.disk.throughputInMBperSec | The maximum disk bandwidth that Disk Balancer consumes while transferring data between disks. The default value is 10 MB/s. |
dfs.disk.balancer.max.disk.errors | The maximum number of errors to ignore for a move operation between two
disks before abandoning the move. The default value of the maximum errors to
ignore is 5. For example, if a plan specifies data move operation between three pairs of disks, and if the move between the first pair encounters more than five errors, that move is abandoned and the next move between the second pair of disks starts. |
dfs.disk.balancer.block.tolerance.percent | Specifies a threshold value in percentage to consider a move operation
successful, and stop moving further data. For example, setting this value to 20% for an operation requiring 10GB data movement indicates that the movement will be considered successful only after 8GB of data is moved. |
dfs.disk.balancer.plan.threshold.percent | The ideal storage value for a set of disks in a DataNode indicates the
amount of data each disk should have for achieving perfect data distribution
across those disks. The threshold percentage defines the value at which disks
start participating in data redistribution or balancing operations. Minor
imbalances are ignored because normal operations automatically correct some of
these imbalances. The default threshold percentage for a disk is 10%; indicating that a disk is used in balancing operations only if the disk contains 10% more or less data than the ideal storage value. |