HDFS OverviewPDF version

Configuring storage balancing for DataNodes

You can configure HDFS to distribute writes on each DataNode in a manner that balances out available storage among that DataNode's disk volumes.

By default a DataNode writes new block replicas to disk volumes solely on a round-robin basis. You can configure a volume-choosing policy that causes the DataNode to take into account how much space is available on each volume when deciding where to place a new replica.

You can configure
  • how much DataNode volumes are allowed to differ in terms of bytes of free disk space before they are considered imbalanced, and
  • what percentage of new block allocations will be sent to volumes with more available disk space than others.

We want your opinion

How can we improve this page?

What kind of feedback do you have?