Scaling Namespaces and Optimizing Data Storage
Also available as:
PDF
loading table of contents...

Balance data in a federation

Depending on your requirements, you can use the HDFS Balancer to balance data either at the level of the DataNodes or the block pools in a cluster with federated NameNodes.

Balancer balances only data across the cluster and not the namespace.
Run the Balancer using the hadoop-daemon.sh start command.
hadoop-daemon.sh start balancer [-policy <policy>]
Specify either of the following values for policy:
  • datanode: The default policy that balances data at the level of the DataNode.
  • blockpool: Balances data at the level of the block pool.