Managing Data StoragePDF version

Storage group classification

The HDFS Balancer first invokes the getLiveDatanodeStorageReport rpc to the Namenode to the storage report for all the storages in all Datanodes. The storage report contains storage utilization information such as capacity, dfs used space, remaining space, and so forth, for each storage in each DataNode.

A Datanode can contain multiple storages and the storages can have different storage types. A storage group Gi,T is defined to be the group of all the storages with the same storage type T in Datanode i. For example, Gi,DISK is the storage group of all the DISK storages in Datanode i. For each storage type T in each DataNode i, HDFS Balancer computes Storage Group Utilization (%)

Ui,T = 100% (storage group used space)/(storage group capacity),

and Average Utilization (%)

Uavg,T = 100% * (sum of all used spaces)/(sum of all capacities).

Let △ be the threshold parameter (default is 10%) and GI,T be the storage group with storage type T in DataNode I.

 
                    Over-Utilized:       {Gi,T :  Uavg,T + Δ < Ui,T},
Average + Threshold ---------------------------------------------------------------------------
                    Above-Average:	{Gi,T :  Uavg,T < Ui,T <= Uavg,T + Δ},
Average ---------------------------------------------------------------------------------------
                    Below-Average:	{Gi,T :  Uavg,T - Δ <= Ui,T <= Uavg,T},
Average - Threshold ---------------------------------------------------------------------------
                    Under-Utilized:      {Gi,T :  Ui,T < Uavg,T - Δ }.

A storage group is over-utilized or under-utilized if its utilization is larger or smaller than the difference between the average and the threshold. A storage group is above-average or below-average if its utilization is larger or smaller than average but within the threshold.

If there are no over-utilized storages and no under-utilized storages, the cluster is said to be balanced. The HDFS Balancer terminates with a SUCCESS state. Otherwise, it continues with storage group pairing.