Increasing storage capacity with HDFS erasure coding

HDFS Erasure Coding (EC) can be used to reduce the amount of storage space required for replication.

The default 3x replication scheme in HDFS adds 200% overhead in storage space and other resources such as network bandwidth. For warm and cold datasets with relatively low I/O activities, additional block replicas are rarely accessed during normal operations, but still consume the same amount of resources as the first replica.

Erasure coding provides the same level of fault-tolerance as 3x replication, but uses much less storage space. In a typical erasure coding setup, the storage overhead is not more than 50%.

In storage systems, the most notable usage of EC is in a Redundant Array of Independent Disks (RAID). RAID implements EC through striping, which divides logically sequential data such as a file into smaller units (such as a bit, byte, or block) and stores consecutive units on different disks. This unit of striping distribution is termed as a striping cell. EC uses an encoding process to calculate and store a certain number of parity cells for each stripe of original data cells. An error on any striping cell can be recovered using a decoding calculation based on the surviving data and the parity cells.

Integrating EC with HDFS can improve storage efficiency while still providing similar data durability as traditional replication-based HDFS deployments. As an example, a 3x replicated file with 6 blocks consumes 6*3 = 18 blocks of disk space. But with EC (6 data, 3 parity) deployment, the file consumes only 9 blocks of disk space.