Erasure coding examples

You can use the hdfs ec command with its various options to set erasure coding policies on directories.

Viewing the list of erasure coding policies

The following example shows how you can view the list of available erasure coding policies:

hdfs ec -listPolicies
Erasure Coding Policies:
ErasureCodingPolicy=[Name=RS-10-4-1024k, Schema=[ECSchema=[Codec=rs, numDataUnits=10, numParityUnits=4]], CellSize=1048576, Id=5], State=DISABLED
ErasureCodingPolicy=[Name=RS-3-2-1024k, Schema=[ECSchema=[Codec=rs, numDataUnits=3, numParityUnits=2]], CellSize=1048576, Id=2], State=DISABLED
ErasureCodingPolicy=[Name=RS-6-3-1024k, Schema=[ECSchema=[Codec=rs, numDataUnits=6, numParityUnits=3]], CellSize=1048576, Id=1], State=ENABLED
ErasureCodingPolicy=[Name=RS-LEGACY-6-3-1024k, Schema=[ECSchema=[Codec=rs-legacy, numDataUnits=6, numParityUnits=3]], CellSize=1048576, Id=3], State=DISABLED
ErasureCodingPolicy=[Name=XOR-2-1-1024k, Schema=[ECSchema=[Codec=xor, numDataUnits=2, numParityUnits=1]], CellSize=1048576, Id=4], State=DISABLED

Enabling an erasure coding policy

In the previous example, the list of erasure coding policies indicates that RS-6-3-1024k is already enabled. If required, you can enable additional policies as mentioned in the following example:

hdfs ec -enablePolicy -policy RS-3-2-1024k
Erasure coding policy RS-3-2-1024k is enabled

Setting an erasure coding policy

The following example shows how you can set the erasure coding policy RS-6-3-1024k on a particular directory:

hdfs ec -setPolicy -path /data/dir1 -policy RS-6-3-1024k
Set erasure coding policy RS-6-3-1024k on /data/dir1

To confirm whether the specified directory has the erasure coding policy applied, run the hdfs ec -getPolicy command:

hdfs ec -getPolicy -path /data/dir1
RS-6-3-1024k

Checking the block status on an erasure-coded directory

After enabling erasure coding on a directory, you can check the block status by running the hdfs fsck command. The following example output shows the status of the erasure-coded blocks:

hdfs fsck /data/dir1
.
.
.
Erasure Coded Block Groups:
Total size:	434424 B
Total files:	1
Total block groups (validated):	1 (avg. block group size 434424 B)
Minimally erasure-coded block groups:	1 (100.0 %)
Over-erasure-coded block groups:	0 (0.0 %)
Under-erasure-coded block groups:	0 (0.0 %)
Unsatisfactory placement block groups:	0 (0.0 %)
Average block group size:	4.0
Missing block groups:	0
Corrupt block groups:	0
Missing internal blocks:	0 (0.0 %)
FSCK ended at Fri Mar 21 19:39:11 UTC 2018 in 1 milliseconds

The filesystem under path '/data/dir1' is HEALTHY

Changing the erasure coding policy

You can use the hdfs ec setPolicy command to change the erasure coding policy applied on a particular directory.
hdfs ec -setPolicy -path /data/dir1 -policy RS-3-2-1024k
Set erasure coding policy RS-3-2-1024k on /data/dir1

You can check the check the block status after applying the new policy. The following example output shows the status of the erasure-coded blocks for a directory that has the RS-3-2-1024k policy:

hdfs fsck /data/dir1
.
.
.
Erasure Coded Block Groups:
Total size:	68644 B
Total files:	2
Total block groups (validated):	2 (avg. block group size 34322 B)
Minimally erasure-coded block groups:	2 (100.0 %)
Over-erasure-coded block groups:	0 (0.0 %)
Under-erasure-coded block groups:	0 (0.0 %)
Unsatisfactory placement block groups:	0 (0.0 %)
Average block group size:	2.5
Missing block groups:		0
Corrupt block groups:		0
Missing internal blocks:	0 (0.0 %)
FSCK ended at Mon Apr 09 10:11:06 UTC 2018 in 3 milliseconds


The filesystem under path '/data/dir1' is HEALTHY

You can apply the default 3x replication policy and check the block status as specified in the following examples:

hdfs ec -setPolicy -path /data/dir1 -replicate
Set erasure coding policy replication on /tmp/data1/
Warning: setting erasure coding policy on a non-empty directory will not automatically convert existing files to replication
hdfs fsck /data/dir1
.
.
.
Erasure Coded Block Groups:
Total size:	34322 B
Total files:	1
Total block groups (validated):	1 (avg. block group size 34322 B)
Minimally erasure-coded block groups:	1 (100.0 %)
Over-erasure-coded block groups:	0 (0.0 %)
Under-erasure-coded block groups:	0 (0.0 %)
Unsatisfactory placement block groups:	0 (0.0 %)
Average block group size:	2.0
Missing block groups:		0
Corrupt block groups:		0
Missing internal blocks:	0 (0.0 %)
FSCK ended at Tue Apr 10 04:34:14 UTC 2018 in 2 milliseconds

The filesystem under path '/data/dir1' is HEALTHY