Erasure coding examples
You can use the hdfs ec
command with its various options to set
erasure coding policies on directories.
Viewing the list of erasure coding policies
The following example shows how you can view the list of available erasure coding policies:
hdfs ec -listPolicies Erasure Coding Policies: ErasureCodingPolicy=[Name=RS-10-4-1024k, Schema=[ECSchema=[Codec=rs, numDataUnits=10, numParityUnits=4]], CellSize=1048576, Id=5], State=DISABLED ErasureCodingPolicy=[Name=RS-3-2-1024k, Schema=[ECSchema=[Codec=rs, numDataUnits=3, numParityUnits=2]], CellSize=1048576, Id=2], State=DISABLED ErasureCodingPolicy=[Name=RS-6-3-1024k, Schema=[ECSchema=[Codec=rs, numDataUnits=6, numParityUnits=3]], CellSize=1048576, Id=1], State=ENABLED ErasureCodingPolicy=[Name=RS-LEGACY-6-3-1024k, Schema=[ECSchema=[Codec=rs-legacy, numDataUnits=6, numParityUnits=3]], CellSize=1048576, Id=3], State=DISABLED ErasureCodingPolicy=[Name=XOR-2-1-1024k, Schema=[ECSchema=[Codec=xor, numDataUnits=2, numParityUnits=1]], CellSize=1048576, Id=4], State=DISABLED
Enabling an erasure coding policy
In the previous example, the list of erasure coding policies indicates that
RS-6-3-1024k
is already enabled. If required, you can enable
additional policies as mentioned in the following example:
hdfs ec -enablePolicy -policy RS-3-2-1024k Erasure coding policy RS-3-2-1024k is enabled
Setting an erasure coding policy
The following example shows how you can set the erasure coding policy
RS-6-3-1024k
on a particular directory:
hdfs ec -setPolicy -path /data/dir1 -policy RS-6-3-1024k Set erasure coding policy RS-6-3-1024k on /data/dir1
To confirm whether the specified directory has the erasure coding policy applied, run
the hdfs ec -getPolicy
command:
hdfs ec -getPolicy -path /data/dir1 RS-6-3-1024k
Checking the block status on an erasure-coded directory
After enabling erasure coding on a directory, you can check the block status by
running the hdfs fsck
command. The following example output shows
the status of the erasure-coded blocks:
hdfs fsck /data/dir1 . . . Erasure Coded Block Groups: Total size: 434424 B Total files: 1 Total block groups (validated): 1 (avg. block group size 434424 B) Minimally erasure-coded block groups: 1 (100.0 %) Over-erasure-coded block groups: 0 (0.0 %) Under-erasure-coded block groups: 0 (0.0 %) Unsatisfactory placement block groups: 0 (0.0 %) Average block group size: 4.0 Missing block groups: 0 Corrupt block groups: 0 Missing internal blocks: 0 (0.0 %) FSCK ended at Fri Mar 21 19:39:11 UTC 2018 in 1 milliseconds The filesystem under path '/data/dir1' is HEALTHY
Changing the erasure coding policy
hdfs ec setPolicy
command to change the erasure
coding policy applied on a particular directory.
hdfs ec -setPolicy -path /data/dir1 -policy RS-3-2-1024k Set erasure coding policy RS-3-2-1024k on /data/dir1
You can check the check the block status after applying the new policy. The following
example output shows the status of the erasure-coded blocks for a directory that has
the RS-3-2-1024k
policy:
hdfs fsck /data/dir1 . . . Erasure Coded Block Groups: Total size: 68644 B Total files: 2 Total block groups (validated): 2 (avg. block group size 34322 B) Minimally erasure-coded block groups: 2 (100.0 %) Over-erasure-coded block groups: 0 (0.0 %) Under-erasure-coded block groups: 0 (0.0 %) Unsatisfactory placement block groups: 0 (0.0 %) Average block group size: 2.5 Missing block groups: 0 Corrupt block groups: 0 Missing internal blocks: 0 (0.0 %) FSCK ended at Mon Apr 09 10:11:06 UTC 2018 in 3 milliseconds The filesystem under path '/data/dir1' is HEALTHY
You can apply the default 3x replication policy and check the block status as specified in the following examples:
hdfs ec -setPolicy -path /data/dir1 -replicate Set erasure coding policy replication on /tmp/data1/ Warning: setting erasure coding policy on a non-empty directory will not automatically convert existing files to replication
hdfs fsck /data/dir1 . . . Erasure Coded Block Groups: Total size: 34322 B Total files: 1 Total block groups (validated): 1 (avg. block group size 34322 B) Minimally erasure-coded block groups: 1 (100.0 %) Over-erasure-coded block groups: 0 (0.0 %) Under-erasure-coded block groups: 0 (0.0 %) Unsatisfactory placement block groups: 0 (0.0 %) Average block group size: 2.0 Missing block groups: 0 Corrupt block groups: 0 Missing internal blocks: 0 (0.0 %) FSCK ended at Tue Apr 10 04:34:14 UTC 2018 in 2 milliseconds The filesystem under path '/data/dir1' is HEALTHY