Erasure Coding data

The following write path, read path, and container replication applies to Ozone EC data.

Write path

  • You can configure the placement of open EC containers using the ozone.scm.container.placement.ec.impl configuration key.
  • The pipeline placement policy is available in the org.apache.hdds.scm.container.placement.algorithms package.
    • By default, the SCMContainerPlacementRackScatter is used for topology awareness. Currently, this is the only pipeline placement policy implemented for EC in Ozone.
    • To change to a user-customized implementation, use the following property
      <property>
         <name>ozone.scm.container.placement.ec.impl</name>
         <value>full_class_name_of_the_customized_implementation</value>
      </property>
This SCMContainerPlacementRackScatter placement policy will try to distribute the replicas of an EC container on datanodes on as many racks as possible. For example, if the EC policy used is RS-3-2-1024k, then this policy will try to distribute the 5(3+2) replicas of an EC container to 5 datanodes, each under a different rack, as much as possible.

Read path

For an EC container, each replica contains different pieces of data. Data is read as requested. There is no topology configuration here.

Container replication

Currently, closed EC containers’ replication and balance use the same placement policy described in the Write Path section. That is, the property ozone.scm.container.placement.ec.impl with default implementation SCMContainerPlacementRackScatter applies to both open containers write and closed containers replication and balance.