Replicating Iceberg tables stored in FSO buckets

Learn how to replicate Iceberg tables stored in FSO buckets.

  1. If you are using a secure source cluster, authenticate using the Ozone Manager (OM) keytab.
    kinit -kt /cdep/keytabs/om.keytab om
  2. Using the Ozone shell on the source cluster, create a volume and bucket on the target cluster.
    ozone sh volume create [ ***VOLUME NAME*** ]
    ozone sh bucket create [ ***VOLUME NAME*** ]/[ ***BUCKET NAME*** ]
  3. Add the mandatory advanced configuration snippet, depending on the Cloudera Base on premises source cluster version, for the Cloudera Manager > [ ***CORE_SETTINGS ***] > Configuration > Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml property:
    For 7.1.9, add fs.ofs.impl = org.apache.hadoop.fs.ozone.BasicRootedOzoneFileSystem

    For 7.3.2, add fs.ofs.impl = org.apache.hadoop.fs.ozone.RootedOzoneFileSystem

  4. Save and refresh the stale configuration.
  5. Add the mandatory advanced configuration snippet, depending on the Cloudera Base on premises target cluster version, for the Cloudera Manager > [***CORE_SETTINGS***] > Configuration > Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml property:
    For 7.1.9, add fs.ofs.impl = org.apache.hadoop.fs.ozone.BasicRootedOzoneFileSystem

    For 7.3.2, add fs.ofs.impl = org.apache.hadoop.fs.ozone.RootedOzoneFileSystem

  6. Save and refresh the stale configuration.
  7. Create an Iceberg table on the source cluster.
    Example:
    create table tb1(id int, val int) stored by iceberg location 'ofs://[*** OM SERVICE ID ***]/[*** VOLUME ***]/[*** BUCKET ***]/[*** KEY ***];
  8. Enable the ‘Iceberg on Ozone replication’ feature flag.
  9. Add the source cluster as a peer before creating the Iceberg replication policy.
  10. Create the Iceberg replication policy by providing the following mandatory details in the Create Iceberg replication policy wizard:
    On the General tab, set the Source Storage Filter field to OZONE and configure the rest of the fields as required.

    On the Advanced tab, set the Location Mapping field to:

    ofs://[** SOURCE OM SERVICE ID ***]/]*** SOURCE VOLUME ***]/[*** SOURCE_BUCKET ***] ---> ofs://[*** TARGET OM SERVICE ID ***]/[*** TARGET VOLUME ***]/[*** TARGET_BUCKET ***]