How Iceberg replication policy works
Replication Manager performs several steps to replicate the Iceberg tables when you create or run an Iceberg replication policy.
The following list shows a few high-level steps that are completed during the replication process:
How Atlas metadata replication for Iceberg tables work
Atlas metadata for the chosen Iceberg tables can be replicated using Iceberg replication policies.
During the Iceberg replication policy creation process, if you:
- choose the
- runs a bootstrap replication for all the chosen Iceberg tables and its Atlas metadata during the first replication policy run. Bootstrap replication replicates all the available Iceberg data and its associated Atlas metadata.
- runs incremental replication on the Iceberg data and its Atlas metadata during subsequent replication runs. Here, the delta data and metadata gets replicated during each run.
option, Replication Manager: - choose to replicate an Iceberg table that was created using 'create table as select (CTAS)', Replication Manager sets the Skip lineage option to false and the Fetch type option to CONNECTED during the Iceberg replication policy run.
Use case
You have an original or base table named T1. You create table T2 using CTAS from T1.
Similarly, you create T3 from T2, and T4 from T3. During the Iceberg replication policy
creation process, you choose T2 as source table, and then choose Replicate Atlas
metadata. In this scenario, Replication Manager performs the following tasks
during the replication policy run:
- Sets Skip lineage to false, and Fetch type to CONNECTED during Atlas replication step.
- Replicates T2 and all the Atlas entities connected to it, which includes the hdfs_path.
- Replicates T1 and T3 Iceberg tables.