Using Apache IcebergPDF version

Branching feature

Branches are references to snapshots that have a lifecycle of their own. You can create a branch by basing the branch on a snapshot ID, a timestamp, or the state of your table. Using the SNAPSHOT RETENTION clause, you can create a branch that limits the number of snapshots of a table.

Iceberg branching is available in Hive only. Iceberg branches are not supported in Impala or Spark.

The following syntax lists timestamps and IDs of snapshots of an Iceberg table. You can use the list of snapshots to create branches and tags.

SELECT * FROM <database>.<table name>.HISTORY

The following syntax lists the branches and tags of a table.

SELECT * from <database>.<table name>.REFS

Use either system version or system time in the following syntax from Hive to create a branch. Tables must be Iceberg V2 tables.

ALTER TABLE <table name> CREATE BRANCH <branch name> FOR SYSTEM_VERSION AS OF <SNAPSHOT_ID>

ALTER TABLE <table name> CREATE BRANCH <branch name> FOR SYSTEM_TIME AS OF 'time_stamp' [expression]

If you do not have the ID or timestamp of a snapshot, you can also create a branch using the table name only and omitting the FOR clause

ALTER TABLE <table name> CREATE BRANCH <branch name>

This syntax creates a branch having the same state as the table.

When you create a branch, you can limit the number of snapshots a branch retains.

ALTER TABLE <table name> CREATE BRANCH <branch name> FOR SYSTEM_VERSION AS OF <timestamp> WITH SNAPSHOT RETENTION <integer limit> SNAPSHOTS;

From Hive, you can ingest data into an Iceberg branch using dot notation as you would a SQL table. The branch name prefix branch_ must be lowercase.

INSERT INTO TABLE <database name>.<table name>.branch_<branch name> VALUES (<column name>[, <column name> ...]

You can use SQL syntax to read, update, and delete data in a branch.

SELECT <column name 1>, <column name 2>, ... FROM <database name>.<table name>.branch_<branch name>

UPDATE TABLE <database name>.<table name>.branch_<branch name> SET <column name>=<new value>, <column name>=<new value> ... WHERE <condition>

DELETE FROM <database name>.<table name>.branch_<branch name> WHERE <condition>

Fast forwarding a branch updates the state of one branch to another branch within its hierarchy. For example, you can fast-forward branch x to branch z as shown in the following example:

ALTER TABLE <table name> EXECUTE FAST-FORWARD 'x' 'z';

Branch x must be an ancestor of branch z. If you omit the second branch name, the named branch is fast-forwarded to the current branch.

You can delete a branch related to a particular table, using the following syntax:

ALTER TABLE <table name> DROP BRANCH [IF EXISTS] <branch name>