Migrating a Hive table to Iceberg

You see how to use a simple ALTER TABLE statement to migrate an external Hive table to an Iceberg table. You see how to configure table input and output by setting table properties.

When you migrate an external Hive table to Iceberg, Hive makes the following changes:

  • Converts the storage_handler, serde, inputformat and outputformat properties of the table in HMS to use the Iceberg specific classes.

  • Reads the footers of the existing data files and generates the necessary Iceberg metadata files based on the footers.

  • Commits all the data files to the Iceberg table in a single commit.

  • You must meet the prerequisites for using Iceberg mentioned earlier.
  • .
  1. Log in to the CDP web interface and navigate to the Data Warehouse service.
  2. In the Data Warehouse service, in the Overview page, locate your Hive Virtual Warehouse, and click Hue.
    Instead of using Hue, you can connect over JDBC to the Hive Virtual Warehouse, and run the query.
  3. Enter a query to use a database.
    For example:
    USE mydb;
  4. Enter a Hive query to migrate an existing external Hive table to an Iceberg v2 table.
    For example:
    ALTER TABLE tbl 
    SET TBLPROPERTIES ('storage_handler'='org.apache.iceberg.mr.hive.HiveIcebergStorageHandler',
     'format-version' = '2');
    Do not drop the table as explained above unless you set the 'external.table.purge' table property to false.
  5. Click to run the queries.
    An Iceberg V2 table is created. The Hive table remains intact.