Creating an Iceberg partitioned table

The ease of use of the Iceberg partitioning is clear from an example of how to partition a table using the backward compatible, identity-partition syntax. Alternatively, you can partition an Iceberg table by column values from Hive or Impala.

Iceberg transparently partitions when creating an Iceberg table even if you do not specify partitioning as you would when creating a Hive or Impala table. Iceberg hides information about transparently created partitions. The Hive/Impala columns represent partitions that do not normally appear in an Iceberg table.

You can specify partitioning that is backward compatible with Iceberg V1 using the PARTITION BY clause. This type of table is called an identity-partitioned table. Such a table includes the columns that represent each partition. When you specify partitioning on an Iceberg table, the HMS metadata is unpartitioned and the underlying Iceberg table is partitioned.

  • You must meet the prerequisites to query Iceberg tables from a Virtual Warehouse mentioned earlier.
  1. Create a Hive or Impala Virtual Warehouse.
  2. In Hue, select a database.
  3. Create an identity-partitioned table.
    Hive:
    CREATE EXTERNAL TABLE ice_ext1 (i int, s string, ts timestamp, d date) PARTITIONED BY (state string)
    STORED BY ICEBERG 
    STORED AS ORC;
    Impala:
    CREATE TABLE ice_ext2 (i int, s string, ts timestamp, d date) PARTITIONED BY (state string)
    STORED AS ICEBERG;
  4. Click to run the query.
  5. Create a table and specify an identity transform, such as bucket, truncate, or date, using the Iceberg V2 PARTITION BY SPEC clause.
    Hive:
    CREATE EXTERNAL TABLE ice_orc (i int, s string, ts timestamp, d date) PARTITIONED BY SPEC(date(ts), truncate(4, state))
    STORED BY ICEBERG 
    STORED AS ORC;
    Impala:
    CREATE TABLE ice_par (i int, s string, ts timestamp, d date) PARTITIONED BY SPEC(date(ts), truncate(4, state))
    STORED AS ICEBERG;
  6. Click to run the query.