Creating a new Iceberg table from Spark 3

In Cloudera Data Engineering (CDE), you can create a Spark job that creates a new Iceberg table or import an existing Hive table. Once created, the table can be used for subsequent operations.

An example Spark SQL creation command to create a new Iceberg table is as follows:

spark.sql("""CREATE EXTERNAL TABLE ice_t (idx int, name string, state string)
USING iceberg
PARTITIONED BY (state)""")

For information about creating tables, see the Iceberg documentation.

Creating an Iceberg table format v2🔗

To use the Iceberg table format v2, set the format-version property to 2 as shown below:

CREATE TABLE logs (app string, lvl string, message string, event_ts timestamp) USING iceberg TBLPROPERTIES ('format-version' = '2')

<delete-mode> <update-mode> and <merge-mode> can be specified during table creation for modes of the respective operation. If unspecified, they default to merge-on-read.

Unsupported Feature: Create table … like🔗

The create table … like feature is not supported in Spark:

CREATE TABLE <target> LIKE <source> USING iceberg

Here, <source> is an existing Iceberg table. This operation may appear to succeed and does not display errors and only warnings, but the resulting table is not a usable table.

Creating a new Iceberg table from Spark 3

Creating an Iceberg table format v2🔗

Unsupported Feature: Create table … like🔗

We want your opinion

How can we improve this page?