Creating Iceberg tables

Apache Iceberg is an open, high-performance table format for organizing datasets that can contain petabytes of data. Iceberg can be used to add tables to computing engines, such as Apache Hive and Apache Flink, from which the data can be queried using SQL.

As Iceberg is integrated as a connector to Flink, you can use the table format the same way for SQL Stream Builder (SSB). When creating an Iceberg table in SSB, you need to also have a Hive catalog set up and registered.

After setting up Hive for SSB, you can define Iceberg as a connector in the CREATE TABLE statement as the example shows below:
CREATE TABLE ‘ssb’.’ssb_default’.’iceberg_hive’ (
‘column_int’ INT,
‘column_str’ STRING,
) WITH (
‘connector’ = ‘iceberg’,
‘catalog-database’ = ‘test_db’,
‘catalog-type’ = ‘hive’,
‘catalog-name’ = ‘iceberg_hive_catalog’,
‘catalog-table’ = ‘iceberg_hive_table’,
‘ssb-hive-catalog’ = ‘ssb_hive_catalog’
)
The following properties are mandatory when using the Iceberg connector:
Property Example Description
catalog-database test_db The Iceberg database name in the backend catalog, uses the current Flink database name by default. It will be created automatically if it does not exist when writing records into the Flink table
catalog-type hive Type of the catalog, in case of Iceberg this must be Hive
catalog-name iceberg_hive_catalog Name of the user-specified, internal catalog that is used by the connector. It is required as the connector does not have any default value.
catalog-table iceberg_hive_table Name of the Iceberg table in the backend catalog.
ssb-hive-catalog ssb_hive_catalog The name of the Hive catalog you have provided when adding Hive as a catalog