Schema inference feature

You can base a new table on a schema in a Parquet file using CREATE TABLE LIKE FILE PARQUET from Hive or Impala, and store the table in Iceberg.

The column definitions in the Iceberg table are inferred from the Parquet data file when you create a table like Parquet from Hive or Impala. Set the following table property for creating the table:
hive.parquet.infer.binary.as = <value>
Where <value> is binary (the default) or string.

This property determines the interpretation of the unannotated Parquet binary type. Some systems expect binary to be interpreted as string.

Hive syntax

CREATE [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name LIKE PARQUET 'object_storage_path_of_parquet_file' 	  
[PARTITIONED BY [SPEC]([col_name][, spec(value)][, spec(value)]...)]]
[STORED AS file_format]
STORED BY ICEBERG
[TBLPROPERTIES (property_name=property_value, ...)] 

Impala syntax

CREATE TABLE [IF NOT EXISTS] [db_name.]table_name LIKE PARQUET 'object_storage_path_of_parquet_file' 	  
[PARTITIONED BY [SPEC]([col_name][, spec(value)][, spec(value)]...)]]
STORED (AS | BY) ICEBERG
[TBLPROPERTIES (property_name=property_value, ...)]

Hive example

CREATE TABLE ctlf_table LIKE FILE PARQUET 's3a://testbucket/files/schema.parq'
STORED BY ICEBERG;

Impala example

CREATE TABLE ctlf_table LIKE FILE PARQUET 's3a://testbucket/files/schema.parq'
STORED BY ICEBERG;