Schema inference feature
You can base a new table on a schema in a Parquet file using CREATE TABLE LIKE FILE PARQUET from Hive or Impala, and store the table in Iceberg.
The column definitions in the Iceberg table are inferred from the Parquet data file when
you create a table like Parquet from Hive or Impala. Set the following table property
for creating the table:
hive.parquet.infer.binary.as = <value>
Where <value> is binary (the default) or string.This property determines the interpretation of the unannotated Parquet binary type. Some systems expect binary to be interpreted as string.
Hive syntax
CREATE [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name LIKE PARQUET 'object_storage_path_of_parquet_file'
[PARTITIONED BY [SPEC]([col_name][, spec(value)][, spec(value)]...)]]
[STORED AS file_format]
STORED BY ICEBERG
[TBLPROPERTIES (property_name=property_value, ...)]
Impala syntax
CREATE TABLE [IF NOT EXISTS] [db_name.]table_name LIKE PARQUET 'object_storage_path_of_parquet_file'
[PARTITIONED BY [SPEC]([col_name][, spec(value)][, spec(value)]...)]]
STORED (AS | BY) ICEBERG
[TBLPROPERTIES (property_name=property_value, ...)]
Hive example
CREATE TABLE ctlf_table LIKE FILE PARQUET 's3a://testbucket/files/schema.parq'
STORED BY ICEBERG;
Impala example
CREATE TABLE ctlf_table LIKE FILE PARQUET 's3a://testbucket/files/schema.parq'
STORED BY ICEBERG;