Creating a table from Parquet data
In Unified Analytics, you can base a new table on a schema in a Parquet file using the CREATE TABLE LIKE FILE PARQUET statement. You are likely familiar with this statement if you have been an Impala user. The Impala counterpart CREATE TABLE LIKE PARQUET lacks the keyword LIKE, but is functionally similar.
The column definitions in the new table are inferred from the Parquet data file when
you create a table like Parquet in Unified Analytics. Set the following table
property for creating the table:
hive.parquet.infer.binary.as = <value>
Where <value> is
binary (the default) or string.This property determines the interpretation of the unannotated Parquet binary type. Some systems expect binary to be interpreted as string.
Use the following syntax for creating a table like Parquet.CREATE [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name LIKE PARQUET FILE 'object_storage_path_of_parquet_file'
[PARTITIONED BY (col_name data_type [COMMENT 'col_comment'], ...)]
[SORT BY ([column [, column ...]])] [COMMENT 'table_comment'] [ROW FORMAT row_format]
[WITH SERDEPROPERTIES ('key1'='value1', 'key2'='value2', ...)] [STORED AS file_format]
[LOCATION 'object_storage_path']
[TBLPROPERTIES ('key1'='value1', 'key2'='value2', ...)]
In this task, you create a table like parquet that uses the default value
binary
for hive.parquet.infer.binary.as
.
Create a table based on a parquet table.
CREATE TABLE ua_table LIKE FILE PARQUET 's3a://testbucket/files/schema.parq';