Load data inpath feature

From Hive or Impala, you can load Parquet or ORC data from a file in a directory on your file system or object store into an Iceberg table. For Impala, you might need to set the mem_limit or pool configuration (max-query-mem-limit, min-query-mem-limit) to accommodate the load.

Hive syntax🔗

LOAD DATA [LOCAL] INPATH '<path to file>' [OVERWRITE] INTO TABLE tablename;

Hive example🔗

LOAD DATA LOCAL INPATH '/tmp/some_db/files/part.orc' INTO TABLE ice_orc;

LOAD DATA LOCAL INPATH '/tmp/some_db/files/part.orc' OVERWRITE INTO TABLE ice_orc;

Impala syntax🔗

LOAD DATA INPATH '<path to file>' INTO TABLE tablename;

Impala example🔗

In this example, you create a table using the LIKE clause to point to a table stored as Parquet. This is required for Iceberg to infer the schema. You also load data stored as ORC.

CREATE TABLE test_iceberg LIKE my_parquet_table STORED AS ICEBERG;
SET MEM_LIMIT=1MB;

LOAD DATA INPATH '/tmp/some_db/parquet_files/' INTO TABLE iceberg_tbl;
    
LOAD DATA INPATH '/tmp/some_db/orc_files/' INTO TABLE iceberg2_tbl;