Insert table data feature
From Hive and Impala, you can insert data into Iceberg tables using the standard INSERT INTO a single table.
You can replace data in the table with the result of a query. To replace data, Hive and Impala dynamically overwrite partitions that have rows returned by the SELECT query. Partitions that do not have rows returned by the SELECT query, are not replaced. Using INSERT OVERWRITE on tables that use the BUCKET partition transform is not recommended. Results are unpredictable because dynamic overwrite behavior would be too random in this case.
From Hive, CDP also supports inserting into multiple tables as a technical preview; however, this operation is not atomic, so data consistency of Iceberg tables is equivalent to that of Hive external tables. Changes within a single table will remain atomic.
Inserting, deleting, or updating table data generates a snapshot. A new snapshot corresponds to a new manifest list. Manifest lists are named snap-*.avro.
Iceberg specification defines sort orders. At this point, Hive doesn’t support defining sort orders. But if there are sort orders defined by using other engines Hive can utilize them on write operations. For more information about sorting, see sort orders specification.
Hive or Impala syntax
INSERT INTO TABLE tablename VALUES values_row [, values_row ...]
INSERT INTO TABLE tablename1 select_statement1 FROM tablename2
INSERT OVERWRITE TABLE tablename1 select_statement1 FROM tablename2
Hive or Impala examples
CREATE TABLE ice_10 (i INT, s STRING, b BOOLEAN) STORED BY ICEBERG;
INSERT INTO ice_10 VALUES (1, 'asf', true);
CREATE TABLE ice_11 (i INT, s STRING, b BOOLEAN) STORED BY ICEBERG;
INSERT INTO ice_11 VALUES (2, 'apache', false);
INSERT INTO ice_11 SELECT * FROM ice_10;
SELECT * FROM ice_11;
INSERT OVERWRITE ice_11 SELECT * FROM ice_10;
Hive example
FROM customers
INSERT INTO target1 SELECT customer_id, first_name;
INSERT INTO target2 SELECT last_name, customer_id;