Update data feature

From Hive or Impala, you can update data from a V2 Iceberg table.

The Iceberg V2 format allows row-level modifictions using delete files. Row-level updates are supported. Hive and Impala write position to denote deleted or updated records. For information about position delete files, see the Delete data feature and Row-level operations topics.

Updating table data generates a new snapshot for the table.

You can use a WHERE clause in your UPDATE statement. For example:

update tbl_ice set b='Changed' where b in (select b from tbl_ice where a < 4);

Hive or Impala evaluates rows from the target table against a WHERE clause, and updates all the rows that match the WHERE condition. The WHERE condition is similar to the WHERE condition used in SELECT.

Impala prerequisites

The target table should be an Iceberg V2 table. Therefore set the following table property during table creation, or use ALTER TABLE SET TBLPROPERTIES on existing Iceberg V1 tables: 'format-version' = '2'

You must write data and delete files in Parquet format. Set the following table properties to meet this prerequisite:
  • 'write.format.default' = 'PARQUET'
  • 'write.delete.format.default' = 'PARQUET'

Hive or Impala syntax

update tablename set column = value [, column = value ...] [where expression]

Hive examples

create external table tbl_ice(a int, b string, c int) stored by iceberg tblproperties ('format-version'='2');
insert into tbl_ice values (1, 'one', 50), (2, 'two', 51), (3, 'three', 52), (4, 'four', 53), (5, 'five', 54), (111, 'one', 55), (333, 'two', 56);
        
update tbl_ice set b='Changed' where b in (select b from tbl_ice where a < 4);

Updating a table from a non-Iceberg source

From Impala, you can use UPDATE FROM to update anIceberg table based on a source table or view that is another Iceberg or non-Iceberg table. For example:
UPDATE tbl_ice SET tbl_ice.one = t.one, tbl_ice.two = t.two, FROM tbl_ice, tbl_hive one where tbl_ice.pk = one.pk;