Making row-level changes on V2 tables only
Learn the types of workloads best suited for Iceberg and when using V2 tables might improve query response.
Iceberg atomic DELETE and UPDATE operations resemble traditional RDBMS systems, but are not suitable for OLTP workloads. Iceberg is not designed to handle high frequency transactions. To handle very large datasets and frequent updates, use Apache Kudu.
Use Iceberg for managing large, infrequently changing datasets. You can update and delete Iceberg V2 tables at the row-level and not incur the overhead of rewriting the data files of V1 tables. Iceberg stores information about the deleted records in position delete files. These files store the file paths and positions of the deleted records, eliminating the need to rewrite the files. Iceberg performs a DELETE plus an INSERT operation in a single transaction. This technique speeds up queries. Query engines scan the data files and delete files associated with a snapshot and merge them, removing the deleted rows. For example, to remove all data belonging to a single customer:
DELETE FROM ice_tbl WHERE user_id = 1234;
To update a column value in a specific record:
UPDATE ice_tbl SET col_v = col_v + 1 WHERE id = 4321;
You can convert an Iceberg v1 table to v2 by setting a table property as follows:'format-version' = '2'.