Known Issues Iceberg
Learn about the known issues in Iceberg, the impact or changes to the functionality, and the workaround.
- CDPD-57551: Performance issue can occur on reads after writes of Iceberg tables
- Hive might generate too many small files, which causes performance degradation.
- Maintain a relatively small number of data files under the
iceberg table/partition directory to have efficient reads. To alleviate poor performance
caused by too many small files, run the following queries:
TRUNCATE TABLE target; INSERT OVERWRITE TABLE target select * from target FOR SYSTEM_VERSION AS OF <preTruncateSnapshotId>;
- CDPD-55243: Spark can read stale data under certain conditions
Using inconsistent cases for database and table names of Iceberg tables in queries can lead to Spark reading stale data. In particular, after any write to the table (append, update, delete) in the same session, an old snapshot may still be read, unless all queries use consistent casing for database and table names.
- Use consistent casing for database and table names in all queries.