Iceberg changelog for Cloudera Data Warehouse on premises

Review the changes introduced in Iceberg for Cloudera Data Warehouse on premises.

2025.0.20.2-26

Hive - Iceberg changes
  • CDPD-88086: HIVE-25948: Optimize Iceberg writes by directing records either to Clustered or Fanout writer
  • DWX-16464: HIVE-28987: [Iceberg] A faulty query predicate can compromise transaction isolation
  • CDPD-87658: HIVE-28607: Iceberg: Syntax sugar for Iceberg Branching.
  • CDPD-85711: HIVE-28421: Iceberg: mvn test can not run UTs in iceberg-cacatlog
  • DWX-21107: HIVE-28935: Iceberg: Fix partition filtering condition in compaction query
  • HIVE-28837: Iceberg: PartitionsTable#partitions returns incomplete list in case of partition evolution and NULL partition values
  • HIVE-28727: Iceberg: Refactor IcebergTableUtil.toPartitionData
  • DWX-19764: HIVE-28644: Iceberg: Add support for SMART OPTIMIZE feature
  • DWX-20801: HIVE-28858: Iceberg: WriterRegistry is not thread-safe after HIVE-26319
  • CDPD-80575: HIVE-28817: Rebuilding a materialized view stored in Iceberg fails when schema has varchar column
  • DWX-19102: HIVE-28590: Iceberg: Add support for FILE_SIZE_THRESHOLD to compaction command
  • DWX-20366 HIVE-28759: Hive Query History - records are failed to be written due to iceberg worker pools shut down
  • CDPD-79520: HIVE-28762: Iceberg: Add support for partitions with transforms in Drop partition.
  • CDPD-59186, HIVE-28586: Iceberg: Support for write order in iceberg tables at CREATE TABLE
  • DWX-18940: HIVE-28764: Iceberg: Throw Exception in case of Drop Partition on transformed column.
  • DWX-16716: HIVE-28763: Iceberg: Support functions while expiring snapshots.
Imapla - Iceberg changes
  • CDPD-93162: CDPD-91087: IMPALA-14358: Event processing can invalidate Iceberg tables.
  • IMPALA-11512: Add tests for BINARY type support in Iceberg
  • CDPD-87935: Skip test_nested_array_from_iceberg_with_delete in FENG
  • IMPALA-13888: LEFT ANTI JOIN is not working with Iceberg V2 tables on the right side
  • IMPALA-14017: Add Ranger tests to Iceberg REST Catalog
  • IMPALA-14185: Error unnesting nested array from Iceberg with DELETE files
  • IMPALA-12337: Implement delete orphan files for Iceberg table
  • IMPALA-14123: Allow forcing predicate push down to Iceberg
  • IMPALA-14154: IllegalStateException with Iceberg table with DELETE
  • IMPALA-14142: Fix TestIcebergV2Table.test_compute_stats_table_sampling
  • IMPALA-14075: Add CatalogOpExecutor.icebergExecutorService_
  • IMPALA-11672: Update 'transient_lastDdlTime' for Iceberg tables
  • CDPD-82868: Skip test_hive_impala_iceberg_reloads in FENG
  • IMPALA-13931: TestIcebergRestCatalog.test_rest_catalog_basic failed at setup
  • IMPALA-13718: Skip reloading Iceberg tables when metadata JSON file is the same
  • IMPALA-13268: Integrate Iceberg ScanMetrics into Impala query profiles
  • IMPALA-13972: TestIcebergRestCatalog.test_rest_catalog_basic should check erasure coding
  • IMPALA-13933: run-iceberg-rest-server.sh should use IMPALA_MAVEN_OPTIONS
  • IMPALA-13882: Fix Iceberg v2 deletes with tuple caching
  • IMPALA-13932: Add file path and position-based duplicate check for IcebergMergeNode
  • IMPALA-13934: Do quick pointer comparison in IcebergDeleteBuilder
  • IMPALA-13586: Initial support for Iceberg REST Catalogs
  • IMPALA-13609: Store Iceberg snapshot id for COMPUTE STATS
  • IMPALA-13738 (Part1): Refactor IcebergTable.getPartialInfo()
  • IMPALA-13880: Skip Iceberg interop tests with Hive if the default filesystem is not HDFS
  • IMPALA-13611: Add interop tests for Iceberg tables
  • IMPALA-13836: Fix TestIcebergV2Table.test_missing_data_files failure in erasure-coding build
  • IMPALA-13854: IcebergPositionDeleteChannel uses incorrect capacity
  • IMPALA-13853: Don't adjust Iceberg field IDs for data files that don't have complex types
  • IMPALA-13674: Enable MERGE statement for Iceberg tables with equality deletes
  • IMPALA-13814: test_iceberg_table_metrics fails in non-DFS builds
  • IMPALA-13654: Tolerate missing data files of Iceberg tables
  • IMPALA-13739: Part2: Introduce IcebergFileDescriptor
  • IMPALA-13770 (Addendum): Close expressions in IcebergMergeCasePlan
  • IMPALA-13770: Updating Iceberg tables with UDFs crashes Impala
  • IMPALA-13768: Redundant Iceberg delete records are shuffled around which cause error "Invalid file path arrived at builder"
  • IMPALA-13737: Directly load file metadata via IcebergFileMetadataLoader

2025.0.19.1000-43

Hive - Iceberg changes
No new features or fixes.
Imapla - Iceberg changes
  • CDPD-87405: IMPALA-14185: Error unnesting nested array from Iceberg with DELETE files
  • CDPD-85228/IMPALA-14154: IllegalStateException with Iceberg table with DELETE
  • IMPALA-14014: Fix COMPUTE STATS with TABLESAMPLE clause

2025.0.19.1-74

Hive - Iceberg changes
  • CDPD-72164: HIVE-28276: Iceberg: Make Iceberg split threads configurable when table scanning
  • CDPD-72045: HIVE-28368: Iceberg: Unable to read PARTITIONS Metadata table.
  • CDPD-71812: HIVE-28353: Iceberg: Reading *Files Metadata table files if the column is of TIMESTAMP type.
  • CDPD-70374: HIVE-28275: Iceberg: Add support for 'If Not Exists' and 'or Replace' for Create Tag.
  • CDPD-72046: HIVE-28299: Iceberg: Optimize show partitions through column projection
  • DWX-18658: HIVE-28256: Iceberg: Major QB Compaction on partition level with evolution
  • CDPD-71472: HIVE-28323: Iceberg: Allow reading tables irrespective whether they were created with hive engined enabled or not.
  • DWX-18477: HIVE-28282: Merging into iceberg table fails with copy on write when values clause has a function call
  • CDPD-70373: HIVE-28274: Iceberg: Add support for 'If Not Exists' and 'or Replace' for Create Branch.
  • CDPD-70435: HIVE-27880: Iceberg: Support creating a branch on an empty table
  • CDPD-70188: HIVE-28278: Iceberg: Stats: IllegalStateException Invalid file: file length 0
  • CDPD-69309: HIVE-28132: Iceberg: Add support for Replace Tag.
  • CDPD-69704: HIVE-28266: Iceberg: select count(*) from data_files metadata tables gives wrong result
  • CDPD-69311: HIVE-28225: Iceberg: Delete on entire table fails on COW mode.
  • DWX-17603: HIVE-28077: Iceberg: Major QB Compaction on partition level
  • CDPD-68139: HIVE-28131: Iceberg: Add support for Replace Branch.
Impala - Iceberg changes
  • IMPALA-13932: Add file path and position-based duplicate check for IcebergMergeNode
  • IMPALA-13825: Extend Docker container build to custom base images
  • IMPALA-13854: IcebergPositionDeleteChannel uses incorrect capacity
  • IMPALA-13853: Don't adjust Iceberg field IDs for data files that don't have complex types
  • IMPALA-13737: Directly load file metadata via IcebergFileMetadataLoader
  • IMPALA-13789: Defer creating Path objects in loading file metadata
  • IMPALA-13772: Fix Workload Management DMLs Timeouts
  • IMPALA-13768: Redundant Iceberg delete records are shuffled around which cause error "Invalid file path arrived at builder"
  • IMPALA-13594: Read Puffin stats also from older snapshots
  • CDPD-78207: Disable ESTIMATE_DUPLICATE_IN_PREAGG in downstream
  • IMPALA-13205: Do not include Iceberg position fields for MERGE statements with INSERT merge clauses
  • IMPALA-13656: MERGE redundantly accumulates memory in HDFS WRITER
  • IMPALA-13324: Enable statement rewrite for merge queries for IcebergMergeImpl
  • IMPALA-13655: UPDATE redundantly accumulates memory in HDFS WRITER
  • IMPALA-13501: Clean up uncommitted Iceberg files after validation check failure
  • IMPALA-13086: Lower AggregationNode estimate using stats predicate
  • IMPALA-13305: Better thrift compatibility checks based on pyparsing
  • IMPALA-13589: SELECT INPUT__FILE__NAME can crash Impala
  • IMPALA-11265: Part2: Store Iceberg file descriptors in encoded format
  • IMPALA-13370: Read Puffin stats from metadata.json property if available
  • IMPALA-13495: Make exceptions from the Calcite planner easier to classify
  • IMPALA-13484: Don't call alter_table() on HMS when loading Iceberg table
  • IMPALA-13325: Use RowBatch::CopyRows in IcebergDeleteNode
  • IMPALA-13467: Fix partition list size calculation for empty Iceberg scan nodes
  • IMPALA-13247: Support Reading Puffin files for the current snapshot
  • IMPALA-12861: Fix mixed file format listing for Iceberg tables
  • IMPALA-13463: Impala should ignore case of Iceberg schema elements
  • IMPALA-13425: Iceberg tables crash server with Calcite planner
  • IMPALA-13364: Schema resolution doesn't work for migrated partitioned Iceberg tables that have complex types
  • IMPALA-13220: Docs for Iceberg DROP PARTITION
  • IMPALA-11265: Part1: Clear GroupContentFiles once used
  • IMPALA-12732: Add support for MERGE statements for Iceberg tables
  • IMPALA-13254: Optimize REFRESH for Iceberg tables
  • IMPALA-12867: Filter files to OPTIMIZE based on file size
  • IMPALA-13274: Filter out illegal output for certain join nodes
  • IMPALA-13296: Check column compatibility earlier for table migration
  • IMPALA-12850: Add better error message for REFRESH iceberg_tbl PARTITION(...)
  • IMPALA-12857: Add flag to enable merge-on-read even if tables are configured with copy-on-write
  • IMPALA-13088, IMPALA-13109: Use RoaringBitmap instead of sorted vector of int64s
  • IMPALA-13085: Add warning and NULL out DECIMAL values in Iceberg metadata tables
  • IMPALA-13079: Add support for FLOAT/DOUBLE in Iceberg metadata tables
  • IMPALA-11499: Refactor UrlEncode function to handle special characters
  • IMPALA-13035: Querying metadata tables from non-Iceberg tables throws IllegalArgumentException
  • IMPALA-12973,IMPALA-11491,IMPALA-12651: Support BINARY nested in complex types in select list
  • IMPALA-13002: Iceberg V2 tables with Avro delete files aren't read properly
  • IMPALA-12543: Detect self-events before finishing DDL
  • IMPALA-12990: Fix impala-shell handling of unset rows_deleted
  • IMPALA-13003: Handle Iceberg AlreadyExistsException
  • IMPALA-13006: Restrict Iceberg tables to Parquet
  • IMPALA-12996: Add support for DATE in Iceberg metadata tables
  • IMPALA-12810: Simplify IcebergDeleteNode and IcebergDeleteBuilder
  • IMPALA-12991: Eliminate unnecessary SORT for Iceberg DELETEs
  • IMPALA-12970: Fix ConcurrentModificationException for Iceberg table scans