May 2, 2024 - Hotfix

Review the fixed issues and changed behaviors in this hotfix release of Cloudera Data Warehouse on Public Cloud.

Fixed issues

This release of the Cloudera Data Warehouse (CDW) service on CDP Public Cloud introduces these changes.

Upgrade your Virtual Warehouse to get these hotfixes:
  • HIVE-28051 A new housekeeping task is added to clean up local folders on Hive Virtual Warehouse startup and periodically after startup, resolving the issue with disk overflow.

    When a Hive LLAP daemon crashes, it can leave behind unnecessary files in the LLAP local directories. These files have to be cleaned periodically without which Hive queries can fail indicating an invalid disk error exception.

    The following properties are introduced to periodically delete the unnecessary files:

    hive.llap.local.dir.cleaner.cleanup.interval
    Specifies the time interval based on which the LocalDirCleaner service in LLAP daemon checks for stale or old files.
    Default value: 2 hours
    hive.llap.local.dir.cleaner.file.modify.time.threshold
    Specifies the threshold time for the LocalDirCLeaner service. If a file is older than the threshold time, the file is deleted.
    Default value: 24 hours
  • IMPALA-12827 Fixes event processing errors when write IDs of an AbortTxnEvent are cleaned up by the HMS cleaner housekeeping threads.

  • IMPALA-12831 Fix race condition when a table is being invalidated and updated concurrently.

  • IMPALA-12832 Implicitly invalidates a table instead of resulting in an ERROR state during event processing.

  • IMPALA-12835 Fix event processor, which is not synching file metadata for non-partitioned ACID tables when incremental refresh on transactional tables is turned off.

  • IMPALA-12851 Fix issue of txnId not being added to tableWriteIds mapping in Catalog.

  • IMPALA-12855 Fixes the possibility of encountering a NullPointerException when refreshing a partition that has just been dropped.

  • IMPALA-12356 Fix for incorrect identification on self events.

  • IMPALA-12969 Fix conditional JVM heap leak in array allocation on deserialization failures.

Behavior changes

This release of the Cloudera Data Warehouse (CDW) service on CDP Public Cloud has the following behavior changes:

Summary:
Change in value for the write.delete.mode property
Before this release:
The value for the write.delete.mode property was set to 'merge-on-read'.
After this release:
The default value for the write.delete.mode property is changed to 'copy-on-write'.

This change might result in issues while deleting Iceberg table records.

If you want to continue to use the 'merge-on-read' mode for new Iceberg tables, perform the following steps:

  1. Log in to the CDP web interface and navigate to the Data Warehouse service.
  2. From the Data Warehouse service, click Database Catalogs, locate your Database Catalog and then click > Edit.
  3. In the Database Catalogs detail page, click Configurations > Metastore and select the hive-site configuration file.
  4. Click and add the following configuration key and key value:
    write.delete.mode=merge-on-read
  5. Click Apply Changes.
  6. Click Virtual Warehouses, locate your virtual warehouse and then click > Edit.
  7. In the Virtual Warehouse Details page, click Configurations > Hiveserver2 and select the hive-site configuration file.
  8. Click and add the following configuration key and key value:
    write.delete.mode=merge-on-read
  9. Click Apply Changes.

If you want to set 'merge-on-read' for older Iceberg tables that were created before upgrading to this CDW version, perform the following steps:

  1. Run the following queries to modify the table properties of the old Iceberg tables:
    ALTER TABLE [***OLD TABLE NAME***] SET TBLPROPERTIES('write.on.delete'='dummy');
    ALTER TABLE [***OLD TABLE NAME***] SET TBLPROPERTIES('write.on.delete'='merge-on-read');