CDP Public Cloud: August 2024 Release Summary

The CDP Public Cloud Release Summary summarizes major features introduced in CDP Public Cloud Management Console, Data Hub, and data services.

Data Engineering

This release (1.22.0-h1) of the Cloudera Data Engineering (CDE) service on CDP Public Cloud does not contain new features, but includes the following fixes:

DEX-13103: Fix dex-runtime-python-builder to use python 3.8 even for Spark 3.3+

Apache Spark now comes with Python 3.8 installed for versions 3.3 and 3.5.

DEX-14027: Spark 3.5.1 on RAZ; few jobs are failing with error ‘org.apache.hadoop.fs.s3a.impl.InstantiationIOException’

Jobs running on RAZ-enabled clusters on Apache Spark 3.5.1 failed with the org.apache.hadoop.fs.s3a.impl.InstantiationIOException error.

DEX-14037: Restore tries to create deleted default VC

When you performed the CDE jobs backup and restore operation, CDE tried to restore a default VC as well that had been deleted before the backup was created. This has been corrected, and now, only those VCs are restored that are included in the Backup archive.

DEX-14231: Allow setting Spark configs to empty & multiple ‘=’ separators values in the UI

You can use the empty value and multiple ‘=’ separator values in the UI for Spark configurations in the create-job and VC-level Spark configurations.

DEX-14313: CDE Backup/Restore Upgrade Does Not Update K8 version

When you performed the CDE jobs backup and restore operation, during the upgrade, the Kubernetes version was not upgraded.

DEX-14412: Runtime API sometimes crashes due to panic: interface conversion: interface {} is cache.DeletedFinalStateUnknown, not *v1.Event

The Runtime API sometimes crashed with the panic: interface conversion: interface {} is cache.DeletedFinalStateUnknown, not *v1.Event error.

DEX-14424: TGT Secret name validation and sanitisation incorrect

The TGT Kerberos tickets are saved as a secret with a name that is based on the CDP username. An issue related to the validation and sanitization of the CDP usernames has been fixed.

Data Warehouse

Review the fixed issues and changed behaviors in this hotfix release of Cloudera Data Warehouse on Public Cloud:

Fixed issues

DWX-19003: Unable to set t-shirt size for a Cloudera Data Visualization instance

You could not set or configure a t-shirt size for a Cloudera Data Visualization instance while creating or editing it from CDW. By default, any Cloudera Data Visualization instance created in CDW release version 1.9.1 used the small t-shirt size. This issue has been resolved.

DWX-19034: Unable to create an Impala Virtual Warehouse with HA enabled on older runtime versions

Enabling HA on Impala Virtual Warehouses on runtime versions lower than 2024.0.18.0-206 caused errors such as the following on the catalogd pods:

I0805 13:33:15.674875 1 thrift-util.cc:198] TSocket::open() getaddrinfo() <Host: statestored-0 Port: 24000>Name or service not known
I0805 13:33:15.674933 1 thrift-client.cc:82] Couldn't open transport for statestored-0:24000 (Could not resolve host for client socket.)

This was because enabling HA on an Impala Virtual Warehouse enabled Statestore HA, too. Statestore HA is available starting with runtime 2024.0.18.0-206. This issue has been resolved.

DWX-19035: Service discovery does not work with Impala HA

If you created an Impala Virtual Warehouse with HA enabled in CDW 1.9.1-b233, and then tried to link to it from Cloudera Data Visualization, you could not view the required Virtual Warehouse on the Create New Data Connection modal. This issue has been resolved.

IMPALA-13270: Addressing IllegalStateException in Complex Views post upgrade

When executing queries that generate runtime filters where the same column identifier appears repeatedly after upgrading to the 2024.0.18.0-206 runtime, you encountered the following error: IllegalStateException: null. This issue has been fixed.

IMPALA-13272: Stability Improvement for analytic functions on collections

The sorting process incorrectly included unnecessary elements, causing errors during array operations and leading to frequent query failures. This issue has been fixed to ensure only complete data entries are used in the sorting process. This prevents crashes and maintains stable execution of analytic functions on collections.

Known issues

Review the known issues in this release of the Cloudera Data Warehouse (CDW) service on Public Cloud.

DWX-19138: The option to enable Impala query logging is unavailable on the CDW UI

You do not see the Log Impala queries option while creating or editing a Virtual Warehouse on the CDW UI.

Workaround: Use CDP CLI to create or edit a Virtual Warehouse and specify the --impala-query-log option to enable logging Impala queries.

Behavior changes

This release of the Cloudera Data Warehouse (CDW) service on CDP Public Cloud has the following behavior changes:

Summary: Selecting a resource template replaces configuring Cloudera Data Visualization T-shirt sizes

Before this release: You could configure the following sizes for your Cloudera Data Visualization instance:

  • Small (default), 8Gb
  • Medium, 16Gb
  • Large, 24Gb

After this release: You can configure the following CDW resource template for your Cloudera Data Visualization instance:

  • Default resources
  • Medium resources
  • Large resources

Observability

Cloudera Observability Premium for Cloudera Data Hub

Cloudera Observability Premium is now generally available for Cloudera Data Hub - Public Cloud customers on version 7.2.18 SP1 (7.2.18.100) and higher. Cloudera Observability introduces new governance, auditing, and monitoring capabilities for Cloudera workloads. Key premium features include: