July 26, 2024

Review the new features introduced in this release of Cloudera Data Warehouse (CDW) service on CDP Public Cloud.

What's new in CDW Public Cloud

General availability of Virtual Warehouse and Database Catalog workload version selections
The CDW UI now provides a list of workload versions that match your cluster from which you can select one during cluster installation. The Database Catalog list contains versions compatible with your Kubernetes version and your cluster environment (DWX version). The Virtual Warehouse list contains versions compatible with your Kubernetes version, your cluster environment (DWX version), and your Database Catalog version.
General availability of Impala workload-aware autoscaling
Workload-aware autoscaling allocates Impala Virtual Warehouse resources based on the workload that is running. You choose multiple executor group sets size based on your workload requirements, instead of the fixed executor group size of the previous auto-scaling implementation. This feature is now generally available. See Workload Aware Auto-Scaling in Impala.
Improved Impala Autoscaling Dashboard
You can now use the new Impala Autoscaling Dashboard to monitor Impala autoscaling in a warehouse that uses workload-aware autoscaling or the regular autoscaling. You can access the Impala Autoscaling Dashboard by going to the Virtual Warehouse Details page's Web UI tab, and clicking the Impala Autoscaler Web UI option. See About the Impala Autoscaling Dashboard.
Ability to forward Prometheus metrics from CDW to an external endpoint
In this release, you can configure Prometheus in CDW to push its metrics to an external endpoint, such as Prometheus, Grafana, Thanos, or some other endpoint. See Forwarding Prometheus metrics from CDW to an endpoint.
Automatically backing up and restoring CDW
This release adds more automation to back up and restore procedures for AWS and Azure environments and clarifies the documentation of the automatic, semi-automatic, and manual procedures.

To get the supported Kubernetes version for this release, you back up your old AWS or Azure environment and start up a new environment using the restoration process. The backup/restore feature saves your environment parameters, making it possible to recreate your environment with the same settings, URL, and connection strings you used in your previous environment.

Ability to configure Impala Statestore high availability
You can now configure high availability for Impala Statestore pods in a Virtual Warehouse, with active and passive modes ensuring continuity and reliability during failovers. See Configuring Impala Statestore high availability.
Downloading the UDF development package from CDW UI
Introducing the ability to download the Impala UDF development package directly from the CDW UI for enhanced convenience and integration, see Building and deploying UDFs

What's new in CDW on Azure environments

Azure AKS 1.29 upgrade
Cloudera supports the Azure Kubernetes Service (AKS) version 1.29. In 1.9.1-b233 (released July 26, 2024), when you activate an environment, CDW automatically provisions AKS 1.29. To upgrade to AKS 1.29 from an earlier version of CDW, you must backup and restore CDW. To avoid compatibility issues between CDW and AKS, upgrade to version 1.29.
Addition of new Azure instance types
This release offers the selection of the Standard_E16pds_v5 Azure Virtual Machine, an AKS Ampere® Altra® Arm-based instance type for an Impala Virtual Warehouse. For more information about using the instance type, see Activating an Azure environment from CDW.

What's new in CDW on AWS environments

AWS environment permissions support for the EKS start/stop feature
AWS permissions have been expanded to support the Elastic Kubernetes Service (EKS) start/stop feature:
  • rds:StartDBInstance
  • rds:StopDBInstance
  • rds:DescribeDBInstances
  • autoscaling:DescribeAutoScalingGroups
Addition of new AWS instance types
This release offers the selection of the r6gd.4xlarge and r7gd.4xlarge Arm-based instance types for an Impala Virtual Warehouse. For more information about using the instance type, see Activating an AWS environment from CDW.
Ability to use envelop encryption for EKS secrets
Envelope encryption is now added for EKS Secrets through CDW KMS Key by default. See Encrypt Kubernetes secrets with AWS KMS on existing clusters.

What's new in Iceberg on CDW Public Cloud

CDP support for Iceberg version 1.4.3
The Apache Iceberg component has been upgraded from 1.3.0 to 1.4.3.
Support for Iceberg data compaction
You can compact Iceberg tables and optimize them for read operations from Hive and Impala. Compaction is an essential table maintenance activity that creates a new snapshot, which contains the table content in a compact form. See Iceberg data compaction.
SQL support for querying Iceberg metadata tables
Apache Iceberg stores extensive metadata for its tables. From Hive and Impala, you can query the metadata tables as you would query a regular table. For example, you can use projections, joins, filters, and so on. See Query metadata tables feature.
Impala support for reading Iceberg equality deletes for NiFi
Cloudera supports row-level deletes, and starting with this release you can read equality deletes from Impala with suport added for Apache NiFi. See the Delete data feature.

What's new in Hue on CDW Public Cloud

General availability (GA) of the SQL AI Assistant
Hue leverages the power of Large Language Models (LLM) to help you generate SQL queries from natural language prompts and also provides options to optimize, explain, and fix queries, promoting efficient and accurate practices for accessing and manipulating data. You can use several AI services and models such as OpenAI’s GPT service, Amazon Bedrock, and Azure’s OpenAI service to run the Hue SQL AI assistant.
Introduction of task server in Hue and significant improvement in the file upload functionality
A new Task Server page has been added to the Hue web interface. The Hue task server enables the following functionalities:
  • It improves the file-upload experience, allowing you to upload multiple files up to 5 GB each in parallel.
  • It helps you to schedule tasks to clean up Hue documents and the /tmp directory, improving cluster maintenance experience and performance.
See About the Hue task server in CDW.