Cloudera Public Cloud: December 2024 Release Summary
The Release Summary of Cloudera Public Cloud summarizes major features introduced in Cloudera Management Console, Cloudera Data Hub, and data services.
Cloudera Data Catalog
Cloudera Data Catalog 2.0.28 introduces the following new changes:
Column name based tagging in
You can override the sampling to profile data in a column based on the column name matching a preset regular expression pattern instead matching the certain percentage of the column values. This can be used for assets with skewed proportions where relying on the sampling would not result in correct tagging.
For more information, see Cluster Sensitivity Profiler configuration and Setting up column name based tagging.
Cloudera DataFlow
Cloudera DataFlow 2.9.0-h4-b3 hotfix resolves an issue where environments with a high number of deployments could unpredictably fail to upgrade during a standard upgrade process.
Fixed issues
Fixed an issue, identified during testing, where environments with 50+ flow deployments sometimes encountered incomplete upgrade processes. This release ensures that upgrades can complete successfully, even for environments with a large number of deployments.
Cloudera Data Hub
This release of the Cloudera Data Hub service introduces the following changes:
Cloudera Runtime 7.3.1
Cloudera Runtime 7.3.1 is now available and can be used for creating Cloudera Data Hub clusters. For more information about the new Cloudera Runtime version, see Cloudera Runtime. If you need to upgrade your existing Cloudera environment, your upgrade path may be complex. To determine your upgrade path, refer to Upgrading to Runtime 7.3.1 documentation.
Cloudera Management Console
This release of the Cloudera Management Console service introduces the following changes:
Cloudera Runtime 7.3.1
Cloudera Runtime 7.3.1 is now available and can be used for registering an environment with a 7.3.1 Data Lake. For more information about the new Cloudera Runtime version, see Cloudera Runtime. If you need to upgrade your existing Cloudera environment, your upgrade path may be complex. To determine your upgrade path, refer to Upgrading to Runtime 7.3.1 documentation.
Cloudera Observability
The Real-time monitoring feature of Cloudera Observability now supports a high-availability cluster setup with multiple active HiveServer2 roles. This feature requires CDP Data Hub 7.2.18.500 and higher.
For more information, see Monitor environment health and performance using Cloudera Observability Real-time monitoring.
Fixed issues
Fixed an issue, identified during testing, where the Spark jobs were not shown on the Jobs & Queries tab of the Cloudera Observability Real-time monitoring user interface. The issue occurred when Cloudera Data Hub clusters were created with versions from 7.2.18.100 to 7.2.18.400. The issue has been fixed on Cloudera Data Hub version 7.2.18.500 or higher.
Cloudera Operational Database
Cloudera Operational Database 1.47 supports enhancements to the CDP CLI and adding a new storage type while creating a new operational database.
Cloudera Operational Database supports JDK17
Cloudera Operational Database supports creating a database with Cloudera Runtime 7.3.1 only by using JDK17. If you use an earlier runtime version, only JDK8 and JDK11 are supported for database creation.
New CLI commands to list and upload certificates
Cloudera Operational Database supports two new CLI commands, list-certificates and upload-certificate.
In an Auto-TLS setup, Cloudera Manager maintains a global certificate trust store across the cluster to ensure a mutual trust relationship between cluster nodes in secure TLS connections.
You can now upload custom certificates into the global certificate store and distribute them across all nodes, enabling secure Cloudera Operational Database connections from your infrastructure without changing the existing PKI infrastructure, certificates, or Root CA.
The feature is designed to support mTLS authentication from outside of Cloudera Operational Database’s network, but it could also be useful for TLS connections from Cloudera Operational Database to other networks in general. The command details are as follows.
- list-certificates: This command lists SHA-1 fingerprints of certificates listed in the Global Trust Store.
Following is an example,
cdp opdb list-certificates --environment <environment_name> --database <database_name>
- upload-certificate: This command uploads a single, PEM-encoded certificate to the Global Trust Store and refreshes all the nodes in the cluster.
Following is an example,
cdp opdb upload-certificate --environment <environment_name> --database <database_name> --certificate <custom_certificate_in_PEM_format>
For more information, see CDP CLI documentation.
New storage type support during database creation
Cloudera Operational Database UI supports a new storage type, Cloud Storage with Caching and Data Tiering while creating an operational database. This storage type is equivalent to a cloud storage that supports time-based priority caching, where data within a specified time range is given a higher priority.
You must have the COD_DATATIERING entitlement to be able to use this storage type.
For more information, see Creating a database using Cloudera Operational Database.
Cloudera Data Warehouse
Review the new features, fixes, behavioral changes, and preview features in Cloudera Data Warehouse 1.9.3-b166.
Ability to select an instance type for Virtual Warehouses is GA
You can select AWS or Azure compute instance types, such as r6id.4xlarge or Standard_E16_v3, while creating a Virtual Warehouse, both using CDP CLI and UI. See Creating a Virtual Warehouse.
If you select a compute instance type both during environment activation and creating a Virtual Warehouse, then what you select for a Virtual Warehouse takes precedence.
Improvements to Impala Autoscaler Dashboard
The following improvements were introduced for the Impala Autoscaler Dashboard:
Ability to select the log-level configuration for the autoscaler and autoscaler metrics containers.
A new “Understanding The Dashboard” page has been added which explains the metrics displayed on the UI and how they are calculated.
Empty data points that manifest as gaps in the graphs are skipped. Zero values are accurately displayed.
Ability to view end-of-support information through UI and CDP CLI
Cloudera Data Warehouse releases reach the end of support every six months. The Cloudera Data Warehouse UI displays whether your deployment is nearing its end of support time or is unsupported, enabling you to plan an upgrade. You can also view the upgrade instructions on the UI. The end of support information is also displayed when you run the list-clusters and describe-clusters CDP CLI commands.
Streamlined option for downloading Cloudera Data Warehouse diagnostic bundles
Cloudera Data Warehouse users can now easily download diagnostic bundles with a direct Collect option that reduces the need for prior time interval and log selection adjustments. This update enables faster, more efficient access to relevant diagnostic data.
For more information, see Cloudera Data Warehouse Diagnostic Bundle Documentation, Downloading Cloudera Data Warehouse and Kubernetes diagnostic bundles and Diagnostic bundles for Cloudera Data Warehouse and Kubernetes.
What’s new in Cloudera Data Warehouse on Azure environments
Azure AKS 1.30 upgrade
Cloudera supports the Azure Kubernetes Service (AKS) version 1.30. In 1.9.3-b166 (released December 5, 2024), when you activate an Environment, Cloudera Data Warehouse automatically provisions AKS 1.30. To upgrade to AKS 1.30 from an earlier version of Cloudera Data Warehouse, you must backup and restore Cloudera Data Warehouse.
Note Using the Azure CLI or Azure portal to upgrade the AKS cluster is not supported. Doing so can cause the cluster to become unusable and can cause downtime. For more information about upgrading, see Upgrading an Azure Kubernetes Service (AKS) cluster.
What’s new in Cloudera Data Warehouse on AWS environments
AWS EKS 1.30 upgrade
Cloudera supports the AWS Elastic Kubernetes Service (EKS) version 1.30. In 1.9.3-b166 (released December 5, 2024), when you activate an Environment, Cloudera Data Warehouse automatically provisions EKS 1.30. To upgrade to EKS 1.30 from an earlier version of Cloudera Data Warehouse, you must backup and restore Cloudera Data Warehouse. For more information about upgrading, see Upgrading an Amazon Kubernetes Service (EKS) cluster.
What’s new in Impala in Cloudera Data Warehouse
Improved log storage location access for S3 and Azure in Cloudera Data Warehouse
Cloudera Data Warehouse now offers a simplified way to locate Impala diagnostic logs in S3 and Azure. With fewer required identifiers and clear directory paths, users can efficiently access the logs they need. Find details on the updated steps in S3 and Azure.
What’s new in Hue in Cloudera Data Warehouse
Enhanced AI Integration in Hue SQL AI Assistant
The Hue SQL AI Assistant now supports Cloudera AI Workbench and Cloudera AI Inference service. These integrations enhance the Hue SQL AI Assistant by enabling the use of private models hosted within Cloudera-managed infrastructure. This ensures enhanced security and privacy while leveraging GenAI for the Hue SQL-related tasks.
- Cloudera AI Workbench: This enables you to securely deploy and run your own models within a virtual private cloud. This configuration enhances control and privacy within your environment. For more information, see Configure SQL AI Assistant using Cloudera AI Workbench.
- Cloudera AI Inference service: Helps in a production-grade serving environment for hosting predictive and generative AI models. This service simplifies model deployment and maintenance. For more information, see Configure SQL AI Assistant using Cloudera AI Inference service.
For more information about the Fixed issues, Known issues, Behavioral changes and Deprecation notices, see the Cloudera Data Warehouse Release Notes.
Review the fixed issues in Cloudera Data Warehouse 1.9.4-b147 hotfix.
DWX-19898: Databus producer fails with multiple bucket configurations After a fresh install or DBC upgrade, the databus-producer fails when separate buckets are configured for data and logs. The error occurs because it expects all paths to be in the primary data bucket, but logs are stored in a different bucket.
The issue was resolved by updating the databus-producer version to 1.1.0-b4 in the Cloudera Data Warehouse 1.9.4-b147 release (released December 18, 2024).