August 4, 2022
This release of the Cloudera Data Warehouse (CDW) service on CDP Public Cloud introduces these changes.
Azure AKS 1.22 provisioning
In this release 1.4.2-b118 (released August 4, 2022), when you activate an environment, CDW automatically provisions AKS 1.22.
Do not upgrade your AKS cluster to 1.22 in this release using the Azure CLI. Doing so can cause the cluster to become unusable and can cause downtime. For more information about upgrading, see Upgrading an Azure Kubernetes Service cluster for CDW.
Azure AKS managed identity support
You can now specify a user-assigned, managed identity for the AKS cluster when you activate the Azure environment.
Support for fine-grained access using minimum permissions in Azure environments
You can configure minimum permissions to govern access control between CDW, Azure resources, and the Azure storage account.
Azure Kubernetes Service deployment configuration options
Changes in configuration properties simplify configuring a DNS zone for AKS.
Cluster subdomain definition improvement
In release version 1.4.2-b118 (released August 4, 2022), if you have the CDW_CUSTOM_CLUSTER_ID entitlement, you can define a cluster subdomain to retain your JDBC URLs. When you activate an environment, define your old subdomain using the following format:
env-xxx.dw
Defining the old subdomain in this way retains your old Virtual Warehouse names in the cluster. During environment activation, your old URLS continue to work, including JDBC URLS, Hive and Impala Virtual Warehouse URLS, Grafana URLS, and other URLS.
The new way of defining your old subdomain replaces the behavior in effect for Versions 2021.0.1.-b10 (released August 27, 2021) through 1.4.1-b86 (released June, 22, 2022). For information about the old way of defining a cluster domain, and the JDBC URL changes you had to made, see the August 27, 2021 release notes.
Simplification of Azure activation UI from CDW
Apache Iceberg GA in CDW
Impala enhancements
This release of Cloudera Data Warehouse includes the following new Impala features:
- Printing query results in vertical format
Impala-shell now includes a new command option '-E' or '--vertical' to support printing of query results in vertical format.
- Resolving ORC columns by names
Before this release, Impala resolved ORC columns by index. In this release, a query option
ORC_SCHEMA_RESOLUTION
is added to support resolving ORC columns by names. - Retrieving the data file name
Impala now supports including a virtual column in a standard SELECT statement
select INPUT__FILE__NAME from <tablename>
to retrieve the name of the data file that stores the actual row in a table. - Consolidating the ranger audit logs for the same table
Impala now consolidates the Ranger audit log entries of column accesses granted by the same policy for columns in the same table, after all the requests for accessing an object are processed.
- BYTES function support
Impala now supports the BYTES() function. This function returns the number of bytes contained in a byte string.
- Configurable socket timeout for http transport
Impala supports configuring socket timeout using the new impala-shell config option
--http_socket_timeout_s
. When you set a reasonable timeout, an impala-shell client can retry in case of connection issues. - UTF-8 mode support
Some Impala STRING types now support UTF-8 aware behavior to ensure consistent results for non-ASCII characters in the string in both Hive and Impala.