August 4, 2022

This release of the Cloudera Data Warehouse (CDW) service on CDP Public Cloud introduces these changes.

Azure AKS 1.22 provisioning

In this release 1.4.2-b118 (released August 4, 2022), when you activate an environment, CDW automatically provisions AKS 1.22.

Do not upgrade your AKS cluster to 1.22 in this release using the Azure CLI. Doing so can cause the cluster to become unusable and can cause downtime. For more information about upgrading, see Upgrading an Azure Kubernetes Service cluster for CDW.

Azure AKS managed identity support

You can now specify a user-assigned, managed identity for the AKS cluster when you activate the Azure environment.

Support for fine-grained access using minimum permissions in Azure environments

You can configure minimum permissions to govern access control between CDW, Azure resources, and the Azure storage account.

Azure Kubernetes Service deployment configuration options

Changes in configuration properties simplify configuring a DNS zone for AKS.

Cluster subdomain definition improvement

In release version 1.4.2-b118 (released August 4, 2022), if you have the CDW_CUSTOM_CLUSTER_ID entitlement, you can define a cluster subdomain to retain your JDBC URLs. When you activate an environment, define your old subdomain using the following format:

env-xxx.dw

Defining the old subdomain in this way retains your old Virtual Warehouse names in the cluster. During environment activation, your old URLS continue to work, including JDBC URLS, Hive and Impala Virtual Warehouse URLS, Grafana URLS, and other URLS.

The new way of defining your old subdomain replaces the behavior in effect for Versions 2021.0.1.-b10 (released August 27, 2021) through 1.4.1-b86 (released June, 22, 2022). For information about the old way of defining a cluster domain, and the JDBC URL changes you had to made, see the August 27, 2021 release notes.

Simplification of Azure activation UI from CDW

Azure documentation for Data Warehouse has been updated to reflect improvements in the UI:

Apache Iceberg GA in CDW

The following new features are generally available (GA) in Iceberg V1 in CDW: Impala BI features were released on a GA basis already.

Impala enhancements

This release of Cloudera Data Warehouse includes the following new Impala features:

  • Printing query results in vertical format

    Impala-shell now includes a new command option '-E' or '--vertical' to support printing of query results in vertical format.

  • Resolving ORC columns by names

    Before this release, Impala resolved ORC columns by index. In this release, a query option ORC_SCHEMA_RESOLUTION is added to support resolving ORC columns by names.

  • Retrieving the data file name

    Impala now supports including a virtual column in a standard SELECT statement select INPUT__FILE__NAME from <tablename> to retrieve the name of the data file that stores the actual row in a table.

  • Consolidating the ranger audit logs for the same table

    Impala now consolidates the Ranger audit log entries of column accesses granted by the same policy for columns in the same table, after all the requests for accessing an object are processed.

  • BYTES function support

    Impala now supports the BYTES() function. This function returns the number of bytes contained in a byte string.

  • Configurable socket timeout for http transport

    Impala supports configuring socket timeout using the new impala-shell config option --http_socket_timeout_s. When you set a reasonable timeout, an impala-shell client can retry in case of connection issues.

  • UTF-8 mode support

    Some Impala STRING types now support UTF-8 aware behavior to ensure consistent results for non-ASCII characters in the string in both Hive and Impala.