What's new in Cloudera Data Warehouse Private Cloud

Learn about the new features in Cloudera Data Warehouse (CDW) service on CDP Private Cloud Data Services 1.5.1.

Ability to upgrade Database Catalogs and Virtual Warehouses

After upgrading the CDP platform to the latest version, you can upgrade the Database Catalogs and Virtual Warehouses to the latest available versions from the CDW web interface or from the CLI. The option to upgrade is displayed only if you are upgrading CDP Private Cloud platform from 1.5.0 to a newer version (1.5.1). For more information, see Upgrading Database Catalogs and Virtual Warehouses in CDW Private Cloud.

CDW supports Kerberos authentication for Beeline, Impala-shell, Impyla, and other JDBC clients

In addition to LDAP, you can now use Kerberos to authenticate requests from CLI-based Business Intelligence clients such as Beeline, Impala-shell, Impyla, and other JDBC clients. You can store user credentials in Kerberos Keytab files, so that you can run commands without having to specify the username and password as command-line parameter or enter them after running a command. See Authenticating users in CDW Private Cloud.

Group-level access control for Impala Virtual Warehouse

CDW enables you to allow one or more user groups to access a particular Impala Virtual Warehouse. As a result, only specific users can connect to a Virtual Warehouse, from all supported channels (Hue, JDBC, or other Business Intelligence tools). The option to specify user groups when creating a new Virtual Warehouse is enabled by default. You can disable this option from the Advanced Settings page in the Data Warehouse UI. See Configuring access control for Impala Warehouses.

Enhancements to the Impala custom pod configuration UI

The user experience to create and customize the custom pod configurations for Impala has been enhanced. See the revised how-to topic Creating custom pod configurations for Impala Virtual Warehouses.

Workload-aware autoscaling for Impala (Preview)

Using workload-aware autoccaling, you can configure multiple executor groups within a single Virtual Warehouse that can independently autoscale to allow handling of different workloads in the same Virtual Warehouse. According to each query’s resource requirement, the query is scheduled on an executor group size that is appropriate for that query. For more information, see Workload aware autoscaling in Impala.

You must select the Enable workload-aware autoscaling for Impala option from the Advanced Configurations to use workload-aware autoscaling. See Enabling workload-aware autoscaling for Impala.

Configurations are copied from base to CDW for seamless workload migration

Configurations such as default table types, compression format, file format, parquet resolution, timezones, and so on are copied from CDP Private Cloud Base to CDW Data Service when you activate an Environment in CDW. This enables seamless migration of Impala and Hive LLAP workloads to CDW. See List of configurations copied from the Base cluster to CDW Private Cloud.

Ability to browse the Ozone filesystem using the Hue File Browser

You can browse files and directories on the Ozone filesystem on the base cluster just like you can on S3 or ADLS Gen2. To enable browsing files on Ozone, you must first enable the Ozone File Browser. See Enabling browsing Ozone from Hue.

Support for Ozone erasure-coded data in Impala

Impala now supports reading from Ozone data stored with Erasure Coding (EC). The Ozone EC feature provides data durability and fault tolerance along with reduced storage space and ensures data durability similar to the Ratis THREE replication approach. EC can be considered as an alternative to replication. For more information, see Erasure Coding Overview.

Support for Ozone erasure-coded data in Hive

Hive now supports reading data stored in Ozone that is enabled with Erasure Coding (EC). The Ozone EC feature provides data durability and fault-tolerance along with reduced storage space and ensures data durability similar to Ratis/THREE replication approach. For more information, see Using Ozone storage with Cloudera Data Warehouse.

The “enable_queries_list” configuration has been removed from Hue jobbrowser safety valve section

The enable_queries_list configuration in the CDW Hue safety valve was used to display or hide the Queries tab on the Job Browser page. This configuration has been removed. The Queries tab is displayed by default based on the type of Virtual Warehouse, whether it is Hive or Impala. You can override the query_store configuration and hide the Queries tab. For more information, see Hue configurations in Cloudera Data Warehouse.

Hive ACID compaction observability

Compaction observability is a notification and information system based on metrics about the health of the compaction process. You can access Grafana dashboards from CDW web interface to view alerts about compaction status, the issue, and recommended actions.

The following list describes a few of the 25 notifications:
  • Oldest initiated compaction passed threshold
  • Large number of compaction failures
  • More than one host is initiating compaction

Create an environment in this release to use this feature. For more information, see Compaction observability.

CDW in Private Cloud supports Apache Iceberg V2 on HDFS and Ozone (Preview)

Apache Iceberg integration with Cloudera Data Platform (CDP) enhances the Lakehouse architecture by extending multifunction analytics to a petabyte scale for multi-cloud and hybrid use cases. From Hive or Impala, you use Apache Iceberg features in CDW on HDFS and Ozone, which include time travel, create table as select (CTAS), and schema and partition evolution. To use Iceberg with CDW, you must upgrade to CDP Private Cloud Data Services 1.5.1.

Apache Iceberg V1 and V2 is in technical preview in the Private Cloud 1.5.1 release and is not recommended for production deployments. Cloudera recommends that you try this feature in test or development environments.

Deprecations

Data Analytics Studio (DAS) has been deprecated and is no longer available with CDW Private Cloud starting with 1.5.1. DAS features have been migrated to the Hue Query Processor, along with the ability to query data in Hive, Impala, and Unified Analytics mode. After you upgrade to this release, any DAS instances that are running will be removed from the cluster.