What's new in Cloudera Data Warehouse Private Cloud

Changes to the database recommendations🔗

In older releases, Cloudera recommended that you use an external database for the Hive MetaStore (HMS) and the Control Plane service, so that you could backup and restore data as needed. With the introduction of the Data Recovery Service in 1.5.0, you no longer need to manually back up and restore the data in the database. Starting from 1.5.0, Cloudera recommends that you use an embedded database for the HMS and the Control Plane service.

CDW in Private Cloud supports Unified Analytics🔗

Unified Analytics bring SQL equivalency without syntax changes to CDW SQL engines. Unified Analytics provide significant optimization equivalency to these engines, unifying common techniques such as subquery processing, join ordering, and materialized views. Unified Analytics documentation includes features, limitations, and how to use Unified Analytics. To take advantage of this feature, create a new Impala Virtual Warehouse and enable the Unified Analytics option, or simply create a new Hive Virtual Warehouse. For more information, see Unified Analytics overview.

The Unified Analytics feature has the following limitations:

Query isolation feature for Unified Analytics is in technical preview in the Private Cloud 1.5.0 release and is not recommended for use in production deployments. Cloudera recommends that you try this feature in test or development environments.
Queries with left or right anti joins are not supported.

CDW in Private Cloud supports Apache Iceberg on HDFS (Preview)🔗

Apache Iceberg integration with Cloudera Data Platform (CDP) enhances the Lakehouse architecture by extending multifunction analytics to a petabyte scale for multi-cloud and hybrid use cases. From Hive or Impala, you use Apache Iceberg features in CDW, which include time travel, create table as select (CTAS), and schema and partition evolution. To use Iceberg with CDW, you must upgrade to CDP Private Cloud Data Services 1.5.0.

Apache Iceberg V1 is in technical preview in the Private Cloud 1.5.0 release and is not recommended for production deployments. Cloudera recommends that you try this feature in test or development environments.

For the list of supported features and more information, see Iceberg overview.

Ability to back up and restore Kubernetes data using DRS🔗

The Data Recovery Service (DRS) in CDP enables you to back up and restore Kubernetes namespaces behind CDW entities such as Database Catalogs and Virtual Warehouses on demand. CDW leverages DRS and provides CDP CLI endpoints which you can use to create and restore backups for CDW namespaces to back up CDW metadata and configurations such as Kubernetes objects, persistent volumes, autoscaling configuration, and so on. To learn more about the Data Recovery Service, see Data Recovery Service overview. For the list of available sub-commands for CDW, see Using DRS with CDW.

Support added for ADLS Gen2 object storage (Preview)🔗

CDW supports using ADLS (Gen1 and Gen2) containers for storing tables. CDW exposes Hive and Impala tables stored on ABFS containers as SQL tables which you can query using Hue. However, you cannot browse and import files to create tables from ABFS in Hue. For more information, see Supported object storage services for Cloudera Data Warehouse Private Cloud.

This feature is in Technical Preview and is not recommended for production deployments. Cloudera recommends that you try this feature in test or development environments.

Non-default Database Catalogs are deprecated🔗

The ability to create non-default Database Catalogs has been deprecated and is disabled by default. Cloudera encourages you to use the default Database Catalog that is created when you activate an environment. However, if you need to use a non-default Database Catalog, then you can enable the Create multiple Database Catalogs option from the Advanced Settings page as described in Enabling the option to create additional Database Catalogs in CDW Private Cloud.

Ability to migrate Hive workloads from CDP Private Cloud Base to CDW Data Service on CDP Private Cloud🔗

You can migrate Hive workloads from CDP Private Cloud Base to CDW Private Cloud to leverage the auto-scaling, workload optimization, isolation, data caching, and many other powerful capabilities that CDW offers. For more information, see Migrating Hive workloads from CDP Private Cloud Base to CDW Private Cloud.

Improved troubleshooting experience for failed Database Catalogs and Virtual Warehouses🔗

A new option called View Logs is displayed on Database Catalogs and Virtual Warehouses that have failed. It enables you to quickly view the logs and start investigating the issue. This is in addition to the diagnostic bundles that you can generate on a need basis.

CDW supports Cloudera Data Visualization (CDV) 7.0.5🔗

To know what’s new in CDV 7.0.5, see What's new in Data Visualization 7.0.5.

Support for using Hive user-defined functions (UDF) is GA🔗

You can export user-defined functions (UDF) to a JAR file from a Hadoop- and Hive-compatible Java project and store the JAR file on HDFS. Using Hive commands, you can register the UDF based on the JAR file, and call the UDF from a Hive query using Hue in CDW. For more information, see Creating a user-defined function in Cloudera Data Warehouse.

This feature is generally available (GA) starting 1.5.0, and you can use it in production environments.

Support for uploading auxillary JARs🔗

CDW enables administrators to upload auxillary JARs to the Hive classpath that might be required to support dependency JARs, third-party Serde, or any Hive extensions. For more information, see Uploading additional JARs to CDW