November 18, 2022

This release of Cloudera Data Warehouse (CDW) service on CDP Private Cloud Data Services introduces the new features and improvements that are described in this section.

Ability to use deterministic namespace in Kerberos principals

CDW now uses a deterministic namespace and environment IDs. The Kerberos principals for Database Catalogs and Environments use the service hostname and the deterministic namespace name based on the name of the Database Catalog or Environment.

When you specify an Environment or Database Catalog name, CDW appends a prefix to the environment and Database Catalog name, as well as to the Kerberos principal name based on them. For more information, see Predefined Kerberos principals in Cloudera Data Warehouse Private Cloud.

Ability to install and manage Cloudera Data Warehouse clusters using CDP CLI

You can install CDP CLI on your computer and use it to install and manage CDW clusters on CDP Private Cloud. To install CDP CLI, see CLI client setup. For the list of CDW sub-commands, see https://cloudera.github.io/cdp-dev-docs/cli-docs/dw/index.html.

Real-time graphs in Grafana display scratch and cache disk utilization

You can monitor and track the scratch and cache disk utilization at a cluster level and the Virtual Warehouse level. This enables you to size a Virtual Warehouse optimally and calculate the memory and disk requirements. For more information, see Monitoring Data Warehouse service resources with Grafana dashboards.

Hue is now the unified next-generation SQL assistant in CDP

Hue packs the combined abilities of Data Analytics Studio (DAS) such as query optimization, query debugging framework, and rich query editor experience of Hue, making Hue the next-generation SQL assistant in CDP. You can search query history, view query details, visual explain plan, and DAG information, compare two queries, and download debug bundles for troubleshooting Hive queries from the Job Browser page. To support this feature, a new service called Query Processor is added to the CDP stack as a dependency for Hue.

A new tab called Hue query processor has been added under the Database Catalog > Edit > CONFIGURATIONS section. The “hue-query-processor.json” and “hue-event-processor.json” files are no longer available under the Virtual Warehouse > Edit > CONFIGURATIONS > Hue > Configuration files dropdown menu. For more information, see About Hue Query Processor.

Data Analytics Studio (DAS) is deprecated

DAS is deprecated and is not installed by default. DAS will be unavailable in future releases. Cloudera encourages that you use Hue to run Hive LLAP workloads. If you need to use DAS, then you can enable it from the Advanced Settings page. See Enabling Data Analytics Studio in CDW Private Cloud.

Ability to configure Impala coordinator high availability

You can configure up to five Impala coordinators in an active-active configuration concurrently with cookie-based load balancing to resolve or mitigate query concurrency problems. To enable the active-active configuration, select the Enabled (Active-Active) option while creating a Virtual Warehouse. For more information, see Configuring Impala coordinator high availability in CDW Private Cloud.

Ability to spill Impala queries to HDFS

You can configure heavy Impala queries to write intermediate files during large sorts, joins, aggregations, or analytic function operations to a remote scratch space on HDFS. To enable this feature, you must configure the Impala daemon to use the specified locations for writing the intermediate files and then specify the HDFS URI while creating the Impala Virtual Warehouse. For more information, see Enabling Impala to spill to HDFS in CDW.

New Advanced Configurations menu for enabling and disabling deprecated and Technical Preview features

A new Advanced Configurations menu has been added to the CDW web interface which opens an Advanced Settings page. On this page, you can enable or disable Technical Preview and deprecated features which are not installed or available out of the box when you install the Private Cloud data services. For example, enabling DAS which has been deprecated and enabling third-party S3 providers in private cloud.

Using the Refresh option to apply configuration changes

A new Refresh option has been added to the more options menu for Database Catalogs and Virtual Warehouses that helps you to apply configuration changes that you made at an environment level, from the Management Console, or from the Advanced Settings page. In most cases, this helps you to avoid deleting and recreating Database Catalogs or Virtual Warehouses. To learn more about the use cases in which you can use the Refresh option, see About the Refresh option.

Support added for AWS S3 and third-party object storage

CDW supports using AWS S3 object storage services for storing tables. Other similar, compatible, on-premises object stores that support the S3 protocol could work as well. CDW exposes Hive and Impala tables stored on S3 as SQL tables which you can query using Hue. However, you cannot browse and import files to create tables from S3 in Hue. For more information, see Third-party object storage support for CDW Private Cloud.

Ability to change delegation username and password

You can update the delegation username and password that CDW uses to impersonate authorization requests from Hue to the Impala engine from the Environment Details page. For more information, see Changing delegation username and password.

Ability to configure Impala coordinator and executor pod size is GA

You can optimize the performance of your Impala Virtual Warehouse and resources used in an environment based on your hardware configuration by customizing the amount of resources allocated to the Impala coordinators, executors, and catalog daemons. This helps you to better leverage intra-query parallelism and achieve powerful compute clusters with fewer nodes.

This feature is generally available (GA) starting 1.4.1, and you can use it in production environments.

Earlier, CDW allowed you to specify the path and size for scratch and cache space for Impala executor and coordinator pods. Starting with CDW 1.4.1, you can only specify the size for these parameters.

A new parameter, Overhead size has been added which allows you to specify storage size for storing resources that are used by the tools run by the container. For more information, see Creating custom pod configurations for Impala Virtual Warehouses.

Data Visualization integration in Cloudera Data Warehouse is GA

CDW integrates Data Visualization for building graphic representations of data, dashboards, and visual applications based on CDW data, or other data sources you connect to. Authorized users can explore data using graphics, such as pie charts and histograms and collaborate using dashboards. BI analysts who can access your environment can use these features. To get started with Cloudera Data Visualization, see Creating a Data Visualization instance in CDW

This feature is generally available (GA) starting 1.4.1, and you can use it in production environments.

SSL support for Oracle database is GA

CDW supports Oracle database and can connect to SSL-enabled Oracle on the base cluster. For optimum security, the network connection between the default Database Catalog Hive MetaStore (HMS) in CDW and the relational database hosting the base cluster’s HMS must be encrypted with SSL. For more information, see Configuring Oracle database to use SSL for Data Warehouse.

This feature is generally available (GA) starting 1.4.1, and you can use it in production environments.

Improved read performance of ORC tables by Impala

Continuous improvements in Impala's read performance of ORC tables.

Ozone filesystem support added for Hive and Impala (Preview)

You can use Apache Ozone storage with CDW Private Cloud. This feature is in Technical Preview and Cloudera recommends that you try this in test and development environments. For more information, see Using Ozone storage with Cloudera Data Warehouse Private Cloud.