November 22, 2022

This release of the Cloudera Data Warehouse (CDW) service on CDP Public Cloud introduces these changes.

AWS EKS Kubernetes version upgrade

The CDW application uses Kubernetes (K8S) clusters to deploy and manage Hive and Impala in the cloud. Kubernetes versions are updated every 3 months on average. When the version is updated, minor versions are deprecated.

To avoid compatibility issues between CDW and AWS resources, you must upgrade the version of Kubernetes that supports your existing CDW clusters to version 1.21.

AWS environments you activate using this release of Cloudera Data Warehouse, and later, will use version 1.22.

If a MIGRATE icon appears in the upper right corner of the environment tile, your AWS environment has not been migrated from Helm 2 to Helm 3. Update the Helm package manager for your environment before you attempt to upgrade its Kubernetes version.

For information about upgrading the AWS EKS Kubernetes version, see Upgrade CDW for EKS upgrade.

Support for Amazon Elastic Kubernetes Service 1.22 cluster

This release 1.5.1-b110 (released November 22, 2022) automatically uses and provisions Amazon Elastic Kubernetes Service (EKS) 1.22 when you activate an environment from CDW. In this release, upgrading a cluster to EKS 1.22 is not supported.

Forward logs to your observability system

In this release, you can forward logs from environments activated in Cloudera Data Warehouse (CDW) to observability and monitoring systems such as Datadog, New Relic, or Splunk. You configure a CDW environment for these systems using the UI to provide a fluentd configuration.

Rebuild a Database Catalog or Virtual Warehouse

Rebuilding a Database Catalog and rebuilding a Virtual Warehouse, you clean up resources and redeploy your Database Catalog and Virtual Warehouse using your existing DBC or VW image, or the latest available image. You might consider rebuilding to get a feature in a later DBC or VW image, to perform housekeeping, or to troubleshoot a problem.

AWS compute instance type selection is available using the CDP CLI

This release of CDW supports selection of a single AWS compute instance type and additional fallback instance types. From the CDP CLI, you can select the following options:
  • computeInstanceTypes

    An array of AWS compute instance types from which you can select a single instance type that the environment of your Virtual Warehouse will use.

  • additionalInstanceTypes

    An array of additional AWS instance types for fallback in order of priority to fail over in the event the compute instance type is unavailable. You can specify one of the additional instance types as the default if you do not configure a compute instance type.

When selecting an instance type, consider your computing, memory, networking, and storage needs. To fetch a string of available instance types, use the following CLI command:
cdp dw describe-allowed-instance-types
Example output:
{
     "aws": {
     "default": [
     "r5d.4xlarge"
     ],
     "allowed": [
     "r5d.4xlarge",
     ...
     "azure": {
     "default": [
     "Standard_E16ds_v4"
     ],
     "allowed": [
     "Standard_E16_v3",
     ...
Specify the instance types in aws-options or azure-options of the create-cluster command. For example:
cdp \
     --profile ${PROFILE} \
     dw \
     create-cluster \
     --environment-crn ${AWS_CB_ENV_CRN} \
     --use-private-load-balancer \
     --aws-options computeInstanceTypes=r5dn.4xlarge,additionalInstanceTypes=r5d.4xlarge,r5dn.4xlarge
     
     cdp \
     --profile ${PROFILE} \
     dw \
     create-cluster \
     --environment-crn ${AZURE_CB_ENV_CRN} \
     --use-private-load-balancer \
     --azure-options computeInstanceTypes=Standard_E16_v3
Use list-clusters and describe-cluster to see cluster settings. CLI documentation link.

Enable CloudWatch Logs

If you use Amazon CloudWatch, you can enable CloudWatch logs when you activate an environment from CDW or edit environment details. Add required permissions to your IAM policy, and then access CloudWatch logs from AWS.

Nodes dashboards added to Grafana

New nodes dashboards show metrics for the AWS/Azure Virtual Machines that host the Kubernetes cluster.

Cloudera Data Warehouse audit events

You can retrieve the following audit events using the CDP CLI that occur in CDW:
CreateEnvironment CloneDbCatalog UpgradeHiveVirtualWarehouse
DeleteEnvironment UpgradeDbCatalog CreateImpalaVirtualWarehouse
UpdateEnvironment CreateHiveVirtualWarehouse DeleteImpalaVirtualWarehouse
UpgradeEnvironment DeleteHiveVirtualWarehouse StartImpalaVirtualWarehouse
CreateDbCatalog StartHiveVirtualWarehouse StopImpalaVirtualWarehouse
DeleteDbCatalog StopHiveVirtualWarehouse UpdateImpalaVirtualWarehouse
StartDbCatalog UpdateHiveVirtualWarehouse CloneImpalaVirtualWarehouse
StopDbCatalog CloneHiveVirtualWarehouse UpgradeImpalaVirtualWarehouse
UpdateDbCatalog

For more information about experimental CLI commands for Cloudera Data Warehouse, go to Version Mapping. Click the CDP CLI Reference link for your CDW version. Scroll to Available Commands, and click dw.

Data Analytics Studio (DAS) is deprecated

DAS has been deprecated and is disabled by default. It will be removed in future releases. Cloudera encourages you to use Hue to run Hive LLAP workloads. If you need to use DAS, then you can enable it by setting the das.event-pipeline.enabled property to true in the Database Catalog configurations. For more information, see Enabling Data Analytics Studio in CDW Public Cloud.

Hue Query Processor scan frequency decreased to 5 minutes

The Hue Query Processor scans the event processor pipeline to retrieve the Hive query history and query details and display them on the Job Browser page. The scan frequency has been decreased from 2 milliseconds to 5 minutes to optimize resource utilization. As a result, you may notice a delay in viewing the query history and query details on the Job Browser page for queries that finish executing in less than 5 minutes. However, you can still view the query history from the Query history tab below the query editor.