November 21, 2022
This release of the Cloudera Data Warehouse (CDW) service on CDP Public Cloud introduces these changes.
AWS EKS Kubernetes version upgrade
The CDW application uses Kubernetes (K8S) clusters to deploy and manage Hive and Impala in the cloud. Kubernetes versions are updated every 3 months on average. When the version is updated, minor versions are deprecated.
To avoid compatibility issues between CDW and AWS resources, you must upgrade the version of Kubernetes that supports your existing CDW clusters to version 1.21.
AWS environments you activate using this release of Cloudera Data Warehouse will use version 1.22.
If a MIGRATE icon appears in the upper right corner of the environment tile, your AWS environment has not been migrated from Helm 2 to Helm 3. Update the Helm package manager for your environment before you attempt to upgrade its Kubernetes version.
For information about upgrading the AWS EKS Kubernetes version, see Upgrade CDW for EKS upgrade.
Support for Amazon Elastic Kubernetes Service 1.22 cluster
This release 1.5.1-b110 (released November 22, 2022) automatically uses and provisions Amazon Elastic Kubernetes Service (EKS) 1.22 when you activate an environment from CDW. In this release, upgrading a cluster to EKS 1.22 is not supported.
Forward logs to your observability system
In this release, you can forward logs from environments activated in Cloudera Data Warehouse (CDW) to observability and monitoring systems such as Datadog, New Relic, or Splunk. You configure a CDW environment for these systems using the UI to provide a fluentd configuration.
Rebuild a Database Catalog or Virtual Warehouse
Rebuilding a Database Catalog and rebuilding a Virtual Warehouse, you clean up resources and redeploy your Database Catalog and Virtual Warehouse using your existing DBC or VW image, or the latest available image. You might consider rebuilding to get a feature in a later DBC or VW image, to perform housekeeping, or to troubleshoot a problem.
AWS compute instance type selection is available using the CDP CLI
- computeInstanceTypes
An array of AWS compute instance types from which you can select a single instance type that the environment of your Virtual Warehouse will use.
- additionalInstanceTypes
An array of additional AWS instance types for fallback in order of priority to fail over in the event the compute instance type is unavailable. You can specify one of the additional instance types as the default if you do not configure a compute instance type.
cdp dw describe-allowed-instance-types
Example output:
{
"aws": {
"default": [
"r5d.4xlarge"
],
"allowed": [
"r5d.4xlarge",
...
Specify the instance types in aws-options of the
create-cluster command. For example: cdp \
--profile ${PROFILE} \
dw \
create-cluster \
--environment-crn ${AWS_CB_ENV_CRN} \
--use-private-load-balancer \
--aws-options computeInstanceTypes=r5dn.4xlarge,additionalInstanceTypes=r5d.4xlarge,r5dn.4xlarge
Use
list-clusters
and describe-cluster
to see cluster settings.
CLI documentation link.Enable CloudWatch Logs
If you use Amazon CloudWatch, you can enable CloudWatch logs when you activate an environment from CDW or edit environment details. Add required permissions to your IAM policy, and then access CloudWatch logs from AWS.
Nodes dashboards added to Grafana
New nodes dashboards show metrics for the AWS/Azure Virtual Machines that host the Kubernetes cluster.
Cloudera Data Warehouse audit events
CreateEnvironment | CloneDbCatalog | UpgradeHiveVirtualWarehouse |
DeleteEnvironment | UpgradeDbCatalog | CreateImpalaVirtualWarehouse |
UpdateEnvironment | CreateHiveVirtualWarehouse | DeleteImpalaVirtualWarehouse |
UpgradeEnvironment | DeleteHiveVirtualWarehouse | StartImpalaVirtualWarehouse |
CreateDbCatalog | StartHiveVirtualWarehouse | StopImpalaVirtualWarehouse |
DeleteDbCatalog | StopHiveVirtualWarehouse | UpdateImpalaVirtualWarehouse |
StartDbCatalog | UpdateHiveVirtualWarehouse | CloneImpalaVirtualWarehouse |
StopDbCatalog | CloneHiveVirtualWarehouse | UpgradeImpalaVirtualWarehouse |
UpdateDbCatalog |
For more information about experimental CLI commands for Cloudera Data Warehouse, go to Version Mapping. Click the CDP CLI Reference link for your CDW version. Scroll to Available Commands, and click dw.
Data Analytics Studio (DAS) is deprecated
DAS has been deprecated and is disabled by default. It will be removed in future releases.
Cloudera encourages you to use Hue to run Hive LLAP workloads. If you need to use DAS, then you
can enable it by setting the das.event-pipeline.enabled
property to
true
in the Database Catalog configurations. For more information, see
Enabling Data Analytics Studio in CDW Public Cloud.
Hue Query Processor scan frequency decreased to 5 minutes
The Hue Query Processor scans the event processor pipeline to retrieve the Hive query history and query details and display them on the Job Browser page. The scan frequency has been decreased from 2 milliseconds to 5 minutes to optimize resource utilization. As a result, you may notice a delay in viewing the query history and query details on the Job Browser page for queries that finish executing in less than 5 minutes. However, you can still view the query history from the Query history tab below the query editor.