May 3, 2021
This release of the Cloudera Data Warehouse (CDW) service on CDP Public Cloud introduces these changes.
CDE Airflow integration to schedule ETL jobs for Hive Virtual Warehouses
In this release, CDE Airflow, based on Apache Airflow, which enables you to configure, schedule, and monitor workflows, can now be used to schedule ETL jobs that target Hive Virtual Warehouses running in CDW. For more information, see CDE Airflow ETL workloads in Hive Virtual Warehouses.
AWS environments updates
- Helm 2 to Helm 3 migration
- Helm is a package manager for the Kubernetes platform. CDW, which runs on Kubernetes, has migrated to Helm version 3 to take advantage of the improved security and upgrading capabilities offered by Helm 3. Any AWS environments that were created prior to this platform migration still run on Helm version 2. Although you can continue to use the Database Catalogs and Virtual Warehouses running on AWS environments that use Helm 2, there are significant limitations. Cloudera recommends that you migrate your AWS environments to Helm 3 as soon as you can. For more information about the Helm 2 to Helm 3 migration for AWS environments, see Helm 2 to Helm 3 migration.
- AWS EKS Kubernetes version upgrade
The CDW application uses Kubernetes (K8S) clusters to deploy and manage Hive, Impala, and Real-time Event Store Analytics (Druid) in the cloud. Kubernetes versions are updated every 3 months on average. When the version is updated, minor versions are deprecated. Amazon Elastic Kubernetes Service (EKS) is updating to Kubernetes version 1.16 and is ending support for version 1.15 on May 2, 2021. To avoid compatibility issues between CDW and AWS resources, the version of Kubernetes that supports your CDW clusters must be updated.
For more information about upgrading AWS environments for AWS EKS Kubernetes cluster updates, see Upgrade CDW for EKS Kubernetes version upgrade.
- Ability to set scratch space limit for spilling Impala queries on Azure environments
- Memory-intensive SQL operations need temporary disk space when Impala is close to exceeding its memory limit on a particular host. You can configure the scratch space between 300 GiB and 16 TiB per node while creating an Impala Virtual Warehouse in the Cloudera Data Warehouse (CDW) service. For details, see Setting scratch space limit for spilling Impala queries in Azure environments.