Preview features in Cloudera Data Warehouse on Public Cloud
This release of the Cloudera Data Warehouse (CDW) service on CDP Public Cloud introduces this technical preview.
Enabling the Hive Virtual Warehouse to spill to an EBS volume (Preview)
To prevent failures when query data exceeds memory capacity, you spill data to an EBS volume. The data spills to the Amazon gp3 Elastic Block Store (EBS) volumes. You select the Additional LLAP Spill Disk (EBS) option when you create a Hive Virtual Warehouse. CDP automatically provisions the gp3 volume type for spilling Hive queries when you create or reactivate a Hive Virtual Warehouse on the latest CDW environment. For more information about EBS volumes, see Amazon documentation. Using the EBS volume incurs cost.
Improvements to the shared Hue service (Preview)
- Name change from Query Editor to Shared Hue Service in the left navigation pane in the CDW UI.
- Shared Hue service supports upgrade and rebuild operations similar to other CDW components.
- Added a one-time option to copy saved queries and query history while creating a shared Hue service instance.
For more information, see Deploying shared Hue service in Data Warehouse Public Cloud (Preview).
Ability to log and manage Impala workloads (Preview)
CDW provides you the option to enable logging Impala queries on an existing Virtual Warehouse or while creating a new Impala Virtual Warehouse. By logging the Impala queries in Cloudera Data Warehouse (CDW), you gain increased observability of the workloads running on Impala, which you can use to improve the performance of your Impala Virtual Warehouses.
This feature represents a significant enhancement to query profiling capabilities. You can
have Impala archive crucial data from each query's profile into dedicated database tables known
as the query history table and live query table. These tables are part of the
sys
database and are designed to store valuable information that can later be
queried using any Impala client, providing a consolidated view of reports from previously
executed queries.
For more information, see Impala workload management in Data Warehouse Public Cloud (Preview).
Introducing AI-enhanced UDF development package in Impala (Preview)
- A built-in AI function,
ai_generate_text
, enabling direct access to Large Language Models (LLMs) from SQL queries by inputting a prompt and retrieving the response. - This integration into existing workflows simplifies the process, reducing complexity and enhancing the user experience, allowing for quicker setup and deployment of UDFs in Impala.
For more information, see Advantages and use cases of Impala AI functions (Preview).
Support for Impala external JDBC data sources (Preview)
Apache Impala now supports reading from external JDBC data sources. An external JDBC table represents a table or a view in a remote RDBMS database or another Impala cluster. Using external JDBC tables, you can connect Impala to a database, such as MySQL, PostgreSQL, or another Impala cluster and read the data in the remote tables.
For more information, see Using Impala to query external JDBC data sources (Preview).
AWS environment permissions support for the EKS start/stop feature (Preview)
- rds:StartDBInstance
- rds:StopDBInstance
- rds:DescribeDBInstances
- autoscaling:DescribeAutoScalingGroups
Ability to select an instance type for Virtual Warehouses (Preview)
You can now specify AWS or Azure instance types, such as r6id.4xlarge or Standard_E16_v3, that you want to use for your Virtual Warehouse while creating a Virtual Warehouse. You are no longer confined to use the instance types that were specified while activating the environment in CDW. See Creating a Virtual Warehouse with ARM compute instance types using CDP CLI.
Impala support for reading Iceberg equality deletes for NiFi (Preview)
Cloudera supports row-level deletes, and starting with this release you can read equality deletes from Impala with suport added for Apache NiFi. See the Delete data feature.