What's new in Cloudera Data Warehouse on cloud
Review the new features introduced in this release of Cloudera Data Warehouse service on Cloudera on cloud.
What's new in Cloudera Data Warehouse on cloud
- Azure AKS 1.33 upgrade
- Cloudera supports the Azure Kubernetes Service (AKS) version 1.33. In 1.11.1 (released October 22, 2025), when you activate an Environment, Cloudera Data Warehouse automatically provisions AKS 1.33. To upgrade to AKS 1.33 from a lower version of Cloudera Data Warehouse, you must backup and restore Cloudera Data Warehouse.
- Integration of Hive ARM architecture support
- Cloudera Data Warehouse now includes full operational support for Hive Virtual Warehouses on ARM architecture instance types. This confirms that Hive workloads run natively on both AWS and Azure instances currently supported by the platform, offering immediate access to the associated cost and performance efficiencies. For more information, see Compute instance types.
What's new in Hive on Cloudera Data Warehouse on cloud
- Upgrading Calcite
- Hive has been upgraded to Calcite version 1.33. This upgrade introduces various query optimizations that can improve query performance.
- Hive on ARM Architecture
- Hive is now fully supported on ARM architecture instances, including AWS Graviton and Azure ARM. This enables you to run your Hive workloads on more cost-effective and energy-efficient hardware.
What's new in Impala on Cloudera Data Warehouse on cloud
- Enable global admission controller
- It is now a standalone service that maintains a consistent view of cluster resource usage and can make admission decisions without risking over-admission. This feature is enabled by default for new Impala Virtual Warehouses running in High Availability (HA) Active-Active mode. If needed, you can disable it through the Cloudera web interface, but this action is permanent. For more information, see Impala admissiond and Configuring admission control.
- Impala AES encryption and decryption support
- Impala now supports AES (Advanced Encryption Standard) encryption and decryption to work better with other systems. AES-GCM is the default mode for strong security, but you can also use other modes like CTR, CFB, and ECB for different needs.
This feature works with both 128-bit and 256-bit keys and includes checks to keep your data safe and confidential. For more information see AES encryption and decryption support
Apache Jira: IMPALA-13039
- Query cancellation supported during analysis and planning
- This new feature allows you to cancel Impala queries even while they are in the Frontend stage, which includes analysis and planning. Previously, you could not cancel a query while it was waiting for operations like loading metadata from the Catalog Server. With this update, Impala now registers the planning process and can interrupt it to cancel the query.
Apache Jira: IMPALA-915
- Improved memory estimation and control for large queries
- Impala now uses a more realistic approach to memory estimation for large operations like
SORT,AGGREGATION, andHASH JOIN. - Expose query cancellation status to UDF interface
- Impala now exposes the query cancellation status to the User-Defined Function (
UDF) interface. This new feature allows complex or time-consuming UDFs to periodically check if the query has been cancelled by the user. If cancellation is detected, the UDF can stop its work and fail fast. - CDPD-76276: Auto-optimized parquet collection queries
- Impala now automatically boosts query performance for tables with collection data types by
setting the
parquet_late_materialization_thresholdto1when data can be skipped during filtering. This ensures maximum efficiency by reading only the data needed.For more information, see Late Materialization of Columns
Apache Jira: IMPALA-3841
- Impala now supports Hive’s legacy timestamp conversion to ensure consistent interpretation of historical timestamps
- When reading Parquet or Avro files written by Hive using legacy timestamp conversion, Impala's timezone calculation for UTC timestamps could be incorrect, particularly for historical dates and timezones like Asia/Kuala_Lumpur or Singapore before 1982. This meant the timestamps displayed in Impala were different from those in Hive.
- CDPD-82251: Impala-shell now shows row count and elapsed time for most statements in HiveServer2 mode
- When running Impala queries, some commands over HiveServer2 protocol (like
REFRESHorINVALIDATE) did not show the Fetched X row(s) in Ys output inImpala-shell, even though Beeswax protocol showed them. - CDPD-84069: Support for arbitrary encodings in text and sequence files
- Impala now supports reading from and writing to Text and Sequence files that use arbitrary character encodings, such as GBK, beyond the default UTF-8.
- Expanded compression levels for ZSTD, and ZLIB
- Impala has extended the configurable range of compression levels for ZSTD, and ZLIB (GZIP/DEFLATE) codecs. This enhancement allows for better optimization of the trade-off between compression ratio and write throughput.
- IMPALA-12992: Impala now supports tables created with the Hive JDBC Storage handler
- Previously, Impala had difficulty reading tables created using the Hive JDBC Storage handler due to differences in how table properties, such as JDBC driver and DBCP configurations, were defined compared to Impala-created tables.
- IMPALA-10349: Constant folding is now supported for non-ASCII and binary strings
- Previously, the query planner could not apply the optimization known as constant folding if the resulting value contained non-ASCII characters or was a non-UTF8 binary string. This failure meant that important query filters could not be simplified, which prevented key performance optimizations like predicate pushdown to the storage engine (e.g., Iceberg or Parquet stat filtering).
What's new in Iceberg on Cloudera Data Warehouse on cloud
- Integrate Iceberg scan metrics into Impala query profiles
- Iceberg scan metrics are now integrated into the
Frontendsection of Impala query profiles, providing deeper insight into query planning performance for Iceberg tables.The query profile now displays scan metrics from Iceberg's
planFiles()API, including total planning time, counts of data/delete files and manifests, and the number of skipped files.Metrics are displayed on a per-table basis. If a query scans multiple Iceberg tables, a separate metrics section will appear in the profile for each one.
For more information, see IMPALA-13628
- Delete orphan files for Iceberg tables
- You can now use the following syntax to remove orphan files for Iceberg
tables:
-- Remove orphan files older than '2022-01-04 10:00:00'. ALTER TABLE ice_tbl EXECUTE remove_orphan_files('2022-01-04 10:00:00'); -- Remove orphan files older than 5 days from now. ALTER TABLE ice_tbl EXECUTE remove_orphan_files(now() - interval 5 days);This feature removes all files from a table’s data directory that are not linked from metadata files and that are older than the value of
older_thanparameter. Deleting orphan files from time to time is recommended to keep size of a table’s data directory under control. For more information, see IMPALA-14492 - Allow forced predicate pushdown to Iceberg
- Since IMPALA-11591, Impala has optimized query planning by avoiding predicate pushdown to
Iceberg unless it is strictly necessary. While this default behavior makes planning faster, it
can miss opportunities to prune files early based on Iceberg's file-level statistics.
A new table property,
impala.iceberg.push_down_hintis introduced, which allows you to force predicate pushdown for specific columns. The property accepts a comma-separated list of column names, for example,'col_a, col_b'.If a query contains a predicate on any column listed in this property, Impala will push that predicate down to Iceberg for evaluation during the planning phase. For more information, see IMPALA-14123
UPDATEoperations now skip rows that already have the desired value- The
UPDATEstatement for Iceberg and Kudu tables is optimized to reduce unnecessary writes.Previously, an
UPDATEoperation would modify all rows matching theWHEREclause, even if those rows already contained the new value. For Iceberg tables, this resulted in writing unnecessary new data and delete records.With this enhancement, Impala automatically adds an extra predicate to the
UPDATEstatement to exclude rows that already match the target value. For more information, see IMPALA-12588.
