What's New in Apache Impala

Learn about the new features of Impala in Cloudera Runtime 7.2.7.

Impyla Support

Added support to Impyla to connect to and submit SQL queries to Impala. Impyla is a Python client wrapper around the HiveServer2 Thrift Service. It connects to either Hive or Impala and implements Python DB API 2.0

See Configuring Impyla for Impala for more information.

ROLE-related Statements in Impala

Impala does not currently support ROLE-related DDL statements for Ranger. However if you are migrating your workload from CDH to CDP or upgrading from CDH to CDP, you can migrate the role-based authentication rules and manage them using the Ranger admin UI.

Impala's Builtin Mask Functions through Overloads

Changes in Hive UDF implemented through "GenericUDF" supports a lot more features. Even though Impala users can call Hive UDFs, Impala does not yet support new Hive UDFs based on the GenericUDF class, so you cannot use Hive's mask functions in Impala. However, Impala has builtin mask functions that are implemented through overloads.

See Limitations on Mask Functions for more information.

Improvements in impala-shell "profile" Command

Currently, the impala-shell 'profile' command only returns the profile of the most recent profile attempt. This release added support for returning both original and retried profiles for a retried query.

See Understanding Performance using Query Profile for more information.

The File Handle Cache Supports ABFS

Impala can now cache ABFS file handles for tables that store their data in ABFS storage.

See Scalability Considerations for File Handle Caching for more information.

Introduced Query Options for Local Time related Flags

Until this release, discrepancy in INTERVAL operation was handled by the use_local_tz_for_unix_timestamp_conversions setting and the time zone between Hive and Impala was supported by the convert_legacy_hive_parquet_utc_timestamps setting. These two settings were controlled using flags. In this release, these two settings have been simplified and can be added as query options. Since this change, these two flags can only be used on the Coordinator to set the defaults for the query options. However, if you set these flags as query options using the default_query_options configuration, it will take precedence over the old flags.

See TIMESTAMP data type and Query options for more information.