What's New in Apache Impala

Learn about the new features of Impala in Cloudera Runtime 7.1.8.

UTF-8 mode support

Certain Impala STRING data types now support UTF-8 aware behavior to ensure consistent results for non-ASCII characters in the string value, in both Hive and Impala.

Querying arrays using JOIN and UNNEST

Impala now supports using the UNNEST function in the SELECT list or in the FROM clause in the SELECT statement to query arrays. When you use the UNNEST function, you can provide more than one array in the SELECT statement. If you use JOIN functions for querying arrays it yields a “joining unnest” functionality; however, the latter provides a “zipping unnest” functionality.

Zipping unnest on arrays from views

As part of this release, you can use the zipping unnest functionality on arrays from views as the source. Before this release, this zipping functionality worked for arrays only in Tables but did not support views as a source. For more information about using this zipping unnest functionality, see Zipping unnest on arrays from Views.

Support for Kerberos over HTTP

Currently, Impala-shell does not support Kerberos over HTTP. When we have Knox as a gateway, users cannot access Impala using impala-shell with Kerberos authentication. As part of this release, Impala-shell now supports connecting to a Kerberized Impala instance over HTTP in a cluster, to issue queries.

Multi-row transactions in Kudu

This release adds broader transactional support between Kudu and Impala. Since the support for multi-row transactions has been added in Kudu, you can now use an INSERT query in Impala to insert multiple rows into Kudu's table in the context of a single transaction.

Ranger audit behavior enhancements

Before this release, the Ranger authorization is called for each Impala object; that is, database, table and each column, and this generates a bulky audit for a larger number of columns. This release consolidates the log entries of several columns’ accesses into one entry in the same table, which saves space.

Impala hash functions

This release adds built-in functions to compute SHA-1 digest and SHA-2 family of hashing functions. This new function enables you to use queries where certain fields must apply a one-way hash function to prevent the values of those fields from being displayed.

Asynchronous model for some DDL statements

This release adds a new query option ENABLE_ASYNC_DDL_EXECUTION that you can configure to execute any request from an Impala client to the Impala server asynchronously in different threads without blocking the RPC. Using this asynchronous model, you can get a query handle and poll for state and results to avoid Impala clients hanging indefinitely.

Impala-shell with Python 3

Since Python 2.7 has reached the end of life, impala-shell can now be used with Python 3 by installing the latest release from PyPI at https://pypi.org/project/impala-shell/4.2.0a1/ .

READ support for FULL-ACID ORC Tables

Until this release, Impala in CDP supported INSERT-ONLY transactional tables allowing both READ and WRITE operations. The latest version of Impala in CDP 7.1.8 now also supports READ of FULL ACID ORC tables without modifying any configurations.