What's New in Apache Impala

This release of Impala contains the following changes and enhancements from previous releases.

Continue reading:

New Features in CDH 5.16.x
New Features in CDH 5.15.x
New Features in CDH 5.14.x
New Features in CDH 5.13.x
New Features in CDH 5.12.x
New Features in CDH 5.12 / Impala 2.9
New Features in CDH 5.11.x
New Features in Impala 2.8.x / CDH 5.10.x
New Features in Impala 2.7.x / CDH 5.9.x
New Features in Impala 2.6.x / CDH 5.8.x
New Features in Impala 2.5.x / CDH 5.7.x
New Features in Impala 2.4.x / CDH 5.6.x
New Features in Impala 2.3.x / CDH 5.5.x
New Features in Impala 2.2.x for CDH 5.4.3 and 5.4.4
New Features in Impala 2.2.x / CDH 5.4.x
New Features in Impala 2.1.8 / CDH 5.3.10
New Features in Impala 2.1.7 / CDH 5.3.9
New Features in Impala 2.1.6 / CDH 5.3.8
New Features in Impala 2.1.5 / CDH 5.3.6
New Features in Impala 2.1.0 / CDH 5.3.0
New Features in Impala 2.0.0 / CDH 5.2.0
New Features in Impala 1.4.0 / CDH 5.1.0
New Features in Impala 1.3.2 / CDH 5.0.4
New Features in Impala 1.3.1 / CDH 5.0.3
New Features in Impala 1.3.0 / CDH 5.0.0
New Features in Impala 1.2.4
New Features in Impala 1.2.3
New Features in Impala 1.2.2
New Features in Impala 1.2.1
New Features in Impala 1.2.0 (Beta)
New Features in Impala 1.1.1
New Features in Impala 1.1
New Features in Impala 1.0.1
New Features in Impala 1.0
New Features in Version 0.7 of the Impala Beta Release
New Features in Version 0.6 of the Impala Beta Release
New Features in Version 0.5 of the Impala Beta Release
New Features in Version 0.4 of the Impala Beta Release
New Features in Version 0.3 of the Impala Beta Release
New Features in Version 0.2 of the Impala Beta Release

New Features in CDH 5.16.x

The following are some of the most significant new features in this release.

Fine Grained Privileges

Sentry and Impala introduced fine grained privileges to provide object-level privileges to roles.

Fine grained privileges include the REFRESH and CREATE privileges, which allow users to create databases and tables, and to execute commands that update metadata information on Impala databases and tables. See Impala Sentry documentation for the new privileges and the scopes of the objects that you can grant the new privileges on.

The following new privileges were added:

The REFRESH privilege
The CREATE privilege
The SELECT and INSERT privileges on SERVER

If a role has SELECT or INSERT privilege on an object in Impala before upgrading to CDH 5.16.1, that role will automatically get the REFRESH privilege during the upgrade.

Object Ownership

Object ownership designates an owner for a database, table, or view in Sentry. The owner of an object has the OWNER privilege which is the equivalent of the ALL privilege on the object. See the Object Ownership documentation for information about enabling object ownership.

If the object ownership feature is enabled, Sentry grants the user the OWNER privilege. Whether or not object ownership is enabled, HMS stores the object creator as the default object owner. Previously, HMS stored the Kerberos user as the object owner.

The following statements were added to Impala to support object ownership via Sentry:

ALTER DATABASE SET OWNER
ALTER TABLE SET OWNER
ALTER VIEW SET OWNER
SHOW GRANT USER

Admission Control Enhancement

A new query option, MAX_MEM_ESTIMATE_FOR_ADMISSION, was added. Use the new option to set an upper limit on the memory estimates of a query as a workaround for over-estimates precluding a query from being admitted.

General Performance Improvements

A new query option, SHUFFLE_DISTINCT_EXPRS, controls the shuffling behavior when a query has both grouping and distinct expressions.

Metadata Performance Improvements

Incremental Stats
The following enhancements improve Impala stability. The features reduce chances of having catalogd and impalad crash due to be out of memory when using incremental stats.
- Incremental stats are now compressed in memory in catalogd, reducing memory footprint in catalogd.
- Incremental stats are fetched on demand from catalogd by impalad coordinators. This enhancement reduces memory footprint for impalad coordinators and statestored and also reduces network requirements to broadcast metadata.
See Loading Incremental Statistics from Catalogd for details.
Automatic Invalidation of Metadata
Note: This feature is experimental and not recommended for use in production clusters.

To keep the size of metadata bounded and to reduce the chances of catalogd cache running out of memory, this release introduces an automatic metadata invalidation feature with time-based and memory-based invalidation.

Automatic invalidation of metadata provides more stability with lower chances of running out of memory, but could potentially cause performance risks. The feature is turned off by default.

See Startup Options for Automatic Invalidation of Metadata for details.