What's New in Apache Impala

Learn about the new features of Impala in Cloudera Runtime 7.2.11.

Added a new scale argument to the ndv() function

The optional argument scale must be an integer and can be in the range from 1 to 10 and maps to a precision used by the HyperLogLog (HLL) algorithm with the following mapping formula:

precision = scale + 8

See NDV Function for more information.

Added a new query option DELETE_STATS_IN_TRUNCATE

This query option DELETE_STATS_IN_TRUNCATE can be used to delete or retain table statistics. The default value of this option is 1 or true which means table statistics will be deleted as part of truncate operation.

See Query Options for more information.

Added a new query option KUDU_REPLICA_SELECTION

Using this new query option KUDU_REPLICA_SELECTION, the queries can be targeted to the leader-only replica and skip non-leader replicas while scanning. When the new query option KUDU_REPLICA_SELECTION is set as LEADER_ONLY, Impala planner will generate a query plan that enables Impala to scan Kudu tables at the leader-only replicas.

See Query Options for more information.

Added a new query option DEFAULT_NDV_SCALE

As a cluster admin, you can improve the precision of NDV() using this new query option DEFAULT_NDV_SCALE. You may adjust the value in this query option to change the default precision setting for NDV() so that the SQL writers do not have to rewrite the SQLs to adjust NDV’s precision.

See Query Options for more information.