What's New in Apache Hive

New features and functional updates for Hive are introduced in Cloudera Runtime 7.3.2, its service packs, and cumulative hotfixes.

Cloudera Runtime 7.3.2

Cloudera Runtime 7.3.2 introduces new features of Hive and includes all service packs and cumulative hotfixes from 7.3.1.100 through 7.3.1.706. For a comprehensive record of all updates in Cloudera Runtime 7.3.1.x, see New Features.

Hive user experience enhancements

Cloudera now provides several Hive user experience enhancements:

Improved error handling for the SHOW PARTITIONS command: The SHOW PARTITIONS command now returns a concise execution error instead of a full stack trace when you run it against a non-partitioned table. This change simplifies the output and helps you identify configuration issues more quickly. (HIVE-26926)
Enhanced error messages for the STORED BY clause: The error message displayed when you provide an invalid identifier or literal in the STORED BY clause is now more informative. This improvement provides better guidance if you mistakenly use STORED BY instead of STORED AS or provide an incorrect storage format. (HIVE-27957)
Enhanced task attempt log clarity: The task attempt log now includes explicit context for boolean values to improve readability. You can now clearly identify whether a task attempt is guaranteed or not within the log messages. (HIVE-28246)
Enabled vectorized mode support for custom UDFs: You can now use custom User-Defined Functions (UDFs) in vectorized execution mode. This enhancement ensures that custom functions, such as TIME_PARSE, integrate correctly with vectorized query plans to improve processing performance. (HIVE-28830)
Resolved NullPointerException in TezSessionPoolManager: A NullPointerException (NPE) that occurred in the TezSessionPoolManager when the resource plan was null is now resolved. This fix improves the stability of the Tez session pool when updating triggers from an active resource plan. (HIVE-29007)

Apache Jira: HIVE-26926, HIVE-27957, HIVE-28246, HIVE-28830, HIVE-29007

Dropping Hive Metastore statistics

Statistics associated with tables, partitions, and columns in the Hive Metastore (HMS). This feature is particularly useful during migration or replication processes where large volumes of statistical data—generated for every table, partition, and column combination can significantly increase copy times. Removing unnecessary statistics, you can improve operational efficiency and reduce data transfer overhead.

Apache Jira: HIVE-28655

Enhanced Hive Metastore notification fetching with table filters

The Hive Metastore (HMS) notification fetch API now supports optional database and table name filters. This enhancement allows external engines and clients to fetch pending events specifically for a given table or list of tables, rather than scanning all notifications. Additionally, an index on the table name is now included in the HMS notification log to optimize query performance and prevent table scans. This change enables more efficient metadata synchronization and improves performance for queries involving multiple tables.

Apache Jira: HIVE-27499

New command to display HiveServer2 and Hive Metastore connections

You can now use the SHOW PROCESSLIST command to display active operations and connection details for HiveServer2 (HS2) and Hive Metastore (HMS) instances. This feature provides a view of current sessions, including user names, IP addresses, query IDs, and execution states, similar to the process list functionality in MySQL. This command helps you troubleshoot stuck queries, monitor service load, and identify inappropriate connections for termination.

Apache Jira: HIVE-27829

Upgrading Calcite: Hive has been upgraded to Calcite version 1.33. This upgrade introduces various query optimizations that can improve query performance.
Hive on ARM Architecture: Hive is now fully supported on ARM architecture instances, including AWS Graviton and Azure ARM. This enables you to run your Hive workloads on more cost-effective and energy-efficient hardware.
ZooKeeper SASL authentication for Hive clients: You can now configure Hive clients to authenticate with a ZooKeeper ensemble that enforces Simple Authentication and Security Layer (SASL). By using the new hive.zookeeper.client.sasl.enforce property, JDBC and HS2 clients can successfully establish sessions in Kerberized environments that require secure service discovery and locking. For more information, see Configuring Zookeeper SASL for Hive
Impala now supports OAuth Authentication: Cloudera now provides support for OAuth authentication using OAuth JWT bearer tokens. For more information, see:
Impala OAuth Authentication