What's New in Apache Hive

This topic lists new Hive features in this release of Cloudera Runtime.

Auto-translation for Spark-Hive reads, no HWC session needed

Reads Hive ACID tables in HMS from Spark directly or through HWC, based on your configuration of spark.sql.extensions. The HWC session is created transparently. Use existing Spark application code without modification. For details, see HWC configuration information.

Hive Warehouse Connector Spark direct reads

Spark Direct Reader is a Spark Datasource V1 implementation for reading Hive ACID, transactional tables from Spark fast. Low-latency analytical processing (LLAP) is not needed. Spark Direct Reader is intended to be used for Extract Transform Load/Extract Load Transform (ETL/ELT) processes.

Authorization of external file writes from Spark

Ranger now authorizes read/write access to external files from Spark through the Hive metastore API (HMS API) in addition to read/write access to managed Hive tables from Spark through HiveServer (HS2).

Scheduled Queries, Rebuilding Materialized Views Automatically

You can schedule Hive queries to run on a recurring basis, monitor query progress, temporarily ignore a query schedule, and limit the number running in parallel. You can use scheduled queries to start compaction and periodically rebuild materialized views, for example. For details, see the Apache Hive Language Manual. In CDP, you need to enable scheduled queries.

SHOW DATABASES supports SQL LIKE patterns In a SHOW DATABASES LIKE statement, you can use wildcards, and in this release of Hive, specify any character or a single character. See examples.

Support for quantified comparison predicates ANY/SOME/ALL in subqueries

Apache Hive now supports ALL and SOME/ANY for uncorrelated subqueries. Prior to the introduction of this feature, <>ANY and =ALL were disallowed in these constructs. See examples.

New configuration parameter for Hive: hive.server2.tez.sessions.per.default.queue