Cloudera Runtime Release NotesPDF version

What's New in Apache Hive

This topic lists new features for Apache Hive in this release of Cloudera Runtime.

Hive 3 tables are ACID (Atomicity, Consistency, Isolation, and Durability)-compliant, which is critical to observing the right to be forgotten requirement of the GDPR (General Data Protection Regulation).

Hive metastore (HMS) interoperates with multiple engines, Impala and Spark for example, simplifying interoperation between engines and user data access.

Hive processes transactions using low-latency analytical processing (LLAP) or the Hive-on-Tez execution engine.

You can use Hive to query data from Apache Spark applications without workarounds. The Hive Warehouse Connector supports reading and writing Hive tables from Spark.

Apache Ranger secures Hive data by default. To meet demands for concurrency improvements, ACID support for GDPR (General Data Protection Regulation), render security, and other features, Hive tightly controls the location of the warehouse on a file system, or object store, and memory resources.

You can configure who uses query resources, how much can be used, and how fast Hive responds to resource requests. Workload management can improve parallel query execution, cluster sharing for queries running on Hive LLAP, and performance of non-LLAP queries.

Because multiple queries frequently need the same intermediate roll up or joined table, you can avoid costly, repetitious query portion sharing, by precomputing and caching intermediate tables into views.

When launched, Hive creates two databases from JDBC data sources: information_schema and sys. All Metastore tables are mapped into your tablespace and available in sys. The information_schema data reveals the state of the system, similar to sys database data. You can query information_schema using SQL standard queries, which are portable from one DBMS to another.

We want your opinion

How can we improve this page?

What kind of feedback do you have?