What's New in Apache Hive
This topic lists new Hive features in this release of Cloudera Runtime.
-
Scheduled Queries, Rebuilding Materialized Views Automatically Using SQL
You can schedule Hive queries to run on a recurring basis, monitor query progress, temporarily ignore a query schedule, and limit the number running in parallel. You can use scheduled queries to start compaction and periodically rebuild materialized views, for example. For details, see the Apache Hive Language Manual. In CDP, you need to enable scheduled queries.
- Auto-translation for Spark-Hive reads, no HWC session needed
Reads Hive ACID tables in HMS from Spark directly or through HWC based your configuration of
spark.sql.extensions
. The HWC session is created transparently. Use existing Spark application code without modification. - Hive Warehouse Connector Spark direct reads
Spark Direct Reader is a Spark Datasource V1 implementation for reading Hive ACID, transactional tables from Spark. Spark Direct Reader is intended to be used for Extract Transform Load (ETL) or Extract Load Transform (ELT) processes.
- Authorization of external file writes from Spark
Ranger now authorizes read/write access to external files from Spark through the Hive metastore API (HMS API) in addition to read/write access to managed Hive tables from Spark through HiveServer (HS2).
- Specifying a top level directory for managed tables when creating a Hive database