Apache Hive overview
Also available as:

Creating a table

To improve useability and functionality, Hive 3 significantly changed table creation.

Hive has changed table creation in the following ways:
  • Creates ACID-compliant table, which is the default in HDP
  • Supports simple writes and inserts
  • Writes to multiple partitions
  • Inserts multiple data updates in a single SELECT statement
  • Eliminates the need for bucketing.

If you have an ETL pipeline that creates tables in Hive, the tables will be created as ACID. Hive now tightly controls access and performs compaction periodically on the tables. The way you access managed Hive tables from Spark and other clients changes. In CDP, access to external tables requires you to set up security access permissions.

Before Upgrade

In HDP 2.6.5, by default CREATE TABLE created a non-ACID table.

After Upgrade

By default CREATE TABLE creates a full, ACID transactional table in ORC format.

Action Required

To access Hive ACID tables from Spark, you connect to Hive using the Hive Warehouse Connector (HWC). To write ACID tables to Hive from Spark, you use the HWC and HWC API. Set up Ranger policies and HDFS ACLs for tables.