Changes to HDP Hive tables

You need to locate and use your Apache Hive 3 tables after moving to CDP. You also need to understand the changes that occur during the process.

Managed, ACID tables that are not owned by the hive user remain managed tables in CDP, but hive becomes the owner.

In CDP, the format of a Hive table is the same as it was in HDP. For example, native or non-native tables remain native or non-native, respectively.

In CDP, the location of managed tables or partitions do not change under any one of the following conditions:

  • The old HDP 2.x table or partition directory was not in its default location /apps/hive/warehouse before the upgrade.
  • The old table or partition is in a different file system than the new warehouse directory.
  • The old table or partition directory is in a different encryption zone than the new warehouse directory.

Otherwise, in CDP managed files reside in the Hive warehouse /warehouse/tablespace/managed/hive. The location of external files in CDP does not change. By default, Hive places any new external tables you create in /warehouse/tablespace/external/hive. In CDP, the hive.metastore.warehouse.dir property is set to this location, designating it the Hive warehouse location.

Changes to table references using dot notation

Upgrading to CDP includes the Hive-16907 bug fix, which rejects `db.table` in SQL queries. The dot (.) is not allowed in table names. To reference the database and table in a table name, both must be enclosed in backticks as follows: `db`.`table`.

Changes to ACID properties

Hive 3.x supports transactional and non-transactional tables. Transactional tables have atomic, consistent, isolation, and durable (ACID) properties. In Hive 2.x, the initial version of ACID transaction processing was ACID v1. In Hive 3.x, the mature version of ACID is ACID v2, which is the default table type in CDP.

Native and non-native storage formats

Storage formats are a factor in changes to table types. Hive 2.x and 3.x support the following native and non-native storage formats:

  • Native: Tables with built-in support in Hive, such as those in the following file formats:
    • Text
    • Sequence File
    • RC File
    • AVRO File
    • ORC File
    • Parquet File
  • Non-native: Tables that use a storage handler, such as the DruidStorageHandler or HBaseStorageHandler

Changes to HDP table types

The following table compares Hive table types and ACID operations from HDP 2.x to CDP. The ownership of the Hive table file is a factor in determining table types and ACID operations in CDP.
Table 1. HDP 2.x and CDP Table Type Comparison
Table Type ACID v1 Format Owner (user) of Hive Table File Table Type ACID v2
External No Native or non-native hive or non-hive External No
Managed Yes ORC hive or non-hive Managed, updatable Yes
Managed No ORC hive Managed, updatable Yes
non-hive External, with data delete No
Managed No Native (but non-ORC) hive Managed, insert only Yes
non-hive External, with data delete No
Managed No Non-native hive or non-hive External, with data delete No