Changes to HDP Hive tables
You need to locate and use your Apache Hive 3 tables after moving to CDP. You also need to understand the changes that occur during the process.
Managed, ACID tables that are not owned by the hive
user remain managed
tables in CDP, but hive
becomes the owner.
In CDP, the format of a Hive table is the same as it was in HDP. For example, native or non-native tables remain native or non-native, respectively.
In CDP, the location of managed tables or partitions do not change under any one of the following conditions:
- The old HDP 2.x table or partition directory was not in its default location
/apps/hive/warehouse
before the upgrade. - The old table or partition is in a different file system than the new warehouse directory.
- The old table or partition directory is in a different encryption zone than the new warehouse directory.
Otherwise, in CDP managed files reside in the Hive warehouse
/warehouse/tablespace/managed/hive
. The location of external files in CDP
does not change. By default, Hive places any new external tables you create in
/warehouse/tablespace/external/hive
. In CDP, the
hive.metastore.warehouse.dir
property is set to this location, designating
it the Hive warehouse location.
Changes to table references using dot notation
Upgrading to CDP includes the Hive-16907 bug fix, which rejects `db.table` in SQL queries.
The dot (.) is not allowed in table names. To reference the database and table in a table
name, both must be enclosed in backticks as follows: `db`.`table`
.
Changes to ACID properties
Hive 3.x supports transactional and non-transactional tables. Transactional tables have atomic, consistent, isolation, and durable (ACID) properties. In Hive 2.x, the initial version of ACID transaction processing was ACID v1. In Hive 3.x, the mature version of ACID is ACID v2, which is the default table type in CDP.
Native and non-native storage formats
Storage formats are a factor in changes to table types. Hive 2.x and 3.x support the following native and non-native storage formats:
- Native: Tables with built-in support in Hive, such as those in the following file
formats:
- Text
- Sequence File
- RC File
- AVRO File
- ORC File
- Parquet File
- Non-native: Tables that use a storage handler, such as the DruidStorageHandler or HBaseStorageHandler
Changes to HDP table types
HDP 2.x | CDP | ||||
---|---|---|---|---|---|
Table Type | ACID v1 | Format | Owner (user) of Hive Table File | Table Type | ACID v2 |
External | No | Native or non-native | hive or non-hive | External | No |
Managed | Yes | ORC | hive or non-hive | Managed, updatable | Yes |
Managed | No | ORC | hive | Managed, updatable | Yes |
non-hive | External, with data delete | No | |||
Managed | No | Native (but non-ORC) | hive | Managed, insert only | Yes |
non-hive | External, with data delete | No | |||
Managed | No | Non-native | hive or non-hive | External, with data delete | No |