Changes to HDP Hive tables
As a Data Scientist, Architect, Analyst, or other Hive user you need to locate and use your Apache Hive 3 tables after an upgrade. You also need to understand the changes that occur during the upgrade process.
Managed, ACID tables that are not owned by the hive
user remain managed
tables after the upgrade, but hive
becomes the owner.
After the upgrade, the format of a Hive table is the same as before the upgrade. For example, native or non-native tables remain native or non-native, respectively.
After the upgrade, the location of managed tables or partitions do not change under any one of the following conditions:
- The old table or partition directory was not in its default location
/apps/hive/warehouse
before the upgrade. - The old table or partition is in a different file system than the new warehouse directory.
- The old table or partition directory is in a different encryption zone than the new warehouse directory.
Otherwise, the upgrade process from HDP to CDP moves managed files to the Hive warehouse
/warehouse/tablespace/managed/hive
. The upgrade process carries the
external files over to CDP with no change in location. By default, Hive places any new
external tables you create in /warehouse/tablespace/external/hive
. The
upgrade process sets the hive.metastore.warehouse.dir
property to this
location, designating it the Hive warehouse location.
Changes to table references using dot notation
Upgrading to CDP includes the Hive-16907 bug fix, which rejects `db.table` in SQL queries.
The dot (.) is not allowed in table names. To reference the database and table in a table
name, both must be enclosed in backticks as follows: `db`.`table`
.
Changes to ACID properties
Hive 3.x in Cloudera Private Cloud Base supports transactional and non-transactional tables. Transactional tables have atomic, consistent, isolation, and durable (ACID) properties. In Hive 2.x, the initial version of ACID transaction processing was ACID v1. In Hive 3.x, the mature version of ACID is ACID v2, which is the default table type in Cloudera Private Cloud Base.
Native and non-native storage formats
Storage formats are a factor in upgrade changes to table types. Hive 2.x and 3.x support the following native and non-native storage formats:
- Native: Tables with built-in support in Hive, such as those in the following file
formats:
- Text
- Sequence File
- RC File
- AVRO File
- ORC File
- Parquet File
- Non-native: Tables that use a storage handler, such as the DruidStorageHandler or HBaseStorageHandler
CDP upgrade changes to HDP table types
HDP 2.x | CDP | ||||
---|---|---|---|---|---|
Table Type | ACID v1 | Format | Owner (user) of Hive Table File | Table Type | ACID v2 |
External | No | Native or non-native | hive or non-hive | External | No |
Managed | Yes | ORC | hive or non-hive | Managed, updatable | Yes |
Managed | No | ORC | hive | Managed, updatable | Yes |
non-hive | External, with data delete | No | |||
Managed | No | Native (but non-ORC) | hive | Managed, insert only | Yes |
non-hive | External, with data delete | No | |||
Managed | No | Non-native | hive or non-hive | External, with data delete | No |