Hive post-HDP-upgrade tasks

You need to perform several critical tasks after upgrading to CDP Private Cloud Base, such as checking table locations and correcting locations of files. Before beginning these tasks, you need a little information about Hive changes to ACID properties.

A brief description of storage formats is provided as background information for understanding how the upgrade process relocates tables.

Changes include the location of tables, the HDFS path to the Hive warehouse, file ownership, table types, and ACID-compliance. You need to understand the following changes that occur during the upgrade before performing the post-upgrade tasks.

Hive Changes to ACID Properties

Hive 2.x and 3.x have transactional and non-transactional tables. Transactional tables have atomic, consistent, isolation, and durable (ACID) properties. In Hive 2.x, the initial version of ACID transaction processing is ACID v1. In Hive 3.x, the mature version of ACID is ACID v2, which is the default table type in HDP CDP Private Cloud Base.

Native and Non-native Storage Formats in Hive

Storage formats are a factor in upgrade changes to table types. No change has occurred to storage formats in CDP Private Cloud Base, but storage formats determine table changes that occur during the upgrade. Hive 2.x and 3.x supports the following Hadoop native and non-native storage formats:

  • Native: Tables with built-in support in Hive, such as those in the following file formats:

  • Text

  • Sequence File

  • RC File

  • AVRO File

  • ORC File

  • Parquet File

  • Non-native: Tables that use a storage handler, such as the DruidStorageHandler or HBaseStorageHandler

After the upgrade, the format of a Hive table is the same as before the upgrade. For example, native or non-native tables remain native or non-native, respectively.