Running the Hive Upgrade Check tool
The Hive Strict Metastore Migration uses the public Hive Thrift API to materialize every table to determine if it needs to be upgraded. That process is very time consuming. If you are expediting the Hive upgrade process and modified the upgrade process to skip materialzing every table in the metastore, you need to identify databases and tables that are subject to the upgrade process and run HSMM on them or run provided scripts.
- Check SERDE Definitions and Availability
- Handle Missing Table or Partition Locations
- Manage Table Location Mapping
- Make Tables SparkSQL Compatible
Obtain the Hive Upgrade Check tool.
Download the Hive Upgrade Check tool from the Community-based github location.
Follow instructions in the github readme to run the tool.
The Hive Upgrade Check (v.220.127.116.11+) will create a yaml file (hsmm_whitelist.yaml) identifying databases and tables that require attention.
Do what the Hive Upgrade Check tool tells you to do.
At a minimum, you must run the following processes described in the github readme:
- process ID 1 Table / Partition Location Scan - Missing Directories
- process id 3 Hive 3 Upgrade Checks - Managed Non-ACID to ACID Table Migrations