Running the Hive Upgrade Check tool

The Hive Strict Metastore Migration uses the public Hive Thrift API to materialize every table to determine if it needs to be upgraded. That process is very time consuming. If you are expediting the Hive upgrade process and modified the upgrade process to skip materialzing every table in the metastore, you need to identify databases and tables that are subject to the upgrade process and run HSMM on them or run provided scripts.

You use the Hive Upgrade Check community tool to help you identify tables that would need upgrading.

There are many reports from the Hive Upgrade Check process that should be run well in advance of the upgrade and used to clean up the Hive Metastore. Provide adequate time and planning to do this.

You modifed the Hive Strict Metastore Migration to skip processing Hive tables in your databases and then completed the upgrade process to CDP.
  1. Obtain the Hive Upgrade Check tool.
    Download the Hive Upgrade Check tool from the Community-based github location.
  2. Follow instructions in the github readme to run the tool.
    The Hive Upgrade Check (v.2.3.5.0+) will create a yaml file (hsmm_whitelist.yaml) identifying databases and tables that require attention.
  3. Run HSMM on DB’s identified in this hsmm_whitelist.yaml or apply the scripts suggested by the tool to make the adjustments.