Administrative tools for Hive Metastore integration
Kudu provides the command line tools kudu hms list
, kudu hms
precheck
, kudu hms check
, and kudu hms fix
to allow
administrators to find and fix metadata inconsistencies between the internal Kudu catalog and the
Hive Metastore catalog.
The kudu hms
tools should be run from the command line as the Kudu
admin user. They require the full list of master addresses to be specified:
$ sudo -u kudu kudu hms check master-name-1:7051,master-name-2:7051,master-name-3:7051
To see a full list of the options available with the kudu hms
tool, use the --help
flag.
When fine-grained authorization is enabled, the Kudu admin user, commonly "kudu", needs to have
access to all the Kudu tables to be able to run the kudu hms
tools.
This can be done by configuring the user as a trusted user via the
--trusted_user_acl
master configuration.
If the Hive Metastore is configured with fine-grained authorization using Apache Sentry, the
Kudu admin user needs to have read and write privileges on HMS table entries. Configured this in
the Hive Metastore using the sentry.metastore.service.users
property.
kudu hms list
- The
kudu hms list
tool scans the Hive Metastore catalog, and lists the HMS entries (including table name and type) for Kudu tables, as indicated by their HMS storage handler. kudu hms check
- The
kudu hms check
tool scans the Kudu and Hive Metastore catalogs, and validates that the two catalogs agree on what Kudu tables exist. The tool will make suggestions on how to fix any inconsistencies that are found. Typically, the suggestion will be to run thekudu hms fix
tool, however some certain inconsistencies require using Impala Shell for fixing. kudu hms precheck
- The
kudu hms precheck
tool scans the Kudu catalog and validates that if there are multiple Kudu tables whose names only differ by case and logs the conflicted table names. kudu hms fix
- The
kudu hms fix
tool analyzes the Kudu and HMS catalogs and attempts to fix any automatically-fixable issues, for instance, by creating a table entry in the HMS for each Kudu table that doesn't already have one. The--dryrun
option shows the proposed fix instead of actually executing it. When no automatic fix is available, it will make suggestions on how a manual fix can help. kudu hms downgrade
- The
kudu hms downgrade
downgrades the metadata to legacy format for Kudu and the Hive Metastores. It is discouraged to use unless necessary, since the legacy format can be deprecated in future releases.