Importing Hive Metadata using Command-Line (CLI) utility
You can use the Atlas-Hive import command-line utility to load Atlas with databases and tables present in Hive Metastore.
This utility supports importing metadata of a specific table, tables from a specific database or all databases and tables.
Consider a scenario where Hive has databases and tables prior to enabling Hive hook for Atlas. In such a situation, the Atlas-Hive import utility can be employed to ensure Hive and Atlas are in sync.
Also, the utility can be used in a scenario where Atlas is unable to process specific messages due to some errors that could possibly occur with Kafka.
Supported Hive Metadata import options:
Atlas-Hive utility supports various options which can be used while importing Hive Metadata:
-d <database regex> OR --database <database regex>
Specify the database name pattern which has to be synced with Atlas.
-t <table regex> OR --table <table regex>
Specify the table name pattern which has to be synced with Atlas. It must be used
Imports all databases and tables in the specified file. The file must have one entry
on each line where the entry is in the form of
A scenario where the user wants to import two tables named t11 from database db1 and t21 from db2 and all tables from db3. The file content must be:
Does not specify any option to import all databases from Hive into Atlas.
A sample usage of the script
Atlas hook in Hive is not configured and hence no “Live” data gets reflected into Atlas.
Later, you configure the Atlas hook but it is observed that the Hive database already contains entities that need to reflect in Atlas. In such cases, an Atlas-hive import script reads the database and tables from Hive Metadata and creates entities in Atlas.
An example of Atlas-hive script:
<atlas bundle>/hook-bin/import-hive.sh [-d <database regex> OR
--database <database regex>] [-t <table regex> OR --table <table
<atlas bundle>/hook-bin/import-hive.sh [-f <filename>]
Limitations of using Atlas-Hive import script
The Atlas-Hive import utility has the following limitations:
- Cannot delete entities which are dropped from Hive but do exist in Atlas.
- Cannot create lineages.