Importing Hive Metadata using Command-Line (CLI) utility
You can use the Atlas-Hive import command-line utility to load Atlas with databases and tables present in Hive Metastore.
This utility supports importing metadata of a specific table, tables from a specific database or all databases and tables.
Consider a scenario where Hive has databases and tables prior to enabling Hive hook for Atlas. In such a situation, the Atlas-Hive import utility can be employed to ensure Hive and Atlas are in sync.
Supported Hive Metadata import options:
Atlas-Hive utility supports various options which can be used while importing Hive Metadata:
-d <database regex> OR --database <database regex>
Specify the database name pattern which has to be synced with Atlas.
-t <table regex> OR --table <table regex>
Specify the table name pattern which has to be synced with Atlas. It must be used
along with -d
.
-f <filename>
Imports all databases and tables in the specified file. The file must have one entry
on each line where the entry is in the form of
<database_name>:<table_name>
.
For example:
A scenario where the user wants to import two tables named t11 from database db1 and t21 from db2 and all tables from db3. The file content must be:
- db1:t11
- db2:t21
- db3
No options
Does not specify any option to import all databases from Hive into Atlas.
A sample usage of the script Atlas hook in Hive is not configured and hence no “Live” data gets reflected into Atlas.
Later, you configure the Atlas hook but it is observed that the Hive database already contains entities that need to reflect in Atlas. In such cases, an Atlas-hive import script reads the database and tables from Hive Metadata and creates entities in Atlas.
An example of Atlas-hive script:
Usage 1:
<atlas bundle>/hook-bin/import-hive.sh
Usage 2:
<atlas bundle>/hook-bin/import-hive.sh [-d <database regex> OR
--database <database regex>] [-t <table regex> OR --table <table
regex>]
Usage 3:
<atlas bundle>/hook-bin/import-hive.sh [-f <filename>]
File Format:
database1:tbl1
database1:tbl2
database2:tbl1
Limitations of using Atlas-Hive import script
The Atlas-Hive import utility has the following limitations:
- Cannot delete entities which are dropped from Hive but do exist in Atlas.
- Cannot create lineages.