Taking a mandatory snapshot of HDP tables
Taking a snapshot of Hive tables is mandatory before upgrading or migrating. You also need to keep track of how many tables you have before and after for comparison.
In Ambari, go to Services/Hive/Configs, and check the value of
hive.metastore.warehouse.dirto determine the location of the Hive warehouse,
On any node in the cluster, as the HDFS superuser, enable snapshots.
$ sudo su - hdfs $ hdfs dfsadmin -allowSnapshot /apps/hive/warehouse
Allowing snapshot on /apps/hive/warehouse succeeded
Create a snapshot of the Hive warehouse.
$ hdfs dfs -createSnapshot /apps/hive/warehouseOutput includes the name and location of the snapshot.
Created snapshot /apps/hive/warehouse/.snapshot/s20181204-164645.898
Start Hive as a user who has SELECT privileges on the tables.
$ beeline beeline> !connect jdbc:hive2:// Enter username for jdbc:hive2://: hive Enter password for jdbc:hive2://: *********
Connected to: Apache Hive (version 1.2.1000.2.6.5.0-292) Driver: Hive JDBC (version 1.2.1000.2.6.5.0-292)
Identify all tables outside
hive> USE my_database; hive> SHOW TABLES;
Determine the location of each table using the DESCRIBE command. For
hive> DESCRIBE FORMATTED my_table partition (dt='20181130');
- Create a snapshot of the directory shown in the location section of the output.
Repeat steps 5-7 for each database and its tables outside