Hive/Impala replication with snapshots

If you are using Hive replication, Cloudera recommends that you make the Hive Warehouse Directory snapshottable.

The Hive warehouse directory is located in the HDFS file system in the location specified by the hive.metastore.warehouse.dir property. The default location is /user/hive/warehouse.

To locate the Hive warehouse directory, perform the following steps:
  1. Go to the Cloudera Manager > HDFS service > Configuration tab
  2. Search for hive.metastore.warehouse.dir property to view the location of the directory.

You can enable snapshots for the Hive warehouse directory.

If you are using external tables in Hive, also make the directories hosting any external tables not stored in the Hive warehouse directory snapshottable.

Similarly, if you are using Impala and are replicating any Impala tables using Hive/Impala replication, ensure that the storage locations for the tables and associated databases are also snapshottable.