Setting up the HDP cluster
You set up the HDP cluster after replicating one or more databases before you can verify replication. Set up requires stopping jobs.
- Run at least one incremental replication for a databases before attempting verification.
- In the CDP cluster, find the dump directory path using the following query:
select * from sys.replication_metrics where policy_name=‘<policy name>’ order by scheduled_execution_id desc limit 1;
- Find and copy the external table paths listed in the CDP dump directory path in _file_list_external file. You will use these paths to set up Ranger policies in Ambari.
- On the HDP source cluster, stop all ETL jobs.
, add a Deny policy (no writes) for all users including ‘hive’ on all databases:
Database *, Table *, Hive column *You need only one policy to deny any writes to managed tables or any access to any external tables
- In , add a Ranger Deny policy for all external table paths.
In Resource Path, paste the external table paths you copied from in the CDP dump directory
path in the _file_list_external file.
You can add single or multiple policies for all the external table paths in all the databases.For example:
Disable the StatsUpdaterThread background thread by configuring the
hive.metastore.stats.auto.analyzeproperty to none.
- Disable the PartitionManagementTask background thread by configuring the metastore.partition.management.database.pattern property to ^*.