Configuring the HDP cluster

You need to configure the HDP cluster before you dump workload data that you want to replicate on CDP.

Prepare a cron script to set policies for chained REPL DUMP commands and to control execution, for example to run at a certain time.
  1. In new and existing databases, include the repl.source.for property in the source database dbproperties file.
    Set the repl.source.for property value using the following format:
    'repl.source.for' = [****policy1 name***, ****policy2 name***, ****policy3 name***]                                 

    For example, to create a new source database for policies named 1, 2, and 3, configure the source database properties file as follows:

    ‘repl.source.for' = '1, 2, 3'             

    For example, to configure an existing source database named testdb, run the following command:

    ALTER DATABASE testdb SET
    DBPROPERTIES('repl.source.for'=[****policy1 name, policy2 name,
    policy3 name***]');                  
  2. On the HDP cluster, configure the mandatory HDP cluster configuration properties listed in the next topic.
  3. Run the REPL DUMP command along the mandatory policy-level configuration parameters using a cron script.
    Use the following command syntax:
    [***cron syntax for regular intervals***] beeline -u jdbc:hive2://[***source database***] hive
    -e"repl dump [***source database***] with [***mandatory policy-level configuration
    parameters separated by comma***]
    See the Cron Expression Generator & Explainer website.