You need to configure the HDP cluster before you dump workload data that you want to replicate on CDP.
Prepare a cron script to set policies for chained REPL DUMP commands and to control execution, for example to run at a certain time.
-
In new and existing databases, include the
repl.source.for
property in the source database dbproperties file.
Set the
repl.source.for
property value using the following
format:
'repl.source.for' = [****policy1 name***, ****policy2 name***, ****policy3 name***]
For example, to create a new source database for policies named 1, 2, and 3,
configure the source database properties file as follows:
‘repl.source.for' = '1, 2, 3'
For example, to configure an existing source database named testdb, run the
following command:
ALTER DATABASE testdb SET
DBPROPERTIES('repl.source.for'=[****policy1 name, policy2 name,
policy3 name***]');
-
On the HDP cluster, configure the mandatory HDP cluster configuration
properties listed in the next topic.
-
Run the REPL DUMP command along the mandatory policy-level configuration
parameters using a cron script.
Use the following command syntax:
[***cron syntax for regular intervals***] beeline -u jdbc:hive2://[***source database***] hive
-e"repl dump [***source database***] with [***mandatory policy-level configuration
parameters separated by comma***]
See
the Cron Expression Generator & Explainer website.