You must have a cluster registered with the DLM app to perform data replication from
on-premise to Google cloud. You must register your cloud credentials. For more information,
see Register cloud credentials.
You must create a new replication policy to replicate data
from on-premise to Google cloud account. You can replicate data on-premise to Google
cloud storage using a single cluster. You must have Infra Admin
or DLM Admin role to perform this set of tasks.
-
Select Policies and click Add
Policy.
By default, HDFS is selected as the service in
the Create Replication Policy page.
-
Enter the replication policy name and description.
-
Click SELECT SOURCE and select type and source cluster
from the drop-down.
-
Provide the data replication folder path and click SELECT DESTINATION.
-
Select the destination type as GCS and Cloud
Credential from the drop-down.
-
Provide a folder path
bucket_name/path
.
| Important |
---|
If the target dataset is non-empty, a warning message
appears Target dataset directory /xxxx/xxxis not empty. You
can proceed by selecting the check-box supressWarnings.
Opting to select the check-box overwrites the target location, considering
the conflict resolution between HDFS location and Hive External Table base
location directory. |
-
Click VALIDATE.
-
Once the validation is successful, click SCHEDULE.
-
Configure the job settings for the replication policy.
-
Click ADVANCED SETTINGS to set up the policy queue.
-
Click CREATE POLICY.
The data replication process is enabled.
View job status
from the policies page. Verify that the job starts and runs as
expected.