You must have a cluster registered with the Data Lifecycle Manager to which you want
to replicate data from Google Cloud. The cluster must have enough storage to accept data
that gets replicated. For more information, see Register cloud credentials.
You must create a new replication policy to replicate data
from Google cloud storage to on-premise. You must have Infra
Admin or DLM Admin role to perform this set of
tasks.
-
Select Policies and click Add
Policy.
By default, HDFS is selected as the service in
the Create Replication Policy page.
-
Enter the replication policy name and description.
-
Click SELECT SOURCE.
-
Select type as GCS and Cloud
Credential from the drop-down and enter the path
container_name/path
for the GCS source.
-
Click SELECT DESTINATION.
You must have one or more clusters in the DLM application.
-
Select cluster type and destination cluster from the drop-down.
-
Enter the destination path.
| Important |
---|
If the target dataset is non-empty, a warning message
appears Target dataset directory /xxxx/xxxis not empty. You
can proceed by selecting the check-box supressWarnings.
Opting to select the check-box overwrites the target location, considering
the conflict resolution between HDFS location and Hive External Table base
location directory. |
-
Click VALIDATE.
-
Once the validation is successful, click SCHEDULE.
-
Configure the job settings for the replication policy.
-
Click ADVANCED SETTINGS to set up the policy queue.
-
Click CREATE POLICY.
The data replication process is enabled.
View job status from the
policies page. Verify that the job starts and runs as
expected.