DLM Administration
Also available as:
PDF
loading table of contents...

Replication of data on-premise to Google Cloud in HIVE

You must register the Google Cloud Storage cloud account with the DLP App. For more information, see Register cloud credentials.
You can replicate data on-premise to Google cloud with a single cluster. The metastore must be running on the cloud. There is no requirement to run the HiveServer 2 on the cloud environment. You must have Infra Admin or DLM Admin role to perform this set of tasks.
  1. Select Policies and click Add Policy. Select HIVE as the service in the Create Replication Policy page.
  2. Enter the replication policy name and description.
  3. Click SELECT SOURCE and choose Type, Source Cluster, and Select Database.
  4. Click SELECT DESTINATION and choose Type and Destination Cluster.
  5. Enter the Destination Database.
  6. Provide the Hive External Table Base Directory path: GCS://bucket_name/path
    The external table base directory path cannot be changed once the policy is created.
  7. Select Cloud Credential from the drop-down.
    Important
    Important
    If the target dataset is non-empty, a warning message appears - Target dataset directory /xxxx/xxx is not empty. You can proceed by selecting the supressWarnings check-box. Opting to select the check-box overwrites the target location, considering the conflict resolution between HDFS location and Hive External Table base location directory.
  8. Click VALIDATE.
  9. Once the validation is successful, click SCHEDULE.
  10. Configure the job settings for the replication policy.
  11. Click ADVANCED SETTINGS to set up the policy queue.
  12. Click CREATE POLICY.

    The data replication process is enabled.

    View job status from the policies page. Verify that the job starts and runs as expected.