DLM Administration
Also available as:
PDF
loading table of contents...

Setting target cluster for cloud storage in Hive

Before performing Hive replication from on-prem to any supported cloud storage, the target cluster for Hive cloud replication should be setup on cloud storage instances, with Hive warehouse directory on that specific cloud storage.

The target cluster is Data Lake cluster with metadata services such as HMS, Ranger, Atlas, and DLM engine.

For a specific cloud account that is used for data replication, you must setup applicable path values for Hive replication function and Hive Metastore parameters.

Amazon S3 cloud storage

When you setup Amazon S3 as your target cloud cluster, use the following Hive metastore configuration:

hive.metastore.warehouse.dir=s3a://<bucket_name>/<warehouse_path>

The target cluster must have additional Amazon S3 credential configurations to access Amazon S3 storage buckets. For more information, see Configuring Access to S3.

Microsoft WASB cloud storage

When you setup WASB as your target cloud cluster, use the following Hive metastore configuration:

hive.metastore.warehouse.dir=wasb://<container_name>@<storage_account_name>.blob.core.windows.net/<warehouse_path>

The target cluster must have additional WASB credential configurations to access WASB storage containers. For more information, see Configuring Access to WASB.

Google cloud storage

When you setup Google cloud as your target cloud cluster, use the following Hive metastore configuration:

hive.metastore.warehouse.dir=gs://<bucket_name>/<warehouse_path>

The target cluster must have additional Google cloud storage credential configurations to access Google cloud buckets.

Add and save the following configurations in core-site.xmlfile.

fs.gs.auth.service.account.email=email id of gcs service account

fs.gs.auth.service.account.private.key.id=private key id of gcs service account

fs.gs.auth.service.account.private.key=private key of gcs service account

The values for these configurations can be found in the JSON file that you downloaded while registering the Google cloud storage credentials with the DLM app.

Note
Note
For more information, see Registering Google Cloud Account.