Use the backup-cluster command to back up the configuration and settings of all the
Database Catalog, Virtual Warehouses, and Data Visualization instances within your Cloudera
Data Warehouse (CDW) environment.
-
SSH into a host on your cluster from which you can access the CDP Private Cloud
Data Services cluster.
-
Run the following command to back up the cluster:
cdp dw backup-cluster --cluster-id [***CDW-CLUSTER-ID***] [--cli-input-json <value>] [--generate-cli-skeleton]
Replace [***CDW-CLUSTER-ID***] with the actual cluster ID
of your environment. The cluster ID is a unique CDW environment
identifier.
[--cli-input-json <value>] and
[--generate-cli-skeleton] are optional
parameters.
To specify the
–cli-input-json parameter, you must
obtain the skeleton of the JSON file by running the following
command:
cdp dw backup-cluster --generate-cli-skeleton
The output of this command is a JSON object as
follows:
{
"clusterId": ""
}
You can now use the JSON string as a parameter for the
--cli-input-json command option as
follows:
cdp dw backup-cluster --cli-input-json '{"clusterId":"[***CDW-CLUSTER-ID***]"}'
The output contains the following information:
- clusterId: The ID of the cluster, a unique identifier of the
CDW environment.
- operationId: The ID of the backup operation. You can use the
operation ID to query the backup execution details using the CLI.
- timestamp: The date of the creation.
- data: The backup data and configuration.
- md5: The md5 hash of the encoded data. In case the data and
its hash are lost, the cluster objects cannot be restored
automatically.
-
Save the output in a file.
You need this information during the restoration process.
The Hue backup is stored in the following location:
hdfs://cdw-backups/[***TIMESTAMP***]_[***JOB-ID***]/[***ENVIRONMENT-NAME***]/hue-backup
The
CDV backup is stored in the following location:
hdfs://cdw-backups/[***TIMESTAMP***]_[***JOB-ID***]/[***DATAVIZ-INSTANCE-NAME]/viz-backup
Monitor the database backup jobs. The backup process
automatically starts the Hue and Data Visualization database backup jobs that you can
monitor. Make sure that the database backup jobs complete before destroying the cluster.
If you delete the cluster before the jobs are completed, you cannot recover the
application contents.