Managing Environments

Backing up Cloudera Data Warehouse using the backup-cluster command

Use the backup-cluster command to back up the configuration and settings of all the Database Catalog, Virtual Warehouses, and Cloudera Data Visualization instances within your Cloudera Data Warehouse environment.

SSH into a host on your cluster from which you can access the Cloudera Data Services on premises cluster.
Run the following command to back up the cluster:
```
cdp dw backup-cluster --cluster-id [***CDW-CLUSTER-ID***] [--cli-input-json <value>] [--generate-cli-skeleton]
```
Replace [***Cloudera Data Warehouse-CLUSTER-ID***] with the actual cluster ID of your environment. The cluster ID is a unique Cloudera Data Warehouse environment identifier.
[--cli-input-json <value>] and [--generate-cli-skeleton] are optional parameters.
To specify the –cli-input-json parameter, you must obtain the skeleton of the JSON file by running the following command:
```
cdp dw backup-cluster --generate-cli-skeleton
```
The output of this command is a JSON object as follows:
```
{
    "clusterId": ""
}
```
You can now use the JSON string as a parameter for the --cli-input-json command option as follows:
```
cdp dw backup-cluster --cli-input-json '{"clusterId":"[***CDW-CLUSTER-ID***]"}'
```
The output contains the following information:
- clusterId: The ID of the cluster, a unique identifier of the Cloudera Data Warehouse environment.
- operationId: The ID of the backup operation. You can use the operation ID to query the backup execution details using the CLI.
- timestamp: The date of the creation.
- data: The backup data and configuration.
- md5: The md5 hash of the encoded data. In case the data and its hash are lost, the cluster objects cannot be restored automatically.
Save the output in a file.
You need this information during the restoration process.

The Hue backup is stored in the following location:

hdfs://cdw-backups/[***TIMESTAMP***]_[***JOB-ID***]/[***ENVIRONMENT-NAME***]/hue-backup

The Cloudera Data Visualization backup is stored in the following location:

hdfs://cdw-backups/[***TIMESTAMP***]_[***JOB-ID***]/[***DATAVIZ-INSTANCE-NAME]/viz-backup

Monitor the database backup jobs. The backup process automatically starts the Hue and Cloudera Data Visualization database backup jobs that you can monitor. Make sure that the database backup jobs complete before destroying the cluster. If you delete the cluster before the jobs are completed, you cannot recover the application contents.

We want your opinion

How can we improve this page?

What kind of feedback do you have?