Backing up the environment and objects

You can export environment configurations, which you use later to automatically restore the entire CDW environment and all logical objects, such as Database Catalogs, Virtual Warehouses, and Data Visualization applications. The procedure preserves the data and configurations of the logical objects.

  • You must temporarily deploy at least one Virtual Warehouse that runs 2023.0.14.0-15 or later to your environment as described in the steps below if you meet both of the following conditions:
    • You have not deployed Runtime version 2023.0.14.0-15 (released May 5, 2023) or later in any Virtual Warehouse in your cluster.
    • You have deployed only Runtime version 2023.0.13.0-20 (released Feb 7, 2023) or earlier in any Virtual Warehouse in your cluster.
      1. Create a Virtual Warehouse that runs 2023.0.14.0-15 or later.
      2. Delete the Virtual Warehouse you just created.

        The steps above resolve a Hue schema incompatibility issue before backing up and restoring Hue.

  • Add bucket encryption to your managed policy and attach the policy to the node instance role.
  • You must use the CDP CLI version 0.9.99 or later.
  • You must clean up Hue history before doing this backup if Hue is used heavily.

The procedure below backs up the environment and objects, which includes Virtual Warehouse parameters. Use the CDP CLI dw backup-cluster command to create the backup data.

Use the CDP CLI `dw backup-cluster` command to create the backup data.
export CDP_PROFILE=<test / prod / etc>
export CLUSTER_ID=<the-id-of-the-cluster> # the current ID (original ID) of the cluster  
       
cdp \
  --profile ${CDP_PROFILE} \
  dw backup-cluster \
  --cluster-id ${CLUSTER_ID} 1>dump_${CLUSTER_ID}.json      
Example content of the dump_${CLUSTER_ID}.json file:
{
  "clusterId": "env-lqhwqs",
  "operationId": "94197da9-fff7-4414-8b56-a30446c75119",
  "timestamp": "2023-08-16T20:22:00+00:00",
  "data": "UEsDBBQACAAIAAAAAAAAAAAA....
  "md5": "5f427b11f01f5540fa961aba8ea232aa"
}
The cluster ID is the unique CDW environment identifier. You can use the operation ID to query the backup execution details using the CLI. The data holds the object data and the configuration,. The md5 is a hash for this data. In case the data and this file is lost, the cluster objects cannot be restored automatically.
  • Monitor database backup jobs

    The backup will automatically start the Hue backup and Data Visualization database backup jobs that you can monitor. Make sure that the database backup jobs finish before destroying the cluster. If the cluster is deleted before the jobs are finished, you cannot recover the application contents.

  • Alert settings

    The compaction observability alert settings are backed up. If the configuration has been modified, make a copy of the configurations, and apply them to the new cluster after restoration.

    Using one of the following ways, get the value of the Alert Manager settings:
    • Use the CDW UI:

      Navigate to your environment tile, click Edit, and in Alert Settings, add the alert settings.

    • Use kubectl:

      kubectl get configmap -n istio-system alertmanager -o json

  • Azure environments

    Azure environments activated prior to 1.6.3-b319 (released May 5, 2023) support only manual environment backup. New activations require a managed identity for cluster creation. Old clusters do not have this setting available. Automatic recovery is not an option if your Azure was activated in 1.6.3-b319 (released May 5, 2023).

  • Grafana dashboards

    Any changes made to the Grafana dashboards will be lost. A new cluster will be provisioned, the data from the previous cluster won’t be carried over to the new Grafana deployment.