Restoring the environment and objects

You learn how to use the dw restore-cluster command, which you can use either to pass the environment's Cloudera resource name (crn) or to pass the identifier of an activated environment.

Passing the Cloudera resource name (crn) will activate the cluster from the backup file and restore all the entities and database contents.

Passing an activated environment resource name will restore all the entities and database contents to the running environment. Passing the environment identifier is useful when you need to change activation parameters, but requires manual reactivation.

In the steps below, use dw restore-cluster to pass the Cloudera resource name (crn) to activate the cluster.

  • Your Azure cluster must run version 1.6.3-b319 (released May 5, 2023) or later.

    You cannot automatically activate an Azure cluster that runs version 1.6.2-b197 (released Feb 13, 2023) or earlier.

  • You must use the same Cloudera Data Warehouse version to restore files that you used to back up those files.

    Using a backup file from 1.6.2-b197 (released Feb 13, 2023) for restoration will not work.

  • Check that the size of your Hue backup file is smaller than 6GB. If the backup file 6GB or larger, do not automatically restore the environment. Go to the procedure for manually restoring the environment.
  1. Get your environment resource name from the Cloudera portal by selecting the environment that is not activated, and clicking Manage.
    The environment properties open.
    Under the environment resource name the Cloudera resource name (crn) appears.
    crn:cdp:environments:us-west-1:98765432-abcd-45d7-b645-7ccf9edbb73d:environment:00000000-7bf2-4aeb-af71-f2bf2c038588
  2. Create a CLI skeleton file to serve the base file for the restore command.
    For example, replace your environment resource name placeholder <your cluster name> with the environment resource name of the newly activated cluster (for example env-npk886 shown step 3 of Reactivating the environment).
    export CLUSTER_NAME="<your cluster name>"
    cdp \
      dw restore-cluster \
      --generate-cli-skeleton 1>restore_${CLUSTER_NAME}_cli_input.json
  3. Open restore_<CLUSTER_NAME>_cli_input.json for editing, and fill in the clusterId and the data fields.
    For example:
    {
       "clusterId": "crn:cdp:environments:us-west-1:98765432-abcd-45d7-b645-7ccf9edbb73d:environment:00000000-7bf2-4aeb-af71-f2bf2c038588",
       "data": "UEsDBBQ…AAAAAAAAAAABkYXRhUEsFBgAAAAABAAEAMgAAAKuBAQAAAA==",
    }
  4. Use the dw restore-cluster command, provide the same CLUSTER_NAME as you used in step 3 and use the CDP_PROFILE from your CLI configuration.
    export CLUSTER_NAME="<your cluster name>"
    export CDP_PROFILE="<your CDP CLI profile>"
                  
    cdp \
      --profile ${CDP_PROFILE} \
      dw restore-cluster \
        --cli-input-json file://restore_${CLUSTER_NAME}_cli_input.json
    Example output:
    {
        "clusterId": "crn:cdp:environments:us-west-1:98765432-abcd-45d7-b645-7ccf9edbb73d:environment:00000000-7bf2-4aeb-af71-f2bf2c038588",
        "operationId": "62408134-3d8c-46e8-a914-0f427fc3b1b1",
        "action": "Create",
        "message": "the cluster will be created",
        "dbcRestorePlans": [
            {
                "ref": "test-aws-dl-default",
                "id": "warehouse-1692719478-xrm4",
                "action": "Create",
                "message": "the SDX-type DB Catalog will be created based on the data referenced in the backup as test-aws-dl-default"
            }
        ],
        "hueRestorePlans": [
            {
                "ref": "test-aws-dl-default",
                "id": "warehouse-1692719478-xrm4",
                "action": "Create",
                "message": "Hue restore is started for warehouse-1692719478-xrm4 DB Catalog, referenced in the backup data as test-aws-dl-default. Restore will overwrite Hue database with the backup if it isn't empty."
            }
        ],
        "hiveRestorePlans": [
            {
                "ref": "test-hive",
                "action": "Create",
                "message": "the test-hive Hive Virtual Warehouse will be created and attached to the warehouse-1692719478-xrm4 DB Catalog"
            }
        ],
        "impalaRestorePlans": [
            {
                "ref": "test-impala",
                "action": "Create",
                "message": "the test-impala Impala Virtual Warehouse will be created and attached to the warehouse-1692719478-xrm4 DB Catalog"
            }
        ],
        "vizRestorePlans": [
            {
                "ref": "test-viz",
                "action": "Create",
                "message": "the test-viz Data Visualization will be created"
            }
        ]
        }

    After several minutes the environment will be activated, the Virtual Warehouses will be created in the new cluster and attached to the Database Catalog. The Virtual Warehouse and Data Visualization ids will be changed.

    The Data Visualization database will be recovered. However, because this is a new deployment, the recovered connections will be broken.

  5. Monitor the environment restoration as described in Monitoring environment restoration.
  6. Adjust the Data Visualization connection settings to point to the new Virtual Warehouse(s).