Upgrading Cloudera Data EngineeringPDF version

Handling upgrade failures for Cloudera Data Engineering

If your upgrade of Cloudera Data Engineering fails, you have the option to clone the service with the latest version of Cloudera Data Engineering. Learn how to handle an upgrade failure.

During a Cloudera Data Engineering upgrade, a backup is created as part of the upgrade preparation process. This procedure uses that backup to be restored in a new cluster.

The list of service backups is available in the Backup Library. To locate the Backup Library, in the left navigation menu of Cloudera Data Engineering select Administration, select Service Details, select the Maintenance tab, and select Backup Library.

To obtain the list of all available backups, in the CDP CLI, run:
cdp de list-backups

To obtain the list of service backups associated with a specific Cloudera environment, run "cdp de list-backups --filter "environment(eq)[***CDP ENVIRONMENT NAME***]"

The Cloudera Data Engineering backup includes the following:

  • Cloudera Data Engineering Service configurations
  • Virtual cluster names
  • Virtual cluster configurations
  • Virtual cluster file-based resources
  • Spark job definitions
  • Airflow job definitions
  • Spark Python-env resources
The following are not yet included in the backup:
  • Non file-based resources, for example, Python-venv resources and custom runtimes
  • Airflow custom operators & libraries
  • Logs
  • Job run history
  • Endpoints
  1. Ensure that the catchup option is not enabled for any user's Airflow jobs.

    Before the backup starts, if the Airflow DAG catchup options are enabled, disable them manually.

  2. By default, the restored Cloudera Data Engineering service is assigned the name of the original backed-up service, and a new service ID is generated. To prevent backup failure due to naming conflicts, choose one of the following options:
    1. Delete the original service that failed to upgrade during the Cloudera Data Engineering upgrade.
    2. Rename the service using the --service-name option.
  1. Restore the service from the backup.
    cdp de restore-service --backup-id <backup-id> --environment-crn
        <environment-crn>

    Where:

    backup-id
    The ID of the backup that you are restoring from.
    environment-crn
    The Customer Resource Number (CRN) of the Cloudera environment with which a restored Cloudera Data Engineering service is associated. Currently, you can restore the Cloudera Data Engineering service only to the same Cloudera environment to which the backed-up service is associated.
    For example:
    cdp de restore-service --backup-id 2 --environment-crn crn:cdp:environments:us-west-1:9d74eee4-1cad-45d7-b645-7ccf9edbb73d:environment:c67b9089-2d3b-4579-861d-c0df12a105b1
  2. Optional: To obtain a list of backups, run:
    cdp de list-backups
  3. Optional: To describe a particular backup, run:
    cdp de describe-backup --backup-id <backup-id>

    For example:

    $ cdp de describe-backup --backup-id 2 --profile priv
    {
        "backup": {
            "id": 2,
            "serviceID": "cluster-cf6h74lq",
            "serviceName": "dex-priv-default-azure-env-1689008683873",
            "environmentName": "dex-priv-default-azure-env",
            "environmentCrn": "crn:cdp:environments:us-west-1:9d74eee4-1cad-45d7-b645-7ccf9edbb73d:environment:c67b9089-2d3b-4579-861d-c0df12a105b1",
            "creator": "crn:altus:iam:us-west-1:9d74eee4-1cad-45d7-b645-7ccf9edbb73d:user:0f9a97a7-23a7-43bd-bc71-ecdb2aa34ed5",
            "cloudPlatform": "AZURE",
            "status": "completed",
            "created": "2023-07-17T18:02:58.385455Z"
        }
    }
    
  4. Optional: In the case of an Airflow DAG failure, identify the impacted DAG on the Airflow UI and fix it.
    For more information, see the DAG-related steps in In-place upgrade with Airflow Operators and Libraries.

We want your opinion

How can we improve this page?

What kind of feedback do you have?