Handling upgrade failures for Cloudera Data Engineering
If your upgrade of Cloudera Data Engineering (CDE) fails, you have the option to clone the service with the latest version of CDE. Learn how to handle an upgrade failure.
During a CDE upgrade, a backup is created as part of the upgrade preparation process. This procedure uses that backup to be restored in a new cluster.
The list of service backups is available in the Backup Library. To locate the Backup Library, in the left navigation menu of CDE select Administration, select Service Details, select the Maintenance tab, and select Backup Library.
To obtain the list of all available backups, in the CDP CLI, run:cdp de list-backups
To obtain the list of service backups associated with a specific CDP environment, run
"cdp de list-backups --filter "environment(eq)[***CDP
ENVIRONMENT NAME***]
"
The CDE backup includes the following:
- CDE Service configurations
- Virtual cluster names
- Virtual cluster configurations
- Virtual cluster file-based resources
- Spark job definitions
- Airflow job definitions
- Spark Python-env resources
- Non file-based resources, for example, Python-venv resources and custom runtimes
- Airflow custom operators & libraries
- Logs
- Job run history
- Endpoints
- Ensure that the
catchup
option is not enabled for any user's Airflow jobs.Before the backup starts, if the Airflow DAG
catchup
options are enabled, disable them manually. - By default, the restored service receives the name and ID of the original
backed-up service. To ensure that the backup does not fail due to name and ID
conflicts, perform either of these options:
- Delete the original service, which failed to upgrade during upgrading CDE.
- Rename the service and assign a new ID to it using the
--service-id and --service-name
options.