Learn about how to use the backup-restore-based upgrade script in Cloudera Data Engineeringon cloud.
This procedure applies to Cloudera Data Engineering versions 1.20.3-h2, 1.21.0-h2,
1.22.0, and higher.
If Apache Airflow connections and variables are involved, the original
backup-restore-based upgrade is not applicable. This document describes a
script-based fallback for performing the backup-restore-based upgrade that you can
use when the default upgrade method described in Handling upgrade failures for Cloudera Data
Engineering does not work.
Table 1. Contents included in the backup
Artifacts
Included in the automation script
Service config
Y
Service Logs
N
Virtual Cluster config
Y
Virtual Cluster end-points
N
Virtual Cluster event logs
N
Job: Spark
Y
Job: Airflow
Y
Resource: files
Y
Resource: docker runtimes
Y
Resource: python-venv (for Spark)
Y
Resource: python-venv (for Airflow)
Y (Since 1.21.0)
Git Repository
Y
Credentials
Y
Spark Session
N
Spark Session logs
(Including statement history)
N
Job Runs
Y (Since 1.22.0)
Job Run logs
(Driver, Executor, API)
N
Airflow DAG logs
N
Airflow connections
Y
Airflow variables
Y
Before running the script, you must install the listed tools:
Install cdpcurl. For more information, see cdpcurl.
Download the CDE CLI from your Virtual Cluster page and add its path to the
PATH environment variable with the export
PATH="$PATH:[***CLI-PATH***]" command.
Install the other tools according to your system requirements.
Configure the CDP CLI.
On the Cloudera Management Console, navigate to your
Profile page.
On the Access Keys tab, click
Generate Access Key.
Download the credentials file and copy the contents to
~/.cdp/credentials
The credentials are set as the default profile. You can also rename
the profile. If you do so, set the CDP profile to your preference using
one of the listed methods:
The CDP_PROFILE environment variable
The cde-service-backup-restore-utils.properties
file
The --cdp-profile command line option
Configure the CDE CLI.
Configure the CDE CLI following the instructions in Configuring the CLI client. Cloudera
recommends that you use the Cloudera credentials, as they are needed for CDP CLI
too.
Set the CDE CLI using one of the listed methods:
The environment variable
The cde-service-backup-restore-utils.properties
file
The ~/.cde/config.yaml file
Run the following command once to make sure that the configuration
works:
cde job list --vcluster-endpoint
[***JOBS-API-URL***]
Authenticate again to kubectl to access your
Cluster.
If the version of the new service is different from the original one,
download the CDE CLI tool of the new service and add its path to the
PATH environment variable.