Cloudera Data Engineering on premises upgrade FAQs

Review the frequently asked questions about Cloudera Data Engineering on premises upgrade.

Is Cloudera Data Engineering upgraded as part of the Cloudera Data Services on premises upgrade?

The Cloudera Data Engineering upgrade is a separate step and must be performed only after the Cloudera Data Services on premises upgrade is complete. Cloudera Data Engineering is not included in the scope of the Cloudera Data Services on premises upgrade itself.

Can I stay on the old version of Cloudera Data Engineering Service and Virtual Cluster if my Cloudera Data Services on premises is upgraded?

If your Cloudera Data Services on premises is upgraded through Cloudera Manager, you must not remain on an older versions of the Cloudera Data Engineering Service and Virtual Clusters. Cloudera recommends upgrading the Cloudera Data Engineering service using the dex-upgrade-utils Docker image utility.

Will my endpoints change after upgrading Cloudera Data Engineering?

The endpoints for your Cloudera Data Engineering Service and Virtual Clusters changes the following ways:

For Cloudera Embedded Container Service, endpoints will change if you upgrade to Cloudera Data Services on premises 1.5.5 or higher versions. For more information, see Pre-upgrade - Upgrading Cloudera Data Engineering service.
For OpenShift Container Platform, endpoints will remain the same if you are upgrading to Cloudera Data Services on premises 1.5.5 or 1.5.5 SP1. If you are upgrading to Cloudera Data Services on premises 1.5.5 SP2 or higher versions, the endpoints will change. For more information, see Pre-upgrade - Upgrading Cloudera Data Engineering service.

Does the upgrade create a new Cloudera Data Engineering service?

The Cloudera Data Engineering upgrade is a service-wide operation, affecting the Cloudera Data Engineering service and all associated Virtual Clusters. The process involves creating an exact replica of the existing Cloudera Data Engineering service and its Virtual Clusters, and then restoring all artifacts and data to this newly created set of Virtual Clusters.

Do I need to create the Python environment (pyenv) that existed in the previous version?

The Cloudera Data Engineering upgrade process does not restore the pyenv environment from the backed-up cluster. You must recreate the pyenv manually in each Virtual Cluster after the upgrade.

Will my Job runs be restored after upgrading?: After a successful restore, you will be able to see your previous Job runs and their associated Spark Job logs. However, Airflow Job logs are not included in the restore.
How much time does a Cloudera Data Engineering upgrade take?: The total time depends on the number of Virtual Clusters and the number of Jobs associated with each Virtual Cluster. Generally, the service creation takes approximately 15 minutes and after that, each Virtual Cluster creation might also take 15 minutes per Virtual Cluster. The restoration of jobs and other data per Virtual Cluster will then take a proportional amount of time based on the volume of Jobs you have.; Example: Restoring a Service that includes two Virtual Clusters, each having 100 jobs might take approximately one hour.

From which node do I run the upgrade script?

For Cloudera Embedded Container Service, the upgrade instructions must be executed on a worker node in the Cloudera Embedded Container Service environment.
For OpenShift Container Platform, you can perform the upgrade instructions from any edge node, provided it has Docker installed and connectivity to your OpenShift Container Platform cluster. If you encounter errors related to name resolution, for example, the failure in name resolution error, include the --network=host argument in your Docker command.

When can I delete my old service?

After successfully completing the restore operation and verifying that jobs are running, you can proceed with the deletion of the old service.