Preparing for CDP PvC Data Services update for CDE

Upgrading the OpenShift Container Platform (OCP) version while CDE service is enabled, can cause database corruption in the embedded MySQL database used for CDE. Follow the below steps before starting the OCP version upgrade.

  1. Stop running jobs and pause scheduled workloads
    1. Kill all the running Spark jobs in the CDE virtual clusters under all CDE services or wait for them to complete.
    2. Pause all Airflow jobs and scheduled Spark jobs.
  2. Identifying the CDE Namespace
    1. Navigate to the Cloudera Data Engineering Overview page by clicking the Data Engineering tile in the Cloudera Data Platform (CDP) management console.
    2. In the CDE Services column, click Service Details for the CDE service.
    3. Note the Cluster ID shown in the page. For example, if the Cluster ID is cluster-abcd1234, then the CDE Namespace is dex-base-abcd1234.
    4. Use this CDE Namespace (in the above example, it is dex-base-abcd1234) in the following instructions to run kubernetes commands.
  3. Scale down CDE embedded database

    Access the OpenShift cluster with OpenShift CLI or Kubernetes CLI, and scale down the CDE embedded database statefulset to 0 with the following command:

    OpenShift CLI
    oc scale statefulset/cdp-cde-embedded-db --namespace <CDE Namespace> --replicas 0

    Kubernetes CLI

    kubectl scale statefulset/cdp-cde-embedded-db --namespace <CDE Namespace> --replicas 0