General known issues with Cloudera Data Engineering

Learn about the general known issues with the Cloudera Data Engineering (CDE) service on public clouds, the impact or changes to the functionality, and the workaround.

COMPX-5494: Yunikorn recovery intermittently deletes existing placeholders
On recovery, Yunikorn may intermittently delete placeholder pods. After recovery, there may be remaining placeholder pods. This may cause unexpected behavior during rescheduling.

There is no workaround for this issue. To avoid any unexpected behavior, Cloudera suggests removing all the placeholders manually before restarting the scheduler.

DWX-8257: CDW Airflow Operator does not support SSO

Although Virtual Warehouse (VW) in Cloudera Data Warehouse (CDW) supports SSO, this is incompatible with the CDE Airflow service as, for the time being, the Airflow CDW Operator only supports workload username/password authentication.

Workaround: Disable SSO in the VW.
COMPX-7085: Scheduler crashes due to Out Of Memory (OOM) error in case of clusters with more than 200 nodes

Resource requirement of the YuniKorn scheduler pod depends on cluster size, that is, the number of nodes and the number of pods. Currently, the scheduler is configured with a memory limit of 2Gi. When running on a cluster that has more than 200 nodes, the memory limit of 2Gi may not be enough. This can cause the scheduler to crash because of OOM.


Increase resource requests and limits for the scheduler. Edit the YuniKorn scheduler deployment to increase the memory limit to 16Gi.

For example:

    cpu: "4"
    memory: 16Gi
    cpu: "2"
    memory: 8Gi
COMPX-6949: Stuck jobs prevent cluster scale down

Because of hanging jobs, the cluster is unable to scale down even when there are no ongoing activities. This may happen when some unexpected node removal occurs, causing some pods to be stuck in Pending state. These pending pods prevent the cluster from downscaling.

Workaround: Terminate the jobs manually.
DEX-3997: Python jobs using virtual environment fail with import error
Running a Python job that uses a virtual environment resource fails with an import error, such as:
Traceback (most recent call last):
  File "/tmp/spark-826a7833-e995-43d2-bedf-6c9dbd215b76/", line 3, in <module>
    from insurance.beneficiary import BeneficiaryData
ModuleNotFoundError: No module named 'insurance'
Workaround: Do not set the spark.pyspark.driver.python configuration parameter when using a Python virtual environment resource in a job.