General guidelines

Learn more about general guidelines, known issues, and limitations while scaling Cloudera Data Engineering deployments.

Typically, the total physical resources of a cluster are available for all Spark and Airflow jobs. However, in Cloudera Data Engineering on premises, capacity is consumed in a stack, and Cloudera Data Engineering jobs can only access the remaining unused resources. The following layers of consumption exist:

Physical hardware – The total bare-metal or virtual machine capacity.
Platform overhead – Resources consumed by the host Operating System, Kubernetes services, and platform-level daemonsets for storage and networking.
Cloudera Data Services on premises overhead – Resources consumed by the Cloudera Data Services on premises infrastructure, including the Cloudera Control Plane and Cloudera Shared Data Experience services.
Cloudera Data Engineering Service and Virtual Cluster infrastructure overhead – A fixed amount of resources consumed by the Cloudera Data Engineering service and each Virtual Cluster's internal pods such as Airflow or API server.
Available Job Capacity – The remaining resources available to run Spark executors and Airflow task pods.

Limitations

Resource fragmentation – Jobs can remain in the Pending state even when monitoring dashboards indicate free resources in the cluster. This often occurs when the available capacity is scattered across multiple nodes in small chunks, none of which are large enough to host the job. The following resource fragmentation cases exist:
- CPU Fragmentation – For example, a cluster might have 8 available CPU cores, but if this capacity is distributed as 0.5 cores on 16 different nodes, a Spark job requesting a single 4-core executor cannot be scheduled. The job remains in the Pending state because no single node can satisfy its request.
- Memory Fragmentation – For example, a cluster might have 100 GB of free memory, but if this memory is distributed into 10 GB chunks on 10 different nodes, a job requesting an executor with 20 GB of RAM cannot be scheduled.
important
When jobs remain in the Pending state due to fragmentation, the most effective solution is to request smaller executors. For example, a job requesting 1-core executors is far more likely to be scheduled, as it can fill the gaps of fractional CPUs left behind by larger jobs. This can lead to higher overall cluster utilization. If you need further assistance to assess the logging requirements for your workloads, reach out to Cloudera support.
Control Log Ingestion Rate – Platform logging services have a maximum ingestion capacity of Disk Write Rate in the order of 450 Kib/second per pod. To ensure log reliability, Spark jobs must not generate logs at an excessive rate. If a job log output significantly exceeds this rate, it can overwhelm the logging pipeline and result in log loss.
Memory Overhead (Java or Scala vs Python) – For memory-critical workloads, the implementation language can matter. The lower the overhead of the job, the more executors can be scheduled. For example, for memory-intensive workloads, the Java or Scala job required a 30% memory overhead for stable execution, while the equivalent Python job required a 40% overhead. The lower overhead of the Java or Scala job allowed more executors to be scheduled, improving concurrency.
Configuration override – Memory overhead can be configured per job to percentage or absolute value.
- Job Level Configuration – You can set these parameters directly in the Spark configurations for each job.
  - General overhead factor
```
spark.kubernetes.memoryOverheadFactor=0.2
```
    This sets a 20% overhead for both driver and executor.
  - Executor-specific overhead
```
spark.executor.memoryOverhead=2g
```
    This sets a fixed 2 GB overhead for executors regardless of the executor memory size.
  - Driver-specific overhead
```
spark.driver.memoryOverhead=1g
```
    This sets a fixed 1 GB overhead for the driver.
- Virtual Cluster Level Configuration – The memory overhead factors are defined at the virtual cluster level and can be adjusted there.
  - defaultMemoryOverheadFactor (for regular JVM jobs)
  - nonJVMMemoryOverheadFactor (for PySpark jobs)