Dynamic allocation
Dynamic allocation allows Spark to dynamically scale the cluster resources allocated to your application based on the workload. When dynamic allocation is enabled and a Spark application has a backlog of pending tasks, it can request executors. When the application becomes idle, its executors are released and can be acquired by other applications.
In Cloudera Data Platform (CDP), dynamic allocation is enabled by default. The table below describes properties to control dynamic allocation.
To disable dynamic allocation, set
spark.dynamicAllocation.enabled
to
false
. If you use the --num-executors
command-line argument or set the
spark.executor.instances
property when running a Spark
application, the number of initial executors is the greater of
spark.executor.instances
or
spark.dynamicAllocation.initialExecutors
.
For more information on how dynamic allocation works, see resource allocation policy.
When Spark dynamic resource allocation is enabled, all resources are allocated to the first submitted job available causing subsequent applications to be queued up. To allow applications to acquire resources in parallel, allocate resources to pools and run the applications in those pools and enable applications running in pools to be preempted.
If you are using Spark Streaming, see the recommendation in Spark Streaming and Dynamic Allocation.
Property | Description |
---|---|
spark.dynamicAllocation.executorIdleTimeout
|
The length of time executor must be idle before it is removed.
Default: |
spark.dynamicAllocation.enabled
|
Whether dynamic allocation is enabled.
Default: |
spark.dynamicAllocation.initialExecutors
|
The initial number of executors for a Spark application when
dynamic allocation is enabled. If
spark.executor.instances (or its equivalent
command-line argument, --num-executors ) is set
to a higher number, that number is used instead.
Default: |
spark.dynamicAllocation.minExecutors
|
The lower bound for the number of executors.
Default: |
spark.dynamicAllocation.maxExecutors
|
The upper bound for the number of executors.
Default: |
spark.dynamicAllocation.schedulerBacklogTimeout
|
The length of time pending tasks must be backlogged before new
executors are requested.
Default: |