Spark Streaming and Dynamic Allocation
Dynamic allocation is enabled by default, which means that executors are removed when idle. Dynamic allocation conflicts with Spark Streaming operations.
In Spark Streaming, data comes in batches, and executors run whenever
data is available. If the executor idle timeout is less than the batch
duration, executors are constantly added and removed. However, if the
executor idle timeout is greater than the batch duration, executors are
never removed. Therefore, Cloudera recommends that you disable dynamic
allocation by setting spark.dynamicAllocation.enabled
to false
when running streaming applications.