Using YARN queues with autoscaling

Beyond the regular considerations to keep in mind while configuring YARN queues, a few additional aspects need to be considered, and some additional YARN configuration parameters are required.

Using a single queue for the entire YARN cluster

Nothing specific is required when YARN is configured with a single queue only.

Using multiple queues

Workloads executing in a queue will result in a scaling operation within the bounds of the resources allocated to the specific queue.

There are manual steps that are required on the YARN side to configure multiple queues for autoscaling. The same set of steps are also required before queue configuration is changed:

  • Configure Resource Manager for queue-based autoscaling. Configuring Resource Manager requires setting the yarn.resourcemanager.autoscaling.plugin-type property to the value fine in Cloudera Manager.
  • User-limit-factor and other queue configuration should be tuned as usual.
  • Queue configuration should use `Absolute` values instead of percentages.
  • The `max-capacity` for `memory`, and optionally `cores` for a cluster should correspond to the total value available to YARN when scaled up to the configured `target maximum`.
  • The `maximum capacity` for a queue should be configured to be less than or equal to the `maximum-capacity` for the entire cluster.

Instructions for these manual steps are in the following topic, Configuring multiple YARN queues for autoscaling.