1.1. Create and Configure YARN Capacity Scheduler Queues

Capacity Scheduler queues can be used to allocate cluster resources among users and groups. These settings can be accessed from Ambari > YARN > Configs > Scheduler or in capacity-scheduler.xml. YARN must be restarted in order for queues to take effect.

The following simple configuration example demonstrates how to set up Capacity Scheduler queues. This example separates short and long-running queries into two separate queues.

  • hive1 -- this queue will be used for short-duration queries, and will be assigned 50% of cluster resources.

  • hive2 -- this queue will be used for longer-duration queries, and will be assigned 50% of cluster resources.

The following capacity-scheduler.xml settings are used to implement this configuration:

yarn.scheduler.capacity.root.queues=hive1,hive2
yarn.scheduler.capacity.root.hive1.capacity=50
yarn.scheduler.capacity.root.hive2.capacity=50

Configure limits on usage for these queues and their users with the following settings:

yarn.scheduler.capacity.root.hive1.maximum-capacity=50
yarn.scheduler.capacity.root.hive2.maximum-capacity=50
yarn.scheduler.capacity.root.hive1.user-limit=1
yarn.scheduler.capacity.root.hive2.user-limit=1

Setting maximum-capacity to 50 restricts queue users to 50% of the queue capacity with a hard limit. If the maximum-capacity  is set to more than 50%, the queue can use more than its capacity when there are other idle resources in the cluster. However, any user can only use up to the configured queue capacity. The default value of "1" for user-limit means that any single user in the queue can at maximum occupy 1x the queue’s configured capacity. These settings prevent users in one queue from monopolizing resources across all queues in a cluster.

This example is a basic introduction to queues. For more detailed information on allocating cluster resources using Capacity Scheduler queues, see the "Capacity Scheduler" section in the YARN Resource Management guide.