Chapter 5. Node Labels

This chapter describes how to use Node labels to restrict YARN applications so that they run only on cluster nodes that have a specified node label.

As discussed in "Capacity Scheduler", the fundamental unit of scheduling in YARN is the queue. The capacity of each queue specifies the percentage of cluster resources that are available for applications submitted to the queue. Queues can be set up in a hierarchy that reflects the resource requirements and access restrictions required by the various organizations, groups, and users that utilize cluster resources.

Node labels can be assigned to cluster nodes. You can then associate node labels with capacity scheduler queues to specify which node label each queue is allowed to access.

When a queue is associated with one or more node labels, all applications submitted by the queue run only on nodes with those specified labels. If no node label is assigned to a queue, the applications submitted by the queue can run on any node without a node label.

Node labels represent one aspect of YARN resource management capabilities that includes CPU scheduling, CGroups, archival storage, and memory as storage.


loading table of contents...