Configure node labels

You can configure node labels on a cluster by making configuration changes on the YARN ResourceManager host.

Enable Node Labels

To enable Node Labels on a cluster, make the following configuration changes on the YARN ResourceManager host.

  1. Create a Label Directory in HDFS

    Use the following commands to create a "node-labels" directory in which to store the Node Labels in HDFS.

    sudo su hdfs
    hadoop fs -mkdir -p /yarn/node-labels
    hadoop fs -chown -R yarn:yarn /yarn
    hadoop fs -chmod -R 700 /yarn

    -chmod -R 700 specifies that only the yarn user can access the "node-labels" directory.

    You can then use the following command to confirm that the directory was created in HDFS.

    hadoop fs -ls /yarn

    The new node label directory should appear in the list returned by the following command. The owner should be yarn, and the permission should be drwx.

    Found 1 items
    drwx------ - yarn yarn 0 2014-11-24 13:09 /yarn/node-labels

    Use the following commands to create a /user/<username> directory that is required by the distributed shell.

    hadoop fs -mkdir -p /user/<username>
    hadoop fs -chown -R yarn:yarn /user/<username>
    hadoop fs -chmod -R 700 /user/<username>
  2. In Cloudera Manager, select the YARN service.
  3. Click the Configuration tab.
  4. Search for YARN Service Advanced Configuration.
  5. In YARN Service Advanced Configuration Snippet (Safety Valve) for yarn-site.xml add the following:
    • Set the following property to enable Node Labels:
      Name: yarn.node-labels.enabled
      Value: true
    • Set the following property to reference the HDFS node label directory
      Name: yarn.node-labels.fs-store.root-dir
      Value: hdfs://:/

      For example,

      Name: yarn.node-labels.fs-store.root-dir
      Value: hdfs://node-1.example.com:8020/yarn/node-labels/ 
  6. Start or Restart the YARN ResourceManager.

Add Node Labels

Use the following command format to add Node Labels. You should run these commands as the yarn user. Node labels must be added before they can be assigned to nodes and associated with queues.

sudo su yarn
yarn rmadmin -addToClusterNodeLabels "<label1>(exclusive=<true|false>),<label2>(exclusive=<true|false>)"

For example, the following commands add the node label "x" as exclusive, and "y" as shareable (non-exclusive).

sudo su yarn
yarn rmadmin -addToClusterNodeLabels "x(exclusive=true),y(exclusive=false)"

You can use the yarn cluster --list-node-labels command to confirm that Node Labels have been added:

[root@node-1 /]# yarn cluster --list-node-labels
15/07/11 13:55:43 INFO impl.TimelineClientImpl: Timeline service address: http://node-1.example.com:8188/ws/v1/timeline/
15/07/11 13:55:43 INFO client.RMProxy: Connecting to ResourceManager at node-1.example.com/240.0.0.10:8032
Node Labels: <x:exclusivity=true>,<y:exclusivity=false>
You can use the following command format to remove Node Labels:
yarn rmadmin -removeFromClusterNodeLabels "<label1>,<label2>"

Assign Node Labels to Cluster Nodes

Use the following command format to add or replace node label assignments on cluster nodes:

yarn rmadmin -replaceLabelsOnNode "<node1>:<port>=<label1> <node2>:<port>=<label2>"

For example, the following commands assign node label "x" to "node-1.example.com", and node label "y" to "node-2.example.com".

sudo su yarn
yarn rmadmin -replaceLabelsOnNode "node-1.example.com=x node-2.example.com=y"

To remove node label assignments from a node, use -replaceLabelsOnNode, but do not specify any labels. For example, you would use the following commands to remove the "x" label from node-1.example.com:

sudo su yarn
yarn rmadmin -replaceLabelsOnNode "node-1.example.com"

Associate Node Labels with Queues

Now that we have created Node Labels, we can associate them with queues in the capacity-scheduler.xml file.

You must specify capacity on each node label of each queue, and also ensure that the sum of capacities of each node-label of direct children of a parent queue at every level is equal to 100%. Node labels that a queue can access (accessible Node Labels of a queue) must be the same as, or a subset of, the accessible Node Labels of its parent queue.

Example:

Assume that a cluster has a total of 8 nodes. The first 3 nodes (n1-n3) have node label=x, the next 3 nodes (n4-n6) have node label=y, and the final 2 nodes (n7, n8) do not have any Node Labels. Each node can run 10 containers.

The queue hierarchy is as follows:



Assume that queue “a” can access Node Labels “x” and “y”, and queue “b” can only access node label “y”. By definition, nodes without labels can be accessed by all queues.

Consider the following example label configuration for the queues:

capacity(a) = 40, capacity(a, label=x) = 100, capacity(a, label=y) = 50; capacity(b) = 60, capacity(b, label=y) = 50

This means that:

  • Queue “a” can access 40% of the resources on nodes without any labels, 100% of the resources on nodes with label=x, and 50% of the resources on nodes with label=y.

  • Queue “b” can access 60% of the resources on nodes without any labels, and 50% of the resources on nodes with label=y.

You can also see that for this configuration:

capacity(a) + capacity(b) = 100 capacity(a, label=x) + capacity(b, label=x) (b cannot access label=x, it is 0) = 100 capacity(a, label=y) + capacity(b, label=y) = 100

For child queues under the same parent queue, the sum of the capacity for each label should equal 100%.

Similarly, we can set the capacities of the child queues a1, a2, and b1:

a1 and a2: capacity(a.a1) = 40, capacity(a.a1, label=x) =30, capacity(a.a1, label=y) =50 capacity(a.a2) = 60, capacity(a.a2, label=x) =70, capacity(a.a2, label=y) =50 b1: capacity(b.b1) = 100 capacity(b.b1, label=y) = 100

You can see that for the a1 and a2 configuration:

capacity(a.a1) + capacity(a.a2) = 100 capacity(a.a1, label=x) + capacity(a.a2, label=x) = 100 capacity(a.a1, label=y) + capacity(a.a2, label=y) = 100

How many resources can queue a1 access?

Resources on nodes without any labels: Resource = 20 (total containers that can be allocated on nodes without label, in this case n7, n8) * 40% (a.capacity) * 40% (a.a1.capacity) = 3.2 (containers)

Resources on nodes with label=x

Resource = 30 (total containers that can be allocated on nodes with label=x, in this case n1-n3) * 100% (a.label-x.capacity) * 30% = 9 (containers)

To implement this example configuration, you would add the following properties in the capacity-scheduler.xml file.

 
 Name: yarn.scheduler.capacity.root.queues
 Value: a,b
 
 Name: yarn.scheduler.capacity.root.accessible-node-labels.x.capacity
 Value: 100
  
 Name: yarn.scheduler.capacity.root.accessible-node-labels.y.capacity
 Value: 100
 

 <!-- configuration of queue-a -->
 
 Name: yarn.scheduler.capacity.root.a.accessible-node-labels
 Value: x,y
 
 Name: yarn.scheduler.capacity.root.a.capacity
 Value: 40
 
 Name: yarn.scheduler.capacity.root.a.accessible-node-labels.x.capacity
 Value: 100
 
 Name: yarn.scheduler.capacity.root.a.accessible-node-labels.y.capacity
 Value: 50
 
 Name: yarn.scheduler.capacity.root.a.queues
 Value: a1,a2
 

 <!-- configuration of queue-b -->
 
 Name: yarn.scheduler.capacity.root.b.accessible-node-labels
 Value: y
 
 Name: yarn.scheduler.capacity.root.b.capacity
 Value: 60
 
 Name: yarn.scheduler.capacity.root.b.accessible-node-labels.y.capacity
 Value: 50
 
 Name: yarn.scheduler.capacity.root.b.queues
 Value: b1
 

 <!-- configuration of queue-a.a1 -->
 
 Name: yarn.scheduler.capacity.root.a.a1.accessible-node-labels
 Value: x,y

 Name: yarn.scheduler.capacity.root.a.a1.capacity
 Value: 40
 
 Name: yarn.scheduler.capacity.root.a.a1.accessible-node-labels.x.capacity
 Value: 30
 
 Name: yarn.scheduler.capacity.root.a.a1.accessible-node-labels.y.capacity
 Value: 50


 <!-- configuration of queue-a.a2 -->
 
 Name: yarn.scheduler.capacity.root.a.a2.accessible-node-labels
 Value: x,y
 
 Name: yarn.scheduler.capacity.root.a.a2.capacity
 Value: 60
 
 Name: yarn.scheduler.capacity.root.a.a2.accessible-node-labels.x.capacity
 Value: 70
 
 Name: yarn.scheduler.capacity.root.a.a2.accessible-node-labels.y.capacity
 Value: 50


 <!-- configuration of queue-b.b1 -->
 
 Name: yarn.scheduler.capacity.root.b.b1.accessible-node-labels
 Value: y
 
 Name: yarn.scheduler.capacity.root.b.b1.capacity
 Value: 100
 
 Name: yarn.scheduler.capacity.root.b.b1.accessible-node-labels.y.capacity
 Value: 100

Refresh Queues

After adding or updating queue node label properties in the capacity-scheduler.xml file, you must run the following commands to refresh the queues:

sudo su yarn
yarn rmadmin -refreshQueues 

Confirm Node Label Assignments

You can use the following commands to view information about node labels.

  • List all running nodes in the cluster: yarn node -list

    Example:

    [root@node-1 /]# yarn node -list
    14/11/21 12:14:06 INFO impl.TimelineClientImpl: Timeline service address: http://node-1.example.com:8188/ws/v1/timeline/
    14/11/21 12:14:07 INFO client.RMProxy: Connecting to ResourceManager at node-1.example.com/240.0.0.10:8032
    Total Nodes:3
     Node-Id Node-State Node-Http-Address Number-of-Running-Containers
    node-3.example.com:45454 RUNNING node-3.example.com:50060 0
    node-1.example.com:45454 RUNNING node-1.example.com:50060 0
    node-2.example.com:45454 RUNNING node-2.example.com:50060 0
  • List all node labels in the cluster: yarn cluster --list-node-labels

    Example:

    [root@node-1 /]# yarn cluster --list-node-labels
    15/07/11 13:55:43 INFO impl.TimelineClientImpl: Timeline service address: http://node-1.example.com:8188/ws/v1/timeline/
    15/07/11 13:55:43 INFO client.RMProxy: Connecting to ResourceManager at node-1.example.com/240.0.0.10:8032
    Node Labels: <x:exclusivity=true>,<y:exclusivity=false>
  • List the status of a node (includes node labels): yarn node -status <Node_ID>

    Example:

    [root@node-1 /]# yarn node -status node-1.example.com:45454
    14/11/21 06:32:35 INFO impl.TimelineClientImpl: Timeline service address: http://node-1.example.com:8188/ws/v1/timeline/
    14/11/21 06:32:35 INFO client.RMProxy: Connecting to ResourceManager at node-1.example.com/240.0.0.10:8032
    Node Report : 
     Node-Id : node-1.example.com:45454
     Rack : /default-rack
     Node-State : RUNNING
     Node-Http-Address : node-1.example.com:50060
     Last-Health-Update : Fri 21/Nov/14 06:32:09:473PST
     Health-Report : 
     Containers : 0
     Memory-Used : 0MB
     Memory-Capacity : 1408MB
     CPU-Used : 0 vcores
     CPU-Capacity : 8 vcores
     Node-Labels : x

Node labels are also displayed in the ResourceManager UI on the Nodes and Scheduler pages.

Specify a Child Queue with No Node Label

If no node label is specified for a child queue, it inherits the node label setting of its parent queue. To specify a child queue with no node label, use a blank space for the value of the node label.

For example:

 Name: yarn.scheduler.capacity.root.b.b1.accessible-node-labels
 Value: 

Set a Default Queue Node Label Expression

You can set a default node label on a queue. The default node label will be used if no label is specified when the job is submitted.

For example, to set "x"as the default node label for queue "b1", you would add the following property in the capacity-scheduler.xml file.

Name: yarn.scheduler.capacity.root.b.b1.default-node-label-expression
Value: x