1. Configuring Node Labels

To enable node labels, make the following configuration changes on the YARN ResourceManager host.

1. Create a Label Directory in HDFS

Use the following commands to create a "node-labels" directory in which to store the node labels in HDFS.

sudo su hdfs
hadoop fs -mkdir -p /yarn/node-labels
hadoop fs -chown -R yarn:yarn /yarn
hadoop fs -chmod -R 700 /yarn

-chmod -R 700 specifies that only the yarn user can access the "node-labels" directory.

You can then use the following command to confirm that the directory was created in HDFS.

hadoop fs -ls /yarn

The new node label directory should appear in the list returned by the following command. The owner should be yarn, and the permission should be drwx.

Found 1 items
drwx------ - yarn yarn 0 2014-11-24 13:09 /yarn/node-labels

Use the following commands to create a /user/yarn directory that is required by the distributed shell.

hadoop fs -mkdir -p /user/yarn
hadoop fs -chown -R yarn:yarn /user/yarn
hadoop fs -chmod -R 700 /user/yarn

The preceding commands assume that the yarn user will be submitting jobs with the distributed shell. To run the distributed shell with a different user, create the user, then use /user/<user_name> in the file paths of the commands above to create a new user directory.

2. Configure YARN for Node Labels

Add the following properties to the /etc/hadoop/conf/yarn-site.xml file on the ResourceManager host.

Set the following property to enable node labels:

 <property>
      <name>yarn.node-labels.manager-class</name>
      <value>org.apache.hadoop.yarn.server.resourcemanager.nodelabels.RMNodeLabelsManager</value>
</property>

Set the following property to reference the HDFS node label directory:

<property>
      <name>yarn.node-labels.fs-store.root-dir</name>
      <value>hdfs://<host>:<port>/<absolute_path_to_node_label_directory></value>
 </property>

For example:

<property>
     <name>yarn.node-labels.fs-store.root-dir</name>
     <value>hdfs://node-1.example.com:8020/yarn/node-labels/</value>
</property>

3. Start or Restart the YARN ResourceManager

In order for the configuration changes in the yarn-site.xml file to take effect, you must stop and restart the YARN ResourceManager if it is running, or start the ResourceManager if it is not running. To start or stop the ResourceManager, use the applicable commands in the "Controlling HDP Services Manually" section of the HDP Reference Guide.

4. Add and Assign Node Labels

For demonstration purposes, the following commands show how to use the yarn rmadmin client to add the node labels "x" and "y", but you can add your own node labels. You should run these commands as the yarn user. Node labels must be added before they can be assigned to nodes and associated with queues.

sudo su yarn
yarn rmadmin -addToClusterNodeLabels "x,y"

You can use the yarn cluster --list-node-labels command to confirm that node labels have been added:

 [root@node-1 /]# yarn cluster --list-node-labels
14/11/21 13:09:55 INFO impl.TimelineClientImpl: Timeline service address: http://node-1.example.com:8188/ws/v1/timeline/
14/11/21 13:09:55 INFO client.RMProxy: Connecting to ResourceManager at node-1.example.com/240.0.0.10:8032
Node Labels: x,y
[root@node-1 /]# 
 

You can use the following command format to remove node labels:

yarn rmadmin -removeFromClusterNodeLabels "<label1>,<label2>"

[Note]Note

You cannot remove a node label if it is associated with a queue.

Use the following command format to add or replace node label assignments on cluster nodes:

yarn rmadmin -replaceLabelsOnNode "<node1>:<port>,<label1>,<label2> <node2>:<port>,<label1>,<label2>"

For example, the following commands assign node label "x" to "node-1.example.com", and node label "y" to "node-2.example.com".

sudo su yarn
yarn rmadmin -replaceLabelsOnNode "node-1.example.com,x node-2.example.com,y"
[Note]Note

You can only assign one node label to each node. Also, if you do not specify a port, the node label change will be applied to all NodeManagers on the host.

To remove node label assignments from a node, use -replaceLabelsOnNode, but do not specify any labels. For example, you would use the following commands to remove the "x" label from node-1.example.com:

sudo su yarn
yarn rmadmin -replaceLabelsOnNode "node-1.example.com"

5. Associating Node Labels with Queues

Now that we have created node labels, we can associate them with queues in the /etc/hadoop/conf/capacity-scheduler.xml file.

You must specify capacity on each node label of each queue, and also ensure that the sum of capacities of each node-label of direct children of a parent queue at every level is equal to 100%. Node labels that a queue can access (accessible node labels of a queue) must be the same as, or a subset of, the accessible node labels of its parent queue.

Example:

Assume that a cluster has a total of 8 nodes. The first 3 nodes (n1-n3) have node label=x, the next 3 nodes (n4-n6) have node label=y, and the final 2 nodes (n7, n8) do not have any node labels. Each node can run 10 containers.

The queue hierarchy is as follows:

Assume that queue “a” can access node labels “x” and “y”, and queue “b” can only access node label “y”. By definition, nodes without labels can be accessed by all queues.

Consider the following example label configuration for the queues:

capacity(a) = 40, capacity(a, label=x) = 100, capacity(a, label=y) = 50; capacity(b) = 60, capacity(b, label=y) = 50

This means that:

  • Queue “a” can access 40% of the resources on nodes without any labels, 100% of the resources on nodes with label=x, and 50% of the resources on nodes with label=y.

  • Queue “b” can access 60% of the resources on nodes without any labels, and 50% of the resources on nodes with label=y.

You can also see that for this configuration:

capacity(a) + capacity(b) = 100 capacity(a, label=x) + capacity(b, label=x) (b cannot access label=x, it is 0) = 100 capacity(a, label=y) + capacity(b, label=y) = 100

For child queues under the same parent queue, the sum of the capacity for each label should equal 100%.

Similarly, we can set the capacities of the child queues a1, a2, and b1:

a1 and a2: capacity(a.a1) = 40, capacity(a.a1, label=x) =30, capacity(a.a1, label=y) =50 capacity(a.a2) = 60, capacity(a.a2, label=x) =70, capacity(a.a2, label=y) =50 b1: capacity(b.b1) = 100 capacity(b.b1, label=y) = 100

You can see that for the a1 and a2 configuration:

capacity(a.a1) + capacity(a.a2) = 100 capacity(a.a1, label=x) + capacity(a.a2, label=x) = 100 capacity(a.a1, label=y) + capacity(a.a2, label=y) = 100

How many resources can queue a1 access?

Resources on nodes without any labels: Resource = 20 (total containers that can be allocated on nodes without label, in this case n7, n8) * 40% (a.capacity) * 40% (a.a1.capacity) = 3.2 (containers)

Resources on nodes with label=x

Resource = 30 (total containers that can be allocated on nodes with label=x, in this case n1-n3) * 100% (a.label-x.capacity) * 30% = 9 (containers)

To implement this example configuration, you would add the following properties in the /etc/hadoop/conf/capacity-scheduler.xml file.

 <property>
 <name>yarn.scheduler.capacity.root.queues</name>
 <value>a,b</value>
 </property>

 <property>
 <name>yarn.scheduler.capacity.root.accessible-node-labels.x.capacity</name>
 <value>100</value>
 </property>

 <property>
 <name>yarn.scheduler.capacity.root.accessible-node-labels.y.capacity</name>
 <value>100</value>
 </property>

 <!-- configuration of queue-a -->
 <property>
 <name>yarn.scheduler.capacity.root.a.accessible-node-labels</name>
 <value>x,y</value>
 </property>

 <property>
 <name>yarn.scheduler.capacity.root.a.capacity</name>
 <value>40</value>
 </property>

 <property>
 <name>yarn.scheduler.capacity.root.a.accessible-node-labels.x.capacity</name>
 <value>100</value>
 </property>

 <property>
 <name>yarn.scheduler.capacity.root.a.accessible-node-labels.y.capacity</name>
 <value>50</value>
 </property>

 <property>
 <name>yarn.scheduler.capacity.root.a.queues</name>
 <value>a1,a2</value>
 </property>

 <!-- configuration of queue-b -->
 <property>
 <name>yarn.scheduler.capacity.root.b.accessible-node-labels</name>
 <value>y</value>
 </property>

 <property>
 <name>yarn.scheduler.capacity.root.b.capacity</name>
 <value>60</value>
 </property>

 <property>
 <name>yarn.scheduler.capacity.root.b.accessible-node-labels.y.capacity</name>
 <value>50</value>
 </property>

 <property>
 <name>yarn.scheduler.capacity.root.b.queues</name>
 <value>b1</value>
 </property>

 <!-- configuration of queue-a.a1 -->
 <property>
 <name>yarn.scheduler.capacity.root.a.a1.accessible-node-labels</name>
 <value>x,y</value>
 </property>

 <property>
 <name>yarn.scheduler.capacity.root.a.a1.capacity</name>
 <value>40</value>
 </property>

 <property>
 <name>yarn.scheduler.capacity.root.a.a1.accessible-node-labels.x.capacity</name>
 <value>30</value>
 </property>

 <property>
 <name>yarn.scheduler.capacity.root.a.a1.accessible-node-labels.y.capacity</name>
 <value>50</value>
 </property>

 <!-- configuration of queue-a.a2 -->
 <property>
 <name>yarn.scheduler.capacity.root.a.a2.accessible-node-labels</name>
 <value>x,y</value>
 </property>

 <property>
 <name>yarn.scheduler.capacity.root.a.a2.capacity</name>
 <value>60</value>
 </property>

 <property>
 <name>yarn.scheduler.capacity.root.a.a2.accessible-node-labels.x.capacity</name>
 <value>70</value>
 </property>

 <property>
 <name>yarn.scheduler.capacity.root.a.a2.accessible-node-labels.y.capacity</name>
 <value>50</value>
 </property>

 <!-- configuration of queue-b.b1 -->
 <property>
 <name>yarn.scheduler.capacity.root.b.b1.accessible-node-labels</name>
 <value>y</value>
 </property>

 <property>
 <name>yarn.scheduler.capacity.root.b.b1.capacity</name>
 <value>100</value>
 </property>

 <property>
 <name>yarn.scheduler.capacity.root.b.b1.accessible-node-labels.y.capacity</name>
 <value>100</value>
 </property>

6. Refresh Queues

After adding or updating queue node label properties in the capacity-scheduler.xml file, you must run the following commands to refresh the queues:

sudo su yarn
yarn rmadmin -refreshQueues 

8. Confirm Node Label Assignments

You can use the following commands to view information about node labels.

  • List all running nodes in the cluster: yarn node -list

    Example:

    [root@node-1 /]# yarn node -list
    14/11/21 12:14:06 INFO impl.TimelineClientImpl: Timeline service address: http://node-1.example.com:8188/ws/v1/timeline/
    14/11/21 12:14:07 INFO client.RMProxy: Connecting to ResourceManager at node-1.example.com/240.0.0.10:8032
    Total Nodes:3
     Node-Id Node-State Node-Http-Address Number-of-Running-Containers
    node-3.example.com:45454 RUNNING node-3.example.com:50060 0
    node-1.example.com:45454 RUNNING node-1.example.com:50060 0
    node-2.example.com:45454 RUNNING node-2.example.com:50060 0
    [root@node-1 /]# 
    
  • List all node labels in the cluster: yarn cluster --list-node-labels

    Example:

    [root@node-1 /]# yarn cluster --list-node-labels
    14/11/21 13:09:55 INFO impl.TimelineClientImpl: Timeline service address: http://node-1.example.com:8188/ws/v1/timeline/
    14/11/21 13:09:55 INFO client.RMProxy: Connecting to ResourceManager at node-1.example.com/240.0.0.10:8032
    Node Labels: x,y
    [root@node-1 /]# 
    

  • List the status of a node (includes node labels): yarn node -status <Node_ID>

    Example:

    [root@node-1 /]# yarn node -status node-1.example.com:45454
    14/11/21 06:32:35 INFO impl.TimelineClientImpl: Timeline service address: http://node-1.example.com:8188/ws/v1/timeline/
    14/11/21 06:32:35 INFO client.RMProxy: Connecting to ResourceManager at node-1.example.com/240.0.0.10:8032
    Node Report : 
     Node-Id : node-1.example.com:45454
     Rack : /default-rack
     Node-State : RUNNING
     Node-Http-Address : node-1.example.com:50060
     Last-Health-Update : Fri 21/Nov/14 06:32:09:473PST
     Health-Report : 
     Containers : 0
     Memory-Used : 0MB
     Memory-Capacity : 1408MB
     CPU-Used : 0 vcores
     CPU-Capacity : 8 vcores
     Node-Labels : x
    
    [root@node-1 /]# 
    

Node labels are also displayed in the ResourceManager UI on the Nodes and Scheduler pages.

Specifying a Child Queue with No Node Label

If no node label is specified for a child queue, it inherits the node label setting of its parent queue. To specify a child queue with no node label, use a blank space for the value of the node label.

For example:

 <property>
 <name>yarn.scheduler.capacity.root.b.b1.accessible-node-labels</name>
 <value> </value>
 </property>
 

Setting a Default Queue Node Label Expression

You can set a default node label on a queue. The default node label will be used if no label is specified when the job is submitted.

For example, to set "x"as the default node label for queue "b1", you would add the following property in the capacity-scheduler.xml file.

<property>
 <name>yarn.scheduler.capacity.root.b.b1.default-node-label-expression</name>
 <value>x</value>
</property> 

loading table of contents...