Configure node labels
You can configure node labels on a cluster by making configuration changes on the YARN ResourceManager host.
Enable Node Labels
To enable Node Labels on a cluster, make the following configuration changes on the YARN ResourceManager host.
- Create a Label Directory in HDFS
Use the following commands to create a "node-labels" directory in which to store the Node Labels in HDFS.
sudo su hdfs hadoop fs -mkdir -p /yarn/node-labels hadoop fs -chown -R yarn:yarn /yarn hadoop fs -chmod -R 700 /yarn
-chmod -R 700
specifies that only the yarn user can access the "node-labels" directory.You can then use the following command to confirm that the directory was created in HDFS.
hadoop fs -ls /yarn
The new node label directory should appear in the list returned by the following command. The owner should be
yarn
, and the permission should bedrwx
.Found 1 items drwx------ - yarn yarn 0 2014-11-24 13:09 /yarn/node-labels
Use the following commands to create a
/user/<username>
directory that is required by the distributed shell.hadoop fs -mkdir -p /user/<username> hadoop fs -chown -R yarn:yarn /user/<username> hadoop fs -chmod -R 700 /user/<username>
- In Cloudera Manager, select the YARN service.
- Click the Configuration tab.
- Search for YARN Service Advanced Configuration.
- In YARN Service Advanced Configuration Snippet (Safety Valve) for
yarn-site.xml add the following:
- Set the following property to enable Node Labels:
Name: yarn.node-labels.enabled Value: true
- Set the following property to reference the HDFS node label directory
Name: yarn.node-labels.fs-store.root-dir Value: hdfs://:/
For example,
Name: yarn.node-labels.fs-store.root-dir Value: hdfs://node-1.example.com:8020/yarn/node-labels/
- Set the following property to enable Node Labels:
- Start or Restart the YARN ResourceManager.
Add Node Labels
Use the following command format to add Node Labels. You should run these commands as
the yarn
user. Node labels must be added before they can be assigned to
nodes and associated with queues.
sudo su yarn yarn rmadmin -addToClusterNodeLabels "<label1>(exclusive=<true|false>),<label2>(exclusive=<true|false>)"
For example, the following commands add the node label "x" as exclusive, and "y" as shareable (non-exclusive).
sudo su yarn yarn rmadmin -addToClusterNodeLabels "x(exclusive=true),y(exclusive=false)"
You can use the yarn cluster --list-node-labels
command to confirm that
Node Labels have been added:
[root@node-1 /]# yarn cluster --list-node-labels 15/07/11 13:55:43 INFO impl.TimelineClientImpl: Timeline service address: http://node-1.example.com:8188/ws/v1/timeline/ 15/07/11 13:55:43 INFO client.RMProxy: Connecting to ResourceManager at node-1.example.com/240.0.0.10:8032 Node Labels: <x:exclusivity=true>,<y:exclusivity=false>
yarn rmadmin -removeFromClusterNodeLabels "<label1>,<label2>"
Assign Node Labels to Cluster Nodes
Use the following command format to add or replace node label assignments on cluster nodes:
yarn rmadmin -replaceLabelsOnNode "<node1>:<port>=<label1> <node2>:<port>=<label2>"
For example, the following commands assign node label "x" to "node-1.example.com", and node label "y" to "node-2.example.com".
sudo su yarn yarn rmadmin -replaceLabelsOnNode "node-1.example.com=x node-2.example.com=y"
To remove node label assignments from a node, use -replaceLabelsOnNode
,
but do not specify any labels. For example, you would use the following commands to
remove the "x" label from node-1.example.com:
sudo su yarn yarn rmadmin -replaceLabelsOnNode "node-1.example.com"
Associate Node Labels with Queues
Now that we have created Node Labels, we can associate them with queues in the
capacity-scheduler.xml
file.
You must specify capacity on each node label of each queue, and also ensure that the sum of capacities of each node-label of direct children of a parent queue at every level is equal to 100%. Node labels that a queue can access (accessible Node Labels of a queue) must be the same as, or a subset of, the accessible Node Labels of its parent queue.
Example:
Assume that a cluster has a total of 8 nodes. The first 3 nodes (n1-n3) have node label=x, the next 3 nodes (n4-n6) have node label=y, and the final 2 nodes (n7, n8) do not have any Node Labels. Each node can run 10 containers.
The queue hierarchy is as follows:
Assume that queue “a” can access Node Labels “x” and “y”, and queue “b” can only access node label “y”. By definition, nodes without labels can be accessed by all queues.
Consider the following example label configuration for the queues:
capacity(a) = 40, capacity(a, label=x) = 100, capacity(a, label=y) = 50; capacity(b) = 60, capacity(b, label=y) = 50
This means that:
-
Queue “a” can access 40% of the resources on nodes without any labels, 100% of the resources on nodes with label=x, and 50% of the resources on nodes with label=y.
-
Queue “b” can access 60% of the resources on nodes without any labels, and 50% of the resources on nodes with label=y.
You can also see that for this configuration:
capacity(a) + capacity(b) = 100 capacity(a, label=x) + capacity(b, label=x) (b cannot access label=x, it is 0) = 100 capacity(a, label=y) + capacity(b, label=y) = 100
For child queues under the same parent queue, the sum of the capacity for each label should equal 100%.
Similarly, we can set the capacities of the child queues a1, a2, and b1:
a1 and a2: capacity(a.a1) = 40, capacity(a.a1, label=x) =30, capacity(a.a1, label=y) =50 capacity(a.a2) = 60, capacity(a.a2, label=x) =70, capacity(a.a2, label=y) =50 b1: capacity(b.b1) = 100 capacity(b.b1, label=y) = 100
You can see that for the a1 and a2 configuration:
capacity(a.a1) + capacity(a.a2) = 100 capacity(a.a1, label=x) + capacity(a.a2, label=x) = 100 capacity(a.a1, label=y) + capacity(a.a2, label=y) = 100
How many resources can queue a1 access?
Resources on nodes without any labels: Resource = 20 (total containers that can be allocated on nodes without label, in this case n7, n8) * 40% (a.capacity) * 40% (a.a1.capacity) = 3.2 (containers)
Resources on nodes with label=x
Resource = 30 (total containers that can be allocated on nodes with label=x, in this case n1-n3) * 100% (a.label-x.capacity) * 30% = 9 (containers)
To implement this example configuration, you would add the following properties in the
capacity-scheduler.xml
file.
Name: yarn.scheduler.capacity.root.queues Value: a,b Name: yarn.scheduler.capacity.root.accessible-node-labels.x.capacity Value: 100 Name: yarn.scheduler.capacity.root.accessible-node-labels.y.capacity Value: 100 <!-- configuration of queue-a --> Name: yarn.scheduler.capacity.root.a.accessible-node-labels Value: x,y Name: yarn.scheduler.capacity.root.a.capacity Value: 40 Name: yarn.scheduler.capacity.root.a.accessible-node-labels.x.capacity Value: 100 Name: yarn.scheduler.capacity.root.a.accessible-node-labels.y.capacity Value: 50 Name: yarn.scheduler.capacity.root.a.queues Value: a1,a2 <!-- configuration of queue-b --> Name: yarn.scheduler.capacity.root.b.accessible-node-labels Value: y Name: yarn.scheduler.capacity.root.b.capacity Value: 60 Name: yarn.scheduler.capacity.root.b.accessible-node-labels.y.capacity Value: 50 Name: yarn.scheduler.capacity.root.b.queues Value: b1 <!-- configuration of queue-a.a1 --> Name: yarn.scheduler.capacity.root.a.a1.accessible-node-labels Value: x,y Name: yarn.scheduler.capacity.root.a.a1.capacity Value: 40 Name: yarn.scheduler.capacity.root.a.a1.accessible-node-labels.x.capacity Value: 30 Name: yarn.scheduler.capacity.root.a.a1.accessible-node-labels.y.capacity Value: 50 <!-- configuration of queue-a.a2 --> Name: yarn.scheduler.capacity.root.a.a2.accessible-node-labels Value: x,y Name: yarn.scheduler.capacity.root.a.a2.capacity Value: 60 Name: yarn.scheduler.capacity.root.a.a2.accessible-node-labels.x.capacity Value: 70 Name: yarn.scheduler.capacity.root.a.a2.accessible-node-labels.y.capacity Value: 50 <!-- configuration of queue-b.b1 --> Name: yarn.scheduler.capacity.root.b.b1.accessible-node-labels Value: y Name: yarn.scheduler.capacity.root.b.b1.capacity Value: 100 Name: yarn.scheduler.capacity.root.b.b1.accessible-node-labels.y.capacity Value: 100
Refresh Queues
After adding or updating queue node label properties in the
capacity-scheduler.xml
file, you must run the following commands to
refresh the queues:
sudo su yarn yarn rmadmin -refreshQueues
Confirm Node Label Assignments
You can use the following commands to view information about node labels.
-
List all running nodes in the cluster:
yarn node -list
Example:
[root@node-1 /]# yarn node -list 14/11/21 12:14:06 INFO impl.TimelineClientImpl: Timeline service address: http://node-1.example.com:8188/ws/v1/timeline/ 14/11/21 12:14:07 INFO client.RMProxy: Connecting to ResourceManager at node-1.example.com/240.0.0.10:8032 Total Nodes:3 Node-Id Node-State Node-Http-Address Number-of-Running-Containers node-3.example.com:45454 RUNNING node-3.example.com:50060 0 node-1.example.com:45454 RUNNING node-1.example.com:50060 0 node-2.example.com:45454 RUNNING node-2.example.com:50060 0
-
List all node labels in the cluster:
yarn cluster --list-node-labels
Example:
[root@node-1 /]# yarn cluster --list-node-labels 15/07/11 13:55:43 INFO impl.TimelineClientImpl: Timeline service address: http://node-1.example.com:8188/ws/v1/timeline/ 15/07/11 13:55:43 INFO client.RMProxy: Connecting to ResourceManager at node-1.example.com/240.0.0.10:8032 Node Labels: <x:exclusivity=true>,<y:exclusivity=false>
-
List the status of a node (includes node labels):
yarn node -status <Node_ID>
Example:
[root@node-1 /]# yarn node -status node-1.example.com:45454 14/11/21 06:32:35 INFO impl.TimelineClientImpl: Timeline service address: http://node-1.example.com:8188/ws/v1/timeline/ 14/11/21 06:32:35 INFO client.RMProxy: Connecting to ResourceManager at node-1.example.com/240.0.0.10:8032 Node Report : Node-Id : node-1.example.com:45454 Rack : /default-rack Node-State : RUNNING Node-Http-Address : node-1.example.com:50060 Last-Health-Update : Fri 21/Nov/14 06:32:09:473PST Health-Report : Containers : 0 Memory-Used : 0MB Memory-Capacity : 1408MB CPU-Used : 0 vcores CPU-Capacity : 8 vcores Node-Labels : x
Node labels are also displayed in the ResourceManager UI on the Nodes and Scheduler pages.
Specify a Child Queue with No Node Label
If no node label is specified for a child queue, it inherits the node label setting of its parent queue. To specify a child queue with no node label, use a blank space for the value of the node label.
For example:
Name: yarn.scheduler.capacity.root.b.b1.accessible-node-labels Value:
Set a Default Queue Node Label Expression
You can set a default node label on a queue. The default node label will be used if no label is specified when the job is submitted.
For example, to set "x"as the default node label for queue "b1", you would add the
following property in the capacity-scheduler.xml
file.
Name: yarn.scheduler.capacity.root.b.b1.default-node-label-expression Value: x