YARN Resource Management
Also available as:
PDF
loading table of contents...

Deployment Considerations

For HA purposes, it is not advisable to deploy two Masters on the same host.

Thrift/Thrift2/REST gateways are likely to receive heavy traffic. They should be deployed on dedicated hosts.

Memory Considerations for Running One Component on a Node

You can adjust the amount of memory given to a component to achieve mutual exclusion of components, depending upon the NodeManager configuration on each node. Typically all nodes have the same value independent of the actual memory.

Assuming the memory capacity for each NodeManager is known (yarn.nodemanager.resource.memory-mb), you can configure the component to ask for 51% (basically more than half) of the maximum capacity. You also need to ensure that the maximum possible memory allocation (yarn.scheduler.maximum-allocation-mb) allows that value.

For example, if yarn.nodemanager.resource.memory-mb = yarn.scheduler.maximum-allocation-mb = 2048 Set yarn.memory = 1280 for the RegionServer.

Then set the HBase Master/RegionServer max memory to be 256 MB less than that to allow for the agent memory consumption ⎯ the agent should not be more than 100 MB but you can just assume that it consumes ~256 MB. So you can set the HBase Master/RegionServer variables for memory limit to 1024 MB.

Log Aggregation

Log aggregation is specified in the global section of resources.json:

"global": { 
                  "yarn.log.include.patterns": "", 
                  "yarn.log.exclude.patterns": "", 
                  "yarn.log.interval": "0"
     },

The yarn.log.interval unit is seconds.

You can specify the name(s) of log files (for example, agent.log) that you do not want to aggregate using yarn.log.exclude.patterns.

The aggregated logs are stored in the HDFS /app-logs/ directory.

The following command can be used to retrieve the logs:

yarn logs -applicationId <app_id>

Reserving Nodes for HBase

You can use YARN node labels to reserve cluster nodes for applications and their components.

Node labels are specified with the yarn.label.expression property. If no label is specified, only non-labeled nodes are used when allocating containers for component instances.

If label expression is specified for slider-appmaster, then it also becomes the default label expression for all components. To take advantage of default label expression, leave out the property (see HBASE_REGIONSERVER in the example). A label expression with an empty string ("yarn.label.expression":"") is equivalent to nodes without labels.

For example, the following is a resources.json file for an HBase cluster that uses node labels. The label for the application instance is "hbase1", and the label expression for the HBASE_MASTER components is "hbase1_master". HBASE_REGIONSERVER instances will automatically use label "hbase1". Alternatively, if you specify ("yarn.label.expression":"") for HBASE_REGIONSERVER then the containers will only be allocated on nodes with no labels.

{ "schema": "http://example.org/specification/v2.0.0", 
  "metadata": {
  }, 
  "global": {
  }, 
  "components": { 
            "HBASE_MASTER": { 
                  "yarn.role.priority": "1", 
                  "yarn.component.instances": "1", 
                  "yarn.label.expression":"hbase1_master"
           }, 
           "HBASE_REGIONSERVER": { 
                  "yarn.role.priority": "1", 
                  "yarn.component.instances": "1",
           }, 
           "slider-appmaster": { 
                  "yarn.label.expression":"hbase1"
           }
     }
}
            

Specifically, for the above example you would need to:

  • Create two node labels, "hbase1" and "hbase1_master" (using yarn rmadmin commands)

  • Assign the labels to nodes (using yarn rmadmin commands)

  • Create a queue by defining it in the capacity-scheduler.xml configuration file.

  • Allow the queue to access to the labels and ensure that appropriate min/max capacity is assigned.

  • Refresh the queues (yarn rmadmin -refreshQueues)

  • Create the Slider application against the above queue using parameter --queue while creating the application.

Retrieving Effective hbase-site.xml

Once HBase is running on Slider, you can use the following command to retrieve hbase-site.xml so that your client can connect to the cluster:

slider registry --getconf hbase-site --name <cluster name> --format xml --dest <path to local hbase-site.xml> --filesystem <hdfs namenode> --manager <resource manager>

Note that the hbase.tmp.dir in the returned file may be inside the YARN local directory for the container (which is local to the host where the container is running).

For example:

 <property>
 <name>hbase.tmp.dir</name> <value>/grid/0/hadoop/yarn/local/usercache/test/appcache/application_1413942326640_0001/container_1413942326640_0001_01_000005/work/app/tmp</value>
 <source/>
 </property>

If the client does not have access to this directory, changing the hbase.tmp.dir to a directory writable to the user would solve the permission issue.

Retrieving REST gateway for Slider HBase. You can retrieve quicklinks for Slider HBase first. The "org.apache.slider.hbase.rest" entry would show hostname and port for REST gateway.

Retrieving thrift gateway for Slider HBase You can retrieve quicklinks for Slider HBase first. The "org.apache.slider.hbase.thrift" entry would show hostname and port for thrift gateway.

Retrieving thrift2 gateway for Slider HBase You can retrieve quicklinks for Slider HBase first. The "org.apache.slider.hbase.thrift2" entry would show hostname and port for thrift2 gateway.

Workaround for HBase Client Issue

After you create an HBase application instance in Slider, you can use the slider registry command to retrieve the hbase-site.xml file:

slider registry --name hbase1 --getconf hbase-site --format xml --out hbase-site.xml

The path of the hbase.tmp.dir property in the returned file will be inside the YARN local directory for the container (which is local to the host where the container is running):

 <property>
 <name>hbase.tmp.dir</name> <value>/grid/0/hadoop/yarn/local/usercache/test/appcache/application_1413942326640_0001/container_1413942326640_0001_01_000005/work/app/tmp</value>
 <source/>
 </property>

The HBase client (specifically the HBase shell) expects to see a directory named "jars" in this directory. If it is not there the HBase client attempts to create it, resulting in a failure to perform shell operations:

[test@ip-10-0-0-66 hbase]$ hbase --config ./conf shell
...
hbase(main):001:0> status 'simple'
...
ERROR: Failed to create local dir /grid/0/hadoop/yarn/local/usercache/test/appcache/application_1413942326640_0001/container_1413942326640_0001_01_000005/work/app/tmp/local/jars, DynamicClassLoader failed to init

Workaround: Change the hbase.tmp.dir to a directory that is writable for the user running "hbase shell" or to a directory which already has the "jars" directory.