Configuring CPU Scheduling
Use the following steps to configure CPU scheduling.
Enable CPU Scheduling in capacity-scheduler.xml
CPU scheduling is not enabled by default. To enable CPU sheduling, set the following property in the
/etc/hadoop/conf/capacity-scheduler.xml
file on the ResourceManager and NodeManager hosts:Replace the
DefaultResourceCalculator
portion of the<value>
string withDominantResourceCalculator
:Property:
yarn.scheduler.capacity.resource-calculator
Value:
org.apache.hadoop.yarn.util.resource.DominantResourceCalculator
<property> <name>yarn.scheduler.capacity.resource-calculator</name> <!-- <value>org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator</value> --> <value>org.apache.hadoop.yarn.util.resource.DominantResourceCalculator</value> </property>
Set Vcores in yarn-site.xml
In YARN, vcores (virtual cores) are used to normalize CPU resources across the cluster. The
yarn.nodemanager.resource.cpu-vcores
value sets the number of CPU cores that can be allocated for containers.You should set the number of vcores to match the number of physical CPU cores on the NodeManager hosts. Set the following property in the
/etc/hadoop/conf/yarn-site.xml
file on the ResourceManager and NodeManager hosts:Property:
yarn.nodemanager.resource.cpu-vcores
Value:
<number_of_physical_cores>
Example:
<property> <name>yarn.nodemanager.resource.cpu-vcores</name> <value>16</value> </property>
You also should enable CGroups along with CPU scheduling. CGroups are used as the isolation mechanism for CPU processes. With CGroups strict enforcement activated, each CPU process receives only the resources it requests. Without CGroups activated, the DRF scheduler attempts to balance the load, but unpredictable behavior may occur.