Using scheduling to allocate resources

You can allocate CPU, and memory among users and groups in a Hadoop cluster. You can use scheduling to allocate the best possible nodes for application containers.

The CapacityScheduler is responsible for scheduling. The CapacityScheduler is used to run Hadoop applications as a shared, multi-tenant cluster in an operator-friendly manner while maximizing the throughput and the utilization of the cluster.

The ResourceCalculator is part of the YARN CapacityScheduler. If you have only one type of resource, typically a CPU virtual core (vcore), use the DefaultResourceCalculator. If you have multiple resource types, use the DominantResourceCalculator.