Managing Data Operating System
Also available as:
PDF
loading table of contents...

Using Scheduling to Allocate Resources

You can allocate CPU, GPU, and memory among users and groups in a Hadoop cluster. You can use scheduling to allocate the best possible nodes for application containers.

The CapacityScheduler is responsible for scheduling. The ResourceCalculator is part of the YARN CapacityScheduler. The CapacityScheduler is used to run Hadoop applications as a shared, multi-tenant cluster in an operator-friendly manner while maximizing the throughput and the utilization of the cluster.

If you have only one type of resource, typically a CPU virtual core (vcore), use the DefaultResourceCalculator. If you have multiple resource types, use the DominantResourceCalculator.