YARN

Component Java Heap CPU Other Recommendations
Job History Server
  • Minimum: 1 GB
  • Increase memory by 1.6 GB for each 100,000 tasks kept in memory. For example:

    5 jobs @ 100, 000 mappers + 20,000 reducers = 600,000 total tasks requiring 9.6 GB of heap.

    See the Other Recommendations column for additional tuning suggestions.

Set this value using the Java Heap Size of JobHistory Server in Bytes YARN configuration property.

Minimum: 1 core
  • Set the mapreduce.jobhistory.loadedtasks.cache.size property to a total loaded task count. Using the example in the Java Heap column to the left, of 650,000 total tasks, you can set it to 700,000 to allow for some safety margin. This should also prevent the JobHistoryServer from hanging during garbage collection, since the job count limit does not have a task limit.

NodeManager Minimum: 1 GB.

Configure additional heap memory for the following conditions:

  • Large number of containers
  • Large shxmluffle sizes in Spark or MapReduce

Set this value using the Java Heap Size of NodeManager in Bytes YARN configuration property.

  • Minimum: 8-16 cores
  • Recommended: 32-64 cores
Disks:
  • Minimum: 8 disks
  • Recommended: 12 or more disks
Networking:
  • Minimum: Dual 1Gbps or faster
  • Recommended: Single/Dual 10 Gbps or faster
ResourceManager Minimum: 6 GB
Configure additional heap memory for the following conditions:
  • More jobs
  • Larger cluster size
  • Number of retained finished applications (configured with the yarn.resourcemanager.max-completed-applications property.
  • Scheduler configuration

Set this value using the Java Heap Size of ResourceManager in Bytes YARN configuration property.

Minimum: 1 core
Other Settings
  • Set the ApplicationMaster Memory YARN configuration property to 512 MB
  • Set the Container Memory Minimum YARN configuration property to 1 GB.
N/A N/A