Recommended settings for G1GC

The recommended settings for configuring Garbage First Garbage Collector (G1GC) include allocating more Java heap space when compared to the Concurrent Mark Sweep (CMS) GC, and setting specific values for properties such as MaxGCPauseMillis and ParallelGCThreads.

No significant improvements have been observed in the NameNode startup process when using G1GC instead of CMS.

The following NameNode settings are recommended for G1GC in a large cluster:

  • Approximately 10% more Java heap space (-XX:Xms and -XX:Xmx) should be allocated to the NameNode, as compared to CMS setup.

    See Command Line Installation Guide for recommendations on setting the CMS heap size.

  • For large clusters (>50M files), MaxGCPauseMillis should be set to 4000.

  • You should set ParallelGCThreads to 20 (default for a 32-core machine), as opposed to 8 for CMS.

  • Other G1GC parameters should be left set to their default values.

We have observed that the G1GC does not comply with the maximum heap size (-XX:Xmx) setting. For Xmx = 110 GB, we observed the following VM statistics:

  • For CMS: Maximum heap (VmPeak) = 113 GB.

  • For G1GC: Maximum heap (VmPeak) = 147 GB.