HDFS heap sizing
You can provision an HDFS cluster for optimal performance based on the desired storage capacity.
|JournalNode||1 GB (default)
Set this value using the Java Heap Size of JournalNode in Bytes HDFS configuration property.
|1 core minimum||1 dedicated disk|
See Sizing NameNode Heap Memory.
Set this value using the Java Heap Size of NameNode in Bytes HDFS configuration property.
|Minimum of 4 dedicated cores; more may be required for larger clusters||
Minimum: 4 GB
Maximum: 8 GB
Increase the memory for higher replica counts or a higher number of blocks per DataNode. When increasing the memory, Cloudera recommends an additional 1 GB of memory for every 1 million replicas above 4 million on the DataNodes. For example, 5 million replicas require 5 GB of memory.
Set this value using the Java Heap Size of DataNode in Bytes HDFS configuration property.
|Minimum: 4 cores. Add more cores for highly active clusters.||
The maximum acceptable size will vary depending upon how large average block size is. The DN’s scalability limits are mostly a function of the number of replicas per DN, not the overall number of bytes stored. That said, having ultra-dense DNs will affect recovery times in the event of machine or rack failure. Cloudera does not support exceeding 100 TB per data node. You could use 12 x 8 TB spindles or 24 x 4TB spindles. Cloudera does not support drives larger than 8 TB.