Conclusion
Achieving optimal results from a Hadoop implementation begins with choosing the correct hardware and software stacks. The effort involved in the planning stages can pay off dramatically in terms of the performance and the total cost of ownership (TCO) associated with the environment.
The following composite system stack recommendations can help benefit organizations in the planning stages:
Table 1.1 Sizing Recommendations
Machine Type |
Workload Pattern/ Cluster Type |
Storage[1] |
Processor (# of Cores) |
Memory (GB) |
Network |
---|---|---|---|---|---|
Slaves |
Balanced workload |
Twelve 2-3 TB disks |
8 |
128-256 |
1 GB onboard, 2x10 GBE mezzanine/external |
Compute-intensive workload |
Twelve 1-2 TB disks |
10 |
128-256 |
1 GB onboard, 2x10 GBE mezzanine/external | |
Storage-heavy workload |
Twelve 4+ TB disks |
8 |
128-256 |
1 GB onboard, 2x10 GBE mezzanine/external | |
NameNode |
Balanced workload |
Four or more 2-3 TB RAID 10 with spares |
8 |
128-256 |
1 GB onboard, 2x10 GBE mezzanine/external |
ResourceManager |
Balanced workload |
Four or more 2-3 TB RAID 10 with spares |
8 |
128-256 |
1 GB onboard, 2x10 GBE mezzanine/external |
[1] Reserve at least 2.5 GB of hard drive space for each version of HDP to be installed.