Ozone hardware recommendations
This guide helps you choose hardware for Ozone based on your data storage needs. Following these recommendations will ensure that you get optimum performance from your Ozone cluster.
Node Type | Chasis | CPU | Node RAM | RAM for each service | OS Disk | Meta Disk (NVMe) | Data Disk | Network | Disk Controllers | GPU |
---|---|---|---|---|---|---|---|---|---|---|
Master node (OM and SCM and Recon) | 1U | 2 x 20c | 256 GB | 64 GB | 2 x 480 GB SSD | 2 x 4 TB | - | 2x 25Gbps | - | - |
Datanode (Ozone, no compute) | 2U | 2 x 12c | 31 GB** | 2 x 1.5 TB | 24 x 16TB | 2x 12 Gbps (low) | ||||
Datanode (Ozone, mixed compute) | 2 x 24c | 512 GB | 2 x 3 TB | Optional | ||||||
Compute node (No Storage) | 1U | - | 1 x 4 TB | - | 1x 25Gbps | - |
**Avoid using heap sizes of 32GB to 47GB because the JVM cannot use Compressed oops for heap sizes > 31GB. This reduces the effective memory available to the process. If you want to configure heaps > 31GB, then use a heap size of at least 48GB or higher.
Notes
The above configuration will support up to 10B keys because of the 4 TB NVMe on the master nodes.
The absolute minimum recommended configuration is 3 master nodes and 9 datanodes. This will support Erasure Coding with the RS(6,3) configuration with full High Availability. Additional datanodes can be added in increments of 1 to increase storage.
Network
The network between the datanodes and the compute nodes cannot be oversubscribed by more than 2:1. Networking is sized to support the full (real-world) bandwidth of the drives across the network. More drives require faster networks, both at the server level and the switch level.
NVMe
NVMe should be configured in RAID1 pairs to provide business continuity for Ozone metadata in case of hardware failure.
The master nodes and datanodes use NVMe to store Ozone metadata. The compute nodes use NVMe for shuffle (Spark, MapReduce, and Tez) and caching (LLAP). The mixed compute datanodes use NVMe for both Ozone metadata and shuffle (Spark, MapReduce, and Tez) plus caching (LLAP).
Cloudera recommends mounting Ozone partitions across the NVMe drive pair as RAID1 (800GB) with the remaining space used for shuffle or cache as independent JBOD partitions. RAID can be configured either in hardware or in software.
Example sizing calculator
Suggested Value of Parameter | Logical Capacity 2PB | Logical Capacity 8PB | Logical Capacity 16PB | Additional information |
---|---|---|---|---|
Number of Data Nodes if using Erasure Coding rs(6,3) | 9 | 31 | 64 | These are calculated based on actual file storage required (See Row 1) |
Logical data size proposed (TB, EC 6,3) | 2304 | 8192 | 16384 | - |
Raw disk capacity (TB) | 3456 | 12288 | 24576 | - |
Number of Data Nodes if using triple replication | 16 | 64 | 128 | These are calculated based on actual file storage required (See Row 1) |
Logical data size - conservative using 3x (TB) | 2048 | 8192 | 16384 | - |
Raw disk capacity (TB) | 6144 | 24576 | 49152 | - |