JBOD
Overview on Kafka with JBOD.
JBOD refers to a system configuration where disks are used independently rather than
organizing them into redundant arrays (RAID). Using RAID usually results in more reliable hard
disk configurations even if the individual disks are not reliable. RAID setups like these are
common in large scale big data environments built on top of commodity hardware. RAID enabled
configurations are more expensive and more complicated to set up. In a large number of
environments, JBOD configurations are preferred for the following reasons:
- Reduced storage cost: RAID-10 is recommended to protect against disk failures. However, scaling RAID-10 configurations can become excessively expensive. Storing the data redundantly on each node means that storage space requirements have to be multiplied because the data is also replicated across nodes.
- Improved performance: Just like HDFS, the slowest disk in RAID-10 configuration limits overall throughput. Writes need to go through a RAID controller. On the other hand, when using JBOD, IO performance is increased as a result of isolated writes across disks without a controller.