Filesystems
In Linux, there are several choices for formatting and organizing drives. However, only a few choices are optimal for Hadoop.
In RHEL and CentOS, the Logical Volume Manager (LVM) should not be used for data drives. It is not optimal and can lead to combining multiple drives into one logical disk, which is in complete contrast to how Hadoop manages fault tolerance across HDFS. It is beneficial to keep LVM enabled on the OS drives. Any performance impact that may occur is countered by the improvement of system manageability. Using LVM on the OS drives enables the admin to avoid over-allocating space on partitions. Space needs can change over time and the ability to dynamically grow a filesystem is better than having to rebuild a system. Do not use LVM to stripe or span logical volumes across multiple physical volumes to mimic RAID.
Cloudera recommends using an extent-based filesystem.
This includes ext3, ext4, and xfs. Most new
Hadoop clusters use the ext4 filesystem by default. RHEL 7 uses
xfs as its default filesystem.
Filesystem creation options
ext4 filesystems for use with Hadoop data volumes, Cloudera recommends reducing the superuser block
reservation from 5% to 1% for root (using the -m1 option) as well as setting
the following options:- use one inode per 1 MB (largefile)
- minimize the number of super block backups (sparse_super)
- enable journaling (has_journal)
- use b-tree indexes for directory trees (dir_index)
- use extent-based allocations (extent)
ext4
filesystem:mkfs –t ext4 –m 1 –O -T largefile sparse_super,dir_index,extent,has_journal /dev/sdb1
xfs
filesystem:mkfs –t xfs /dev/sdb1Disk mount options
By design, HDFS is a fault-tolerant filesystem. All drives used by DataNode machines for data
need to be mounted without the use of RAID. Drives should be mounted in the
/etc/fstab filesystem table using the noatime option
(which also implies nodiratime). In case of SSD or flash, turn on TRIM by specifying the discard option when mounting. This reduces
premature SSD wear and device failures, while primarily avoiding long garbage collection
pauses.
noatime mount option
specified:/dev/sda1 / ext4 noatime 0 0discard mount
option:/dev/sdb1 /data ext4 noatime,discard 0 0Disk mount naming convention
/data1
/data2
/data3
/data4
/data5
/data6