File system partitioning recommendations
This section helps you to understand the recommendations and set up the file system partitions on master and worker nodes on a CDP Private Cloud Base cluster.
Partitioning recommendations for all nodes
- Root partition: OS and core program files
- Swap: Size 2X system memory
Partitioning recommendations for worker nodes
- Hadoop worker node: Hadoop must have its partitions for Hadoop files and logs. Drives must be partitioned using XFS, ext4, or ext3 in that order of preference.
- Worker nodes: All Hadoop partitions must be mounted individually from drives in the /grid/[0-n] format.
Hadoop worker node partitioning configuration example
- /swap: Cloudera recommends following the guidelines provided by your operating system vendor to configure the swap space on each host. If your vendor recommends a swap space range, then use the lowest recommended value.
- /root: 20 GB (sufficient space for existing files, future log file growth, and OS upgrades)
- /grid/0/: [full disk GB] first partition for Hadoop to use for local storage
- /grid/1/: second partition for Hadoop to use
- /grid/2/: third partition for Hadoop to use, and so on
Redundancy (RAID) recommendations
- Master nodes: Configured for reliability (RAID 10, dual Ethernet cards, dual power supplies, and so on.)
- Worker nodes: RAID is not necessary as the cluster manages the worker nodes' failure automatically. Data is stored across at least three different hosts, therefore redundancy is built-in. Worker nodes must be built for speed and low cost.