Operating System Requirements
This topic describes the operating system requirements for Cloudera software.
CDP Private Cloud Base Supported Operating Systems
Please see the Cloudera Support Matrix for detailed information about supported operating systems.
Operating System support for the CDP Private Cloud Base Trial Installer
SLES 12 SP5 is not supported when using the Trial Installer
cloudera-manager-installer.bin) to install Cloudera
Important information about Runtime and Cloudera Manager Supported Operating Systems
Runtime provides parcels for select versions of RHEL-compatible operating systems.
CDP Private Cloud Base supported operating systems
|IBM PowerPC on RHEL||
The following components are not supported:
Operating System and IBM PowerPC support matrix
- IBM PowerPC only and CDP Private Cloud Base
- IBM PowerPC CPU, IBM Spectrum Scale Storage, and CDP Private Cloud Base. This is a
subset of what is supported generally on IBM PowerPC.
IBM PowerPC Support Documentation PowerPC 8 and 9 generally without Spectrum Scale Storage https://www.ibm.com/docs/en/linux-on-systems?topic=lpo-supported-linux-distributions-virtualization-options-power8-power9-linux-power-systems PowerPC 10 generally without Spectrum Scale Storage https://www.ibm.com/docs/en/linux-on-systems?topic=lpo-supported-linux-distributions-virtualization-options-power10-linux-power-servers IBM Spectrum Scale Storage with CDP Private Cloud Base on x86 and PowerPC combinations https://www.ibm.com/docs/en/spectrum-scale-bda?topic=requirements-support-matrix
- Python - Python dependencies for the different CDP components is mentioned below:
- Cloudera Manager
- Cloudera Manager supports the system Python on supported OSes, and does not support Python 3.
- Hue requires Python 2.7, and does not support Python 3.
- Spark 2.4 supports Python 2.7 and 3.4-3.7.
- Spark 3.0 supports Python 2.7 and 3.4 and higher, although support for Python 2 and 3.4 to 3.5 is deprecated.
- Spark 3.1 supports Python 3.6 and higher.
- If the right level of Python is not picked up by default, set the
PYSPARK_DRIVER_PYTHONenvironment variables to point to the correct Python executable before running the pyspark command.
- Perl - Cloudera Manager requires perl.
- python-psycopg2 - Cloudera Manager 7 has a dependency on the
python-psycopg2. Hue in Runtime 7 requires a higher version of
psycopg2than is required by the Cloudera Manager dependency. For more information, see Installing the
- iproute package - CDP Private Cloud Base has a dependency
iproutepackage. Any host that runs the Cloudera Manager Agent requires the package. The required version varies depending on the operating system:
Table 1. iproute package Operating System iproute version RHEL iproute Ubuntu iproute2
The Hadoop Distributed File System (HDFS) is designed to run on top of an underlying filesystem in an operating system. Cloudera recommends that you use either of the following filesystems tested on the supported operating systems:
- ext3: This is the most tested underlying filesystem for HDFS.
- ext4: This scalable extension of ext3 is supported in more recent Linux releases.
- XFS: This is the default filesystem in RHEL 7.
- S3: Amazon Simple Storage Service
Kudu Filesystem Requirements - Kudu is supported on ext4 and
XFS. Kudu requires a kernel version and filesystem that supports hole
punching. Hole punching is the use of the
fallocate(2) system call with the
FALLOC_FL_PUNCH_HOLE option set.
File Access Time
Linux filesystems keep metadata that record when each file was
accessed. This means that even reads result in a write to the disk. To
speed up file reads, Cloudera recommends that you disable this option,
atime, using the
/dev/sdb1 /data1 ext4 defaults,noatime 0
Apply the change without rebooting:
mount -o remount /data1
Filesystem Mount Options
mount options have a
sync option that allows you to write
sync filesystem mount option reduces
performance for services that write data to disks, such as HDFS, YARN,
Kafka and Kudu. In CDP, most writes are already replicated. Therefore,
synchronous writes to disk are unnecessary, expensive, and do not
measurably improve stability.
NFS and NAS options are not supported for use as DataNode Data Directory mounts, even when using Hierarchical Storage features.
/tmp as a filesystem with the
noexec option is sometimes done as an enhanced
security measure to prevent the execution of files stored there.
However, this causes multiple problems with various parts of Cloudera
Manager and CDP. Therefore, Cloudera does not support mounting
/tmp with the
Cloudera Manager automatically sets
/etc/security/limits.conf, but this
configuration can be overridden by individual files in
/etc/security/limits.d/. This can cause problems with
Apache Impala and other components.
Make sure that the
nproc limits are set sufficiently
high, such as
nscd for Kudu
Although not a strict requirement, it's highly recommended that you
nscd to cache both DNS name resolution and static
name resolution for Kudu.