Networking and Security Requirements

This topic describes the networking and security requirements for CDP.

Cloudera Runtime and Cloudera Manager Supported Transport Layer Security Versions

The following components are supported by the indicated versions of Transport Layer Security (TLS):
Table 1. Components Supported by TLS

Component

Role Name Port Version
Cloudera Manager Cloudera Manager Server 7182 TLS 1.2
Cloudera Manager Cloudera Manager Server 7183 TLS 1.2
Flume 9099 TLS 1.2
Flume Avro Source/Sink TLS 1.2
Flume Flume HTTP Source/Sink TLS 1.2
HBase Master HBase Master Web UI Port 60010 TLS 1.2
HDFS NameNode Secure NameNode Web UI Port 50470 TLS 1.2
HDFS Secondary NameNode Secure Secondary NameNode Web UI Port 50495 TLS 1.2
HDFS HttpFS REST Port 14000 TLS 1.1, TLS 1.2
Hive HiveServer2 HiveServer2 Port 10000 TLS 1.2
Hue Hue Server Hue HTTP Port 8888 TLS 1.2
Impala Impala Daemon Impala Daemon Beeswax Port 21000 TLS 1.0, TLS 1.1, TLS 1.2

We recommend that clients use the highest supported version, TLS 1.2.

Impala Impala Daemon Impala Daemon HiveServer2 Port 21050 TLS 1.0, TLS 1.1, TLS 1.2

We recommend that clients use the highest supported version, TLS 1.2.

Impala Impala Daemon Impala Daemon Backend Port 22000 TLS 1.0, TLS 1.1, TLS 1.2

We recommend that clients use the highest supported version, TLS 1.2.

Impala Impala StateStore StateStore Service Port 24000 TLS 1.0, TLS 1.1, TLS 1.2

We recommend that clients use the highest supported version, TLS 1.2.

Impala Impala Daemon Impala Daemon HTTP Server Port 25000 TLS 1.0, TLS 1.1, TLS 1.2

We recommend that clients use the highest supported version, TLS 1.2.

Impala Impala StateStore StateStore HTTP Server Port 25010 TLS 1.0, TLS 1.1, TLS 1.2

We recommend that clients use the highest supported version, TLS 1.2.

Impala Impala Catalog Server Catalog Server HTTP Server Port 25020 TLS 1.0, TLS 1.1, TLS 1.2

We recommend that clients use the highest supported version, TLS 1.2.

Impala Impala Catalog Server Catalog Server Service Port 26000 TLS 1.0, TLS 1.1, TLS 1.2

We recommend that clients use the highest supported version, TLS 1.2.

Oozie Oozie Server Oozie HTTPS Port 11443 TLS 1.1, TLS 1.2
Solr Solr Server Solr HTTP Port 8983 TLS 1.1, TLS 1.2
Solr Solr Server Solr HTTPS Port 8985 TLS 1.1, TLS 1.2
Spark History Server 18080 TLS 1.2
YARN ResourceManager ResourceManager Web Application HTTP Port 8090 TLS 1.2
YARN JobHistory Server MRv1 JobHistory Web Application HTTP Port 19890 TLS 1.2

Apache Ranger and Kerberos

Cloudera Runtime and Cloudera Manager Networking and Security Requirements

The hosts in a Cloudera Manager deployment must satisfy the following networking and security requirements:

  • Networking Protocols Support
    CDP requires IPv4. IPv6 is not supported and must be disabled.

    See also Configure Network Names.

  • Multihoming Support

    Multihoming Cloudera Runtime or Cloudera Manager is not supported outside specifically certified Cloudera partner appliances. Cloudera finds that current Hadoop architectures combined with modern network infrastructures and security practices remove the need for multihoming. Multihoming, however, is beneficial internally in appliance form factors to take advantage of high-bandwidth InfiniBand interconnects.

    Although some subareas of the product may work with unsupported custom multihoming configurations, there are known issues with multihoming. In addition, unknown issues may arise because multihoming is not covered by our test matrix outside the Cloudera-certified partner appliances.

  • Entropy

    Data at rest encryption requires sufficient entropy to ensure randomness.

    See entropy requirements in Data at Rest Encryption Requirements.

  • Cluster hosts must have a working network name resolution system and correctly formatted /etc/hosts file. All cluster hosts must have properly configured forward and reverse host resolution through DNS. The /etc/hosts files must:
    • Contain consistent information about hostnames and IP addresses across all hosts
    • Not contain uppercase hostnames
    • Not contain duplicate IP addresses

    Cluster hosts must not use aliases, either in /etc/hosts or in configuring DNS. A properly formatted /etc/hosts file should be similar to the following example:

    127.0.0.1 localhost.localdomain localhost
    192.168.1.1 cluster-01.example.com cluster-01
    192.168.1.2 cluster-02.example.com cluster-02
    192.168.1.3 cluster-03.example.com cluster-03 
  • In most cases, the Cloudera Manager Server must have SSH access to the cluster hosts when you run the installation or upgrade wizard. You must log in using a root account or an account that has password-less sudo permission. For authentication during the installation and upgrade procedures, you must either enter the password or upload a public and private key pair for the root or sudo user account. If you want to use a public and private key pair, the public key must be installed on the cluster hosts before you use Cloudera Manager.

    For a list of sudo commands that are supported by Cloudera Manager, see Cloudera Manager sudo command options.

    Cloudera Manager uses SSH only during the initial install or upgrade. Once the cluster is set up, you can disable root SSH access or change the root password. Cloudera Manager does not save SSH credentials, and all credential information is discarded when the installation is complete.

  • The Cloudera Manager Agent runs as root so that it can make sure that the required directories are created and that processes and files are owned by the appropriate user (for example, the hdfs and mapred users).
  • Security-Enhanced Linux (SELinux) must not block Cloudera Manager or Runtime operations.
  • Firewalls (such as iptables and firewalld) must be disabled or configured to allow access to ports used by Cloudera Manager, Runtime, and related services.
  • For RHEL and CentOS, the /etc/sysconfig/network file on each host must contain the correct hostname.
  • Cloudera Manager and Runtime use several user accounts and groups to complete their tasks. The set of user accounts and groups varies according to the components you choose to install. Do not delete these accounts or groups and do not modify their permissions and rights. Ensure that no existing systems prevent these accounts and groups from functioning. For example, if you have scripts that delete user accounts not in an allowlist, add these accounts to the list of permitted accounts. Cloudera Manager, Runtime, and managed services create and use the following accounts and groups:
    Table 2. Users and Groups
    Component (Version) Unix User ID Groups Functionality
    Apache Atlas atlas atlas, hadoop Apache Atlas by default has atlas as user and group. It is configurable
    Apache Flink flink flink The Flink Dashboard runs as this user.
    Apache HBase hbase hbase The Master and the RegionServer processes run as this user.
    Apache HBase Indexer hbase hbase The indexer servers are run as this user.
    Apache HDFS hdfs hdfs, hadoop The NameNode and DataNodes run as this user, and the HDFS root directory as well as the directories used for edit logs should be owned by it.

    Apache Hive

    Hive on Tez

    hive hive The HiveServer2 process and the Hive Metastore processes run as this user.A user must be defined for Hive access to its Metastore DB (for example, MySQL or Postgres) but it can be any identifier and does not correspond to a Unix uid. This is javax.jdo.option.ConnectionUserName in hive-site.xml.
    Apache Impala impala impala, hive Impala services run as this user.
    Apache Kafka kafka kafka Kafka brokers, mirrorMaker, and Connect workers run as this user.
    Apache Knox knox knox Apache Knox Gateway Server runs as this user
    Apache Kudu kudu kudu Kudu services run as this user.
    Apache Livy livy livy The Livy Server process runs as this user
    Apache NiFi nifi nifi Runs as the nifi user
    Apache NiFi Registry nifiregistry nifiregistry Runs as the nifiregistry user
    Apache Oozie oozie oozie The Oozie service runs as this user.
    Apache Ozone hdfs hdfs, hadoop Ozone Manager, Storage Container Manager (SCM), Recon and Ozone Datanodes run as this user.
    Apache Parquet ~ ~ No special users.
    Apache Phoenix phoenix phoenix The Phoenix Query Server runs as this user
    Apache Ranger ranger ranger, hadoop Ranger Admin, Usersync and Tagsync services by default have ranger as user and ranger, hadoop as groups. It is configurable.
    Apache Ranger KMS kms kms Ranger KMS runs with kms user and group. It is configurable.
    Apache Ranger Raz rangerraz ranger Ranger Raz runs with rangerraz user and is part of the ranger group.
    Apache Ranger RMS rangerrms ranger Ranger RMS runs with rangerrms user and is part of the ranger group.
    Apache Solr solr solr The Solr processes run as this user.
    Apache Spark spark spark The Spark History Server process runs as this user.
    Apache Sqoop sqoop sqoop This user is only for the Sqoop1 Metastore, a configuration option that is not recommended.
    Apache YARN yarn yarn, hadoop Without Kerberos, all YARN services and applications run as this user. The LinuxContainerExecutor binary is owned by this user for Kerberos.
    Apache Zeppelin zeppelin zeppelin The Zeppelin Server process runs as this user
    Apache ZooKeeper zookeeper zookeeper The ZooKeeper processes run as this user. It is not configurable.
    Cloudera Manager (all versions) cloudera-scm cloudera-scm Clusters managed by Cloudera Manager run Cloudera Manager Server, monitoring roles, and other Cloudera Server processes as cloudera-scm. Requires keytab file named cmf.keytab because name is hard-coded in Cloudera Manager.
    Cruise Control cruisecontrol hadoop The Cruise Control process runs as this user.
    HttpFS httpfs httpfs The HttpFS service runs as this user. See HttpFS authentication for instructions on how to generate the merged httpfs-http.keytab file.
    Hue hue hue Hue services run as this user.
    Hue Load Balancer apache apache The Hue Load balancer has a dependency on the apache2 package that uses the apache user name. Cloudera Manager does not run processes using this user ID.
    Key Trustee Server keytrustee keytrustee The Key Trustee Server service runs as this user.
    Schema Registry schemaregistry hadoop The Schema Registry process runs as this user.
    Streams Messaging Manager streamsmsgmgr streamsmsgmgr The Streams Messaging Manager processes runs as this user.
    Streams Replication Manager streamsrepmgr streamsrepmgr The Streams Replication Manager processes runs as this user.
    Apache Omid omid hdfs