Port and network requirements

While using CDH clusters or CDP Private Cloud Base clusters, make sure that the following ports are open and accessible on the source hosts to allow communication between the source on-premises cluster and CDP.

Service Default Port On-premises source cluster CDP Data Lake Description
Cloudera Manager Admin Console HTTP 7180 Primary node Primary node Incoming port. Open on the source cluster to enable Data lake Cloudera Manager to communicate to the on-premises Cloudera Manager.
HDFS NameNode 8020 Primary node Primary node Used by HDFS and Hive replication to communicate from destination HDFS and MapReduce hosts to source HDFS NameNode(s).
Key Management Server (KMS) 16000 Primary node Primary node Required for replication of encrypted data. Uses tcp protocol.

Applies to both Java KeyStore KMS and Key Trustee KMS. For more information, see Migrating Keys.

HDFS DataNode 50010 Primary node Primary node Used by HDFS and Hive replication to communicate from destination HDFS and MapReduce hosts to source HDFS DataNode(s).
WebHDFS 50070 Primary node Primary node Used when DistCp and WebHDFS to copy data between a secure cluster and an insecure cluster. Web UI is used to look at the current status of HDFS and explore file systems.
YARN Resource Manager 8032 Primary node Primary node Used to access the YARN ResourceManager.
Hive Metastore 9083 Primary node Primary node Used for Hive/Impala replication to query or access Hive Metastore.
Cloudera Manager Agent 9000 Primary node Primary node Used to retrieve diagnostic and log information.
Data transfer from secondary node for AWS / ADLS Gen2 80 Primary/secondary nodes S3/ADLS endpoint Outgoing port. Open on all the HDFS nodes for AWS and ADLS Gen2.
Cluster Connectivity Manager (CCM) 6000-6049 Primary node CDP Control Plane endpoint Used to register a CDH cluster or CDP Private Cloud Base cluster on CDP. The outgoing traffic is allowed in the port range 6000-6049 on the source cluster to communicate with Cluster Connectivity Manager (CCM).
Data Lake cluster 8443 Primary node Primary node Outgoing port. Configure the port on the Data Lake cluster as the outgoing port for CDP Management Console to communicate with Cloudera Manager and Knox.
Data Lake cluster 9443 Primary node Primary node Outgoing port. Configure the port on the Data Lake cluster as the outgoing port for CDP Management Console to communicate with FreeIPA.
Zookeeper

HBase services

16020

2181

Primary / secondary nodes Primary node Required for HBase replication policies.

Open the ports on the source and destination secondary nodes to ensure that the source HBase service can reach Zookeeper and HBase services on the destination hosts.

Verify whether the following network requirements are met:
  • The outgoing SSH port is open on the Cloudera Manager host.
  • For Hive replication, Cloudera Manager Data Lake must be able to communicate with the on-premises Cloudera Manager.
  • On the AWS cluster and ADLS cluster, make sure that the ports 16020 for secondary security group and 2181 for secondary, primary, and leader groups are available for HBase replication.