Port and network requirements
While using CDH clusters or CDP Private Cloud Base clusters, make sure that the following ports are open and accessible on the source hosts to allow communication between the source on-premises cluster and CDP.
|Service||Default Port||On-premises source cluster||CDP Data Lake||Description|
|Cloudera Manager Admin Console HTTP||7180||Primary node||Primary node||Incoming port. Open on the source cluster to enable Data lake Cloudera Manager to communicate to the on-premises Cloudera Manager.|
|HDFS NameNode||8020||Primary node||Primary node||Used by HDFS and Hive replication to communicate from destination HDFS and MapReduce hosts to source HDFS NameNode(s).|
|Key Management Server (KMS)||16000||Primary node||Primary node||Required for replication of encrypted data. Uses tcp protocol.
Applies to both Java KeyStore KMS and Key Trustee KMS. For more information, see Migrating Keys.
|HDFS DataNode||50010||Primary node||Primary node||Used by HDFS and Hive replication to communicate from destination HDFS and MapReduce hosts to source HDFS DataNode(s).|
|WebHDFS||50070||Primary node||Primary node||Used when DistCp and WebHDFS to copy data between a secure cluster and an insecure cluster. Web UI is used to look at the current status of HDFS and explore file systems.|
|YARN Resource Manager||8032||Primary node||Primary node||Used to access the YARN ResourceManager.|
|Hive Metastore||9083||Primary node||Primary node||Used for Hive/Impala replication to query or access Hive Metastore.|
|Cloudera Manager Agent||9000||Primary node||Primary node||Used to retrieve diagnostic and log information.|
|Data transfer from secondary node for AWS / ADLS Gen2||80||Primary/secondary nodes||S3/ADLS endpoint||Outgoing port. Open on all the HDFS nodes for AWS and ADLS Gen2.|
|Cluster Connectivity Manager (CCM)||6000-6049||Primary node||CDP Control Plane endpoint||Used to register a CDH cluster or CDP Private Cloud Base cluster on CDP. The outgoing traffic is allowed in the port range 6000-6049 on the source cluster to communicate with Cluster Connectivity Manager (CCM).|
|Data Lake cluster||8443||Primary node||Primary node||Outgoing port. Configure the port on the Data Lake cluster as the outgoing port for CDP Management Console to communicate with Cloudera Manager and Knox.|
|Data Lake cluster||9443||Primary node||Primary node||Outgoing port. Configure the port on the Data Lake cluster as the outgoing port for CDP Management Console to communicate with FreeIPA.|
|Primary / secondary nodes||Primary node||Required for HBase replication policies.
Open the ports on the source and destination secondary nodes to ensure that the source HBase service can reach Zookeeper and HBase services on the destination hosts.
- The outgoing SSH port is open on the Cloudera Manager host.
- For Hive replication, Cloudera Manager Data Lake must be able to communicate with the on-premises Cloudera Manager.
- On the AWS cluster and ADLS cluster, make sure that the ports 16020 for secondary security group and 2181 for secondary, primary, and leader groups are available for HBase replication.