Ports for Replication Manager on CDP Public Cloud

Before you create replication policies, you must ensure that the required ports are open and available for data replication. You can verify the mandatory ports using the Replication Manager network security diagram.

The following network security diagram shows the minimum port configuration required to create replication policies:

Figure 1. Network security diagram for Replication Manager in CDP Public Cloud
The image shows the network security diagram for Replication Manager in CDP Public Cloud.

The following ports are optional and might be required based on your requirements. To see the list of mandatory ports required for replication policies, see Preparing to create HDFS replication policy, Preparing to create Hive replication policy, and Preparing to create HBase replication policy.

Service Default Port On-premises source cluster CDP Data Lake Description
HDFS NameNode 8020 Primary node Primary node Used by HDFS and Hive replication to communicate from destination HDFS and MapReduce hosts to source HDFS NameNode(s).
Key Management Server (KMS) 16000 Primary node Primary node Required for replication of encrypted data. Uses tcp protocol.

Applies to both Java KeyStore KMS and Key Trustee KMS. For more information, see Migrating Keys.

HDFS DataNode 50010 Primary node Primary node Used by HDFS and Hive replication to communicate from destination HDFS and MapReduce hosts to source HDFS DataNode(s). Requires outbound connectivity to cloud storage
WebHDFS 50070 Primary node Primary node Used when DistCp and WebHDFS to copy data between a secure cluster and an insecure cluster. Web UI is used to look at the current status of HDFS and explore file systems.
Data Lake cluster 9443 Primary node Primary node Outgoing port. Configure the port on the Data Lake cluster as the outgoing port for CDP Management Console to communicate with FreeIPA.