Default security group settings on AWS

Depending on what you chose during environment creation, CDP can create security groups for your environment automatically or you can provide your own security groups.

Environment security groups

Depending on what you chose during environment creation, CDP can create security groups for your environment automatically or you can provide your own security groups.

  • If you choose to use your own security groups, you are asked to create Knox and Default security groups as described in the Security groups documentation.
  • If you choose for CDP to create all security groups required for an environment, the following security groups are created:

Data Lake: master

AWS naming convention: ${environment-name}-${random-id}-ClusterNodeSecurityGroupmaster-${random-id}

Protocol Port Range Source Description
TCP 22 Your CIDR This is an optional port for end user SSH access to cluster hosts. You should open it to your organization’s CIDR.
TCP 443 Your CIDR and CDP CIDR This port is used to access the Data Lake and Data Hub cluster UIs via Knox gateway. You should open it to your organization’s CIDR in order to access cluster UIs.

This port is also required if you are planning to spin up Machine Learning workspaces since HTTPS access to ML workspaces is available over port 443. If you are not planning to use the Machine Learning service, you do not need to open this port.

When CCM is enabled, you only need to set this to your CIDR.

TCP 9443 CDP CIDR This port is used by CDP to maintain management control of clusters and data lakes.

This port is not used when CCM is enabled.

TCP, UDP 0-65535 Your VPC’s CIDR (for example 10.10.0.0/16) and your subnet’s CIDR (for example 10.0.2.0/24). This is required for internal communication within the VPC.
ICMP N/A Your internal VPC CIDR (for example 10.10.0.0/16). This is required for internal communication within the VPC.

Data Lake: IDBroker

AWS naming convention: ${environment-name}-${random-id}-ClusterNodeSecurityGroupidbroker-${random-id}

Azure naming convention: idbroker${dl-name}${numeric-id}sg

Protocol Port Range Source Description
TCP 22 Your CIDR

This is an optional port for end user SSH access to cluster hosts.

TCP, UDP 0-65535 Your VPC’s CIDR (for example 10.10.0.0/16) and your subnet’s CIDR (for example 10.0.2.0/24). This is required for internal communication within the VPC.
ICMP N/A Your internal VPC CIDR (for example 10.10.0.0/16). This is required for internal communication within the VPC.

FreeIPA

AWS naming convention: ${environment-name}-freeipa-${random-id}-ClusterNodeSecurityGroupmaster-${random-id}

Protocol Port Range Source Description
TCP 22 Your CIDR This is an optional port for end user SSH access to cluster hosts. You should open it to your organization’s CIDR.
TCP 9443 CDP CIDR This port is used by CDP to maintain management control of clusters and data lakes.

This port is not used when CCM is enabled.

TCP, UDP 0-65535 Your VPC’s CIDR (for example 10.10.0.0/16) and your subnet’s CIDR (for example 10.0.2.0/24). This is required for internal communication within the VPC.
ICMP N/A Your internal VPC CIDR (for example 10.10.0.0/16). This is required for internal communication within the VPC.

Database

AWS naming convention: dsecg-dbsvr-${random-id}

Protocol Port Range Source Description
TCP 5432 Your VPC’s CIDR (for example 10.10.0.0/16) This port is used for communication between the Data Lake and its attached database.

Data Hub security groups

Depending on what you chose during environment creation, CDP can create security groups for your Data Hub clusters automatically or it can use your pre-created security groups:

  • If during environment creation, you provided your own security groups, CDP uses these security groups when deploying clusters.
  • If during environment creation you chose for CDP to create new security groups, new security groups are created for each Data Hub cluster as follows:

Data Hub: master

AWS naming convention: ${cluster-name}-${random-i}-ClusterNodeSecurityGroupmaster-${random-id}

Protocol Port Range Source Description
TCP 22 Your CIDR This is an optional port for end user SSH access to cluster hosts. You should open it to your organization’s CIDR.
TCP 443 Your CIDR and CDP CIDR This port is used to access the Data Lake and Data Hub cluster UIs via Knox gateway. You should open it to your organization’s CIDR in order to access cluster UIs.

When CCM is enabled, you only need to set this to your CIDR.

TCP 9443 CDP CIDR This port is used by CDP to maintain management control of clusters and data lakes.

This port is not used when CCM is enabled.

TCP, UDP 0-65535 Your VPC’s CIDR (for example 10.10.0.0/16) and your subnet’s CIDR (for example 10.0.2.0/24). This is required for internal communication within the VPC.
ICMP N/A Your internal VPC CIDR (for example 10.10.0.0/16). This is required for internal communication within the VPC.

Data Hub: worker

AWS naming convention: ${cluster-name}-${random-id}-ClusterNodeSecurityGroupworker-${random-id}

Protocol Port Range Source Description
TCP 22 Your CIDR This is an optional port for end user SSH access to cluster hosts.
TCP, UDP 0-65535 Your VPC’s CIDR (for example 10.10.0.0/16) and your subnet’s CIDR (for example 10.0.2.0/24). This is required for internal communication within the VPC.
ICMP N/A Your internal VPC CIDR (for example 10.10.0.0/16). This is required for internal communication within the VPC.

Hub: compute

AWS naming convention: ${cluster-name}-${random-id}-ClusterNodeSecurityGroupcompute-${random-id}

Protocol Port Range Source Description
TCP 22 Your CIDR This is an optional port for end user SSH access to cluster hosts.
TCP, UDP 0-65535 Your VPC’s CIDR (for example 10.10.0.0/16) and your subnet’s CIDR (for example 10.0.2.0/24). This is required for internal communication within the VPC.
ICMP N/A Your internal VPC CIDR (for example 10.10.0.0/16). This is required for internal communication within the VPC.

Data Warehouse security groups

CDP always creates new security groups when Cloudera Data Warehouses (CDW) are deployed.

Machine Learning security groups

CDP always creates new security groups when Cloudera Machine Learning (CML) workspaces are deployed.

Data Engineering security groups

CDP always creates new security groups when Cloudera Data Engineering (CDE) clusters are deployed.

DataFlow security groups

CDP always creates new security groups when Cloudera DataFlow (CDF) environments are enabled.