Security groups (optional)

For getting started, CDP can create the required network security groups for you, but for production you may want to use your own security groups.

Two security groups need to be created: the first security group will be used for all gateway nodes and the second security groups will be used for all other nodes. The gateway nodes communicate with the Management Console and therefore require additional ports. These security groups will be applied when creating a data lake and FreeIPA during environment creation and when you create Data Hub clusters.

Review the following guidelines prior to adding security groups rules. This describes all the inbound ports that need to be open and provides guidelines for what to enter as a source range:

“Knox” security group

This security group is used for gateway nodes:
Protocol Port Range Source Description
TCP 22 Your CIDR This is an optional port for end user SSH access to cluster hosts. You should open it to your organization’s CIDR.
TCP 443 Your CIDR and CDP CIDR This port is used to access the Data Lake and Data Hub cluster UIs via Knox gateway. You must open this port to your organization’s CIDR in order to access cluster UIs.

This port is not needed for CCM.

TCP 9443 CDP CIDR This port is used by CDP to maintain management control of clusters and data lakes.

By default, when CDP creates the security groups automatically, it opens this port to the correct IP.

This port is not needed for CCM.

TCP, UDP 0-65535 Your internal VPC CIDR. For example 10.10.0.0/16 This is required for internal communication within the VPC.

"Default" security group

This security group is used for all nodes except Knox gateway nodes:
Protocol Port Range Source Description
TCP 22 Your CIDR This is an optional port for end user SSH access to the hosts. You should open it to your organization’s CIDR.
TCP 443 Your CIDR This port is only required if you are planning to spin up Machine Learning workspaces since HTTPS access to ML workspaces is available over port 443. This port is not used by the Management Console or any other services, so if you are not planning to use the Machine Learning service, you do not need to open this port.

This port is not needed for CCM.

TCP 9443 CDP CIDR This port is used by CDP to maintain management control of clusters and data lakes.

By default, when CDP creates the security groups automatically, it opens this port to the correct IP.

This port is not needed when using CCM.

TCP 5432 Your VPC CIDR. For example 10.10.0.0/16 This port is used by the Data Lake for communication with its attached database.
TCP, UDP 0-65535 Your VPC CIDR. For example 10.10.0.0/16 This is required for internal communication within the VPC.

Creating security groups

Security groups can be created and managed from the Azure Portal > Network Security Groups. For instructions on how to create new security groups on Azure, refer to Filter network traffic with a network security group using the Azure portal.

On the Network Interface page > Settings, click Network security group.

On the Network Security Groups page, click Inbound Security Rules to view the list of rules.

In the Inbound security rules tab, click Add.

You need to create two security groups: Knox and Default (You will see this terminology in the Management Console UI and CLI, so if you decide to choose different names, make sure that you are able to distinguish between the two security groups).

Use the guidelines and examples provided above when editing rules.