Network security groups

Network security groups (NSGs) determine the inbound and outbound traffic to and from your CDP environment. That is, you should use security group settings to allow users from your organization access to CDP resources.

You have two options:

  • Use your existing security groups (recommended for production)
  • Have CDP create new security groups

You should verify the security group limits in your Azure account to ensure that you can create security groups for CDP.

Existing security groups

If you would like to create your own security groups, two security groups need to be created: the first security group will be used for all gateway nodes and the second security groups will be used for all other nodes. The gateway nodes communicate with the Management Console and therefore require additional ports. These security groups will be applied when creating a data lake and FreeIPA during environment creation and when you create Data Hub clusters.

Review the following guidelines prior to adding security groups rules. This describes all the inbound ports that need to be open and provides guidelines for what to enter as a source range:

“Knox” security group

This is used for gateway nodes:
Protocol Port Range Source Description
TCP 22 Your CIDR This is an optional port for end user SSH access to cluster hosts. You should open it to your organization’s CIDR.
TCP 443 Your CIDR and CDP CIDR This port is used to access the Data Lake and Data Hub cluster UIs via Knox gateway. You must open this port to your organization’s CIDR in order to access cluster UIs.

When CCM is enabled, you only need to set this to your CIDR.

TCP 9443 CDP CIDR This port is used by CDP to maintain management control of clusters and data lakes.

By default, when CDP creates the security groups automatically, it opens this port to the correct IP.

This port is not needed when CCM is enabled.

TCP, UDP 0-65535 Your internal VNet CIDR (for example 10.10.0.0/16). This is required for internal communication within the VNet.
ICMP N/A Your internal VNet CIDR (for example 10.10.0.0/16). This is required for internal communication within the VNet.

"Default" security group

This is used for all nodes except Knox gateway nodes:
Protocol Port Range Source Description
TCP 22 Your CIDR This is an optional port for end user SSH access to the hosts. You should open it to your organization’s CIDR.
TCP 443 Your CIDR This port is only required if you are planning to spin up Machine Learning workspaces since HTTPS access to ML workspaces is available over port 443. If you are not planning to use the Machine Learning service, you do not need to open this port.
TCP 9443 CDP CIDR This port is used by CDP to maintain management control of clusters and data lakes.

By default, when CDP creates the security groups automatically, it opens this port to the correct IP.

This port is not needed when CCM is enabled.

TCP, UDP 0-65535 Your VNet CIDR (for example 10.10.0.0/16). This is required for internal communication within the VNet. TCP port 5432 is used by the Data Lake for communication with its attached database.
ICMP N/A Your internal VNet CIDR (for example 10.10.0.0/16). This is required for internal communication within the VNet.

Security groups can be created and managed from the Azure Portal > Network Security Groups. For detailed instructions on how to create new security groups on Azure, refer to Filter network traffic with a network security group using the Azure portal.

On the Network Interface page > Settings, click Network security group.

On the Network Security Groups page, click Inbound Security Rules to view the list of rules.

In the Inbound security rules tab, click Add.

You need to create two security groups: Knox and Default (You will see this terminology in the Management Console UI and CLI, so if you decide to choose different names, make sure that you are able to distinguish between the two security groups).

New security groups

If you would like CDP to create the security groups for you, you need to provide a CIDR range for inbound traffic to Azure instances from your organization. CDP creates multiple security groups: one for each Data Lake host group, one for each FreeIPA host group, and one per host group when Data Hub, Data Warehouse, and Machine Learning clusters are created. On these security groups, CDP opens ports as described in Default security group settings documentation.