Security groups (optional)

For getting started, CDP can create the required security groups for you, but for production you may want to use your own security groups.

Two security groups should be created: the first security group will be used for all gateway nodes and the second security group will be used for all other nodes. The gateway nodes communicate with the Management Console and therefore require additional ports. These security groups will be applied when creating a data lake and FreeIPA during environment creation and when you create Data Hub clusters.

Review the following guidelines prior to adding security groups rules. This describes all the inbound ports that need to be open and provides guidelines for what to enter as a source range:

“Knox” security group

This security group is used for gateway (Apache Knox gateway) nodes:
Protocol Port Range Source Description
TCP 22 Your CIDR This is an optional port for end user SSH access to cluster hosts. You should open it to your organization’s CIDR.
TCP 443 Your CIDR and

52.36.110.208/32, 52.40.165.49/32, 35.166.86.177/32

This port is used to access the Data Lake and Data Hub cluster UIs via Knox gateway. You must open this port to your organization’s CIDR in order to access cluster UIs.
TCP 9443 52.36.110.208/32, 52.40.165.49/32, 35.166.86.177/32 This port is used by CDP to maintain management control of clusters and data lakes.

By default, when CDP creates the security groups automatically, it opens this port to the correct IP.

TCP, UDP 0-65535 Your internal VPC CIDR. For example 10.10.0.0/16 This is required for internal communication within the VPC.

Example rules provided in the VPC console on AWS:

"Default" security group

This security group is used for all nodes except Knox gateway nodes:
Protocol Port Range Source Description
TCP 22 Your CIDR This is an optional port for end user SSH access to the hosts. You should open it to your organization’s CIDR.
TCP 443 Your CIDR This port is only required if you are planning to spin up Machine Learning workspaces since HTTPS access to ML workspaces is available over port 443. This port is not used by the Management Console or any other services, so if you are not planning to use the Machine Learning service, you do not need to open this port.
TCP 5432 Your VPC CIDR. For example 10.10.0.0/16 This port is used by the Data Lake for communication with its attached database.
TCP, UDP 0-65535 Your VPC CIDR. For example 10.10.0.0/16 This is required for internal communication within the VPC.

Example rules provided in the VPC console on AWS:

Creating security groups

On AWS, you can create security groups and edit their rules from the VPC console > Security Groups.

To create a security group, click on Create security group and provide the following:

To edit security group rules, select the security group and click on Inbound Rules > Edit rules:

You need to create two security groups: Knox and Default (You will see this terminology in the Management Console UI and CLI, so if you decide to choose different names, make sure that you are able to distinguish between the two security groups).

Use the guidelines and examples provided above when editing rules.