VPC and subnets

When registering an AWS environment in CDP, you will be asked to select a VPC and two or more subnets.

You have two options:

  • Use your existing VPC and subnets for provisioning CDP resources.

  • Have CDP create a new VPC and subnets. All CDP resources will be provisioned into this new VPC and subnets.

Existing VPC and subnets

If you would like to use your own AWS VPC, it must meet the following requirements.
  • The VPC has at least three subnets, each in a different availability zone.
  • The VPC subnets must be connected to an Internet Gateway OR a NAT Gateway. VPC should be able to make an outbound connection with the internet or set of CIDRs and ports provided by Cloudera.
  • CDP supports public subnets and private subnets for Data Lake and Data Hub. For private subnets, you must enable CCM.
  • If you are planning to use the Machine Learning service, you must:
    • Enable DNS in the VPC.
    • Tag the VPC and the subnets as shared so that Kubernetes can find them. For load balancers to be able to choose the subnets correctly, you are also required to tag private subnets with the kubernetes.io/role/internal-elb:1 tag, and public subnets with the kubernetes.io/role/elb:1 tag.
    • For more information about Machine Learning requirements, refer to AWS account prerequisites for ML workspaces.
  • If you are planning to use the Data Warehouse service, you must:

Verify the limits of the VPC and subnets available in your AWS account to ensure that you have enough resources to create clusters in CDP.

New VPC and subnets

If you would like CDP to create a new VPC, three subnets will be created automatically. You will need to specify a valid CIDR in IPv4 range that will be used to define the range of private IPs for EC2 instances provisioned into these subnets. Default is 10.10.0.0/16. Consider changing the IP range to correspond to corporate policies for standardized IP address ranges. You will need to divide the address space as follows:

3 x /19 private subnet for DWX/MLX/CB
3 x /24 public subnet

Private endpoints

By default, when creating a new network CDP uses public endpoints, but during environment registration you can optionally select the “Create Private Endpoints” option to use private endpoints instead of public endpoints.