VPC and subnet (optional)

To quickly get started, CDP can create a VPC and subnets for you. You can also create your own VPCs if you would like to use them for your production platform.

If you would like to use your own AWS VPC, it must meet the following requirements.
  • The VPC has at least three subnets, each in a different availability zone.
  • The VPC subnets must be connected to an Internet Gateway OR a NAT Gateway. VPC should be able to make an outbound connection with the internet or set of CIDRs and ports provided by Cloudera.
  • CDP supports public subnets and private subnets for Data Lake and Data Hub. For private subnets, you must enable CCM.
  • If you are planning to use the Machine Learning service, you must:
    • Enable DNS in the VPC.
    • Tag the VPC and the subnets as shared so that Kubernetes can find them. For load balancers to be able to choose the subnets correctly, you are also required to tag private subnets with the kubernetes.io/role/internal-elb:1 tag, and public subnets with the kubernetes.io/role/elb:1 tag.
    • For more information about Machine Learning requirements, refer to AWS account prerequisites for ML workspaces.
  • If you are planning to use the Data Warehouse service, you must:

VPCs can be created and managed from the VPC console on AWS. For instructions on how to create a new VPC on AWS, refer to Create and configure your VPC in AWS documentation.