VPC and subnet (optional)
To quickly get started, CDP can create a VPC and subnets for you. You can also create your own VPCs if you would like to use them for your production platform.
If you would like to use your own AWS VPC, it must meet the following requirements.
- The VPC has at least three subnets, each in a different availability zone.
- The VPC subnets must be connected to an Internet Gateway OR a NAT Gateway. VPC should be able to make an outbound connection with the internet or set of CIDRs and ports provided by Cloudera.
- CDP supports public subnets and private subnets for Data Lake and Data Hub. For private subnets, you must enable CCM.
- If you are planning to use the Machine Learning service, you must:
- Enable DNS in the VPC.
- Tag the VPC and the subnets as
sharedso that Kubernetes can find them. For load balancers to be able to choose the subnets correctly, you are also required to tag private subnets with the
kubernetes.io/role/internal-elb:1tag, and public subnets with the
- For more information about Machine Learning requirements, refer to AWS account prerequisites for ML workspaces.
- If you are planning to use the Data Warehouse service, you must:
- Enable the VPC settings listed in the AWS environment requirements checklist for the Data Warehouse service.
- If you plan to use the private networking feature in the Data Warehouse service, refer to Prerequisites for private networking in AWS environments.