AWS environment requirements checklist

To successfully activate environments that have been registered with CDP on AWS VPCs with Cloudera Data Warehouse service, your AWS VPC must meet these requirements.

1. VPC has DNS resolution and DNS hostnames enabled

Ensure that your AWS VPC has DNS Resolution and DNS Hostnames enabled. For example, in the VPC Dashboard, click Your VPCs in the left navigation menu, and select the VPC you want to use for your Data Warehouse service environment on CDP. View configuration details to make sure DNS resolution and DNS hostnames are Enabled. Thw AWS screen looks something like this:



2. DHCP option set uses default domain name with one domain

When you create your VPC to use for the Data Warehouse service, ensure that the DHCP option set attached to the VPC uses only one domain and use the default domain name:

domain-name = <region>.compute.internal;

You can verify the setting in the VPC Dashboard of the AWS Console. Click the DHCP options set ID of the default DHCP options set (always named "-" by AWS) to view details, including the associated domain:



A details page appears:



3. DHCP option set uses AmazonProvidedDNS

When you create the VPC for the Data Warehouse service, AWS automatically creates a set of DHCP options and associates them with the VPC. This set of options specifies the Amazon DNS Server as the default domain name server:

domain-name-server = AmazonProvidedDNS;

Use this setting for VPCs for the Data Warehouse service shown in the AWS Console VPC Dashboard above.

4. Ensure the correct subnets in VPC are specified

When you activate an environment for the Data Warehouse service, ensure that the subnets are correct. If there are more than three private subnets in the VPC only the top three are selected. However, they may not be the subnets you intend to use for the Data Warehouse service.

5. Ensure private subnets have outbound internet connectivity

Your private subnets must have outbound internet connectivity. Check the route tables of private subnets to verify the internet routing. Worker nodes must be able to download Docker images for Kubernetes, billing and metering information, and to perform API server registration. For more information, see AWS Outbound Network Access Destinations.

6. Ensure the Amazon Security Token Service (STS) is activated

To successfully activate an environment in the Data Warehouse service, you must ensure the Amazon STS is activated in your AWS VPC:

  1. In the AWS Management Console home page, select IAM under Security, Identity, & Compliance.
  2. In the Identity and Access Management (IAM) dashboard, select Account settings in the left navigation menu.
  3. On the Account settings page, scroll down to the section for Security Token Service (STS).
  4. In the Endpoints section, locate the region in which your environment is located and make sure that the STS service is activated.

Prerequisite for enabling a private EKS API server (Preview)

By enabling a private EKS API server, you can ensure that the EKS cluster is setup with only private endpoint enabled, which restricts the public access to your EKS API server from the internet. To set up the Amazon Eleastic Kubernetes Service (EKS) cluster in private mode and to enable the private EKS, ensure that the DataLake cluster is created with Cluster Connectivity Manager version 2 (CCMv2) enabled.

You must also run the following CDP CLI command:
cdp dw create-cluster --environment-crn crn:cdp:environments:us-west-1:XXXX --use-private-load-balancer --aws-options enablePrivateEKS=true,workerSubnetIds=privatesubnet-1,privatesubnet-2,privatesubnet-3,lbSubnetIds=privatesubnet-XXX --profile dev