Enabling Public Endpoint Access Gateway

You can enable Public Endpoint Access Gateway during AWS environment registration after enabling Cluster Connectivity Manager (CCM).

During environment registration via CDP web interface, you can optionally enable Public Endpoint Access Gateway. Once activated, the gateway will be used for the Data Lake and all the Data Hubs within the environment. There is no way to activate it on a per Data Lake or per Data Hub level. Once it is enabled for an environment, there is no way to deactivate it. The gateway can be used either with an existing VPC or with a new VPC created by CDP.

Prerequisites

  • If you choose to enable Public Endpoint Access Gateway, CDP will create two AWS network load balancers (AWS NLB) per cluster (that is for each Data Lake and Data Hub). Make sure that your AWS NLB limits allow for the load balancer creation.
  • If you are using your existing network, you should have at least 2 public subnets in the VPC that you would like to use for CDP. The exact number of public subnets should be equal to the number of private subnets. Furthermore, availability zones of the public and private subnets must match.

Steps: CDP web interface

When registering your AWS environment, make sure to do the following:

  1. On the Region, Networking, Security and Storage page, select your existing VPC or select to have a new VPC created.
  2. If you selected an existing VPC, select at least two existing private subnets (or at least three subnets if you would like to provision Data Warehouse instances).
  3. The Enable Cluster Connectivity Manager option is enabled by default to enable communication via private subnets.
  4. Click on Enable Public Endpoint Access Gateway to enable it. This enables UIs and APIs of the Data Lake and Data Hub clusters to be accessible over the internet.
  5. If you selected an existing VPC, under Select Endpoint Access Gateway Subnets, select the public subnets for which you would like to use the gateway. The number of subnets must be the same as under Select Subnets and the availability zones must match.
  6. Under Security Access Settings, make sure to restrict access to only be accepted from sources coming from your external network range.
  7. Finish registering your environment.

Steps: CDP CLI

During environment registration via CDP CLI, you can optionally enable public endpoint access gateway using the following CLI parameters:

--endpoint-access-gateway-scheme PUBLIC 
--endpoint-access-gateway-subnet-ids subnet-0232c7711cd864c7b subnet-05d4769d88d875cda 

The first parameter enables the gateway and the second one allows you to specify subnets. The number of subnets must be the same as under --subnet-ids and the availability zones must match. For example:

cdp environments create-aws-environment \
--environment-name gk1dev \
--credential-name gk1cred \
--region "us-west-2" \
--security-access cidr=0.0.0.0/0 \
--authentication publicKeyId="gk1" \
--log-storage storageLocationBase=s3a://gk1priv-cdp-bucket,instanceProfile=arn:aws:iam::152813717728:instance-profile/mock-idbroker-admin-role \
--vpc-id vpc-037c6d94f30017c24 \
--subnet-ids subnet-0232c7711cd864c7b subnet-05d4769d88d875cda \
--endpoint-access-gateway-scheme PUBLIC \
--endpoint-access-gateway-subnet-ids subnet-0232c7711cd864c7b subnet-05d4769d88d875cda \
--free-ipa instanceCountByGroup=1 \

Equivalent CLI JSON for an environment request looks like this:

"endpointAccessGatewayScheme": "PUBLIC",
"endpointAccessGatewaySubnetIds": 
       ["subnet-0232c7711cd864c7b", 
       "subnet-05d4769d88d875cda"],