Step 2) Register a CDP Environment
Before you register an environment, you'll want to create specific IAM roles and policies so that CDP can operate in a secure manner. For background information, a description of what we're building and why can found here. For this quickstart, we'll use Cloudformation to set all of this up for you.
- Download the Cloudformation provided template here.
- In the AWS console, deploy the Cloudformation template:
- In AWS Services, search for Cloudformation.
- Click Create Stack.
- Select Template is ready and then Upload a
- Click Choose file and select the Cloudformation template that you downloaded.
- Click Next.
- Enter a stack name. The name can be any valid name.Under Parameters, complete the following fields:
- S3BucketName: Choose an unused bucket name. CDP will create the bucket for you.
- AWSAccount: Your 12-digit AWS account ID number, which can be found here.
- Prefix: A short prefix of your choosing, which will be added to the names of the IAM resources we'll be creating.
Make a note of the S3BucketName and prefix that you define. You will need them later.
- Click Next.
- At the Configure Stack Options page, click Next.
- At the bottom of the Review page, under Capabilities,
click the checkbox next to I acknowledge that AWS Cloudformation might
create IAM resources with custom names, as that is exactly what we will
- Click Create Stack.
- Still in the AWS console, create an SSH key in the region of your choice. If there
is already an SSH key in your preferred region that you'd like to use, you can skip these
- In AWS Services, search for EC2.
- In the top right corner, verify that you are in your preferred region.
- On the left hand navigation bar, choose Key Pairs.
- On the top right of the screen, select Create Key Pair.
- Provide a name and choose the pem format. The name can be any valid name.
- Return to the CDP Management Console and navigate to .
- Provide an environment name and description. The name can be any valid name.
- Choose Amazon as the cloud provider.
- Under Amazon Web Services Credential, chose the credential that you created earlier and click Next.
- Under Data Lake Settings, give your new data lake a name. The name can be any valid name. Choose the latest data lake version.
- For Data Lake Scale, choose Light Duty and click Next.
- Choose your desired region. This should be the same region you created an SSH key in previously.
- Under Select Network, choose Create New Network.
- Under Security Access Settings, choose Create New
- Under SSH Settings, choose the SSH key you created earlier.
- Under Logs - Storage and Audit:
- Choose the Instance Profile titled prefix-log-access-instance-profile, where "prefix" is the prefix you defined in the Parameters section of the stack details in AWS.
- For Logs Location Base, choose S3BucketName/logs, where S3BucketName is the bucket name you defined in the Parameters section of the stack details in AWS.
- For Ranger Audit Role, choose
prefix-ranger-audit-role, where "prefix" is the prefix you
defined in the Parameters section of the stack details in
AWS.For example, using the parameters we defined earlier:
- Under Data Access, choose
prefix-data-access-instance-profile. For Storage
Location Base, enter the S3Bucketname that you
defined. For Data Access Role, choose
prefix-datalake-admin-role. For example:
- Optionally, under Add Tags, provide any tags that you'd like the resources to be tagged with in your AWS account.
- Under Enable S3 Guard, enter prefix-dynamodb-table.