Register an AWS environment

Once you’ve met the cloud provider requirements, register your AWS environment.

Before you begin

This assumes that you have already fulfilled the environment prerequisites described in AWS environment prerequisites.

Steps

  1. Navigate to the Management Console > Environments > Register environment:
  2. On the Register Environment page, provide the following information:
    Parameter Description
    General Information
    Environment Name Enter a name for your environment. This name will be used to refer to this environment in CDP.
    Description (Optional) Enter a description for your environment.
    Select Cloud Provider Select Amazon.
    Credential
    Select Credential Select an existing credential or select Create new credential.

    For instructions on how to create a credential, refer to Creating a role-based credential.

  3. Click Next.
  4. On the Data Lake Settings page, provide the following information:
    Parameter Description
    Data Lake Cluster Name Enter a name for the Data Lake cluster that will be created for this environment. This name will be used to refer to this Data Lake in CDP and on AWS (for example, in the CloudFormation console and EC2 console).
    Scale Select Data Lake scale. By default, “Light Duty” is used.

    For more information on data lake scale, refer to Data Lake scale.

  5. Click Next.
  6. On the Region, Networking and Storage page, provide the following information:
    Parameter Description
    Region
    Select Region Select the region that you would like to use for CDP.

    If you would like to use a specific existing virtual network, the virtual network must be located in the selected region.

    Network
    Select VPC You have two options:
    • Select the existing virtual network where you would like to provision clusters and/or other resources. Refer to VPC and subnet for requirements.

    • Select Create new network to have a new network with three subnets created. One subnet is created for each availability zone assuming three AZs per region; If a region has two AZs instead of three, then still three subnets are created, two in the same AZ.

    Select Subnet This option is only available if you choose to use an existing network. Multiple subnets must be selected, as described in VPC and subnet, and CDP distributes resources evenly within the subnets.
    Network CIDR This option is only available if you select to create a new network.

    If you selected to create a new network, provide Network CIDR that determines the range of private IPs that EC2 instances will use. This must be a valid private IP CIDR IP in IPv4 range.

    For example 10.10.0.0/16 are valid IPs. /16 is required to allow for enough IP addresses.

    Security Access Settings
    Select Security Access Type This determines inbound security group settings that allow connections to the Data Lake and Data Hub clusters from your organization’s computers. You have two options:
    • Create new security groups - Allows you to provide custom CIDR IP range for all new security groups that will be created for the Data Lake and Data Hub clusters so that users from your organization can access cluster UIs and SSH to the nodes.

      This must be a valid CIDR IP in IPv4 range. For example: 192.168.27.0/24 allows access from 192.168.27.0 through 192.168.27.255.

      If you use this setting, several security groups will get created: one for each Data Lake host group the Data Lake and one for each host group), one for each FreeIPA host group, and one for RDS; Furthermore, the security group settings specified will be automatically used for Data Hub, Data Warehouse, and Machine Learning clusters created as part of the environment.
    • Provide existing security groups (Only available for an existing VPC) - Allows you to select two existing security groups, one for Knox-installed nodes and another for all other nodes. If you select this option, refer to Security groups to ensure that you open all ports required for your users to access environment resources.

    SSH Settings
    New or existing SSH public key You have two options for providing a public SSH key:
    • Select a key that already exists on your AWS account within the specific region that you would like to use.

    • Upload a public key directly from your computer.

    Logs Storage and Audits
    Logs Storage and Audits Select the S3 location and IAM roles created in Minimal setup for cloud storage.
    Data Access
    Data Access Select the S3 location and IAM roles created in Minimal setup for cloud storage.
    Enable S3Guard
    DynamoDB Table Name Provide a name for a DynamoDB table that will be created automatically, or select an existing DynamoDB table that meets the requirements described in DynamoDB table. The table is used by S3Guard to provide consistent view of the S3 object store.

    If you choose to specify a table name, ensure that a table with the specified name does not already exist within the selected region on your AWS account.

    If you choose to create your own table, ensure that no other environment is currently using this table.

  7. Click on Register Environment to trigger environment registration.
  8. The environment creation takes about 60 minutes. The creation of the FreeIPA server and Data Lake cluster is triggered. You can monitor the progress from the web UI. Once the environment creation has been completed, its status will change to “Running”.
  1. Navigate to the Management Console > Environments > Register environment and provide all UI parameters.
  2. Obtain the JSON template from the last page of the environment wizard SHOW ENVIRONMENT CLI COMMAND. This also provides the command to use.
  3. Save the template as a JSON file on your computer.
  4. In the JSON file, remove the environment name so that it is set to a blank value:
    "name":" "
  5. Use the command provided in the UI to create the environment.
  6. Next, create a data lake using the following command:
    cdp datalake create-aws-datalake --datalake-name <value> --environment-name <value>

    You can use CLI help to obtain the correct template for the CLI request.

  7. The environment creation takes about 60 minutes. The creation of the FreeIPA server and Data Lake cluster is triggered. You can monitor the progress from the web UI. Once the environment creation has been completed, its status will change to “AVAILABLE” when you list environments:
    cdp environments list-environments
    }
        ]
            {
                "environmentName": "domi-test-env",
                "crn": "crn:altus:environments:us-west-1:9d74eee4-1cad-45d7-b641-7ccf9edbb71d:environment:cf1fc7cf-e6bd-479e-a117-992df9d7f202",
                "status": "AVAILABLE",
                "region": "eu-central-1",
                "cloudPlatform": "AWS",
                "credentialName": "domi-test-cred",
                "description": ""
            }
        ]
    }

After you finish

After your environment is running, perform the following steps: