AWS requirements for DataFlow

As the administrator for your AWS environment, ensure that the environment meets the requirements for CDP and DataFlow. Then set up your AWS cloud credential and register the environment.

Follow the steps to ensure that your AWS environment meets the CDP and DataFlow requirements:

Understand your AWS account requirements for CDP

  • Review the CDP AWS account requirements. The link is in the Related information section below.

  • Verify that your AWS account for CDP has the required resources.

  • Verify that you have the permissions to manage these resources.

Understand the DataFlow requirements

  • Verify that the following services are available in your environment for DataFlow to use:

    • Network – Amazon VPC
    • Compute – Amazon Elastic Kubernetes Service (EKS)
    • Load Balancing – Amazon ELB Classic Load Balancer
    • Persistent Instance Storage – Amazon Elastic Block Store (EBS)
    • Database – Amazon Relational Database Service (RDS)
  • Determine your networking option:

    • Use your own VPC
    • Allow CDP to create a VPC

    To understand each option, see: DataFlow Networking. The link is in the Related information section below.

  • Regions:

    • Select a CDP-supported region that also includes the AWS Elastic Kubernetes Service (EKS).

      For more information, see: CDP Supported AWS regions and the Region Table in AWS Regional Services. The links are in the Related information section below.

  • Ports and outbound network access:

    • Review the port requirements for the CDP default security group. See: CDP Management Console - Security groups. The link is in the Related information section below.
    • Configure ports for NiFi to access your source and destination systems in the data flow.
    • If you are using a firewall or a security group setting to prevent egress from the workspace, you must ensure that the outbound destinations required by CDF are reachable. For more information, see Outbound network access destinations for AWS. The link is in the Related information section below.
    • If the egress is blocked to these URLs, then autoscaling fails to pull new images and the instances will have broken pods.

      Follow the recommended and minimum required security group settings by AWS. For more information, see Amazon EKS security group considerations. The link is in the Related information section below.

Set up an AWS Cloud credential

Create a role-based AWS credential that allows CDP to authenticate with your AWS account and has authorization to provision AWS resources on your behalf. Role-based authentication uses an IAM role with an attached IAM policy that has the minimum permissions required to use CDP.

To set up an AWS Cloud credential, see Creating a role based provisioning credential for AWS. The link is in the Related information section below.

After you have created this IAM policy, register it in CDP as a cloud credential. Reference this credential when you register an AWS environment in CDP environment as described in the next step.

Register an AWS environment in CDP

A CDP user must have the PowerUser role in order to register an environment. An environment determines the specific cloud provider region and virtual network in which resources can be provisioned, and includes the credential that should be used to access the cloud provider account.

To register an AWS environment in CDP, see CDP AWS Environments. The link is in the Related information section below.