Accessing buckets in a different AWS account under a managed policy

You might need to know how to add read/write access to S3 buckets under AWS accounts that are different from the Cloudera Data Warehouse (CDW) cluster account.

To enable CDW service cluster access to a bucket you add to S3 under a different AWS account, you must configure the bucket in the different account to access the CDW cluster account. Then, you can configure the CDW service account to access the bucket you added. You perform both of these tasks in the AWS Management Console.

Required role: DWAdmin

To configure access to external S3 buckets for your CDW cluster, you must edit the managed policy attached to the AWS instance profile.

You use this cluster ID in Step 5 below.

  1. In the AWS Console, navigate to AWS Management Console > S3, locate the bucket in the other AWS account you added, and then click the bucket name.
  2. In the bucket details page, click the Permissions tab, and then click the Bucket Policy sub-tab.
  3. In the Bucket Policy sub-tab page, in the Bucket policy editor, add the CDW cluster Id and what permissions you want the CDW service account to have for this bucket:

    This example policy includes the following specifications:

    • The Action section specifies what actions the Principal can perform.
    • The Resource section specifies the S3 bucket you added and want your CDW cluster to be able to access.

    For details about bucket policies, see Managing Access to Amazon S3 Buckets Using Bucket Policies in the AWS documentation.

  4. Click Save.
  5. Note the managed policy ARN attached to the Node Instance Role, used while activating the cluster.
  6. Open the managed policy JSON file, for example noderole-inline-policy.json for editing
  7. Locate the sid "putgetmybucketpaths" for editing.
  8. Append resources to the resource section for the buckets you added.
    For example, if you want to add access to the more-sales-data bucket, you append resources to the end of the "resource" section, as shown in the last two resource names:
    "Resource":[
                ...
                "arn:aws:s3:::roohi-dl-bucket/backup/*",
                "arn:aws:s3:::more-sales-data",
                "arn:aws:s3:::more-sales-data/*"
                ],
  9. Click Review policy in the lower right corner of the page, and then click Save changes. You can access the new bucket from your CDW service cluster now. For example, you can create external Hive tables that point to the bucket.