Adding an external S3 bucket to your CDW environment

If you try to access an external S3 bucket from the Hue web interface without adding it to the CDW environment, then Impala or Hive may display the “AccessDeniedException 403” exception. Make sure that your Cloudera Data Warehouse (CDW) environment has access to the S3 buckets that you want to access from Hue.

When you create a Virtual Warehouse in the CDW service, a cluster is created in your AWS account. This cluster has two buckets. One bucket is used for managed data and the other is used for external data. Access to these two buckets is controlled by AWS instance profiles.

To add read/write access to external S3 buckets that reside in the same AWS account as the CDW service cluster or that are different from the account where the CDW service cluster resides, see the corresponding links in the Related information section.

  1. Sign in to the CDP Management Console as an administrator.
  2. Go to Data Warehouse service > Environments and click the More… menu.
  3. Search and locate the environment in which you want to add the S3 bucket and click the edit icon.
    The Environment Details page is displayed.
  4. Specify the name of the S3 bucket you want to configure access to in the Add External S3 Bucket text box.
    If the bucket belongs to another AWS account, then select the Bucket belongs to different AWS account option.
  5. Select the access mode.
    Read-only access is sufficient to import data in Hue.
  6. Click Add Bucket to save the configuration.
    A success message is displayed.
  7. Click APPLY to update the CDW environment.