Configure S3 storage locations
After configuring access to S3 via instance profile, you can optionally use an S3 bucket as a base storage location; this storage location is mainly for the Hive Warehouse Directory (used for storing the table data for managed tables).
Prerequisites
- You must have an existing bucket. For instructions on how to create a bucket on S3, refer to AWS documentation.
- The instance profile that you configured must allow access to the bucket.
Steps
- When creating a cluster, on the Cloud Storage page in the advanced cluster wizard view, select Use existing instance profile and select the instance profile to use, as described in the documentation for configuring access to S3.
- Under Storage Locations, enable Configure Storage Locations by clicking the button.
- Provide your existing bucket name under Base Storage Location. Note
Make sure that the bucket already exists within the account.
- Under Path for Hive Warehouse Directory property (hive.metastore.warehouse.dir),
Cloudbreak automatically suggests a location within the bucket.
You may optionally update this path or select
Do not configure
.NoteThis directory structure will be created in your specified bucket upon the first activity in Hive.