Configuring Cloud Data Access
Also available as:

Configure S3 storage locations

After configuring access to S3 via instance profile, you can optionally use an S3 bucket as a base storage location; this storage location is mainly for the Hive Warehouse Directory (used for storing the table data for managed tables).


  • You must have an existing bucket. For instructions on how to create a bucket on S3, refer to AWS documentation.
  • The instance profile that you configured must allow access to the bucket.


  1. When creating a cluster, on the Cloud Storage page in the advanced cluster wizard view, select Use existing instance profile and select the instance profile to use, as described in the documentation for configuring access to S3.
  2. Under Storage Locations, enable Configure Storage Locations by clicking the button.
  3. Provide your existing bucket name under Base Storage Location.

    Make sure that the bucket already exists within the account.

  4. Under Path for Hive Warehouse Directory property (hive.metastore.warehouse.dir), Cloudbreak automatically suggests a location within the bucket. For example, if the bucket that you specified is my-test-bucket then the suggested location will be my-test-bucket/apps/hive/warehouse. You may optionally update this path or select Do not configure.

    This directory structure will be created in your specified bucket upon the first activity in Hive.