Third-party object storage support for Cloudera Data Warehouse Private Cloud
Cloudera Data Warehouse (CDW) can access object storage such as AWS S3 if the CDP Private Cloud base cluster is configured to connect to the object store. You can query Hive and Impala tables stored on object stores using Hue.
By default, when you activate an environment in CDW, all the
hadoop.fs.s3a
configurations (fs.s3a.*
) are copied from
the core-site.xml file present on the base cluster to the
hadoop-core-site.xml file of the Hive and Impala metastore pods,
enabling CDW to establish a connection to S3. The following four are the key configurations
that must be present in the base cluster core-site.xml file:- fs.s3a.access.key
- fs.s3a.secret.key
- fs.s3a.endpoint
- fs.s3a.connection.ssl.enabled
The fs.s3a.*
configurations are read-only. You can view the
fs.s3a.*
configurations from the CONFIGURATION tab
on the Database Catalog and Virtual Warehouse details page by selecting the
hadoop-core-site.xml option from the Configuration
files drop-down menu.
The Third-party S3 providers in private cloud option is enabled by
default. You can disable CDW’s access to S3 by deselecting the Third-party S3
providers in private cloud option from page.