Supported object storage services for Cloudera Data Warehouse Private Cloud
HDFS is the default storage system for Cloudera Data Warehouse (CDW). However, you can enable CDW to access object storage such as AWS S3 and Azure Data Lake Storage (ADLS Gen1 and Gen2) if the CDP Private Cloud base cluster is configured to access it. You can query Hive and Impala tables stored on object stores using Hue.
When you activate an environment in CDW, all the hadoop
configurations variables (fs.s3a.*
/fs.azure.*
) are copied
from the core-site.xml file present on the base cluster to the
hadoop-core-site.xml file of the Hive and Impala metastore pods,
enabling CDW to establish a connection to S3/ADLS.
core-site.xml
file for connecting to S3 or S3-compatible storage
providers:- fs.s3a.access.key
- fs.s3a.secret.key
- fs.s3a.endpoint
- fs.s3a.connection.ssl.enabled
core-site.xml
file for connecting to ADLS storage provider:- fs.azure.account.oauth.provider.type
- fs.azure.account.oauth2.client.id
- fs.azure.account.oauth2.client.secret
- fs.azure.account.oauth2.client.endpoint
The fs.s3a.*
/fs.azure
configurations are read-only. You can
view these configurations from the CONFIGURATION tab on the Database
Catalog and Virtual Warehouse details page by selecting the
hadoop-core-site.xml option from the Configuration
files drop-down menu.