Configure secure S3A access for Apache Ozone in Cloudera Data Warehouse.
A platform administrator must complete the credential registration process by
following the step-by-step instructions in the Managing S3-Compatible
Credentials.
The S3 Bucket Name specified during credential creation
precisely must match the bucket name used in your SQL queries, such as
s3a://<bucket>/. Cloudera Data Warehouse
uses this exact string to construct and inject the required Hadoop configuration
property keys.
Automated credential delivery and
configuration: Whenever you create or update a Database Catalog or
Virtual Warehouse, the Cloudera Data Warehouse automatically delivers and applies the credentials. Manual configuration
is not required. The system automatically performs the following actions during
this automated configuration process:
Cloudera Data Warehouse
contacts the environment service to get the list of your configured
Ozone S3 accounts.
The system securely reads the required access keys and secret keys
directly from the HashiCorp Vault.
Cloudera Manager automatically
builds a secure JCEKS keystore and applies the following credential
properties to your Virtual Warehouse pods:
fs.s3a.bucket.<bucket>.access.key
fs.s3a.bucket.<bucket>.secret.key
Cloudera Manager updates the
core-site.xml file for all relevant query
engines (Hive Metastore, HiveServer2, Impala, and Trino) with these
essential routing properties:
Path-style access:fs.s3a.bucket.<bucket>.path.style.access =
true (Required for Apache Ozone routing)
Region:fs.s3a.bucket.<bucket>.endpoint.region
(Applied only if a region is specified)
Virtual Warehouse data access execution: After
the Virtual Warehouse pods are deployed, you can access and query your data
immediately. The analytic query engines gain immediate, transparent access to
the Apache Ozone storage layer the moment the pods are deployed.
Run queries directly against your data locations using
standard S3A URI formats, such as
s3a://<bucket>/path/to/data.
Run your queries through Hive, Impala, or Trino without any additional
setup. Because credentials are pre-configured within the pods, you do
not need to apply additional settings, session-level parameters, or
manual configuration adjustments.