Configuring S3Guard in Cloudera Manager
You must set permission to restrict access to S3Guard tables.
After you have created DynamoDB access policy, set the following S3Guard configuration parameters for each S3 bucket that you want to "guard".
- In Cloudera Manager UI, navigate to Clusters > HDFS > Configuration > Advanced > Add property under Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml.
- Set the following configuration parameters for each bucket that you want to "guard". To configure S3Guard for a specific bucket, replace fs.s3a. with the fs.s3a.bucket.<bucketname>. where "bucketname" is the name of your bucket.
- After you've added the configuration parameters, click Save to save the configuration changes.
- Restart all affected components.
Base Parameter | Default Value | Setting for S3Guard |
---|---|---|
fs.s3a.metadatastore.impl | org.apache.hadoop.fs.s3a.s3guard.NullMetadataStore | Set this to “org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore”. |
fs.s3a.s3guard.ddb.table.create | false | Set this to “true” to automatically create the DynamoDB table. |
fs.s3a.s3guard.ddb.table | (Unset) |
Enter a name for the table that will be created in DynamoDB for S3Guard. If you leave this blank while setting fs.s3a.s3guard.ddb.table.create to “true”, a separate DynamoDB table will be created for each accessed bucket. For each bucket, the respective S3 bucket name being used as the DynamoDB table name. |
fs.s3a.s3guard.ddb.region | (Unset) |
Set this parameter to one of the values from AWS. Refer to Amazon documentation. The “region” column value needs to be set as this parameter value. If you leave this blank, the same region as where the S3 bucket is will be used. |
fs.s3a.s3guard.ddb.table.capacity.read | 500 | Specify read capacity for DynamoDB or use the default. You can monitor the DynamoDB workload in the DynamoDB console on AWS portal and adjust the read/write capacities on the fly based on the workload requirements. |
fs.s3a.s3guard.ddb.table.capacity.write | 100 | Specify write capacity for DynamoDB or use the default. You can monitor the DynamoDB workload in the DynamoDB console on AWS portal and adjust the read/write capacities on the fly based on the workload requirements. |
Example
<property>
<name>fs.s3a.bucket.guarded-table.metadatastore.impl</name>
<value>org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore</value>
</property>
<property>
<name>fs.s3a.bucket.guarded-table.s3guard.ddb.table</name>
<value>my-table</value>
</property>
<property>
<name>fs.s3a.bucket.guarded-table.s3guard.ddb.table.create</name>
<value>true</value>
</property>
<property>
<name>fs.s3a.bucket.guarded-table.s3guard.ddb.region</name>
<value>eu-west-1</value>
</property>