Configuring S3Guard in Cloudera Manager

You must set permission to restrict access to S3Guard tables.

After you have created DynamoDB access policy, set the following S3Guard configuration parameters for each S3 bucket that you want to "guard".

  1. In Cloudera Manager UI, navigate to Clusters > HDFS > Configuration > Advanced > Add property under Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml.
  2. Set the following configuration parameters for each bucket that you want to "guard". To configure S3Guard for a specific bucket, replace fs.s3a. with the fs.s3a.bucket.<bucketname>. where "bucketname" is the name of your bucket.
  3. After you've added the configuration parameters, click Save to save the configuration changes.
  4. Restart all affected components.
Table 1. S3Guard Configuration Parameters
Base Parameter Default Value Setting for S3Guard
fs.s3a.metadatastore.impl org.apache.hadoop.fs.s3a.s3guard.NullMetadataStore Set this to “org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore”.
fs.s3a.s3guard.ddb.table.create false Set this to “true” to automatically create the DynamoDB table.
fs.s3a.s3guard.ddb.table (Unset)

Enter a name for the table that will be created in DynamoDB for S3Guard.

If you leave this blank while setting fs.s3a.s3guard.ddb.table.create to “true”, a separate DynamoDB table will be created for each accessed bucket. For each bucket, the respective S3 bucket name being used as the DynamoDB table name.

fs.s3a.s3guard.ddb.region (Unset)

Set this parameter to one of the values from AWS. Refer to Amazon documentation. The “region” column value needs to be set as this parameter value.

If you leave this blank, the same region as where the S3 bucket is will be used.

fs.s3a.s3guard.ddb.table.capacity.read 500 Specify read capacity for DynamoDB or use the default. You can monitor the DynamoDB workload in the DynamoDB console on AWS portal and adjust the read/write capacities on the fly based on the workload requirements.
fs.s3a.s3guard.ddb.table.capacity.write 100 Specify write capacity for DynamoDB or use the default. You can monitor the DynamoDB workload in the DynamoDB console on AWS portal and adjust the read/write capacities on the fly based on the workload requirements.

Example

Adding the following custom properties will create a DynamoDB table called “guarded-table” in the “eu-west-1” region (where the "guarded-table" bucket is located). The configuration will be valid for a bucket called "guarded-table", which means that “guarded-table” will only be used for storing metadata related to this bucket.
<property> 
<name>fs.s3a.bucket.guarded-table.metadatastore.impl</name> 
<value>org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore</value> 
</property> 

<property> 
<name>fs.s3a.bucket.guarded-table.s3guard.ddb.table</name> 
<value>my-table</value> 
</property> 

<property> 
<name>fs.s3a.bucket.guarded-table.s3guard.ddb.table.create</name> 
<value>true</value> 
</property> 

<property> 
<name>fs.s3a.bucket.guarded-table.s3guard.ddb.region</name> 
<value>eu-west-1</value> 
</property>