How to Configure AWS Credentials

Minimum Required Role: User Administrator (also provided by Full Administrator)

Amazon S3 (Simple Storage Service) can be used in a CDH cluster managed by Cloudera Manager in the following ways:
  • As storage for Impala tables
  • As a source or destination for HDFS and Hive/Impala replication and for cluster storage
  • To enable Cloudera Navigator to extract metadata from Amazon S3 storage
  • To browse S3 data using Hue

You can use the S3Guard feature to address possible issues with the "eventual consistency" guarantee provided by Amazon for data stored in S3. To use the S3Guard feature, you provision an Amazon DynamoDB that CDH uses as an additional metadata store to improve performance and guarantee that your queries return the most current data. See Configuring and Managing S3Guard.

To provide access to Amazon S3, you configure AWS Credentials that specify the authentication type (role-based, for example) and the access and secret keys. Amazon offers two types of authentication you can use with Amazon S3:
IAM Role-based Authentication

Amazon Identity and Access Management (IAM) can be used to create users, groups, and roles for use with Amazon Web Services, such as EC2 and Amazon S3. IAM role-based access provides the same level of access to all clients that use the role. All jobs on the cluster will have the same level of access to Amazon S3, so this is better suited for single-user clusters, or where all users of a cluster should have the same privileges to data in Amazon S3.

If you are setting up a peer to copy data to and from Amazon S3, using Cloudera Manager Hive or HDFS replication, select this option.

If you are configuring Amazon S3 access for a cluster deployed to Amazon Elastic Compute Cloud (EC2) instances using the IAM role for the EC2 instance profile, you do not need configure IAM role-based authentication for services such as Impala, Hive, or Spark.

Access Key Credentials
This type of authentication requires an AWS Access Key and an AWS Secret key that you obtain from Amazon and is better suited for environments where you have multiple users or multi-tenancy. You must enable the Sentry service and Kerberos when using the S3 Connector service. Enabling these services allows you to configure selective access for different data paths. (The Sentry service is not required for BDR replication or access by Cloudera Navigator.)

Cloudera Manager stores these values securely and does not store them in world-readable locations. The credentials are masked in the Cloudera Manager Admin console, encrypted in the configurations passed to processes managed by Cloudera Manager, and redacted from the logs.

For more information about Amazon S3, see the Amazon S3 documentation.

The client configuration files generated by Cloudera Manager based on configured services do not include AWS credentials. These clients must manage access to these credentials outside of Cloudera Manager. Cloudera Manager uses credentials stored in Cloudera Manager for trusted clients such as the Impala daemon and Hue. For access from YARN, MapReduce or Spark, see Using S3 Credentials with YARN, MapReduce, or Spark.

Adding AWS Credentials

Minimum Required Role: User Administrator (also provided by Full Administrator)

To add AWS Credentials for Amazon S3:
  1. Open the Cloudera Manager Admin Console.
  2. Click Administration > External Accounts.
  3. Select the AWS Credentials tab.
  4. Select one of the following:
    • Add Access Key Credentials

      This authentication mechanism requires you to obtain AWS credentials from Amazon.

      1. Enter a Name of your choosing for this account.
      2. Enter the AWS Access Key ID.
      3. Enter the AWS Secret Key.
    • Add IAM Role-Based Authentication
      1. Enter a name for your IAM Role-based authentication.
  5. Click Add.

    The Edit S3Guard dialog box displays.

    S3Guard enables a consistent view of data stored in Amazon S3 and requires that you provision a DynamoDB database from Amazon Web Services. S3Guard is optional but can help improve performance and accuracy for certain types of workflows. To configure S3Guard, see Configuring and Managing S3Guard and return to these steps after completing the configuration.

    If you do not want to enable S3Guard, click Save to finish adding the AWS Credential.

    The Connect to Amazon Web Services dialog box displays.

  6. Choose one of the following options:
    • Cloud Backup and Restore

      To configure Amazon S3 as the source or destination of a replication schedule (to back up and restore data, for example), click the Replication Schedules link. See Data Replication for details.

    • Cluster Access to S3

      To enable cluster access to S3 using the S3 Connector Service, click the Enable for Cluster Name link, which launches a wizard for adding the S3 Connector service. See Adding the S3 Connector Service for details.

    • Cloudera Navigator Access to S3

      To give Cloudera Navigator access to Amazon S3, click the Enable for Cloudera Navigator link. Restart the Cloudera Navigator Metadata Server to enable access.

Managing AWS Credentials

To remove AWS credentials:
  1. Open the Cloudera Manager Admin Console.
  2. Click Administration > External Accounts.
  3. Select the AWS Credentials tab.
  4. Locate the row with the credentials you want to delete and click Actions > Remove.
To edit AWS Access Key credentials:
  1. Open the Cloudera Manager Admin Console.
  2. Click Administration > External Accounts.
  3. Select the AWS Credentials tab.
  4. Locate the row with the Access Key Credentials you want to delete and click Actions > Edit Credential.

    The Edit Credential dialog box displays.

  5. Edit the account fields.
  6. Click Save.
  7. Restart cluster services that use these credentials. If connectivity is for Cloudera Navigator, restart the Cloudera Navigator Metadata server.
To rename the IAM Role-Based Authentication:
  1. Open the Cloudera Manager Admin Console.
  2. Click Administration > External Accounts.
  3. Select the AWS Credentials tab.
  4. Locate the row with the IAM Role-Based Authentication you want to rename and click Actions > Rename.
  5. Enter a new name.
  6. Click Save.

    The Connect to Amazon Web Services screen displays.

  7. Click the links to change any service connections or click Close to leave them unchanged.
To edit the services connected to an AWS Credentials account:
  1. Open the Cloudera Manager Admin Console.
  2. Click Administration > External Accounts.
  3. Select the AWS Credentials tab.
  4. Locate the row with the credentials you want to edit and click Actions > Edit Connectivity.

    The Connect to Amazon Web Services screen displays.

  5. Click one of the following options:
    • Cloud Backup and Restore

      To configure Amazon S3 as the source or destination of a replication schedule (to back up and restore data, for example), click the Replication Schedules link. See Data Replication for details.

    • Cluster Access to S3

      To enable cluster access to S3 using the S3 Connector Service, click the Enable for Cluster Name link, which launches a wizard for adding the S3 Connector service. See Adding the S3 Connector Service for details.

    • Cloudera Navigator Access to S3

      To give Cloudera Navigator access to Amazon S3, click the Enable for Cloudera Navigator link. Restart the Cloudera Navigator Metadata Server to enable access.