Setting up your Cloudera Streaming Analytics clusterPDF version

Before creating your cluster

Before you start creating your Cloudera Streaming Analytics Cloudera Data Hub cluster, you need to ensure that you have set up the environment properly and have all the necessary accesses to use Cloudera on cloud.

  • You have Cloudera login credentials.
  • You have an available Cloudera environment.
  • You have a running Data Lake.
  • You have a Cloudera username and the predefined resource role of this user is EnvironmentAdmin.
  • Your Cloudera user is synchronized to the Cloudera on cloud environment.

As an administrator, you need to give permissions to users or groups to be able to access and perform tasks in your Cloudera Data Hub environment.

  1. Navigate to Cloudera Management Console > Environments and select your environment.
  2. Click Actions > Manage Access.
  3. Search for a user or group that needs access to the environment.
  4. Select EnvironmentUser role from the list of Resource Roles.
  5. Click Update Roles.
    The Resource Role for the selected user or group will be updated.
  6. Navigate to Cloudera Management Console > Environments, and select the environment where you want to create a cluster.
  7. Click Actions > Synchronize Users to FreeIPA.
  8. Click Synchronize Users.

As an administrator, you must create IDBroker mapping for a user or group to access cloud storage. As a part of Knox, the IDBroker allows a user to exchange cluster authentication for temporary cloud credentials.

You must create IDBroker mapping for a user or group to have access to the S3 cloud storage. As a part of Knox, the IDBroker allows a user to exchange cluster authentication for temporary cloud credentials. The following roles are created when registering the Cloudera environment:
  • idbroker-role: granting permissions to IDBroker instances associated with the Cloudera environment
  • datalake-admin-role: granting access to Cloudera cloud resources
  • logs-role: granting access to the logs storage location
For using Cloudera Streaming Analytics in Cloudera on cloud, you must make sure that the users who run Flink jobs are associated with the ARN of the datalake-admin-role as it grants access to the cloud resources required to run the Flink service.
  1. Navigate to Cloudera Management Console > Environments and select your environment.
  2. Click Actions > Manage Access.
  3. Click on the IDBroker Mappings tab.
  4. Click Edit to add a new user or group and assign roles to have writing access for the cloud storage.
  5. Search for the user or group you need to map.
  6. Go to the IAM Summary page where you can find information about your cloud storage account.
  7. Copy the Role ARN.
  8. Go back to the IDBroker Mapping interface on the Cloudera Management Console page.
  9. Paste the Role ARN to your selected user or group.
  10. Click Save and Sync.

As a user, you need to set a workload password for your EnvironmentUser account to be able to access the Cloudera SQL Stream Builder nodes through SSH connection.

  1. Navigate to Cloudera Management Console > Environments and select your environment.
  2. Click Actions > Manage Access.
  3. Click Workload Password.
  4. Give a chosen workload password for your user.
  5. Confirm the given password by typing it again.
  6. Click Set Workload Password.

We want your opinion

How can we improve this page?

What kind of feedback do you have?