After creating your cluster

As an EnvironmentAdmin, you need to provide access to users to your environment and to the Streaming Analytics cluster by assigning user roles, adding users to Ranger policies, and creating IDBroker mappings.

The cluster you have created using the Streaming Analytics cluster definition is kerberized and secured with SSL. Users can access cluster UIs and endpoints through a secure gateway powered by Apache Knox. Before you can use Flink and SQL Stream Builder, you must provide users access to the Streaming Analytics cluster components.

Retrieving keytab file

As a user, you need to retrieve the keytab file of your profile and upload it to the Streaming SQL Console to be able to run SQL jobs.

  1. Navigate to Management Console > Environments, and select the environment where you have created your cluster.
  2. Click on your profile name.
  3. Click Profile.
  4. Click Actions > Get Keytab.
  5. Choose the environment where your Data Hub cluster is running.
  6. Click Download.
  7. Save the keytab file in a chosen location.

Uploading or unlocking your keytab

When accessing the Streaming SQL Console for the first time in Data Hub, you must upload and unlock the keytab file corresponding with your profile before you can use SQL Stream Builder (SSB).

  1. Navigate to the Streaming SQL Console.
    1. Navigate to Management Console > Environments, and select the environment where you have created your cluster.
    2. Select the Streaming Analytics cluster from the list of Data Hub clusters.
    3. Select Streaming SQL Console from the list of services.
      The Streaming SQL Console opens in a new window.
  2. Click your username on the sidebar of the Streaming SQL Console.
  3. Click Manage keytab.
    The Keytab Manager window appears.
    You can either unlock the keytab already existing on the cluster, or you can directly upload your keytab file in the SQL Stream Builder.
    1. Unlock your keytab by providing the Principal Name and Password, and clicking Unlock Keytab. The Principal Name and Password should be the same as the workload username and password set for the Streaming Analytics cluster.
    2. Upload your keytab by clicking on the Upload tab, uploading the keytab file directly to the Console, and clicking Unlock Keytab.
    In case there is an error when unlocking your keytab, you can get more information about the issue with the following steps:
    1. Retrieve your keytab file.
      1. Click on your profile name in the Management Console.
      2. Click Profile.
      3. Click Actions > Get Keytab.
      4. Choose the environment where your Data Hub cluster is running.
      5. Click Download.
      6. Save the keytab file in a chosen location.
    2. Manually upload your keytab to the Streaming Analytics cluster:
      scp <location>/<your_keytab_file> <workload_username>@<manager_node_FQDN>:.
                          Password:<your_workload_password>
    3. Access the manager node of your Streaming Analytics cluster:
      ssh <workload_username>@<manager_node_FQDN>
                              Password: <workload_password>
    4. Use kinit command to authenticate your user:
      kinit -kt <keytab_filename>.keytab <workload_username>
    5. Use the flink-yarn-session command to see if the authentication works properly:
      flink-yarn-session -d \ 
      -D security.kerberos.login.keytab=<keytab_filename>.keytab \ 
      -D security.kerberos.login.principal=<workload_username>
    In case the command fails, you can review the log file for further information about the issue.

Configuring Ranger policies for Flink and SSB

You must add your workload username and the SQL Stream Builder (SSB) service user to the Ranger policies that are needed for Kafka, Schema Registry, Hive and Kudu to provide access to topics, schemas and tables used by the components and to be able to execute Flink jobs.

You need to provide access to users and the SSB service by configuring Ranger policies for the Kafka data source and the Schema Registry, Kudu and Hive catalog services. To be able to use Flink, you need to add the workload user or users to the required policies. For SSB, the ssb service user needs to be added to the same policies.

When adding more workload users, instead of adding them one by one, you can create user groups in Ranger, for example a flink_users group. This way you can assign every user who will use the Streaming Analytics cluster into a group, and add only that one group to the Ranger policies.

  1. Navigate to Management Console > Environments, and select the environment where you have created your cluster.
  2. Click Ranger from the Quick Links or select Data Lake > Ranger from Services.
    You are redirected to the Ranger Admin Web user interface (UI) where you can add the workload user and SSB service user to the required policies to grant access for Flink and SSB.

Configuring Kafka policies

After accessing the Ranger Admin Web UI, the workload username or user groups, and the SSB service user needs to be added to the Kafka policies to be able to use Flink and SSB.

The following resource based policies need to be configured for the Kafka service:
  • all - topic: Provides access to all topics for users
  • all - consumergroup: Provides access to all consumer groups for users
  • all - cluster: Provides access to all clusters to users
  • all - transactionalid: Provides transactionalid access to users
  • all - delegationtoken: Provides delegationtoken access to users
You need to ensure that the required workload username or user group, and the ssb service user is added to the policies of the Kafka service and to the policies of the created Streams Messaging and Streaming Alaytics clusters.
  1. Click cm_kafka under Kafka service on the Service Manager page.
    You are redirected to the cm_kafka Policies page.
  2. Click on the edit button of the all-consumergroup policy.
  3. Add the user group to the Select Group field under the Allow Conditions settings.
    • Alternatively, you can also add the workload usernames to the Select User field under the Allow Conditions setting.
  4. Add the ssb service user to the Select User field under the Allow Conditions setting, if it is not configured by default.
  5. Click Save.
    You are redirected to the list of Kafka policies page.
  6. Click on + More… to check if the needed workload user is listed under the Users for the edited policy.
    Repeate same steps to the remaining Kafka policies as well based on the required authorization level:
    • all - topic
    • all - transactionalid
    • all - cluster

Configuring Schema Registry policies

After accessing the Ranger Admin Web UI, the workload username or user groups, and the SSB service user needs to be added to the Schema Registry policies to be able to use Flink and SSB.

The following resource based policy need to be configured for the Schema Registry service:
  • all - export-import: Provides import and export access for users.
  • all - serde: Provides access to store metadata for the format of how the data should be read and written.
  • all - schema-group, schema-metadata: Provides access to the schema groups, schema metadata, and schema branch.
  • all - schema-group, schema-metadata, schema-branch: Provides access to the schema groups, schema metadata, and schema branch.
  • all - schema-group, schema-metadata, schema-branch, schema-version: Provides access to schema groups, schema metadata, schema branch, and schema version.
  • all - registry-service: Provides access to the schema registry service, the user can access all Schema Registry entities.
You need to ensure that the required workload username or user group, and the ssb service user is added to the policies of the created Streams Messaging clusters.
  1. Select your Streams Messaging cluster under the Schema Registry service on the Service Manager page.
    You are redirected to the list of Schema Registry policies page.
  2. Click on the edit button of the all-schema-group, schema-metadata, schema-branch, schema-version policy.
  3. Add the user group to the Select Group field under the Allow Conditions settings.
    • Alternatively, you can also add the workload usernames to the Select User field under the Allow Conditions setting.
  4. Add the ssb service user to the Select User field under the Allow Conditions setting, if it is not configured by default.
  5. Click Save.
    You are redirected to the list of Schema Registry policies page.
  6. Click on + More… to check if the needed workload user is listed under the Users for the edited policy.
    Repeate same steps to the remaining Schema Registry policies as well based on the required authorization level:
    • all - export-import
    • all - serde
    • all - schema-group, schema-metadata
    • all - schema-group, schema-metadata, schema-branch, schema-version
    • all - registry-service

Configuring Hive policies

After accessing the Ranger Admin Web UI, the workload username or user groups, and the SSB service user needs to be added to the Hadoop SQL policies to be able to use Flink and SSB with Hive.

The following resource based policy need to be configured for the Hive service:
  • all - global: Provides global access to users.
  • all - database, table, column: Provides access to all databases, tables, and columns for users.
  • all - database, table: Provides access to all databases and tables for users.
  • all - database: Provides access to all databases for users.
  • all - hiveservice: Provides hiveservice access to users.
  • all - database, udf: Provides database and udf access to users.
  • all - url: Provides url access
You need to ensure that the required workload username or user group, and the ssb service user is added to the policies of the Hive service and the created Operational Database clusters.
  1. Click Hadoop SQL under Hadoop SQL service on the Service Manager page.
    You are redirected to the list of Hadoop SQL policies page.
  2. Click on the edit button of the all-global policy.
  3. Add the user group to the Select Group field under the Allow Conditions settings.
    • Alternatively, you can also add the workload usernames to the Select User field under the Allow Conditions setting.
  4. Add the ssb service user to the Select User field under the Allow Conditions setting, if it is not configured by default.
  5. Click Save.
    You are redirected to the list of Hadoop SQL policies page.
  6. Click on + More… to check if the needed workload user and the ssb are listed under the Users for the edited policy.
    Repeate same steps to the remaining Hadoop SQL policies as well based on the required authorization level:
    • all-database, table, column
    • all-database, table
    • all-database
    • all-database, udf
    • all-hiveservice
    • all-url

Configuring Kudu policies

After accessing the Ranger Admin Web UI, the workload username or user groups, and the SSB service user needs to be added to the Kudu policies to be able to use Flink and SSB.

The following resource based policy need to be configured for the Kudu service:
  • all - database, table: Provides access to all databases and tables for users.
  • all - database, table, column: Provides access to all databases, tables, and columns for users.
  • all - database: Provides access to all databases for users.
You need to ensure that the required workload username or user group, and the ssb service user is added to the policies of the Kudu service and the created Real-Time Data Mart clusters.
  1. Click cm_kudu under Kudu service on the Service Manager page.
    You are redirected to the cm_kudu Policies page.
  2. Click on the edit button of the all-database policy.
  3. Add the user group to the Select Group field under the Allow Conditions settings.
    • Alternatively, you can also add the workload usernames to the Select User field under the Allow Conditions setting.
  4. Add the ssb service user to the Select User field under the Allow Conditions setting, if it is not configured by default.
  5. Click Save.
    You are redirected to the list of Kudu policies page.
  6. Click on + More… to check if the needed workload user is listed under the Users for the edited policy.
    Repeate same steps to the remaining Kudu policies as well based on the required authorization level:
    • all - database, table
    • all - database, table, column