Command Line Installation
Also available as:
PDF
loading table of contents...

(Optional) Configuring Authorization for Storm

Apache Storm supports authorization using Pluggable Authentication Modules, or PAM, with secure Hadoop clusters. Currently, Storm supports the following authorizers:

Table 19.2. Supported Authorizers

Authorizer

Description

org.apache.storm.security.auth.authorizer. SimpleACLAuthorizer

Default authorizer for the Nimbus node and all Storm nodes except DRPC.

org.apache.storm.security.auth.authorizer. DRPCSimpleACLAuthorizer

Default authorizer for Storm DRPC nodes.

com.xasecure.authorization.storm.authorizer. XaSecureStormAuthorizer

Default authorizer for centralized authorization with Apache Ranger.


To enable authorization, perform the following steps:

  1. Configure storm.yaml for Nimbus and Storm nodes.

  2. Configure worker-launcher.cfg for worker-launcher.

  3. Configure the Storm multi-tenant job scheduler.

Configure storm.yaml for Nimbus and Storm Nodes

When authorization is enabled, Storm prevents users from seeing topologies run by other users in the Storm UI. To do this, Storm must run each topology as the operating system user who submitted it rather than the user that runs Storm, typically storm, which is created during installation.

Use the following procedure to configure supervisor to run Storm topologies as the user who submits the topology, rather than as the storm user:

  1. Verify that a headless user exists for supervisor, such as supervisor, on each Storm cluster node.

  2. Create a headless operating system group, such as supervisor, on each Storm cluster node.

  3. Set the following configuration properties in the storm.yaml configuration file for each node in the Storm cluster:

    Table 19.3. storm.yaml Configuration File Properties

    Configuration Property

    Description

    supervisor.run.worker.as.user

    Set to true to run topologies as the user who submits them.

    topology.auto-credentials

    Set to a list of Java plugins that pack and unpack user credentials for Storm workers. This should be set to org.apache.storm.security.auth.kerberos.AutoTGT.

    drpc.authorizer

    Set to org.apache.storm.security.auth.authorizer.DRPCSimpleACLAuthorizer to enable authorizer for Storm DRPC node.

    nimbus.authorizer:

    Set to org.apache.storm.security.auth.authorizer.SimpleACLAuthorizer to enable authorizer for Storm nimbus node.

    storm.principal.tolocal:

    Set to org.apache.storm.security.auth.KerberosPrincipalToLocal to enable transforming kerberos principal to local user names.

    storm.zookeeper.superACL:

    Set to sasl:storm to set the acls on zookeeper nodes so only user storm can modify those nodes.


  4. Change the owner of worker-launcher.cfg to root and verify that only root has write permissions on the file.

  5. Change the permissions for the worker-launcher executable to 6550.

  6. Verify that all Hadoop configuration files are in the CLASSPATH for the Nimbus server.

  7. Verify that the nimbus operating system user has superuser privileges and can receive delegation tokens on behalf of users submitting topologies.

  8. Restart the Nimbus server.

Configure worker-launcher.cfg

/usr/hdp/current/storm-client/bin/worker-launcher is a program that runs Storm worker nodes. You must configure worker-launcher to run Storm worker nodes as the user who submitted a topology, rather than the user running the supervisor process controller. To do this, set the following configuration properties in the /etc/storm/conf/worker-launcher.cfg configuration file on all Storm nodes:

Table 19.4. worker-launcher.cfg File Configuration Properties

Configuration Property

Description

storm.worker-launcher.group

Set this to the headless OS group that you created earlier.

min.user.id

Set this to the first user ID on the cluster node.


Configure the Storm Multi-tenant Scheduler

The goal of the multi-tenant scheduler is to both isolate topologies from one another and to limit the resources that an individual user can use on the cluster. Add the following configuration property to multitenant-scheduler.yaml and place it in the same directory with storm.yaml.

Table 19.5. multitenant-scheduler.yaml Configuration File Properties

Configuration Property

Description

multitenant.scheduler.user.pools

Specifies the maximum number of nodes a user may use to run topologies.


The following example limits users evans and derek to ten nodes each for all their topologies:

multitenant.scheduler.user.pools: "evans": 10 "derek": 10
[Note]Note

The multi-tenant scheduler relies on Storm authentication to distinguish between individual Storm users. Verify that Storm authentication is already enabled.