6. (Optional) Configuring Authorization for Storm

Apache Storm supports authorization using Pluggable Authentication Modules, or PAM, with secure Hadoop clusters. Currently, Storm supports the following authorizers:

 

Table 18.2. Supported Authorizers

Authorizer

Description

backtype.storm.security.auth.authorizer. SimpleACLAuthorizer

Default authorizer for the Nimbus node and all Storm nodes except DRPC.

backtype.storm.security.auth.authorizer. DRPCSimpleACLAuthorizer

Default authorizer for Storm DRPC nodes.

com.xasecure.authorization.storm.authorizer. XaSecureStormAuthorizer

Default authorizer for centralized authorization with Apache Ranger.


To enable authorization, perform the following steps:

  1. Configure storm.yaml for Nimbus and Storm nodes.

  2. Configure worker-launcher.cfg for worker-launcher.

  3. Configure the Storm multi-tenant job scheduler.

Configure storm.yaml for Nimbus and Storm Nodes

When authorization is enabled, Storm prevents users from seeing topologies run by other users in the Storm UI. To do this, Storm must run each topology as the operating system user who submitted it rather than the user that runs Storm, typically storm, which is created during installation.

Use the following procedure to configure supervisor to run Storm topologies as the user who submits the topology, rather than as the storm user:

  1. Verify that a headless user exists for supervisor, such as supervisor, on each Storm cluster node.

  2. Create a headless operating system group, such as supervisor, on each Storm cluster node.

  3. Set the following configuration properties in the storm.yaml configuration file for each node in the Storm cluster:

     

    Table 18.3. storm.yaml Configuration File Properties

    Configuration Property

    Description

    supervisor.run.worker.as.user

    Set to true to run topologies as the user who submits them.

    topology.auto-credentials

    Set to a list of Java plugins that pack and unpack user credentials for Storm workers. This allows Storm to access secure Hadoop services. If the Hadoop cluster uses Kerberos, set this to backtype.storm.security.auth.kerberos.AutoTGT.

    drpc.authorizer

    Set to backtype.storm.security.auth.authorizer. DRPCSimpleACLAuthorizer to enable authorizer for Storm DRPC node.

    nimbus.slots.perTopology

    The maximum number of slots/workers a topology can use. This property is used only by the Nimbus node.

    nimbus.executors.perTopology

    The maximum number of executors/threads a topology can use. This property is used only by the Nimbus node.


    [Note]Note

    Topologies should also set topology.auto-credentials to backtype.storm.security.auth.hadoop.AutoHDFS in the TopologyBuilder class.

  4. Change the owner of worker-launcher.cfg to root and verify that only root has write permissions on the file.

  5. Change the permissions for the worker-launcher executable to 6550.

  6. Verify that all Hadoop configuration files are in the CLASSPATH for the Nimbus server.

  7. Verify that the nimbus operating system user has superuser privileges and can receive delegation tokens on behalf of users submitting topologies.

  8. Restart the Nimbus server.

Configure worker-launcher.cfg

/usr/hdp/current/storm-client/bin/worker-launcher is a program that runs Storm worker nodes. You must configure worker-launcher to run Storm worker nodes as the user who submitted a topology, rather than the user running the supervisor process controller. To do this, set the following configuration properties in the /etc/storm/conf/worker-launcher.cfg configuration file on all Storm nodes:

 

Table 18.4. worker-launcher.cfg File Configuration Properties

Configuration Property

Description

storm.worker-launcher.group

Set this to the headless OS group that you created earlier.

min.user.id

Set this to the first user id on the cluster node.


Configure the Storm Multi-tenant Scheduler

The goal of the multi-tenant scheduler is to both isolate topologies from one another and to limit the resources that an individual user can use on the cluster. Add the following configuration property to multitenant-scheduler.yaml and place it in the same directory with storm.yaml.

 

Table 18.5. multitenant-scheduler.yaml Configuration File Properties

Configuration Property

Description

multitenant.scheduler.user.pools

Specifies the maximum number of nodes a user may use to run topologies.


The following example limits users evans and derek to ten nodes each for all their topologies:

multitenant.scheduler.user.pools: "evans": 10 "derek": 10
[Note]Note

The multi-tenant scheduler relies on Storm authentication to distinguish between individual Storm users. Verify that Storm authentication is already enabled.


loading table of contents...