(Optional) Configuring Authorization for Storm
Apache Storm supports authorization using Pluggable Authentication Modules, or PAM, with secure Hadoop clusters. Currently, Storm supports the following authorizers:
Table 18.2. Supported Authorizers
Authorizer | Description |
---|---|
backtype.storm.security.auth.authorizer. SimpleACLAuthorizer | Default authorizer for the Nimbus node and all Storm nodes except DRPC. |
backtype.storm.security.auth.authorizer. DRPCSimpleACLAuthorizer | Default authorizer for Storm DRPC nodes. |
com.xasecure.authorization.storm.authorizer. XaSecureStormAuthorizer | Default authorizer for centralized authorization with Apache Ranger. |
To enable authorization, perform the following steps:
Configure storm.yaml for Nimbus and Storm nodes.
Configure worker-launcher.cfg for worker-launcher.
Configure the Storm multi-tenant job scheduler.
Configure storm.yaml for Nimbus and Storm Nodes
When authorization is enabled, Storm prevents users from seeing topologies run by other users in the Storm UI. To do this, Storm must run each topology as the operating system user who submitted it rather than the user that runs Storm, typically storm, which is created during installation.
Use the following procedure to configure supervisor to run Storm topologies as the user who submits the topology, rather than as the storm user:
Verify that a headless user exists for supervisor, such as supervisor, on each Storm cluster node.
Create a headless operating system group, such as supervisor, on each Storm cluster node.
Set the following configuration properties in the storm.yaml configuration file for each node in the Storm cluster:
Table 18.3. storm.yaml Configuration File Properties
Configuration Property
Description
supervisor.run.worker.as.user
Set to true to run topologies as the user who submits them.
topology.auto-credentials
Set to a list of Java plugins that pack and unpack user credentials for Storm workers. This allows Storm to access secure Hadoop services. If the Hadoop cluster uses Kerberos, set this to backtype.storm.security.auth.kerberos.AutoTGT.
drpc.authorizer
Set to backtype.storm.security.auth.authorizer. DRPCSimpleACLAuthorizer to enable authorizer for Storm DRPC node.
nimbus.slots.perTopology
The maximum number of slots/workers a topology can use. This property is used only by the Nimbus node.
nimbus.executors.perTopology
The maximum number of executors/threads a topology can use. This property is used only by the Nimbus node.
Note Topologies should also set topology.auto-credentials to backtype.storm.security.auth.hadoop.AutoHDFS in the TopologyBuilder class.
Change the owner of worker-launcher.cfg to root and verify that only root has write permissions on the file.
Change the permissions for the worker-launcher executable to 6550.
Verify that all Hadoop configuration files are in the CLASSPATH for the Nimbus server.
Verify that the nimbus operating system user has superuser privileges and can receive delegation tokens on behalf of users submitting topologies.
Restart the Nimbus server.
Configure worker-launcher.cfg
/usr/hdp/current/storm-client/bin/worker-launcher is a program that runs Storm worker nodes. You must configure worker-launcher to run Storm worker nodes as the user who submitted a topology, rather than the user running the supervisor process controller. To do this, set the following configuration properties in the /etc/storm/conf/worker-launcher.cfg configuration file on all Storm nodes:
Table 18.4. worker-launcher.cfg File Configuration Properties
Configuration Property | Description |
---|---|
storm.worker-launcher.group | Set this to the headless OS group that you created earlier. |
min.user.id | Set this to the first user id on the cluster node. |
Configure the Storm Multi-tenant Scheduler
The goal of the multi-tenant scheduler is to both isolate topologies from one another and to limit the resources that an individual user can use on the cluster. Add the following configuration property to multitenant-scheduler.yaml and place it in the same directory with storm.yaml.
Table 18.5. multitenant-scheduler.yaml Configuration File Properties
Configuration Property | Description |
---|---|
multitenant.scheduler.user.pools | Specifies the maximum number of nodes a user may use to run topologies. |
The following example limits users evans and derek to ten nodes each for all their topologies:
multitenant.scheduler.user.pools: "evans": 10 "derek": 10
Note | |
---|---|
The multi-tenant scheduler relies on Storm authentication to distinguish between individual Storm users. Verify that Storm authentication is already enabled. |