(Optional) Configuring Authorization for Storm
Apache Storm supports authorization using Pluggable Authentication Modules, or PAM, with secure Hadoop clusters. Currently, Storm supports the following authorizers:
Table 18.2. Supported Authorizers
Authorizer |
Description |
---|---|
org.apache.storm.security.auth.authorizer. SimpleACLAuthorizer |
Default authorizer for the Nimbus node and all Storm nodes except DRPC. |
org.apache.storm.security.auth.authorizer. DRPCSimpleACLAuthorizer |
Default authorizer for Storm DRPC nodes. |
com.xasecure.authorization.storm.authorizer. XaSecureStormAuthorizer |
Default authorizer for centralized authorization with Apache Ranger. |
To enable authorization, perform the following steps:
Configure
storm.yaml
for Nimbus and Storm nodes.Configure
worker-launcher.cfg
for worker-launcher.Configure the Storm multi-tenant job scheduler.
Configure storm.yaml for Nimbus and Storm Nodes
When authorization is enabled, Storm prevents users from seeing topologies run by other users in the Storm UI. To do this, Storm must run each topology as the operating system user who submitted it rather than the user that runs Storm, typically storm, which is created during installation.
Use the following procedure to configure supervisor to run Storm topologies as the user who submits the topology, rather than as the storm user:
Verify that a headless user exists for supervisor, such as supervisor, on each Storm cluster node.
Create a headless operating system group, such as supervisor, on each Storm cluster node.
Set the following configuration properties in the storm.yaml configuration file for each node in the Storm cluster:
Table 18.3. storm.yaml Configuration File Properties
Configuration Property
Description
supervisor.run.worker.as.user
Set to true to run topologies as the user who submits them.
topology.auto-credentials
Set to a list of Java plugins that pack and unpack user credentials for Storm workers. This should be set to
org.apache.storm.security.auth.kerberos.AutoTGT
.drpc.authorizer
Set to
org.apache.storm.security.auth.authorizer.DRPCSimpleACLAuthorizer
to enable authorizer for Storm DRPC node.nimbus.authorizer:
Set to
org.apache.storm.security.auth.authorizer.SimpleACLAuthorizer
to enable authorizer for Storm nimbus node.storm.principal.tolocal:
Set to
org.apache.storm.security.auth.KerberosPrincipalToLocal
to enable transforming kerberos principal to local user names.storm.zookeeper.superACL:
Set to
sasl:storm
to set the acls on zookeeper nodes so only userstorm
can modify those nodes.Change the owner of
worker-launcher.cfg
to root and verify that only root has write permissions on the file.Change the permissions for the worker-launcher executable to 6550.
Verify that all Hadoop configuration files are in the CLASSPATH for the Nimbus server.
Verify that the nimbus operating system user has superuser privileges and can receive delegation tokens on behalf of users submitting topologies.
Restart the Nimbus server.
Configure worker-launcher.cfg
/usr/hdp/current/storm-client/bin/worker-launcher
is
a program that runs Storm worker nodes. You must configure worker-launcher to run Storm
worker nodes as the user who submitted a topology, rather than the user running the
supervisor process controller. To do this, set the following configuration properties in
the /etc/storm/conf/worker-launcher.cfg
configuration file on all
Storm nodes:
Table 18.4. worker-launcher.cfg File Configuration Properties
Configuration Property |
Description |
---|---|
storm.worker-launcher.group |
Set this to the headless OS group that you created earlier. |
min.user.id |
Set this to the first user ID on the cluster node. |
Configure the Storm Multi-tenant Scheduler
The goal of the multi-tenant scheduler is to both isolate topologies from one another
and to limit the resources that an individual user can use on the cluster. Add the
following configuration property to multitenant-scheduler.yaml
and place it in the same
directory with storm.yaml
.
Table 18.5. multitenant-scheduler.yaml Configuration File Properties
Configuration Property |
Description |
---|---|
multitenant.scheduler.user.pools |
Specifies the maximum number of nodes a user may use to run topologies. |
The following example limits users evans and derek to ten nodes each for all their topologies:
multitenant.scheduler.user.pools: "evans": 10 "derek": 10
Note | |
---|---|
The multi-tenant scheduler relies on Storm authentication to distinguish between individual Storm users. Verify that Storm authentication is already enabled. |