Using auth-to-local rules to isolate cluster users

How to use auth-to-local rules to restrict user access to specific clusters.

By default, the Hadoop auth-to-local rules map a principal of the form <username>/<hostname>@<REALM> to <username>. This means if there are multiple clusters in the same realm, then principals associated with hosts of one cluster would map to the same user in all other clusters.

For example, if you have two clusters, cluster1-host-[1..4].example.com and cluster2-host- [1..4].example.com, that are part of the same Kerberos realm, EXAMPLE.COM, then the cluster2 principal, hdfs/cluster2-host1.example.com@EXAMPLE.COM, will map to the hdfs user even on cluster1 hosts.

To prevent this, use auth-to-local rules as follows to ensure only principals containing hostnames of cluster1 are mapped to legitimate users.

  1. Go to the HDFS Service > Configuration tab.
  2. Select Scope > HDFS (Service-Wide).
  3. Select Category > Security.
  4. In the Search field, type Additional Rules to find the Additional Rules to Map Kerberos Principals to Short Names settings.
  5. Additional mapping rules can be added to the Additional Rules to Map Kerberos Principals to Short Names property. These rules will be inserted before the rules generated from the list of trusted realms (configured above) and before the default rule.
    RULE:[2:$1/$2@$0](hdfs/cluster1-host1.example.com@EXAMPLE.COM)s/(.*)@EXAMPLE.COM/hdfs/
    RULE:[2:$1/$2@$0](hdfs/cluster1-host2.example.com@EXAMPLE.COM)s/(.*)@EXAMPLE.COM/hdfs/
    RULE:[2:$1/$2@$0](hdfs/cluster1-host3.example.com@EXAMPLE.COM)s/(.*)@EXAMPLE.COM/hdfs/
    RULE:[2:$1/$2@$0](hdfs/cluster1-host4.example.com@EXAMPLE.COM)s/(.*)@EXAMPLE.COM/hdfs/
    RULE:[2:$1/$2@$0](hdfs.*@EXAMPLE.COM)s/(.*)@EXAMPLE.COM/nobody/

    In the example, the principal hdfs/<hostname>@REALM is mapped to the hdfs user if <hostname> is one of the cluster hosts. Otherwise it gets mapped to nobody, thus ensuring that principals from other clusters do not have access to cluster1.

    If the cluster hosts can be represented with a regular expression, that expression can be used to make the configuration easier and more conducive to scaling. For example:

    RULE:[2:$1/$2@$0](hdfs/cluster1-host[1-4].example.com@EXAMPLE.COM)s/(.*)@EXAMPLE.COM/hdfs/
    RULE:[2:$1/$2@$0](hdfs.*@EXAMPLE.COM)s/(.*)@EXAMPLE.COM/nobody/
  6. Click Save Changes.
  7. Restart the HDFS service and any dependent services.