Configuring LDAP Group Mappings

Each host that comprises a node in a Cloudera cluster runs an operating system, such as CentOS or Oracle Linux. At the OS-level, there are user:group accounts created during installation that map to the services running on that specific node of the cluster. The default shell-based group mapping provider, org.apache.hadoop.security.ShellBasedUnixGroupsMapping, handles the mapping from the local host system (the OS) to the specific cluster service, such as HDFS. The hosts authenticate using these local OS accounts before processes are allowed to run on the node.

For clusters integrated with Kerberos for authentication, the hosts must also provide Kerberos tickets before processes can run on the node. The cluster can use the organization's LDAP directory service to provide the login credentials, including Kerberos tickets, to authenticate transparently while the system runs. That means that the local user:group accounts on each host must be mapped to LDAP accounts. To map local user:group accounts to an LDAP service:

  • Use tools such as SSSD (Systems Security Services Daemon) or Centrify Server Suite (see Identity and Access management for Cloudera).
  • The Hadoop LdapGroupsMapping group mapping mechanism. The LdapGroupsMapping library may not be as robust a solution needed for large organizations in terms of scalability and manageability, especially for organizations managing identity across multiple systems and not exclusively for Hadoop clusters. Support for the LdapGroupsMapping library is not consistent across all operating systems.
  • Do not use Winbind to map Linux user:group accounts to Active Directory. It cannot scale, impedes cluster performance, and is not supported.
  • Use the same user:group mappings across all cluster nodes, for ease of management.
  • Use either Cloudera Manager or the command-line process detailed below.

The local user:group accounts must be mapped to LDAP for group mappings in Hadoop. You must create the users and groups for your Hadoop services in LDAP.

To integrate the cluster with an LDAP service, the user:group relationships must be contained in the LDAP directory. The admin must create the user accounts and define groups for user:group relationships on each host.

The user and group names listed in the table are the default user:group values for CDH services.

Cloudera Product or Component User Group
Cloudera Manager cloudera-scm cloudera-scm
Apache Accumulo accumulo accumulo
Apache Avro (No default) (No default)
Apache Flume (sink writing to HDFS needs write privileges) flume flume
Apache HBase (Master, RegionServer processes) hbase hbase
Apache HCatalog, WebHCat service hive hive
Apache Hive (HiveServer2, Hive Metastore) hive hive
Apache Kafka kafka kafka
Apache Mahout (No default) (No default)
Apache Oozie oozie oozie
Apache Pig (No default) (No default)
Apache Sentry sentry sentry
Apache Spark spark spark
Apache Sqoop1 sqoop sqoop
Apache Sqoop2 sqoop2 sqoop, sqoop2
Apache Whirr (No default) (No default)
Apache ZooKeeper zookeeper zookeeper
Impala impala impala, hive
Cloudera Search solr solr
HDFS (NameNode, DataNodes) hdfs hdfs, hadoop
HttpFS httpfs httpfs
Hue hue hue
Hue Load Balancer (needs apache2 package) apache apache
Java KeyStore KMS kms kms
Key Trustee KMS kms kms
Key Trustee Server keytrustee keytrustee
Kudu kudu kudu
Llama llama llama
MapReduce (JobTracker, TaskTracker) mapred mapred, hadoop
Parquet (No default) (No default)
YARN yarn yarn, hadoop
In addition:
  • For Sentry with Hive, add these properties on the HiveServer2 node.
  • For Sentry with Impala, add these properties to all hosts.
See Users and Groups in Sentry for more information.

Using Cloudera Manager

Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)

Make the following changes to the HDFS service's security configuration:
  1. Open the Cloudera Manager Admin Console and go to the HDFS service.
  2. Click the Configuration tab.
  3. Select Scope > HDFS (Service Wide)
  4. Select Category > Security.
  5. Modify the following configuration properties using values from the table below:
    Configuration Property Value
    Hadoop User Group Mapping Implementation org.apache.hadoop.security.LdapGroupsMapping
    Hadoop User Group Mapping LDAP URL ldap://<server>
    Hadoop User Group Mapping LDAP Bind User Administrator@example.com
    Hadoop User Group Mapping LDAP Bind User Password ***
    Hadoop User Group Mapping Search Base dc=example,dc=com
Although the above changes are sufficient to configure group mappings for Active Directory, some changes to the remaining default configurations might be required for OpenLDAP.

Using the Command Line

Add the following properties to the core-site.xml on the NameNode:
<property>    
<name>hadoop.security.group.mapping</name>
<value>org.apache.hadoop.security.LdapGroupsMapping</value>  
</property>  

<property>
<name>hadoop.security.group.mapping.ldap.url</name>
<value>ldap://server</value>  
</property>  

<property>
<name>hadoop.security.group.mapping.ldap.bind.user</name>
<value>Administrator@example.com</value>
</property>  

<property>
<name>hadoop.security.group.mapping.ldap.bind.password</name>    
<value>****</value>
</property>  

<property>
<name>hadoop.security.group.mapping.ldap.base</name>
<value>dc=example,dc=com</value>  
</property>  

<property>
<name>hadoop.security.group.mapping.ldap.search.filter.user</name>
<value>(&amp;(objectClass=user)(sAMAccountName={0}))</value>  
</property>  

<property>
<name>hadoop.security.group.mapping.ldap.search.filter.group</name>  
<value>(objectClass=group)</value>  
</property>  

<property>
<name>hadoop.security.group.mapping.ldap.search.attr.member</name>    
<value>member</value>
</property>  

<property>
<name>hadoop.security.group.mapping.ldap.search.attr.group.name</name>    
<value>cn</value>
</property>