Overview

Authentication is a process that requires users and services to prove their identity when trying to access a system resource.

Organizations typically manage user identity and authentication through various time-tested technologies, including pluggable authentication modules (PAM), Lightweight Directory Access Protocol (LDAP) for identity, directory, and other services, such as group management, and Kerberos for authentication.

Cloudera clusters support integration with all of these technologies. For example, organizations with existing LDAP directory services such as Active Directory (included in Microsoft Windows Server as part of its suite of Active Directory Services) can leverage the organization's existing user accounts and group listings instead of creating new accounts throughout the cluster. Using an external system such as Active Directory or OpenLDAP can support the user role authorization mechanism.

As an alternative to LDAP, you can use pluggable authentication modules (PAM), which are standard Linux modules used for external authentication. In Cloudera Manager, PAM configuration is simpler than LDAP, and systems administrators can add new authentication methods by installing new PAM modules. If you have multiple LDAP servers, you can configure PAM to synchronize with LDAP using System Security Services Daemon (SSSD), as Cloudera Manager does not otherwise support multiple LDAP servers. You can also configure Cloudera Manager to use Apache Knox for authentication and PAM to look up users and roles stored in LDAP.

For authentication, Cloudera supports integration with MIT Kerberos and with Active Directory. Microsoft Active Directory supports Kerberos for authentication in addition to its identity management and directory functionality, that is, LDAP.

Kerberos provides strong authentication, strong meaning that cryptographic mechanisms—rather than passwords alone—are used in the exchange between requesting process and service during the authentication process.

These systems are not mutually exclusive. For example, Microsoft Active Directory is an LDAP directory service that also provides Kerberos authentication services, and Kerberos credentials can be stored and managed in an LDAP directory service. Cloudera Manager Server, CDP nodes, and Cloudera Enterprise components, such as Apache Hive, Hue, and Impala, can all make use of Kerberos authentication.

Configuring the cluster to use Kerberos requires having administrator privileges—and access to—the Kerberos server Key Distribution Center (KDC). The process may require debugging issues between the Cloudera Manager cluster and the KDC.

On each host operating system underlying each node in a cluster, local Linux user:group accounts are created during installation of Cloudera Server and CDP services. These accounts may also be referred to as Hadoop Users. These local accounts are blocked from normal login access, and have no password set. This means login on these user accounts is not possible. Customers are advised not to set any password on these local service accounts, as doing so would compromise security.

To apply per-node authentication and authorization mechanism consistently across all the nodes of a cluster, local user:group accounts are mapped to user accounts and groups in an LDAP-compliant directory service, such as Active Directory or OpenLDAP.

To facilitate the authentication process from each host system (node in the cluster) to the LDAP directory, Cloudera recommends using additional software mechanisms such as SSSD or Centrify Server Suite.