Authentication
Overview of authentication in Cloudera on premises.
Typically a cluster will be integrated with an existing corporate directory, simplifying credentials management and align with well established HR procedures for managing and maintaining both user and service accounts. Kerberos is used to authenticate all service accounts within the cluster with credentials generated in the corporate directory (IDM/AD) and distributed by Cloudera Manager. To ensure that these procedures are secured it’s important that all interactions between Cloudera Manager, the Corporate Directory and the cluster hosts are encrypted using TLS security. Signed Certificates are distributed to each cluster host enabling service roles to mutually authenticate. This includes the Cloudera Agent process which will perform an TLS handshake with the Cloudera Manager server in order that configuration changes such as the generation and distribution of Kerberos credentials are undertaken across an encrypted channel. In addition to the Cloudera Manager agent, all the cluster service roles such as Impala Daemons, HDFS worker roles and management roles typically use TLS.
Kerberos
With Kerberos enabled, all cluster roles are able to authenticate each other
providing they have a valid kerberos ticket. The authentication tickets are issued by the KDC,
typically a local Active Directory Domain Controller, FreeIPA, or MIT Kerberos server with a
trust established with the corporate kerberos infrastructure, upon presentation of valid
credentials. Cloudera Manager generates and distributes these credentials to
each of the service roles using an elevated privilege that is securely maintained within its
database. Typically the administration privilege will enable the creation and deletion of
kerberos principles within a specific organisation unit (OU) within the corporate directory
(see, Delegating Administration by Using OU Objects
). Good practice is to first enable
TLS security between the Cloudera Manager and agents in order to ensure the
Kerberos keytab files are transported over an encrypted connection.
Impersonalisation
Within a Cloudera system there are two methods of
impersonation that are supported. The first is a simple "doAs" system, where certain services
are trusted to assert the identity of connecting users. For example, Oozie, Hue and Hive are
trusted within the Cloudera environment to act on behalf
of the user who connected to them - this is known as impersonation and is configured by the
hadoop.proxyuser.*
set of parameters.
This allows, for example, a user to connect to Hue or HiveServer2 (and we trust Hue/HiveServer2 to correctly identify that user) and then for Hue/HiveServer2 to act on that user's behalf.
Note: When configuring third-party integrations or add-ons, it is possible that they
will require hadoop.proxyuser.*
configurations, so that they can also
impersonate users that have connected.
The second method is supported by some implementations of Kerberos, is known as
Constrained Delegation. For further details on constrained delegation, see Accessing Secure
Cluster from Web Applications
.