HiveServer2 Security Configuration

HiveServer2 supports authentication of the Thrift client using the following methods:
  • Kerberos authentication
  • LDAP authentication

    Starting with CDH 5.7 and later, clusters running LDAP-enabled HiveServer2 deployments also accept Kerberos authentication. This ensures that users are not forced to enter usernames/passwords manually, and are able to take advantage of the multiple authentication schemes SASL offers.

Kerberos authentication is supported between the Thrift client and HiveServer2, and between HiveServer2 and secure HDFS. LDAP authentication is supported only between the Thrift client and HiveServer2.

To configure HiveServer2 to use one of these authentication modes, configure the hive.server2.authentication configuration property.

Enabling Kerberos Authentication for HiveServer2

If you configure HiveServer2 to use Kerberos authentication, HiveServer2 acquires a Kerberos ticket during startup. HiveServer2 requires a principal and keytab file specified in the configuration. Client applications (for example, JDBC or Beeline) must have a valid Kerberos ticket before initiating a connection to HiveServer2.

Configuring JDBC Clients for Kerberos Authentication with HiveServer2

JDBC-based clients must include principal=<hive.server2.authentication.principal> in the JDBC connection string. For example:

String url = "jdbc:hive2://node1:10000/default;principal=hive/HiveServer2Host@YOUR-REALM.COM"
Connection con = DriverManager.getConnection(url);

where hive is the principal configured in hive-site.xml and HiveServer2Host is the host where HiveServer2 is running.

Download the Cloudera JDBC/ODBC database drivers from the Cloudera downloads web page.

Using Beeline to Connect to a Secure HiveServer2

Use the following command to start beeline and connect to a secure HiveServer2 process. In this example, the HiveServer2 process is running on localhost at port 10000:

$ /usr/lib/hive/bin/beeline
beeline> !connect jdbc:hive2://localhost:10000/default;principal=hive/HiveServer2Host@YOUR-REALM.COM
0: jdbc:hive2://localhost:10000/default>

For more information about the Beeline CLI, see Using the Beeline CLI.

For instructions on encrypting communication with the ODBC/JDBC drivers, see Configuring Encrypted Communication Between HiveServer2 and Client Drivers.

Using LDAP Username/Password Authentication with HiveServer2

As an alternative to Kerberos authentication, you can configure HiveServer2 to use user and password validation backed by LDAP. The client sends a username and password during connection initiation. HiveServer2 validates these credentials using an external LDAP service.

You can enable LDAP Authentication with HiveServer2 using Active Directory or OpenLDAP.

Enabling LDAP Authentication with HiveServer2 using Active Directory

  1. In the Cloudera Manager Admin Console, click Hive in the list of components, and then select the Configuration tab.
  2. Type "ldap" in the Search text box to locate the LDAP configuration fields.
  3. Check Enable LDAP Authentication.
  4. Enter the LDAP URL in the format ldap[s]://<host>:<port>
  5. Enter the Active Directory Domain for your environment.
  6. Click Save Changes.

Enabling LDAP Authentication with HiveServer2 using OpenLDAP

  1. In the Cloudera Manager Admin Console, click Hive in the list of components, and then select the Configuration tab.
  2. Type "ldap" in the Search text box to locate the LDAP configuration fields.
  3. Check Enable LDAP Authentication.
  4. Enter the LDAP URL in the format ldap[s]://<host>:<port>
  5. Enter the LDAP BaseDN for your environments. For example: ou=People,dc=example,dc=com
  6. Click Save Changes.

Configuring JDBC Clients for LDAP Authentication with HiveServer2

The JDBC client requires a connection URL that includes the user name and password in the JDBC connection string. For example:

String url ="jdbc:hive2://#<host>:#<port>/#<dbName>; \
          user=<user name>; \
          password=<password>"
          Connection con = DriverManager.getConnection(url);
        

Enabling LDAP Authentication for HiveServer2 in Hue

Enable LDAP authentication with HiveServer2 by setting the following properties under the [beeswax] section in hue.ini.
auth_username LDAP username of Hue user to be authenticated.
auth_password

LDAP password of Hue user to be authenticated.

Hive uses these login details to authenticate to LDAP. The Hive service trusts that Hue has validated the user being impersonated.

Configuring LDAPS Authentication with HiveServer2

HiveServer2 supports LDAP username/password authentication for clients. Clients send LDAP credentials to HiveServer2 which in turn verifies them against the configured LDAP provider, such as OpenLDAP or Microsoft Active Directory. Most implementations now support LDAPS (LDAP over TLS/SSL), an authentication protocol that uses TLS/SSL to encrypt communication between the LDAP service and its client (in this case, HiveServer2) to avoid sending LDAP credentials in cleartext.

To configure the LDAPS service with HiveServer2:

  1. Import the LDAP server CA certificate or the server certificate into a truststore on the HiveServer2 host. If you import the CA certificate, HiveServer2 will trust any server with a certificate issued by the LDAP server's CA. If you only import the server certificate, HiveServer2 trusts only that server. See Understanding Keystores and Truststores for more details.
  2. Make sure the truststore file is readable by the hive user.
  3. Set the hive.server2.authentication.ldap.url configuration property in hive-site.xml to the LDAPS URL. For example, ldaps://sample.myhost.com.
  4. In Cloudera Manager, go to the Hive service and select Configuration. Under Category, select Security. In the right panel, search for HiveServer2 TLS/SSL Certificate Trust Store File, and add the path to the truststore file that you created in step 1.

  5. Restart HiveServer2.

Pluggable Authentication

Pluggable authentication allows you to provide a custom authentication provider for HiveServer2.

To enable pluggable authentication:

  1. In the Cloudera Manager Admin Console, click Hive in the list of components, and then select the Configuration tab.
  2. Type hive-site.xml in the Search text box to locate the configuration fields.
  3. Add entries similar to the following to Hive Service Advanced Configuration Snippet (Safety Valve) for hive-site.xml:
    <property>
      <name>hive.server2.authentication</name>
      <value>CUSTOM</value>
      <description>Client authentication types.
      NONE: no authentication check
      LDAP: LDAP/AD based authentication
      KERBEROS: Kerberos/GSSAPI authentication
      CUSTOM: Custom authentication provider
      (Use with property hive.server2.custom.authentication.class)
      </description>
    </property>
    
    <property>
      <name>hive.server2.custom.authentication.class</name>
      <value>pluggable-auth-class-name</value>
      <description>
      Custom authentication class. Used when property
      'hive.server2.authentication' is set to 'CUSTOM'. Provided class
      must be a proper implementation of the interface
      org.apache.hive.service.auth.PasswdAuthenticationProvider. HiveServer2
      will call its Authenticate(user, passed) method to authenticate requests.
      The implementation may optionally extend the Hadoop's
      org.apache.hadoop.conf.Configured class to grab Hive's Configuration object.
      </description>
    </property>
  4. Make the class available in the CLASSPATH of HiveServer2.

Trusted Delegation with HiveServer2

HiveServer2 determines the identity of the connecting user from the authentication subsystem (Kerberos or LDAP). Any new session started for this connection runs on behalf of this connecting user. If the server is configured to proxy the user at the Hadoop level, then all MapReduce jobs and HDFS accesses will be performed with the identity of the connecting user. If Apache Sentry is configured, then this connecting userid can also be used to verify access rights to underlying tables and views.

Users with Hadoop superuser privileges can request an alternate user for the given session. HiveServer2 checks that the connecting user can proxy the requested userid, and if so, runs the new session as the alternate user. For example, the Hadoop superuser hue can request that a connection's session be run as user bob.

Alternate users for new JDBC client connections are specified by adding the hive.server2.proxy.user=alternate_user_id property to the JDBC connection URL. For example, a JDBC connection string that lets user hue run a session as user bob would be as follows:
# Login as super user Hue 
kinit hue -k -t hue.keytab hue@MY-REALM.COM

# Connect using following JDBC connection string 
# jdbc:hive2://myHost.myOrg.com:10000/default;principal=hive/_HOST@MY-REALM.COM;hive.server2.proxy.user=bob

The connecting user must have Hadoop-level proxy privileges over the alternate user.

HiveServer2 Impersonation

HiveServer2 impersonation lets users execute queries and access HDFS files as the connected user rather than as the super user. Access policies are applied at the file level using the HDFS permissions specified in ACLs (access control lists). Enabling HiveServer2 impersonation bypasses Sentry from the end-to-end authorization process. Specifically, although Sentry enforces access control policies on tables and views within the Hive warehouse, it does not control access to the HDFS files that underlie the tables. This means that users without Sentry permissions to tables in the warehouse may nonetheless be able to bypass Sentry authorization checks and execute jobs and queries against tables in the warehouse as long as they have permissions on the HDFS files supporting the table.

To configure Sentry correctly, restrict ownership of the Hive warehouse to hive:hive and disable Hive impersonation.

To enable impersonation in HiveServer2:

  1. Add the following property to the /etc/hive/conf/hive-site.xml file and set the value to true. (The default value is false.)
    <property>
      <name>hive.server2.enable.impersonation</name>
      <description>Enable user impersonation for HiveServer2</description>
      <value>true</value>
    </property>
  2. In HDFS or MapReduce configurations, add the following property to the core-site.xml file:
    <property>
      <name>hadoop.proxyuser.hive.hosts</name>
      <value>*</value>
    </property>
    <property>
      <name>hadoop.proxyuser.hive.groups</name>
      <value>*</value>
    </property>

See also File System Permissions.

Securing the Hive Metastore

To prevent users from accessing the Hive metastore and the Hive metastore database using any method other than through HiveServer2, the following actions are recommended:

  • Add a firewall rule on the metastore service host to allow access to the metastore port only from the HiveServer2 host. You can do this using iptables.
  • Grant access to the metastore database only from the metastore service host. This is specified for MySQL as:
    GRANT ALL PRIVILEGES ON metastore.* TO 'hive'@'metastorehost';
    where metastorehost is the host where the metastore service is running.
  • Make sure users who are not admins cannot log on to the host on which HiveServer2 runs.