Kerberos configurations for HWC

You learn how to configure and which parameters to set for a Kerberos-secure HWC connection for querying the Hive metastore from Spark.

The Hive Warehouse Connector (HWC) must connect to HiveServer (HS2) to execute writes or to execute reads in read modes other than Direct Reader. You need to set the following configuration properties to connect HWC to a Kerberos-enabled HiveServer:

  • Property: spark.sql.hive.hiveserver2.jdbc.url.principal

    Value: Set this value to the value of "hive.server2.authentication.kerberos.principal".

  • Property: spark.security.credentials.hiveserver2.enabled

    Value: Set this value to "true".

You do not need to explicitly provide other authentication configurations, such as auth type and principal. When Spark opens a secure connection to Hive metastore, Spark automatically picks the authentication configurations from the hive-site.xml that is present on the Spark app classpath. For example, to execute queries in direct reader mode through HWC, Spark opens a secure connection to Hive metastore and this authentication occurs automatically.

You can set the properties using the spark-submit/spark-shell --conf option.