Spark Authentication
Minimum Required Role: Security Administrator (also provided by Full Administrator)
Spark currently support two methods of authentication. Authentication can be configured using Kerberos or using a shared secret. When using Spark on YARN, Cloudera recommends using Kerberos authentication since it is stronger security measure.
Configuring Kerberos Authentication for Spark Using the Command Line
Create the Spark Principal and Keytab File
- Create the spark principal and spark.keytab file:
kadmin: addprinc -randkey spark/fully.qualified.domain.name@YOUR-REALM.COM kadmin: xst -k spark.keytab spark/fully.qualified.domain.name
- Move the file into the Spark configuration directory and restrict its access exclusively to the spark user:
$ mv spark.keytab /etc/spark/conf/ $ chown spark /etc/spark/conf/spark.keytab $ chmod 400 /etc/spark/conf/spark.keytab
For more details on creating Kerberos principals and keytabs, see Step 4: Create and Deploy the Kerberos Principals and Keytab Files.
Configure the Spark History Server to Use Kerberos
Using Cloudera Manager
- Open the Cloudera Manager Administration Console and navigate to the Spark service.
- Click the Configuration tab.
- Select .
- Select .
- Edit the History Server Advanced Configuration Snippet (Safety Valve) for spark-conf/spark-env.sh property to add the following properties:
SPARK_HISTORY_OPTS=-Dspark.history.kerberos.enabled=true \ -Dspark.history.kerberos.principal=spark/fully.qualified.domain.name@YOUR-REALM.COM \ -Dspark.history.kerberos.keytab=/etc/spark/conf/spark.keytab
- Click Save Changes to commit the changes.
Using the Command Line
SPARK_HISTORY_OPTS=-Dspark.history.kerberos.enabled=true \ -Dspark.history.kerberos.principal=spark/fully.qualified.domain.name@YOUR-REALM.COM \ -Dspark.history.kerberos.keytab=/etc/spark/conf/spark.keytab
Running Spark Applications on a Secure Cluster
You can submit compiled Spark applications with the spark-submit script. Specify the following additional command-line options when running the spark-submit script on a secure cluster using the form: --option value.
Option | Description |
---|---|
--keytab | The full path to the file that contains the keytab for the principal. This keytab is copied to the node running the ApplicationMaster using the Secure Distributed Cache, for periodically renewing the login tickets and the delegation tokens. For information on setting up the principal and keytab, see Configuring a Cluster with Custom Kerberos Principalsand Spark Authentication. |
--principal | Principal to be used to log in to the KDC, while running on secure HDFS. |
--proxy-user | This property allows you to use the spark-submit script to impersonate client users when submitting jobs. |
Configuring Spark Authentication With a Shared Secret Using Cloudera Manager
Minimum Required Role: Security Administrator (also provided by Full Administrator)
Authentication using a shared secret can be configured using the spark.authenticate configuration property. The authentication process checks to make sure Spark has the same shared secret as the applications. If the shared secret does not match, authentication will fail.
- Go to the tab.
- In the Search field, type spark authenticate to find the Spark Authentication settings.
- Check the checkbox for the Spark Authentication property.
- Click Save Changes.