Spark Authentication
Minimum Required Role: Security Administrator (also provided by Full Administrator)
Spark currently support two methods of authentication. Authentication can be configured using Kerberos or using a shared secret. When using Spark on YARN, Cloudera recommends using Kerberos authentication since it is stronger security measure.
Configuring Kerberos Authentication for Spark
Create the Spark Principal and Keytab File
- Create the spark principal and spark.keytab file:
kadmin: addprinc -randkey spark/fully.qualified.domain.name@YOUR-REALM.COM kadmin: xst -k spark.keytab spark/fully.qualified.domain.name
- Move the file into the Spark configuration directory and restrict its access exclusively to the spark user:
$ mv spark.keytab /etc/spark/conf/ $ chown spark /etc/spark/conf/spark.keytab $ chmod 400 /etc/spark/conf/spark.keytab
For more details on creating Kerberos principals and keytabs, see Step 4: Create and Deploy the Kerberos Principals and Keytab Files.
Configure the Spark History Server to Use Kerberos
Using Cloudera Manager
- Open the Cloudera Manager Administration Console and navigate to the Spark service.
- Click the Configuration tab.
- Select .
- Select .
- Edit the History Server Advanced Configuration Snippet (Safety Valve) for spark-conf/spark-env.sh property to add the following properties:
SPARK_HISTORY_OPTS=-Dspark.history.kerberos.enabled=true \ -Dspark.history.kerberos.principal=spark/fully.qualified.domain.name@YOUR-REALM.COM \ -Dspark.history.kerberos.keytab=/etc/spark/conf/spark.keytab
- Click Save Changes to commit the changes.
Using the Command Line
SPARK_HISTORY_OPTS=-Dspark.history.kerberos.enabled=true \ -Dspark.history.kerberos.principal=spark/fully.qualified.domain.name@YOUR-REALM.COM \ -Dspark.history.kerberos.keytab=/etc/spark/conf/spark.keytab
Running Spark Applications on a Secure Cluster
You can submit compiled Spark applications with the spark-submit script. Specify the following additional command-line options when running the spark-submit script on a secure cluster using the form: --option value.
Option | Description |
---|---|
--keytab | The full path to the file that contains the keytab for the principal. This keytab is copied to the node running the ApplicationMaster using the Secure Distributed Cache, for periodically renewing the login tickets and the delegation tokens. For information on setting up the principal and keytab, see Configuring a Cluster with Custom Kerberos Principalsand Spark Authentication. |
--principal | Principal to be used to log in to the KDC, while running on secure HDFS. |
--proxy-user | This property allows you to use the spark-submit script to impersonate client users when submitting jobs. |
Configuring Spark Authentication Using a Shared Secret
Authentication using a shared secret can be configured using the spark.authenticate configuration parameter. The authentication process is essentially a handshake between Spark and the other party to ensure they have the same shared secret and can be allowed to communicate. If the shared secret does not match, they will not be allowed to communicate.
- Go to the tab.
- In the Search field, type spark authenticate to find the Spark Authentication settings.
- Check the checkbox for the Spark Authentication property.
- Click Save Changes.