Configuring authentication for long-running Spark Streaming jobs
Long-running applications such as Spark Streaming jobs must be able to
write data continuously, which means that the user may need to delegate tokens possibly beyond
the default lifetime. This workload type requires passing Kerberos principal and keytab to the
spark-submit
script using the --principal
and
--keytab
parameters.
The keytab
is copied to the host running the ApplicationMaster, and the Kerberos login is renewed
periodically by using the principal and keytab to generate the required delegation tokens
needed for HDFS.