Shell action for Spark 3
Learn about how to execute Spark 3's spark3-submit through Oozie's Shell action.
Similar to the support for executing Spark 2's spark-submit through Oozie's Shell action, Cloudera also provides full support for executing Spark 3's spark3-submit through Oozie's Shell action.
As Oozie utilizes delegation tokens instead of Kerberos tickets in its YARN applications, it is
recommended to unset the
environment variable in your
Shell script before executing spark3-submit, if you intend to use spark3-submit without relying
on Oozie's default delegation tokens. This is because spark3-submit might not function properly
with both delegation tokens and Kerberos tickets. However, to ensure the successful completion of
your Shell action, please ensure that you reset the HADOOP_TOKEN_FILE_LOCATION
environment variable after the execution of your custom Shell script segment. The following
example illustrates how you can accomplish
this:#!/usr/bin/env bash # By executing the commands within brackets, # we can ensure that the parent environment remains untouched ( unset HADOOP_TOKEN_FILE_LOCATION kinit -kt /var/keytabs/user.keytab user /usr/bin/spark3-submit --master yarn --deploy-mode cluster \ tableUsingSpark3FromShellAction /usr/bin/spark3-submit --master yarn --deploy-mode cluster \ tableUsingSpark3FromShellAction )