Run sample spark-submit command

After you activate the profile using the cdp-env tool, you can run your spark-submit commands on CDE without completely rewriting your existing spark-on-yarn command lines.

  • Sample spark-submit commands you can run on the CDE workloads.
    $ spark-submit \
    --name pt_rpt_streams \
    --master=yarn --deploy-mode=cluster \
    --driver-memory 4G \
    --executor-memory 4G --executor-cores 3 \
    --num-executors 4 \
    --files "$HOME/spark-sql.py" \
    --conf "spark.executor.extraJavaOptions=-Djava.security.auth.login.config=/home/hdpsparkprd/spark-hdpsparkprdkeytab-jaas.conf -Djava.security.krb5.conf=/etc/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=true" \
    --conf "spark.driver.extraJavaOptions=-Djava.security.auth.login.config=/home/hdpsparkprd/spark-hdpsparkprdkeytab-jaas.conf -Djava.security.krb5.conf=/etc/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=true" \
    --conf "spark.io.compression.codec=org.apache.spark.io.LZ4CompressionCodec" \
    $HOME/spark-sql.py
    $ spark3-submit \
    --name pt_rpt_streams \
    --master=yarn --deploy-mode=cluster \
    --driver-memory 4G \
    --executor-memory 4G --executor-cores 3 \
    --num-executors 4 \
    --files "$HOME/spark-sql.py" \
    --conf "spark.executor.extraJavaOptions=-Djava.security.auth.login.config=/home/hdpsparkprd/spark-hdpsparkprdkeytab-jaas.conf -Djava.security.krb5.conf=/etc/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=true" \
    --conf "spark.driver.extraJavaOptions=-Djava.security.auth.login.config=/home/hdpsparkprd/spark-hdpsparkprdkeytab-jaas.conf -Djava.security.krb5.conf=/etc/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=true" \
    --conf "spark.io.compression.codec=org.apache.spark.io.LZ4CompressionCodec" \
    $HOME/spark-sql.py
  • Sample spark-submit commands with an inline profile configuration you can run on the CDE workloads.
    $ CDE_CONFIG_PROFILE=yarn \
    spark-submit \
    --name pt_rpt_streams --master=yarn \
    --deploy-mode=cluster --driver-memory 4G \
    --executor-memory 4G --executorcores 3 \
    --num-executors 4 --files "$HOME/spark-sql.py" \
    --conf "spark.executor.extraJavaOptions=-Djava.security.auth.login.config=/home/hdpsparkprd/spark-hdpsparkprdkeytab-jaas.conf -Djava.security.krb5.conf=/etc/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=true" \
    --conf "spark.driver.extraJavaOptions=-Djava.security.auth.login.config=/home/hdpsparkprd/spark-hdpsparkprdkeytab-jaas.conf -Djava.security.krb5.conf=/etc/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=true" \
    --conf "spark.io.compression.codec=org.apache.spark.io.LZ4CompressionCodec" \
    $HOME/spark-sql.py
    $ CDE_CONFIG_PROFILE=vc-1 \
    spark3-submit \
    --name pt_rpt_streams \
    --master=yarn --deploy-mode=cluster \
    --driver-memory 4G --executor-memory 4G \
    --executor-cores 3 --num-executors 4 \
    --files "$HOME/spark-sql.py" \
    --conf "spark.executor.extraJavaOptions=-Djava.security.auth.login.config=/home/hdpsparkprd/spark-hdpsparkprdkeytab-jaas.conf -Djava.security.krb5.conf=/etc/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=true" \
    --conf
    "spark.driver.extraJavaOptions=-Djava.security.auth.login.config=/home/hdpsparkprd/spark-hdpsparkprdkeytab-jaas.conf -Djava.security.krb5.conf=/etc/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=true" \
    --conf "spark.io.compression.codec=org.apache.spark.io.LZ4CompressionCodec" \
    $HOME/spark-sql.py