Configuring Spark Applications
You can specify Spark application configuration properties as follows:
-
Pass properties using the
--conf
command-line switch; for example:spark-submit \ --class com.cloudera.example.YarnExample \ --master yarn \ --deploy-mode cluster \ --conf "spark.eventLog.dir=hdfs:///user/spark/eventlog" \ lib/yarn-example.jar \ 10
- Specify properties in spark-defaults.conf.
-
Pass properties directly to the
SparkConf
used to create theSparkContext
in your Spark application; for example:-
Scala:
val conf = new SparkConf().set("spark.dynamicAllocation.initialExecutors", "5") val sc = new SparkContext(conf)
-
Python:
from pyspark import SparkConf, SparkContext from pyspark.sql import SQLContext conf = (SparkConf().setAppName('Application name')) conf.set('spark.hadoop.avro.mapred.ignore.inputs.without.extension', 'false') sc = SparkContext(conf = conf) sqlContext = SQLContext(sc)
-
The order of precedence in configuration properties is:
-
Properties passed to
SparkConf
. -
Arguments passed to
spark-submit
,spark-shell
, orpyspark
. -
Properties set in
spark-defaults.conf
.
For more information, see Spark Configuration.