Configuring Spark Applications
- Pass properties using the --conf command-line switch; for example:
spark-submit \ --class com.cloudera.example.YarnExample \ --master yarn \ --deploy-mode cluster \ --conf "spark.eventLog.dir=hdfs:///user/spark/eventlog" \ lib/yarn-example.jar \ 10
- Specify properties in spark-defaults.conf. See Configuring Spark Application Properties in spark-defaults.conf.
- Pass properties directly to the SparkConf used to create
the SparkContext in your Spark application; for example:
val conf = new SparkConf().set("spark.dynamicAllocation.initialExecutors", "5") val sc = new SparkContext(conf)
- Properties passed to SparkConf.
- Arguments passed to spark-submit, spark-shell, or pyspark.
- Properties set in spark-defaults.conf.
For more information, see Spark Configuration.
Configuring Spark Application Properties in spark-defaults.conf
Specify properties in the spark-defaults.conf file in the form property value.
You create a comment by putting a hash mark ( # ) at the beginning of a line. You cannot add comments to the end or middle of a line.
spark.master spark://mysparkmaster.acme.com:7077 spark.eventLog.enabled true spark.eventLog.dir hdfs:///user/spark/eventlog # Set spark executor memory spark.executor.memory 2g spark.logConf trueCloudera recommends placing configuration properties that you want to use for every application in spark-defaults.conf. See Application Properties for more information.
Configuring Properties in spark-defaults.conf Using Cloudera Manager
Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
You configure properties for all Spark applications in spark-defaults.conf as follows:
- Go to the Spark service.
- Click the Configuration tab.
- Select .
- Select .
- Locate the Spark Client Advanced Configuration Snippet (Safety Valve) for spark-conf/spark-defaults.conf property.
- Specify properties described in Application Properties.
If more than one role group applies to this configuration, edit the value for the appropriate role group. See Modifying Configuration Properties Using Cloudera Manager.
- Click Save Changes to commit the changes.
- Deploy the client configuration.
Configuring Properties in spark-defaults.conf Using the Command Line
To configure properties for all Spark applications using the command line, edit the file SPARK_HOME/conf/spark-defaults.conf.
Configuring Spark Application Logging Properties
You configure Spark application logging properties in a log4j.properties file.
Configuring Logging Properties Using Cloudera Manager
Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
- Go to the Spark service.
- Click the Configuration tab.
- Select .
- Select .
- Locate the Spark Client Advanced Configuration Snippet (Safety Valve) for spark-conf/log4j.properties property.
- Specify log4j properties.
If more than one role group applies to this configuration, edit the value for the appropriate role group. See Modifying Configuration Properties Using Cloudera Manager.
- Click Save Changes to commit the changes.
- Deploy the client configuration.
Configuring Logging Properties Using the Command Line
To specify logging properties for all users on a machine using the command line, edit the file SPARK_HOME/conf/log4j.properties. To set it just for yourself or for a specific application, copy SPARK_HOME/conf/log4j.properties.template to log4j.properties in your working directory or any directory in your application's classpath.