Configuring and Upgrading Apache Spark
Note | |
---|---|
Instructions in this section are specific to HDP-2.3.4 and later. For earlier versions of HDP, refer to the documentation that corresponds to your version. For information about Spark version support, see the Spark - HDP Version Support table in the Spark Guide Introduction. |
Install Spark 1.5.2 (and, optionally, Python) on the node where you want the Spark History Server to run:
su - root
yum install spark_spark_2_3_4_0_3371-master -y
To use Python:
yum install spark_spark_2_3_4_0_3371-python
conf-select create-conf-dir --package spark --stack-version spark_2_3_4_0_3371 --conf-version 0
cp /etc/spark/spark_2_3_4_0_3371/0/* /etc/spark/spark_2_3_4_0_3371/0/
conf-select set-conf-dir --package spark --stack-version spark_2_3_4_0_3371 --conf-version 0
hdp-select set spark-client spark_2_3_4_0_3371
hdp-select set spark-historyserver spark_2_3_4_0_3371
Stop the Spark
history-server
. If you are using the Sparkthrift-server
, stop thethrift-server
.su - spark -c "$SPARK_HOME/sbin/stop-history-server.sh" su - spark -c "$SPARK_HOME/sbin/stop-thriftserver.sh"
It is recommended that you run the Spark History Server on top of HDFS, not YARN ATS. Modify the Spark configuration files as follows:
As the hdfs service user, create an HDFS directory called spark-history with user:spark, user group:hadoop, and permissions = 777:
hdfs dfs -mkdir /spark-history hdfs dfs -chown -R spark:hadoop /spark-history hdfs dfs -chmod -R 777 /spark-history
Edit the
spark-defaults.conf
file.Add the following properties and values:
spark.eventLog.dir to hdfs:///spark-history spark.eventLog.enabled to true spark.history.fs.logDirectory to hdfs:///spark-history
Edit the
spark-thrift-sparkconf.conf
file.Add the following properties and values:
spark.eventLog.dir to hdfs:///spark-history spark.eventLog.enabled to true spark.history.fs.logDirectory to hdfs:///spark-history
Restart the
history-server
:su - spark -c "/usr/hdp/current/spark-historyserver/sbin/start-history-server.sh"
If you are using the Spark thrift-server, restart the thrift-server. See (Optional) Starting the Spark Thrift Server.
Validate the Spark installation. As user
spark
, run the Spark Pi example in the Spark Guide.