Chapter 9. Troubleshooting Spark
When you run a Spark job, you will see a standard set of console messages.
In addition, the following information is available:
A list of running applications, where you can retrieve the application ID and check the application log:
yarn application –list
yarn logs -applicationId <app_id>
For information about a specific job, check the Spark web UI:
http://<host>:8088/proxy/<job_id>/environment/
The following paragraphs describe specific issues and possible solutions.
Issue: Spark YARN jobs don’t seem to start. YARN Resource Manager logs show an application with "bad substitution” errors in its logs.
Solution: Make sure that your
$SPARK_HOME/config/spark-defaults.conf
file includes your HDP version. For
example:
spark.driver.extraJavaOptions -Dhdp.version=2.3.0.0-2557 spark.yarn.am.extraJavaOptions -Dhdp.version=2.3.0.0-2557
To check the HDP version for an Ambari-managed cluster, navigate to
http://$AMBARI_SERVER:8080/#/main/admin/stack/versions
, where
$AMBARI_SERVER
is your Ambari Web URL.
To check the version via bash, run the following command:
> bash-4.1# hdp-select status hadoop-client | sed 's/hadoop-client - \(.*\)/\1/'
2.3.0.0-2557
Issue: Job stays in "accepted" state; it doesn't run. This can happen when a job requests more memory or cores than available.
Solution: Assess workload to see if any resources can be released. You might need to stop unresponsive jobs to make room for the job.
Issue: Insufficient HDFS access. This can lead to errors such as the following:
“Loading data to table default.testtable Failed with exception Unable to move sourcehdfs://blue1:8020/tmp/hive-spark/hive_2015-06-04_ 12-45-42_404_3643812080461575333-1/-ext-10000/kv1.txt to destination hdfs://blue1:8020/apps/hive/warehouse/testtable/kv1.txt”
Solution: Make sure the user or group running the job has sufficient HDFS privileges to the location.
Issue: Wrong host in Beeline, shows error as invalid URL:
Error: Invalid URL: jdbc:hive2://localhost:10001 (state=08S01,code=0)
Solution: Specify the correct Beeline host assignment.
Issue: Error: closed SQLContext.
Solution: Restart the Thrift server.