Spark QuickStart Guide
Also available as:
PDF

Chapter 9. Troubleshooting Spark

When you run a Spark job, you will see a standard set of console messages.

In addition, the following information is available:

  • A list of running applications, where you can retrieve the application ID and check the application log:

    yarn application –list

    yarn logs -applicationId <app_id>

  • For information about a specific job, check the Spark web UI:

    http://<host>:8088/proxy/<job_id>/environment/

The following paragraphs describe specific issues and possible solutions.

 

Issue: Spark YARN jobs don’t seem to start. YARN Resource Manager logs show an application with "bad substitution” errors in its logs.

Solution: Make sure that your $SPARK_HOME/config/spark-defaults.conf file includes your HDP version. For example:

   spark.driver.extraJavaOptions 
   -Dhdp.version=2.3.0.0-2557 
   spark.yarn.am.extraJavaOptions
   -Dhdp.version=2.3.0.0-2557

To check the HDP version for an Ambari-managed cluster, navigate to http://$AMBARI_SERVER:8080/#/main/admin/stack/versions, where $AMBARI_SERVER is your Ambari Web URL.

To check the version via bash, run the following command:

   > bash-4.1# hdp-select status hadoop-client | sed 's/hadoop-client - \(.*\)/\1/'

   2.3.0.0-2557

 

Issue: Job stays in "accepted" state; it doesn't run. This can happen when a job requests more memory or cores than available.

Solution: Assess workload to see if any resources can be released. You might need to stop unresponsive jobs to make room for the job.

Issue: Insufficient HDFS access. This can lead to errors such as the following:

   “Loading data to table default.testtable
   Failed with exception 
   Unable to move sourcehdfs://blue1:8020/tmp/hive-spark/hive_2015-06-04_
   12-45-42_404_3643812080461575333-1/-ext-10000/kv1.txt to destination 
   hdfs://blue1:8020/apps/hive/warehouse/testtable/kv1.txt”

Solution: Make sure the user or group running the job has sufficient HDFS privileges to the location.

 

Issue: Wrong host in Beeline, shows error as invalid URL:

   Error: Invalid URL: jdbc:hive2://localhost:10001 (state=08S01,code=0)

Solution: Specify the correct Beeline host assignment.

 

Issue: Error: closed SQLContext.

Solution: Restart the Thrift server.