Using Spark history server to troubleshoot Spark jobs

The Spark history server is a monitoring tool that displays information about completed Spark applications. It provides information for debugging such as Spark configurations, DAG execution, driver and executor resource utilization, application logs, and job, stage and task-level details.

To view spark history server information for a job run:

  1. In the Cloudera Data Platform (CDP) console, click the Data Engineering tile. The CDE Home page displays.
  2. Click Jobs in the left navigation menu.
  3. From the drop-down in the upper left-hand corner, select the Virtual Cluster that you want to restore jobs to.
  4. Select the job that you want to troubleshoot.
  5. Click Jobs Runs in the left menu, and click the Run ID for the job run you want to view the information.
  6. Click the Spark UI tab to access the Spark History Server.