Troubleshooting Failed Data Engineering Jobs (Hadoop Administrators)

Use Workload Manager to quickly troubleshoot failed data engineering jobs.

  1. On the Data Engineering Jobs page, click the Health Checks drop-down list, and select Failed to Finish. This filters the list to display a list of jobs that did not complete.



  2. In the list of jobs, click on the Job name to view more detailed information:



  3. On the Jobs details page, click Health Checks to view details for the Failed to Finish health check. It indicates that the failure occurred in the Reduce Stage of job execution:



    Click on Reduce Stage and then click Execution Details.

  4. In the Summary box, you can view the failed tasks.:



  5. Click on a failed task to see the error message from each failed attempt. In this example, the error message, Task KILL is received. Killing attempt!, is not very descriptive or helpful. To gather more information about the task failure, open the associated log file to further analyze the root cause.