Determining the Cause of Slow and Failed Queries
Identifying the cause of slow query run times and queries that fail to complete.
Steps with examples are included that explain how to further investigate and troubleshoot the cause of a slow and failed query.
In a supported web browser, log in to the Workload XM UI by doing
- In the web browser URL field, enter wxm.cloudera.com and press Enter.
- In the Email Address and Password fields, enter the email address and password associated with your Workload XM user credentials.
- Click Log In.
In the Clusters page do one of the following:
- In the Search field, enter the name of the cluster whose workloads you want to analyze.
- From the Cluster Name column, locate and click on the name of the cluster whose workloads you want to analyze.
- From the time-range list in the Cluster Summary page, select a time period that meets your requirements.
From the Trend widget, select the tab of an engine whose
jobs you wish to analyze and then click its Total Jobs
The engine's Jobs page opens.
From the Health Check list in the Jobs page, select
Task Wait Time, which filters the list to display a
list of jobs with longer than average wait times before the process is
To view more details, from the Job column, select a
job's name and then click the Health Checks tab.
The Baseline Health checks are displayed.
From the Health Checks panel, select the Task
Wait Time health check.
For example, as shown in the following image, the long wait time occurred in the Map Stage of the job process due to insufficient resources:
To display more information about the Map Stage tasks that are experiencing
longer than average wait times before they can execute, click one of the tasks
listed under Outlier Tasks.
In the following example, the Task Details show that the task's wait time is above average. When comparing the Wait Duration value with the Successful Attempt Duration value, the task when it does finish has a significantly better than average time. This indicates that insufficient resources are allocated for this job.