Key Features of Workload XM

Cluster Report Emails

Enable Cluster Report emails to get daily updates on cluster analytics, which you can use monitor queries, jobs, and the users that are running queries. These analytics are sent to your email address so you can check yesterday's statistics without having to login to your cluster.

See Cluster Report Emails for detailed information about what the reports contain and how to enable them.

Default Time Range

If you have not specified a time range, Workload Experience Manager (Workload XM) displays data for the last 24 hours by default. If there is no data available for the last 24 hours, Workload XM displays the full range that is available by default.

Data Warehouse Tables Widget

The Tables widget in the Data Warehouse Summary page gives you a quick overview of the tables that are accessed most often in queries. The Data Read Distribution section of the widget shows the distribution of the amount of data that was read by the queries. If two or more tables are in a single row in the widget, it is because they were joined together. In that case, the Data Read Distribution statistics represent the total data read across all the tables in that row.

For example, in the image below, the parrot.employees table was accessed in 1% of the total queries. 75% of the queries accessed 209.4 MiB or less data, 95% of the queries accessed 395.2 MiB or less data, and the query that accessed the most data accessed 861.5 MiB of data.

Cluster Analytics Page

The Cluster Analytics page sorts clusters by the date they were last updated, by default. The Last Updated column shows the date and time that the cluster was last updated. You can also sort by cluster name. The Email Report column indicates whether you are subscribed to Cluster Reports for that cluster, and the Actions menu contains a Enable Cluster Report Emails option. The image below shows the Cluster Analytics page:

Data Engineering Jobs Layout

The way that baseline information is presented on the Data Engineering Jobs page matches the style of the Job Comparison page.

Metrics are sorted by their header, and are in alphabetical order. The layout is shown in the screenshot below:

Data Warehouse Summary Layout

The Data Warehouse Summary page and consists of the following separate widgets:

  • Queries
  • Workloads
  • Usage Analysis

Previously, the information in these widgets was contained in separate tabs within the Outliers widget. The screenshot below shows the layout of the Data Warehouse Summary page:

The following widgets appear in the Data Warehouse Summary page: Trend, Queries, Workloads, Usage Analysis, Tables, Statement Types

File Size Reporting

File size reporting helps you identify databases and tables in which data is stored inefficiently, in small files or partitions. When data is stored inefficiently, you may experience performance issues.

For information about how to enable file size reporting, see Enabling File Size Reporting.

For information about viewing file size metadata, see File Size Reporting.

Compare a Job with the Previous Run

When a job is flagged as slow, there is a Compare with Previous Run link in the job page that opens the Job Comparison tool and compares the current run of the job with the last run of the job.

The image below shows the location of the link:

The Compare with Previous Run link is in the Overview tab.

For more information about the Job Comparison tool, see Troubleshooting with the Job Comparison Feature.

Spark RDD Health Check

The Spark RDD health check lets you know if you have a redundant RDD cache. Workload XM tells you the location of the cache so that you can remove it to save executor memory.

For more information about health checks, see Data Engineering (Apache Hive, Spark, MapReduce) Health Checks.

Quickly Analyze Workloads with Auto-Generated Workload Views

Workload XM recommends workload views that you can immediately use to analyze workloads on your cluster. Recommendations are based on the following criteria that occur most frequently with queries:

  • tables accessed
  • resource pools used
  • users who initiated the query

To use auto-generated workload recommendations, select Workloads in the left menu under Data Warehouse, and click Define New:

Then, in the Define New drop-down menu, select Select recommended views.

Using auto-generated workload views saves you time because you do not need to perform the initial analysis to determine which criteria to use to create a workload view. For details about how you can use workload views to perform analysis, see Classifying Workloads for Analysis with Workload Views.

Workload Classification for Deep-dive Analysis

Break down workloads by specific criteria to perform deep-dive analysis on the queries. For example, you can use the Workload Classification feature to determine which users are executing workloads that do not adhere to SLAs. You can also examine how queries being sent to specific databases or that use specific pools are performing against SLAs. For details about how to use this feature, see Classifying Workloads for Analysis with Workload Views.

To access this feature, select Workloads under Data Warehouse in the left menu:

Troubleshoot Issues with the Job Comparison Feature

The Job Comparison feature makes it easy to compare two different runs of the same Data Engineering job. This is especially useful when you notice that something changes unexpectedly. For example, if you have a job that consistently completes within a specific amount of time and then it starts taking longer, you want to know why. The Job Comparison feature makes it easy to quickly see the difference between two runs of the same job so you can troubleshoot the cause. For details about how to use this feature, see Troubleshooting with the Job Comparison Feature.

To access this feature, select Jobs under Data Engineering in the left menu:

Download SQL Commands to Address "Corrupt Table Statistics" and "Missing Table Statistics" Query Health Checks

If your queries trigger the Corrupt Table Statistics or the Missing Table Statistics health checks, Workload XM generates the SQL code you can copy and run on your cluster to address these issues.

To download SQL code for creating or repairing table statistics:

  1. Under Data Warehouse, select Queries.
  2. On the Queries page, select the time period you want to investigate for the Range column.
  3. In the Health Check column, select either Corrupt Table Statistics or Missing Table Statistics. This filters out queries that do not trigger these health checks.
  4. Click the query to view its details.
  5. In the Performance Issues region of the query details page, click the Health Check Violations tab. This lists the health checks that were triggered for this query. It is here you see the SQL code that you can copy and run to repair the table statistics issues.

New Log and Query Redaction Configuration Properties for Telemetry Publisher

You can configure log and query redaction for the Telemetry Publisher service in Cloudera Manager. By default this configuration is enabled. For more information, see Log and Query Redaction for the Telemetry Publisher Service.

Proxy Server Support for Telemetry Publisher

You can now configure the Telemetry Publisher service to send metrics as well as configuration and log files to Workload XM by way of a proxy server for database and Altus metrics uploads. For more information, see Configuring Telemetry Publisher to Use a Proxy Server

Multiple Usability Improvements

Workload XM team usability upgrades improve the user experience:

  • Support for parsing Spark 2.3 application history logs.

  • Job history files and Spark event logs can be download from the Execution Detail tab in the Job detail page:

    Download Job History Files

    Download Spark Event Logs

  • Query Detail page. You can download the query profile for Impala queries and view the total number of joins performed for a specific query:

  • Concurrency chart in the Data Warehouse Summary page. This chart shows query concurrency in the cluster during a selected time range. You can use this chart to gain insight, such as identifying potential resource contention in the cluster or using it to identify the busiest time of day on your cluster.