Step 3 Capture Your Existing Baselines

Describes how to create baselines that enable you to address performance problems and compare the performance of your workloads, jobs, and queries before and after migration.

Workload Manager and Workload XM baseline metrics measure the current performance of a job against the average performance of previous runs. They use performance data from 30 of the most recent runs of a job and require a minimum of three runs. The baseline comparisons start with the fourth run of a job.

Cloudera highly recommends that before you migrate, you establish and capture your workloads, jobs, and query baselines with Workload Manager or Workload XM. Baselines enable you to identify and address performance problems, as well as, enable you to compare the performances of your workloads, jobs, and queries before and after migration.

For more information on how to establish and display baselines and create job comparisons, see Related Information.

The following images show some of the baseline metrics that Workload Manager and Workload XM provide:
  • This image shows a few of the performance metrics listed in the Baseline tab:


  • This image shows the comparison between the baseline performance metrics and the current job run:


To capture a job's baseline performance:
  1. In a supported browser, log in to the Workload Manager or the Workload XM web UI.
  2. In the Clusters page, select the cluster required for analysis.
  3. From the Usage Analysis chart widget, select the required engine, such as Spark.
  4. From the Slow Jobs chart widget, click on a job and record its baseline and any prescriptive improvements that can be made.
  5. If a baseline is not displaying, click Compare with Previous Run.
  6. Capture the Job Comparison page and record any health issues.

You can also compare two different runs of the same job with the Job comparison feature.

To compare two different runs of the same job:
  1. In a supported browser, log in to the Workload Manager or the Workload XM web UI.
  2. In the Clusters page, select the cluster required for analysis.
  3. In the Trend chart widget, select the tab of an engine whose jobs you want to analyze and then click its Total Jobs value.

    The engine's Jobs page opens.

  4. List and display details of all the runs of a specific job by selecting one of the job runs and then, in the Jobs details page, click the Trends tab.
  5. To compare two job runs, select the check boxes adjacent to the job runs you require and then click Compare.

    The Job Comparison page opens displaying more details about each job.

Before moving to CDP, Cloudera highly recommends capturing those jobs that have a maximum impact on your CDH or HDP clusters and establishing their baselines. After migrating to CDP, capture those job baselines again with Workload Manager or Workload XM. If you see significant deviation between the CDH or HDP and CDP baselines, drill down further in Workload Manager or Workload XM to understand the effects of the migration.

You can also identify trends as well as baselines by analyzing your engine’s or cluster’s performance trends from the Trends chart widget and the Trend tab. As described in the next topic.