Using SmartSense
Also available as:
PDF

Workload dashboard (MR, Pig, Tez, Hive)

The Workload dashboard (MR, Pig, Tez, Hive) provides key information about workloads that use MapReduce or Tez for execution.

This dashboard includes the following paragraphs:

  • Longest Running Jobs

  • Most Resource Intensive Jobs

  • Most Resource Wasting Jobs

  • Workloads With Highest HDFS Operations

  • Workloads Creating Max HDFS Files

  • Workloads With Largest HDFS Writes

  • Workloads With Highest CPU Consumption

  • Workloads With Most Inefficient Data Read

  • Workloads With Most Input Data Explosion

  • Job Distribution By Type

  • Job Submission Trend By Day.Hour

Most of these paragraphs have titles that are self-explanatory. A few of them are described below to provide more context:

Paragraph Description
Most Resource Wasting Jobs

Resource wasting is calculated by calculating the difference between the memory asked for and the memory that was actually used.

For example, if a job asks for 100 8GB containers but only uses 5GB per container, 3GB per container is considered wasted. This is calculated per job, and the top 10 are listed.

Job Submission Trend By Day.Hour

This paragraph shows the number of jobs submitted by day and hour with the notation being <day>.<hour>. For example:

• Monday.1 - 1am on Monday

• Monday.20 - 8pm on Monday

The goal of this dashboard is to identify specific job submission hotspots during the week and day. You can use this information to identify the best time to schedule resource intensive jobs to execute.