Cluster Utilization Reports

The Cluster Utilization Report screens in Cloudera Manager display aggregated utilization information for YARN and Impala jobs. The reports display CPU utilization, memory utilization, resource allocations made due to the YARN fair scheduler, and Impala queries. The report displays aggregated utilization for the entire cluster and also breaks out utilization by tenant, which is either a user or a resource pool. You can configure the report to display utilization for a range of dates, specific days of the week, and time ranges.

The report displays the current utilization of CPU and memory resources and the resources that were allocated using the Cloudera Manager resource management features. See Resource Management.

Using the information displayed in the Cluster Utilization Report, a CDH cluster administrator can verify that sufficient resources are available for the number and types of jobs running in the cluster. An administrator can use the reports to tune resource allocations so that resources are used efficiently and meet business requirements. Tool tips in the report pages provide suggestions about how to improve performance based on the information displayed in the report. Hover over a label to see these suggestions and other information. For example:

You can tune the following:

CPU and memory allocations
Weights for each pool
Scheduling rules
Preemption thresholds
Maximum number of running and queued Impala queries
Maximum timeout for the queue of Impala queries
Placement rules
Number of hosts in a cluster
Memory capacity of hosts
Impala Admission Control pool and queue configurations

The Cluster Utilization Report is only available with Cloudera Manager 5.7 and higher and CDH 5.7 and higher.

If you want to create your own reports with similar functionality, or if you want to export the report results, see Creating a Custom Cluster Utilization Report.

Continue reading:

Configuring the Cluster Utilization Report
Using the Cluster Utilization Report to Manage Resources
Downloading Cluster Utilization Reports Using the Cloudera Manager API

Configuring the Cluster Utilization Report

This topic describes the prerequisites and configurations required to use the Cluster Utilization Report.

Enabling the Cluster Utilization Report

By default, the Cluster Utilization Report displays aggregated CPU and memory utilization for an entire CDH cluster and for YARN and Impala utilization. You can also view this utilization by tenants, which include Linux users and Dynamic Resource Pools. To see utilization for a tenant, you must configure the tenant and define resource limits for it.

You must configure several parameters to enable the Cluster Utilization Report:

Enable YARN utilization metrics collection.
1. Go to the YARN Service
2. Click the Configuration tab.
3. Select Category > Monitoring.
4. Type container in the Search box.
5. Select the Enable Container Usage Metrics Collection parameter.
6. Enter a username for the MapReduce job that collects the metrics in the Container Usage MapReduce Job User parameter. The username you enter must be a Linux user on all the cluster hosts. If you are using an Active Directory KDC, the username must also exist in Active Directory. For secure clusters, the user must not be banned or below the minimum user ID. You can view the list of banned users (banned.users) and the minimum user ID (min.user.id) by selecting Clusters > <YARN service> > Configuration.
  Note: The user that is configured with the Container Usage MapReduce Job User property in the YARN service requires permissions to read the subdirectories of the HDFS directory specified with the Cloudera Manager Container Usage Metrics Directory property. The default umask of 022 allows any user to read from that directory. However, if a more strict umask (for example, 027) is used, then those directories are not readable by any user. In that case the user specified with the Container Usage MapReduce Job User property should be added to the same group that owns the subdirectories.
  For example, if the /tmp/cmYarnContainerMetrics/20161010 subdirectory is owned by user and group yarn:hadoop, the user specified in Container Usage MapReduce Job User should be added to the hadoop group.
  
  Note: The directories you specify with the Cloudera Manager Container Usage Metrics Directory and Container Usage Output Directory properties should not be located in encryption zones.
7. (Optional) Enter the resource pool in which the container usage collection MapReduce job runs in the Container Usage MapReduce Job Pool parameter. Cloudera recommends that you dedicate a resource pool for running this MapReduce job.
  Note: If you specify a custom resource pool, ensure that the placement rules for the cluster allow for it. The first rule must be for resource pools to be specified at run time with the Create pool if it does not exist option selected. Alternatively, ensure that the pool you specify already exists. If the placement rule is not properly configured or the resource pool does not already exist, the job may run in a different pool.
8. Click Save Changes to commit the changes.
9. Click the Actions drop-down list and select Create CM Container Usage Metrics Dir.
10. Restart the YARN service:
  1. Go to the YARN service.
  2. Select Actions > Restart.
Enable Impala utilization collection.
1. Go to the Impala service.
2. Click the Configuration tab.
3. Select Category > Admission Control.
4. Select or clear both the Enable Impala Admission Control checkbox and the Enable Dynamic Resource Pools checkbox.
5. Click Save Changes to commit the changes.
6. Restart the Impala service.

Configuring the Cluster Utilization Report

To access the Cluster Utilization Report, go to Clusters and then select Utilization Report for the cluster. The Overview tab displays when you first open the report.

The upper-right part of the page has two controls that you use to configure the Cluster Utilization Report:

You can apply a configuration and date range that applies to all tabs in the report:

Click the Configuration drop-down menu.
Select one the configured options, or create a new configuration:
1. Click Create New Configuration.
2. Enter a Configuration Name.
3. Select the Tenant Type, either Pool or User.
4. Select the days of the week for which you want to report utilization.
5. Select All Day, or use the drop-down menus to specify a utilization time range for the report.
6. Click Create.
The configuration you created is now available from the Configuration drop-down menu.
Select a date range for the report:
1. Click the date range button.
2. Select one of the range options (Today, Yesterday, Last 7 Days, Last 30 Days, or This Month) or click Custom Range and select the beginning and ending dates for the date range.

Using the Cluster Utilization Report to Manage Resources

To access the Cluster Utilization Report, go to Clusters and then select Utilization Report for the cluster. The Overview tab of the report displays.

Cluster Utilization Report Overview Tab

The Cluster Utilization Report is divided into the following tabs:

Overview Tab

The Overview tab provides a summary of CPU and memory utilization for the entire cluster and also for only YARN applications and Impala queries. Two sections, CPU Utilization and Memory Utilization, display the following information:

CPU Utilization	Memory Utilization
Overall Cluster Utilization Total CPU Cores – Average number of CPU cores available during the reporting window. Average Utilization – Average CPU utilization for the entire cluster, including resources consumed by user applications and CDH services. Maximum Utilization – Maximum CPU utilization for the entire cluster during the reporting window, including resources consumed by user applications and CDH services. If this value is high, consider adding more hosts to the cluster. Click the drop-down menu next to the date and select one of the following to view details about jobs running when maximum utilization occurred: View YARN Applications Running at the Time View Impala Queries Running at the Time Average Daily Peak – Average daily peak CPU consumption for the entire cluster during the reporting window. This includes resources consumed by user applications and CDH services. The number is computed by averaging the maximum resource consumption for each day of the reporting period. Click View Time Series Chart to view a chart of peak utilization.	Overall Cluster Utilization Total Physical Memory – Average physical memory available in the cluster during the reporting window. Average Utilization – Average memory consumption for the entire cluster, including resources consumed by user applications and CDH services. Maximum Utilization – Maximum memory consumption for the entire cluster during the reporting window, including resources consumed by user applications and CDH services. If this value is high, consider adding more hosts to the cluster. Click the drop-down menu next to the date and select one of the following to view details about jobs running when maximum utilization occurred: View YARN Applications Running at the Time View Impala Queries Running at the Time Average Daily Peak – Average daily peak memory consumption for the entire cluster during the reporting window, including resources consumed by user applications and CDH services. The number is computed by averaging the maximum memory utilization for each day of the reporting period. Click View Time Series Chart to view a chart of peak utilization.
YARN + Impala Utilization Average Utilization – Average resource consumption by YARN applications and Impala queries that ran on the cluster. Maximum Utilization – Maximum resource consumption by YARN applications and Impala queries that ran on the cluster. Click the drop-down menu next to the date and select one of the following to view details about jobs running when maximum utilization occurred: View YARN Applications Running at the Time View Impala Queries Running at the Time Average Daily Peak – Average daily peak resource consumption by YARN applications and Impala queries during the reporting window. The number is computed by finding the maximum resource consumption per day and calculating the mean. Click View Time Series Chart to view a chart of peak utilization.	YARN + Impala Utilization Average Utilization – Average memory consumption by YARN applications and Impala queries that ran on the cluster. Maximum Utilization – Maximum memory consumption for the entire cluster during the reporting window, including resources consumed by user applications and CDH services. If this is high, consider adding more hosts to the cluster. Click the drop-down menu next to the date and select one of the following to view details about jobs running when maximum utilization occurred: View YARN Applications Running at the Time View Impala Queries Running at the Time Average Daily Peak – Average daily peak memory consumption by YARN applications and Impala queries during the reporting window. The number is computed by finding the maximum resource consumption per day and then calculating the mean. Click View Time Series Chart to view a chart of peak utilization.
Utilization by Tenant Displays overall utilization for each tenant. Tenants can be either pools or users. See Configuring the Cluster Utilization Report.	Utilization by Tenant Displays overall utilization for each tenant. Tenants can be either pools or users. See Configuring the Cluster Utilization Report.

YARN Tab

The YARN tab displays CPU and memory utilization for YARN applications on three tabs:

For information about managing YARN resources, see:

Utilization Tab

Utilization Tab
CPU Utilization	Memory Utilization
Summary section: Average Utilization – Average number of vcores used by YARN applications. The percentage reported is of the total number of vcores configured for YARN. Maximum Utilization – Maximum number of vcores used by YARN applications. The percentage reported is of the total number of vcores configured for YARN. Click the drop-down menu next to the date and select View YARN Applications Running at the Time to view details about jobs running when maximum utilization occurred. Average Daily Peak – Average daily peak vcores used by YARN applications. The number is computed by finding the maximum resource consumption per day and calculating the mean. The percentage reported is of the total number of vcores configured for YARN. Click View Time Series Chart to view a chart of peak utilization.	Summary section: Average Utilization – Average memory used by YARN applications. The percentage reported is of the total container memory configured for YARN. Maximum Utilization – Maximum memory used by YARN applications. The percentage reported is of the total container memory configured for YARN. Click the drop-down menu next to the date and select View YARN Applications Running at the Time to view details about jobs running when maximum utilization occurred. Average Daily Peak – Average daily peak memory used by YARN applications. The number is computed by finding the maximum resource consumption per day and calculating the mean. The percentage reported is of the total container memory configured for YARN. Click View Time Series Chart to view a chart of peak utilization.
Utilization by Tenant Displays overall utilization for each tenant. The tenants can be either pools or users. See Configuring the Cluster Utilization Report Utilization by tenant is displayed in a table with the following columns: Tenant Name Average Allocation – Average number of vcores allocated to YARN applications of the tenant. The percentage reported is of the total number of vcores configured for YARN. Average Utilization – Average number of vcores used by YARN applications. The percentage reported is of the total number of vcores configured for YARN. Unused Capacity – Average unused vcores for the tenant. If this number is high, consider allocating less resources for the applications run by this tenant. Click a column header to sort the table by that column. Click the icon in the header row of the table to view utilization charts for all tenants. Click in a row to view CPU utilization for a single tenant.	Utilization by Tenant Displays overall utilization for each tenant. The tenants can be either pools or users. See Configuring the Cluster Utilization Report. Utilization by tenant is displayed in a table with the following columns: Tenant Name Average Allocation – Average memory allocated to YARN applications of the tenant. The percentage reported is of the total container memory configured for YARN. Average Utilization – Average memory used by YARN applications. The percentage reported is of the total container memory configured for YARN. Unused Capacity – Average unused memory for the tenant. If this number is high, consider allocating less resources for the applications run by this tenant. Click a column header to sort the table by that column. Click the icon in the header row of the table to view utilization charts for all tenants. Click in a row to view CPU utilization for a single tenant.

Adjusting YARN Resources

To adjust YARN resources. Go to the YARN service and select Configuration > Category > Resource Management and configure the following properties:

vcores: Container Virtual CPU Cores
Memory: Container Memory

Capacity Planning Tab

The Capacity Planning Tab displays a table showing how the weights assigned to YARN Dynamic Resource Pools affect CPU and memory allocations. The table displays the following columns:

Tenant Name
CPU Steady Fair Share – Displays the average number of CPU vcores allocated for each tenant based on the weights assigned to dynamic resource pools.
Memory Steady Fair Share – Displays the average memory allocated for each tenant based on the weights assigned to dynamic resource pools.
Wait Ratio During Contention – The wait ratio is the average percentage of containers in the YARN pool that were pending when there was at least one pending container in the pool. If a pool running critical applications has a high wait ratio, consider increasing the weight of that pool. If several pools in the cluster have a high wait ratio, consider adding more hosts to the cluster.

Click a column header to sort the table by that column.

Preemption Tuning Tab

The Preemption Tuning tab displays graphs for each tenant that display the average steady fair share allocations against the average instantaneous fair share allocations and average overall allocations for CPU and memory allocations.

The CPU section shows the average allocated vcores, instantaneous fair share of vcores, and steady fair share of vcores whenever the YARN pool was facing contention with resources (times when there was at least one pending container). If the allocated vcores are less than the steady fair share during contention, consider making preemption more aggressive by doing the following:

Enable fair scheduler preemption.
Reduce the fair scheduler preemption utilization threshold.
If you have configured a preemption timeout for a pool (on the Dynamic Resource Pool Configuration page (Clusters > cluster name > Resource Management > Dynamic Resource Pool), reduce the length of the timeout for pools with a high wait ratio. See Dynamic Resource Pools.

See Enabling and Disabling Fair Scheduler Preemption.

The Memory section shows the average allocated memory, instantaneous fair share of memory, and steady fair share of memory whenever the YARN pool was facing contention with resources (times when there was at least one pending container). If the allocated memory is less than the Steady Fair Share during contention, consider making preemption more aggressive, as described previously for CPU.

Impala Tab

The Impala tab displays CPU and memory utilization for Impala queries using three tabs:

For information about managing Impala resources, see:

Queries Tab

The Overview tab displays information about Impala queries.

The top part of the page displays summary information about Impala queries for the entire cluster. The table in the lower part displays the same information by tenant. Both sections display the following:

Total – Total number of queries.
Click the link with the total to view details and charts about the queries.
Avg Wait Time in Queue – Average time, in milliseconds, spent by a query in an Impala pool while waiting for resources. If this number is high, consider increasing the resources allocated to the pool. If this number is high for several pools, consider increasing the number of hosts in the cluster.
Successful – The number and percentage of queries that finished successfully.
Click the link with the total to view details and charts about the queries.
Memory Limit Exceeded – Number and percentage of queries that failed due to insufficient memory. If there are such queries, consider increasing the memory allocated to the pool. If there are several pools with such queries, consider increasing the number of hosts in the cluster.
Timed Out in Queue – Number of queries that timed out while waiting for resources in a pool. If there are such queries, consider increasing the maximum number of running queries allowed for the pool. If there are several pools with such queries, consider increasing the number of hosts in the cluster.
Rejected – Number of queries that were rejected by Impala because the pool was full. If this number is high, consider increasing the maximum number of queued queries allowed for the pool. See Admission Control and Query Queuing.

Click a column header to sort the table by that column.

Peak Memory Usage Tab

This report shows how Impala consumes memory at peak utilization. If utilization is high for a pool, consider adding resources to the pool. If utilization is high for several pools, consider adding more hosts to the cluster.

The Summary section of this page displays aggregated peak memory usage information for the entire cluster and the Utilization by Tenant section displays peak memory usage by tenant. Both sections display the following:

Max Allocated
- Peak Allocation Time – The time when Impala reserved the maximum amount of memory for queries.
  Click the drop-down list next to the date and time and select View Impala Queries Running at the Time to see details about the queries.
- Max Allocated – The maximum memory that was reserved by Impala for executing queries. If the percentage is high, consider increasing the number of hosts in the cluster.
- Utilized at the Time – The amount of memory used by Impala for running queries at the time when maximum memory was reserved.
  Click View Time Series Chart to view a chart of peak memory allocations.
- Histogram of Allocated Memory at Peak Allocation Time – Distribution of memory reserved per Impala daemon for executing queries at the time Impala reserved the maximum memory. If some Impala daemons have reserved memory close to the configured limit, consider adding more physical memory to the hosts.
  Note: This histogram is generated from the minute-level metrics for Impala daemons. If the minute-level metrics for the timestamp at which peak allocation happened are no longer present in the Cloudera Service Monitor Time-Series Storage, the histogram shows no data. To maintain a longer history for the minute-level metrics, increase the value of the Time-Series Storage property for the Cloudera Service Monitor. (Go to the Cloudera Management Service > Configuration and search for Time-Series Storage.)
Max Utilized
- Peak Usage Time – The time when Impala used the maximum amount of memory for queries.
  Click the drop-down list next to the date and time and select View Impala Queries Running at the Time to see details about the queries.
- Max Utilized – The maximum memory that was used by Impala for executing queries. If the percentage is high, consider increasing the number of hosts in the cluster.
- Reserved at the Time – The amount of memory reserved by Impala at the time when it was using the maximum memory for executing queries.
  Click View Time Series Chart to view a chart of peak memory utilization.
- Histogram of Utilized Memory at Peak Usage Time – Distribution of memory used per Impala daemon for executing queries at the time Impala used the maximum memory. If some Impala daemons are using memory close to the configured limit, consider adding more physical memory to the hosts.
  Note: This histogram is generated from the minute-level metrics for Impala daemons. If the minute-level metrics for the timestamp at which peak allocation happened are no longer present in the Cloudera Service Monitor Time-Series Storage, the histogram shows no data. To maintain a longer history for the minute-level metrics, increase the value of the Time-Series Storage property for the Cloudera Service Monitor. (Go to the Cloudera Management Service > Configuration and search for Time-Series Storage.)

Spilled Memory Tab

The Spilled Memory tab displays information about Impala spilled memory. These disk spills can deteriorate the performance of Impala queries significantly. This report shows the amount of disk spills for Impala queries by tenant. If disk spill is high for a pool, consider adding resources to the pool. If disk spill is high for several pools, consider adding more hosts to the cluster.

For each tenant, the following are displayed:

Average Spill – Average spill per query
Maximum Spill – Maximum memory spilled per hour

Downloading Cluster Utilization Reports Using the Cloudera Manager API

You can download the Cluster Utilization Reports as a JSON file using the Cloudera Manager API. Three new API endpoints have been added.

See:

Cluster Utilization:http://cloudera.github.io/cm_api/apidocs/v18/path__clusters_-clusterName-_utilization.html
Impala Utilization: http://cloudera.github.io/cm_api/apidocs/v18/path__clusters_-clusterName-_impalaUtilization.html
YARN Utilization: http://cloudera.github.io/cm_api/apidocs/v18/path__clusters_-clusterName-_yarnUtilization.html

Managing Impala Admission Control

Creating a Custom Cluster Utilization Report