Configuring Resource Parameters
After enabling cgroups, you can restrict and limit the resource consumption of roles (or role groups) on a per-resource basis.
Minimum Required Role: Cluster Administrator (also provided by Full Administrator) This feature is not available when using Cloudera Manager to manage Data Hub clusters.
- In the Cloudera Manager Admin Console, go to the service where you want to configure resources.
- Click the Configuration tab.
- Select .
- Locate the configuration parameters that begin with "Cgroup". You
can specify resource allocations for each type of resource (CPU,
memory, etc.) for all roles of the service using these
parameters, or you can specify different allocations for each role by
clicking the Edit Individual Values link. Edit
any of the following parameters:
- CPU Shares - The more CPU shares given to a role, the
larger its share of the CPU when under contention. Until processes
on the host (including both roles managed by Cloudera Manager and
other system processes) are contending for all of the CPUs, this
will have no effect. When there is contention, those processes
with higher CPU shares will be given more CPU time. The effect is
linear: a process with 4 CPU shares will be given roughly twice as
much CPU time as a process with 2 CPU shares.
Updates to this parameter are dynamically reflected in the running role.
- I/O Weight - The greater the I/O weight, the higher
priority will be given to I/O requests made by the role when I/O
is under contention (either by roles managed by Cloudera Manager
or by other system processes).
This only affects read requests; write requests remain unprioritized. The Linux I/O scheduler controls when buffered writes are flushed to disk, based on time and quantity thresholds. It continually flushes buffered writes from multiple sources, not certain prioritized processes.
Updates to this parameter are dynamically reflected in the running role.
- Memory Soft Limit - When the limit is reached, the kernel
will reclaim pages charged to the process if and only if the host
is facing memory pressure. If reclaiming fails, the kernel may
kill the process. Both anonymous as well as page cache pages
contribute to the limit.
After updating this parameter, you must restart the role for changes to take effect.
- Memory Hard Limit - When a role's resident set size (RSS)
exceeds the value of this parameter, the kernel will swap out some
of the role's memory. If it is unable to do so, it will kill the
process. The kernel measures memory consumption in a manner that
does not necessarily match what the
top
orps
report for RSS, so expect that this limit is a rough approximation.After updating this parameter, you must restart the role for changes to take effect.
- CPU Shares - The more CPU shares given to a role, the
larger its share of the CPU when under contention. Until processes
on the host (including both roles managed by Cloudera Manager and
other system processes) are contending for all of the CPUs, this
will have no effect. When there is contention, those processes
with higher CPU shares will be given more CPU time. The effect is
linear: a process with 4 CPU shares will be given roughly twice as
much CPU time as a process with 2 CPU shares.
- Click Save Changes.
- Restart the service or roles where you changed values.
See Restarting a Cloudera Runtime Service or Starting, Stopping, and Restarting Role Instances.
Example: Protecting Production MapReduce Jobs from Impala Queries
- The cluster is using homogenous hardware
- Each worker host has two cores
- Each worker host has 8 GB of RAM
- Each worker host is running a DataNode, TaskTracker, and an Impala Daemon
- Each role type is in a single role group
- Cgroups-based resource management has been enabled on all hosts
Action | Procedure |
---|---|
CPU |
|
Memory |
|
I/O |
|
- When MapReduce jobs are running, all Impala queries together will consume up to a fifth of the cluster's CPU resources.
- Individual Impala Daemons will not consume more than 1 GB of RAM. If this figure is exceeded, new queries will be cancelled.
- DataNodes and TaskTrackers can consume up to 1 GB of RAM each.
- We expect up to 3 MapReduce tasks at a given time, each with a maximum heap size of 1 GB of RAM. That's up to 3 GB for MapReduce tasks.
- The remainder of each host's available RAM (6 GB) is reserved for other host processes.
- When MapReduce jobs are running, read requests issued by Impala queries will receive a fifth of the priority of either HDFS read requests or MapReduce read requests.
Operating System Support for Cgroups
Cgroups are a feature of the Linux kernel, and as such, support depends on the host's Linux distribution and version as shown in the following tables. If a distribution lacks support for a given parameter, changes to the parameter have no effect.
The exact level of support can be found in the Cloudera Manager Agent log file, shortly after the Agent has started. In the log file, look for an entry like this:
Found cgroups capabilities: {
'has_memory': True,
'default_memory_limit_in_bytes': 9223372036854775807,
'writable_cgroup_dot_procs': True,
'has_cpu': True,
'default_blkio_weight': 1000,
'default_cpu_shares': 1024,
'has_blkio': True}
The has_cpu
and similar entries correspond directly to
support for the CPU, I/O, and memory parameters.