Linux Control Groups
Cloudera Manager supports the Linux control groups (cgroups) kernel feature. With cgroups, administrators can impose per-resource restrictions and limits on services and roles. This provides the ability to allocate resources using cgroups to enable isolation of compute frameworks from one another. Resource allocation is implemented by setting properties for the services and roles.
Linux Distribution Support
Distribution | CPU Shares | I/O Weight | Memory Soft Limit | Memory Hard Limit |
---|---|---|---|---|
Red Hat Enterprise Linux (or CentOS) 5 | ||||
Red Hat Enterprise Linux (or CentOS) 6 | ||||
SUSE Linux Enterprise Server 11 | ||||
Ubuntu 10.04 LTS | ||||
Ubuntu 12.04 LTS | ||||
Debian 6.0 | ||||
Debian 7.0 |
If a distribution lacks support for a given parameter, changes to the parameter have no effect.
The exact level of support can be found in the Cloudera Manager Agent log file, shortly after the Agent has started. See Viewing Cloudera Manager Server and Agent Logs to find the Agent log. In the log file, look for an entry like this:
Found cgroups capabilities: {'has_memory': True, 'default_memory_limit_in_bytes': 9223372036854775807, 'writable_cgroup_dot_procs': True, 'has_cpu': True, 'default_blkio_weight': 1000, 'default_cpu_shares': 1024, 'has_blkio': True}
The has_memory and similar entries correspond directly to support for the CPU, I/O, and memory parameters.
Further Reading
- http://www.kernel.org/doc/Documentation/cgroups/cgroups.txt
- http://www.kernel.org/doc/Documentation/cgroups/blkio-controller.txt
- http://www.kernel.org/doc/Documentation/cgroups/memory.txt
- https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/index.html
Resource Management with Control Groups
In order to use cgroups, you must also enable cgroup-based resource management under the Host > Resource Management configuration properties. However, if you configure static service pools, this property will be set as part of that process.
Enabling Resource Management
- If you've upgraded from a version of Cloudera Manager older than Cloudera Manager 4.5, restart every Cloudera Manager Agent before using cgroups-based resource management:
- Stop all services, including the Cloudera Management Service.
- On each cluster host, run as root:
$ service cloudera-scm-agent hard_restart
- Start all services.
- Click the Hosts tab.
- Optionally click the link for the host where you want to enable cgroups.
- Select .
- Select the Resource Management category.
- Check Enable Cgroup-based Resource Management checkbox.
- Restart all roles on the host or hosts.
Limitations
- Role group and role instance override cgroup-based resource management parameters must be saved one at a time. Otherwise some of the changes that should be reflected dynamically will be ignored.
- The role group abstraction is an imperfect fit for resource management parameters, where the goal is often to take a numeric value for a host resource and distribute it amongst running roles. The role group represents a "horizontal" slice: the same role across a set of hosts. However, the cluster is often viewed in terms of "vertical" slices, each being a combination of slave roles (such as TaskTracker, DataNode, Region Server, Impala daemon, and so on). Nothing in Cloudera Manager guarantees that these disparate horizontal slices are "aligned" (meaning, that the role assignment is identical across hosts). If they are unaligned, some of the role group values will be incorrect on unaligned hosts. For example a host whose role groups have been configured with memory limits but that's missing a role will probably have unassigned memory.
Configuring Resource Parameters
- CPU Shares - The more CPU shares given to a role, the larger its share of the CPU when under contention. Until processes on the host (including both roles managed by Cloudera Manager and other system processes) are contending for all of the CPUs, this will have no effect. When there is contention, those processes with higher CPU shares will be given more CPU time. The effect is linear: a process with 4 CPU shares will be given roughly twice as much CPU time as a process with 2 CPU shares.
Updates to this parameter will be dynamically reflected in the running role.
- I/O Weight - The greater the I/O weight, the higher priority will be given to I/O requests made by the role when I/O is under contention (either by roles managed by Cloudera Manager or by other system processes). This only affects read requests; write requests remain unprioritized.
Updates to this parameter will be dynamically reflected in the running role.
- Memory Soft Limit - When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit.
After updating this parameter, the role must be restarted before changes take effect.
- Memory Hard Limit - When a role's resident set size (RSS) exceeds the value of this parameter, the kernel will swap out some of the role's memory. If it's unable to do so, it will kill the process. Note that the kernel measures memory consumption in a manner that doesn't necessarily match what the top or ps report for RSS, so expect that this limit is a rough approximation.
After updating this parameter, the role must be restarted before changes take effect.
Example: Protecting Production MapReduce Jobs from Impala Queries
- The cluster is using homogenous hardware
- Each worker host has two cores
- Each worker host has 8 GB of RAM
- Each worker host is running a DataNode, TaskTracker, and an Impala daemon
- Each role type is in a single role group
- Cgroups-based resource management has been enabled on all hosts
Action | Procedure |
---|---|
CPU |
|
Memory |
|
I/O |
|
- When MapReduce jobs are running, all Impala queries together will consume up to a fifth of the cluster's CPU resources.
- Individual Impala daemons won't consume more than 1 GB of RAM. If this figure is exceeded, new queries will be cancelled.
- DataNodes and TaskTrackers can consume up to 1 GB of RAM each.
- We expect up to 3 MapReduce tasks at a given time, each with a maximum heap size of 1 GB of RAM. That's up to 3 GB for MR tasks.
- The remainder of each host's available RAM (6 GB) is reserved for other host processes.
- When MapReduce jobs are running, read requests issued by Impala queries will receive a fifth of the priority of either HDFS read requests or MapReduce read requests.
<< Static Service Pools | Upgrading CDH and Managed Services >> | |