Integrated Resource Management with YARN

You can limit the CPU and memory resources used by Impala, to manage and prioritize workloads on clusters that run jobs from many Hadoop components. (Currently, there is no limit or throttling on the I/O for Impala queries.) In CDH 5, Impala can use the underlying Apache Hadoop YARN resource management framework, which allocates the required resources for each Impala query. Impala estimates the resources required by the query on each host of the cluster, and requests the resources from YARN.

Requests from Impala to YARN go through an intermediary service called Llama. When the resource requests are granted, Impala starts the query and places all relevant execution threads into the cgroup containers and sets up the memory limit on each host. If sufficient resources are not available, the Impala query waits until other jobs complete and the resources are freed. During query processing, as the need for additional resources arises, Llama can "expand" already-requested resources, to avoid over-allocating at the start of the query.

After a query is finished, Llama caches the resources (for example, leaving memory allocated) in case they are needed for subsequent Impala queries. This caching mechanism avoids the latency involved in making a whole new set of resource requests for each query. If the resources are needed by YARN for other types of jobs, Llama returns them.

While the delays to wait for resources might make individual queries seem less responsive on a heavily loaded cluster, the resource management feature makes the overall performance of the cluster smoother and more predictable, without sudden spikes in utilization due to memory paging, CPUs pegged at 100%, and so on.

Continue reading:

The Llama Daemon
Controlling Resource Estimation Behavior
Checking Resource Estimates and Actual Usage
How Resource Limits Are Enforced
Enabling Resource Management for Impala
Limitations of Resource Management for Impala

The Llama Daemon

Llama is a system that mediates resource management between Impala and Hadoop YARN. Llama enables Impala to reserve, use, and release resource allocations in a Hadoop cluster. Llama is only required if resource management is enabled in Impala.

By default, YARN allocates resources bit-by-bit as needed by MapReduce jobs. Impala needs all resources available at the same time, so that intermediate results can be exchanged between cluster nodes, and queries do not stall partway through waiting for new resources to be allocated. Llama is the intermediary process that ensures all requested resources are available before each Impala query actually begins.

For management through Cloudera Manager, see The Impala Llama ApplicationMaster.

Controlling Resource Estimation Behavior

By default, Impala consults the table statistics and column statistics for each table in a query, and uses those figures to construct estimates of needed resources for each query. See COMPUTE STATS Statement for the statement to collect table and column statistics for a table.

To avoid problems with inaccurate or missing statistics, which can lead to inaccurate estimates of resource consumption, Impala allows you to set default estimates for CPU and memory resource consumption. As a query grows to require more resources, Impala will request more resources from Llama (this is called "expanding" a query reservation). When the query is complete, those resources are returned to YARN as normal. To enable this feature, use the command-line option -rm_always_use_defaults when starting impalad, and optionally -rm_default_memory=size and -rm_default_cpu_cores. Cloudera recommends always running with -rm_always_use_defaults enabled when using resource management, because if the query needs more resources than the default values, the resource requests are expanded dynamically as the query runs. See impalad Startup Options for Resource Management for details about each option.

Checking Resource Estimates and Actual Usage

To make resource usage easier to verify, the output of the EXPLAIN SQL statement now includes information about estimated memory usage, whether table and column statistics are available for each table, and the number of virtual cores that a query will use. You can get this information through the EXPLAIN statement without actually running the query. The extra information requires setting the query option EXPLAIN_LEVEL=verbose; see EXPLAIN Statement for details. The same extended information is shown at the start of the output from the PROFILE statement in impala-shell. The detailed profile information is only available after running the query. You can take appropriate actions (gathering statistics, adjusting query options) if you find that queries fail or run with suboptimal performance when resource management is enabled.

How Resource Limits Are Enforced

CPU limits are enforced by the Linux cgroups mechanism. YARN grants resources in the form of containers that correspond to cgroups on the respective machines.
Memory is enforced by Impala's query memory limits. Once a reservation request has been granted, Impala sets the query memory limit according to the granted amount of memory before executing the query.

Enabling Resource Management for Impala

To enable resource management for Impala, first you set up the YARN and Llama services for your CDH cluster. Then you add startup options and customize resource management settings for the Impala services.

Required CDH Setup for Resource Management with Impala

YARN is the general-purpose service that manages resources for many Hadoop components within a CDH cluster. Llama is a specialized service that acts as an intermediary between Impala and YARN, translating Impala resource requests to YARN and coordinating with Impala so that queries only begin executing when all needed resources have been granted by YARN.

For information about setting up the YARN and Llama services, see the instructions for Cloudera Manager.

Using Impala with a Llama High Availability Configuration

Impala can take advantage of the Llama high availability (HA) feature, with additional Llama servers that step in if the primary one becomes unavailable. (Only one Llama server at a time services all resource requests.) Before using this feature from Impala, read the background information about Llama HA, its main features, and how to set it up.

Command-line method for systems without Cloudera Manager:

Setting up the Impala side in a Llama HA configuration involves setting the impalad configuration options -llama_addresses (mandatory) and optionally -llama_max_request_attempts, -llama_registration_timeout_secs, and -llama_registration_wait_secs. See the next section impalad Startup Options for Resource Management for usage instructions for those options.

The impalad daemon on the coordinator host registers with the Llama server for each query, receiving a handle that is used for subsequent resource requests. If a Llama server becomes unavailable, all running Impala queries are cancelled. Subsequent queries register with the next specified Llama server. This registration only happens when a query or similar request causes an impalad to request resources through Llama. Therefore, when a Llama server becomes unavailable, that fact might not be reported immediately in the Impala status information such as the metrics page in the debug web UI.

Cloudera Manager method: See Llama High Availailability.

impalad Startup Options for Resource Management

The following startup options for impalad enable resource management and customize its parameters for your cluster configuration:

-enable_rm: Whether to enable resource management or not, either true or false. The default is false. None of the other resource management options have any effect unless -enable_rm is turned on.
-llama_host: Hostname or IP address of the Llama service that Impala should connect to. The default is 127.0.0.1.
-llama_port: Port of the Llama service that Impala should connect to. The default is 15000.
-llama_callback_port: Port that Impala should start its Llama callback service on. Llama reports when resources are granted or preempted through that service.
-cgroup_hierarchy_path: Path where YARN and Llama will create cgroups for granted resources. Impala assumes that the cgroup for an allocated container is created in the path 'cgroup_hierarchy_path + container_id'.
-rm_always_use_defaults: If this Boolean option is enabled, Impala ignores computed estimates and always obtains the default memory and CPU allocation from Llama at the start of the query. These default estimates are approximately 2 CPUs and 4 GB of memory, possibly varying slightly depending on cluster size, workload, and so on. Cloudera recommends enabling -rm_always_use_defaults whenever resource management is used, and relying on these default values (that is, leaving out the two following options).
-rm_default_memory=size: Optionally sets the default estimate for memory usage for each query. You can use suffixes such as MB and GB, MEM_LIMIT query option. Only has an effect when -rm_always_use_defaults is also enabled.
-rm_default_cpu_cores: Optionally sets the default estimate for number of virtual CPU cores for each query. Only has an effect when -rm_always_use_defaults is also enabled.

The following options fine-tune the interaction of Impala with Llama when Llama high availability (HA) is enabled. The -llama_addresses option is only applicable in a Llama HA environment. -llama_max_request_attempts, -llama_registration_timeout_secs, and -llama_registration_wait_secs work whether or not Llama HA is enabled, but are most useful in combination when Llama is set up for high availability.

-llama_addresses: Comma-separated list of hostname:port items, specifying all the members of the Llama availability group. Defaults to "127.0.0.1:15000".
-llama_max_request_attempts: Maximum number of times a request to reserve, expand, or release resources is retried until the request is cancelled. Attempts are only counted after Impala is registered with Llama. That is, a request survives at mostllama_max_request_attempts-1 re-registrations. Defaults to 5.
-llama_registration_timeout_secs: Maximum number of seconds that Impala will attempt to register or re-register with Llama. If registration is unsuccessful, Impala cancels the action with an error, which could result in an impalad startup failure or a cancelled query. A setting of -1 means try indefinitely. Defaults to 30.
-llama_registration_wait_secs: Number of seconds to wait between attempts during Llama registration. Defaults to 3.

impala-shell Query Options for Resource Management

Before issuing SQL statements through the impala-shell interpreter, you can use the SET command to configure the following parameters related to resource management:

Limitations of Resource Management for Impala

Currently, Impala in CDH 5 has the following limitations for resource management of Impala queries:

Table statistics are required, and column statistics are highly valuable, for Impala to produce accurate estimates of how much memory to request from YARN. See Overview of Table Statistics and Overview of Column Statistics for instructions on gathering both kinds of statistics, and EXPLAIN Statement for the extended EXPLAIN output where you can check that statistics are available for a specific table and set of columns.
If the Impala estimate of required memory is lower than is actually required for a query, Impala dynamically expands the amount of requested memory. Queries might still be cancelled if the reservation expansion fails, for example if there are insufficient remaining resources for that pool, or the expansion request takes long enough that it exceeds the query timeout interval, or because of YARN preemption. You can see the actual memory usage after a failed query by issuing a PROFILE command in impala-shell. Specify a larger memory figure with the MEM_LIMIT query option and re-try the query.

The MEM_LIMIT query option, and the other resource-related query options, are settable through the ODBC or JDBC interfaces in Impala 2.0 and higher. This is a former limitation that is now lifted.

Admission Control and Query Queuing

Performance Management