3.3.3. Node Manager

The Node Manager manages the individual compute nodes in a Hadoop cluster. This includes keeping up-to-date with the Resource Manager, managing the life-cycle of application Containers, monitoring resource usage (memory, CPU) of individual Containers, monitoring node health, and managing logs and other auxiliary services that can be utilized by YARN applications.

On start-up, the Node Manager registers with the Resource Manager, and then sends heartbeats with its status and waits for instructions. Its primary goal is to manage application Containers assigned to it by the Resource Manager.

YARN Containers are described by a Container Launch Context (CLC). This record includes a map of environment variables, dependencies stored in remotely accessible storage, security tokens, payloads for Node Manager services, and the command necessary to create the process. After validating the authenticity of the Container lease, the Node Manager configures the environment for the Container, including initializing its monitoring subsystem with the resource constraints specified by the application. The Node Manager will also kill Containers as directed by the Resource Manager.