Troubleshooting Linux Container Executor
A list of numeric error codes communicated by the container-executor to the NodeManager that appear in the /var/log/hadoop-yarn NodeManager log.
Numeric Code |
Name |
Description |
---|---|---|
1 |
INVALID_ARGUMENT_NUMBER |
|
2 |
INVALID_USER_NAME |
The user passed to the container-executor does not exist. |
3 |
INVALID_COMMAND_PROVIDED |
The container-executor does not recognize the command it was asked to run. |
5 |
INVALID_NM_ROOT |
The passed NodeManager root does not match the configured NodeManager root
( |
6 |
SETUID_OPER_FAILED |
Either could not read the local groups database, or could not set UID or GID |
7 |
UNABLE_TO_EXECUTE_CONTAINER_SCRIPT |
The container-executor could not run the container launcher script. |
8 |
UNABLE_TO_SIGNAL_CONTAINER |
The container-executor could not signal the container it was passed. |
9 |
INVALID_CONTAINER_PID |
The PID passed to the container-executor was negative or 0. |
18 |
OUT_OF_MEMORY |
The container-executor couldn't allocate enough memory while reading the container-executor.cfg file, or while getting the paths for the container launcher script or credentials files. |
20 |
INITIALIZE_USER_FAILED |
Couldn't get, stat, or secure the per-user NodeManager directory. |
21 |
UNABLE_TO_BUILD_PATH |
The container-executor couldn't concatenate two paths, most likely because it ran out of memory. |
22 |
INVALID_CONTAINER_EXEC_PERMISSIONS |
The container-executor binary does not have the correct permissions set. |
24 |
INVALID_CONFIG_FILE |
The container-executor.cfg file is missing, malformed, or has incorrect permissions. |
25 |
SETSID_OPER_FAILED |
Could not set the session ID of the forked container. |
26 |
WRITE_PIDFILE_FAILED |
Failed to write the value of the PID of the launched container to the PID file of the container. |
255 |
Unknown Error |
This error has several possible causes. Some common causes are:
|
Numeric Code |
Name |
Description |
---|---|---|
0 |
SUCCESS |
Container has finished succesfully. |
-1000 |
INVALID |
Initial value of the container exit code. A container that does not have a COMPLETED state will always return this status. |
-100 |
ABORTED |
Containers killed by the framework, either due to being released by the application or being 'lost' due to node failures, for example. |
-101 |
DISKS_FAILED |
Container exited due to local disks issues in the NodeManager node. This occurs when the number of good nodemanager-local-directories or nodemanager-log-directories drops below the health threshold. |
-102 |
PREEMPTED |
Containers preempted by the framework. This does not count towards a container failure in most applications. |
-103 |
KILLED_EXCEEDED_VMEM |
Container terminated because of exceeding allocated virtual memory limit. |
-104 |
KILLED_EXCEEDED_PMEM |
Container terminated because of exceeding allocated physical memory limit. |
-105 |
KILLED_BY_APPMASTER |
Container was terminated on request of the application master. |
-106 |
KILLED_BY_RESOURCEMANAGER |
Container was terminated by the resource manager. |
-107 |
KILLED_AFTER_APP_COMPLETION |
Container was terminated after the application finished. |