Docker on YARN example: DistributedShell
Learn how to run arbitrary shell command through a DistributedShell YARN application.
- Prepare a UNIX-based Docker image. For example, ubuntu:18.04.
- In Cloudera Manager, select the YARN service.
- Click the Configuration tab.
- 
                Search for docker.trusted.registriesand find the Trusted Registries for Docker Containers property.
- 
                Add libraryto the list of trusted registries to allow ubuntu:18.04.
- Click Save Changes.
- Restart the YARN service using Cloudera Manager.
- 
                Search for the hadoop-yarn-applications-distributedshelljar in a Cloudera Manager manager host.
- 
                Set the YARN_JARenvironment variable to the path of thehadoop-yarn-applications-distributedshelljar.For example, using the default value: YARN_JAR=/opt/cloudera/parcels/CDH/jars/hadoop-yarn-applications-distributedshell-<jar version number>.jar
- 
                Choose an arbitrary shell command.
                
                For example “ cat /etc/*-release” which displays OS-related information in UNIX-based systems.
- 
                Run the DistributedShell job providing the shell command in the
                        -shell_commandoption:sudo -u hdfs hadoop org.apache.hadoop.yarn.applications.distributedshell.Client \ -jar $YARN_JAR \ -shell_command "cat /etc/*-release" \ -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker \ -shell_env YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=library/ubuntu:18.04
- 
                Check the output of the command using yarn log command line tool: 
                sudo -u yarn yarn logs -applicationId <id of the DistributedShell application> -log_files stdoutThe output should look like the following in case of the ubuntu image:DISTRIB_ID=Ubuntu DISTRIB_RELEASE=18.04 DISTRIB_CODENAME=bionic DISTRIB_DESCRIPTION="Ubuntu 18.04.3 LTS" NAME="Ubuntu" VERSION="18.04.3 LTS (Bionic Beaver)" ...
