Docker on YARN example: MapReduce job
Learn how to run the Pi MapReduce example job in a Docker image. In a clean cluster where no fine-grained permissions are set, issues can occur.
- Prepare a UNIX-based Docker image with Java, preferably JDK8. For example, ibmjava:8.
- In Cloudera Manager, select the YARN service.
- Click the Configuration tab.
-
Search for
docker.trusted.registriesand find the Trusted Registries for Docker Containers property. -
Add
libraryto the list of trusted registries to allow the ibmjava. -
Search for
map.output. - Find the Compression Codec of MapReduce Map Output property.
-
Change its value to
org.apache.hadoop.io.compress.DefaultCodec. - Click Save Changes.
- Restart the YARN service using Cloudera Manager.
-
Search for the
hadoop-mapreduce-examplejar in a Cloudera Manager manager host. -
Set the
YARN_JARenvironment variable to the path of thehadoop-mapreduce-examplejar.For example, using the default value:YARN_JAR=/opt/cloudera/parcels/CDH/jars/hadoop-mapreduce-examples-<jar version number>.jar -
Start the Pi MapReduce job with the following command:
yarn jar $YARN_JAR pi \ -Dmapreduce.map.env="YARN_CONTAINER_RUNTIME_TYPE=docker,YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=library/ibmjava:8" \ -Dmapreduce.reduce.env="YARN_CONTAINER_RUNTIME_TYPE=docker,YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=library/ibmjava:8" \ 1 40000
