Chapter 7. Running Multiple MapReduce Versions Using the YARN Distributed Cache
Beginning in HDP 2.2, multiple versions of the MapReduce framework can be deployed using the YARN Distributed Cache. By setting the appropriate configuration properties, you can run jobs using a different version of the MapReduce framework than the one currently installed on the cluster.
Distributed cache ensures that the MapReduce job framework version is consistent throughout the entire job lifecycle. This enables you to maintain consistent results from MapReduce jobs during a rolling upgrade of the cluster. Without using Distributed Cache, a MapReduce job might start with one framework version, but finish with the new (upgrade) version, which could lead to unpredictable results.
YARN Distributed Cache enables you to efficiently distribute large read-only files (text files, archives, .jar files, etc) for use by YARN applications. Applications use URLs (hdfs://) to specify the files to be cached, and the Distributed Cache framework copies the necessary files to the applicable nodes before any tasks for the job are executed. Its efficiency stems from the fact that the files are copied only once per job, and archives are extracted after they are copied to the applicable nodes. Note that Distributed Cache assumes that the files to be cached (and specified via hdfs:// URLs) are already present on the HDFS file system and are accessible by every node in the cluster.
Configuring MapReduce for the YARN Distributed Cache
Copy the tarball that contains the version of MapReduce you would like to use into an HDFS directory that applications can access.
$HADOOP_HOME/bin/hdfs dfs -put mapreduce.tar.gz /mapred/framework/
In the
mapred-site.xml
file, set the value of themapreduce.application.framework.path
property URL to point to the archive file you just uploaded. The URL allows you to create an alias for the archive if a URL fragment identifier is specified. In the following example,${hdp.version}
should be replaced with the applicable HDP version, andmr-framework
is specified as the alias:<property> <name>mapreduce.application.framework.path</name> <value>hdfs:/hdp/apps/${hdp.version}/mapreduce/mapreduce.tar.gz#mr-framework</value> </property>
In the
mapred-site.xml
file, the default value of themapreduce.application.classpath
uses the${hdp.version}
environment variable to reference the currently installed version of HDP:<property> <name>mapreduce.application.classpath</name> <value>$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar</value> </property>
Change the value of the
mapreduce.application.classpath
property to reference the applicable version of the MapReduce framework .jar files. In this case we need to replace${hdp.version}
with the applicable HDP version, which in our example is2.4.0.0-$BUILD
. Note that in the following example themr-framework
alias is used in the path references.<property> <name>mapreduce.application.classpath</name> <value>$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/2.4.0.0-$BUILD/hadoop/lib/hadoop-lzo-0.6.0.2.4.0.0-$BUILD.jar</value> </property>
With this configuration in place, MapReduce jobs will run on the version 2.4.0.0-$BUILD framework referenced in the
mapred-site.xml
file.You can upload multiple versions of the MapReduce framework to HDFS and create a separate
mapred-site.xml
file to reference each version of the framework. Users can then run jobs against a specific version by referencing the applicablemapred-site.xml
file. The following example would run a MapReduce job on version 2.1 of the MapReduce framework:hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar pi -conf etc/hdp-2.1.0.0/mapred-site.xml 10 10
You can use the ApplicationMaster log file to confirm that the job ran on the localized version of MapReduce on the Distributed Cache. For example:
2014-06-10 08:19:30,199 INFO [main] org.mortbay.log: Extract jar: file:/<nm-local-dirs>/filecache/10/hadoop-2.4.0.tar.gz/hadoop-2.4.0/share/hadoop/yarn/hadoop-yarn-common-2.4.0.jar!/webapps/mapreduce to /tmp/Jetty_0_0_0_0_42544_mapreduce____.pryk9q/webapp
Limitations
Support for deploying the MapReduce framework via the YARN Distributed Cache currently does not address the job client code used to submit and query jobs. It also does not address the ShuffleHandler code that runs as an auxiliary service within each NodeManager. Therefore, the following limitations apply to MapReduce versions that can be successfully deployed via the Distributed Cache:
The MapReduce version must be compatible with the job client code used to submit and query jobs. If it is incompatible, the job client must be upgraded separately on any node on which jobs are submitted using the new MapReduce version.
The MapReduce version must be compatible with the configuration files used by the job client submitting the jobs. If it is incompatible with that configuration (that is, a new property must be set, or an existing property value must be changed), the configuration must be updated before submitting jobs.
The MapReduce version must be compatible with the ShuffleHandler version running on the cluster nodes. If it is incompatible, the new ShuffleHandler code must be deployed to all nodes in the cluster, and the NodeManagers must be restarted to pick up the new ShuffleHandler code.
Troubleshooting Tips
You can use the ApplicationMaster log file to check the version of MapReduce being used by a running job. For example:
2014-11-20 08:19:30,199 INFO [main] org.mortbay.log: Extract jar: file:/<nm-local-dirs>/filecache/{...}/hadoop-2.6.0.tar.gz/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-common-2.6.0.jar!/webapps/mapreduce to /tmp/Jetty_0_0_0_0_42544_mapreduce____.pryk9q/webapp
If shuffle encryption is enabled, MapReduce jobs may fail with the following exception:
2014-10-10 02:17:16,600 WARN [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to connect to junping-du-centos6.x-3.cs1cloud.internal:13562 with 1 map outputs javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target at com.sun.net.ssl.internal.ssl.Alerts.getSSLException(Alerts.java:174) at com.sun.net.ssl.internal.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1731) at com.sun.net.ssl.internal.ssl.Handshaker.fatalSE(Handshaker.java:241) at com.sun.net.ssl.internal.ssl.Handshaker.fatalSE(Handshaker.java:235) at com.sun.net.ssl.internal.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1206) at com.sun.net.ssl.internal.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:136) at com.sun.net.ssl.internal.ssl.Handshaker.processLoop(Handshaker.java:593) at com.sun.net.ssl.internal.ssl.Handshaker.process_record(Handshaker.java:529) at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:925) at com.sun.net.ssl.internal.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1170) at com.sun.net.ssl.internal.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1197) at com.sun.net.ssl.internal.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1181) at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:434) at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.setNewClient(AbstractDelegateHttpsURLConnection.java:81) at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.setNewClient(AbstractDelegateHttpsURLConnection.java:61) at sun.net.www.protocol.http.HttpURLConnection.writeRequests(HttpURLConnection.java:584) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1193) at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:379) at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:318) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.verifyConnection(Fetcher.java:427) ....
To fix this problem, create a sub-directory under
$HADOOP_CONF
($HADOOP_HOME/etc/hadoop
by default), and copy thessl-client.xml
file to that directory. Add this new directory path (/etc/hadoop/conf/secure) to the MapReduce classpath specified inmapreduce.application.classpath
in themapred-site.xml
file.