Configure YARN and MapReduce

After you upgrade Hadoop, complete the following steps to update your configs.

	Note
	The `su` commands in this section use keywords to represent the Service user. For example, "hdfs" is used to represent the HDFS Service user. If you are using another name for your Service users, you need to substitute your Service user name in each of the `su` commands.

	Important
	In secure mode, you must have Kerberos credentials for the hdfs user.

Upload the MapReduce tarball to HDFS. As the HDFS user, for example 'hdfs':
su - hdfs -c "hdfs dfs -mkdir -p /hdp/apps/2.5.5.0-<$version>/mapreduce/"
su - hdfs -c "hdfs dfs -put /usr/hdp/2.5.5.0-<$version>/hadoop/mapreduce.tar.gz /hdp/apps/2.5.5.0-<$version>/mapreduce/"
su - hdfs -c "hdfs dfs -chown -R hdfs:hadoop /hdp"
su - hdfs -c "hdfs dfs -chmod -R 555 /hdp/apps/2.5.5.0-<$version>/mapreduce"
su - hdfs -c "hdfs dfs -chmod -R 444 /hdp/apps/2.5.5.0-<$version>/mapreduce/mapreduce.tar.gz"

Make sure that the following properties are in /etc/hadoop/conf/mapred-site.xml:

Make sure mapreduce.application.framework.path exists in mapred-site.xml:

<property>
 <name>mapreduce.application.framework.path</name> 
 <value>/hdp/apps/${hdp.version}/mapreduce/mapreduce.tar.gz#mr-framework</value>
</property>
                  
<property>
 <name>yarn.app.mapreduce.am.admin-comand-opts</name> 
 <value>-Dhdp.version=${hdp.version}</value>
</property>

	Note
	You do not need to modify ${hdp.version}.

Modify the following existing properties to include ${hdp.version}:

<property>
 <name>mapreduce.admin.user.env</name>
 <value>LD_LIBRARY_PATH=/usr/hdp/${hdp.version}/hadoop/lib/native:/usr/hdp/${hdp.version}/hadoop/
     lib/native/Linux-amd64-64</value>
</property>
 
<property>
 <name>mapreduce.admin.map.child.java.opts</name>
 <value>-server -Djava.net.preferIPv4Stack=true -Dhdp.version=${hdp.version}</value>
 <final>true</final>
</property>
 
<property>
 <name>mapreduce.admin.reduce.child.java.opts</name>
 <value>-server -Djava.net.preferIPv4Stack=true -Dhdp.version=${hdp.version}</value>
 <final>true</final>
</property>
 
<property>
 <name>mapreduce.application.classpath</name> 
 <value>$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*,
   $PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*,
   $PWD/mr-framework/hadoop/share/hadoop/common/*,
   $PWD/mr-framework/hadoop/share/hadoop/common/lib/*,
   $PWD/mr-framework/hadoop/share/hadoop/yarn/*,
   $PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*,
   $PWD/mr-framework/hadoop/share/hadoop/hdfs/*,
   $PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*,
  /usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar,
  /etc/hadoop/conf/secure</value>
</property>

	Note
	You do not need to modify ${hdp.version}.

	Note
	If you are planning to use Spark in yarn-client mode, make Spark work in yarn-client mode 2.5.5.0-<$version>.

Make sure the following property is in /etc/hadoop/conf/yarn-site.xml:

<property>
 <name>yarn.application.classpath</name> 
 <value>$HADOOP_CONF_DIR,/usr/hdp/${hdp.version}/hadoop-client/*,
  /usr/hdp/${hdp.version}/hadoop-client/lib/*,
  /usr/hdp/${hdp.version}/hadoop-hdfs-client/*,
  /usr/hdp/${hdp.version}/hadoop-hdfs-client/lib/*,
  /usr/hdp/${hdp.version}/hadoop-yarn-client/*,
  /usr/hdp/${hdp.version}/hadoop-yarn-client/lib/*</value>
</property>

On secure clusters only, add the following properties to /etc/hadoop/conf/yarn-site.xml:

 <property>
  <name>yarn.timeline-service.recovery.enabled:</name>
  <value>TRUE</value>
  </property>
  
  <property>
  <name>yarn.timeline-service.state-store.class: org.apache.hadoop.yarn.server.timeline.recovery:</name>
  <value>LeveldbTimelineStateStore</value>
  </property> 
  
  <property>
  <name>yarn.timeline-service.leveldb-state-store.path:</name>
  <value><the same as the default of "yarn.timeline-service-leveldb-timeline-store.path</value>
  </property>

For secure clusters, you must create and configure the container-executor.cfg configuration file:
- Create the container-executor.cfg file in /usr/hdp/2.5.5.0-<version>/hadoop-yarn/bin/container-executor.
- Insert the following properties:
```
<property>
yarn.nodemanager.linux-container-executor.group=hadoop 
banned.users=hdfs,yarn,mapred 
min.user.id=1000
</property>
```
  - yarn.nodemanager.linux-container-executor.group - Configured value of yarn.nodemanager.linux-container-executor.group. This must match the value of yarn.nodemanager.linux-container-executor.group in yarn-site.xml.
  - banned.users - Comma-separated list of users who can not run container-executor.
  - min.user.id - Minimum value of user id. This prevents system users from running container-executor.
  - allowed.system.users - Comma-separated list of allowed system users.
- Set the file /etc/hadoop/conf/container-executor.cfg file permissions to only be readable by root:
```
chown root:hadoop /etc/hadoop/conf/container-executor.cfg
chmod 400 /etc/hadoop/conf/container-executor.cfg
```
- Set the container-executor program so that only root or hadoop group users can run it:
```
chown root:hadoop /usr/hdp/${hdp.version}/hadoop-yarn-server-nodemanager/bin/container-executor
                    
chmod 6050 /usr/hdp/${hdp.version}/hadoop-yarn-server-nodemanager/bin/container-executor
```

​Configure YARN and MapReduce

Configure YARN and MapReduce