Configure YARN and MapReduce

After you install Hadoop, modify your configs.

Upload the MapReduce tarball to HDFS. As the HDFS user, for example hdfs:
hdfs dfs -mkdir -p /hdp/apps/2.5.0.0-<$version>/mapreduce/
hdfs dfs -put /usr/hdp/2.5.0.0-<$version>/hadoop/mapreduce.tar.gz /hdp/apps/2.5.0.0-<$version>/mapreduce/
hdfs dfs -chown -R hdfs:hadoop /hdp
hdfs dfs -chmod -R 555 /hdp/apps/2.5.0.0-<$version>/mapreduce
hdfs dfs -chmod -R 444 /hdp/apps/2.5.0.0-<$version>/mapreduce/mapreduce.tar.gz

Make the following changes to mapred-site.xml:

Add the following property:

<property>
 <name>mapreduce.application.framework.path</name> 
 <value>/hdp/apps/${hdp.version}
   /mapreduce/mapreduce.tar.gz#mr-framework
 </value>
</property>
                  
<property>
 <name>yarn.app.mapreduce.am.admin-command-opts</name> 
 <value>Dhdp.version=${hdp.version}</value>
</property>

	Note
	You do not need to modify ${hdp.version}.

Modify the following existing properties to include ${hdp.version}:

<property>
 <name>mapreduce.admin.user.env</name>
 <value>LD_LIBRARY_PATH=/usr/hdp/${hdp.version}
   /hadoop/lib/native:/usr/hdp/${hdp.version}/hadoop/
     lib/native/Linux-amd64-64
   </value>
</property>
 
<property>
 <name>mapreduce.admin.map.child.java.opts</name>
 <value>-server -Djava.net.preferIPv4Stack=true 
   -Dhdp.version=${hdp.version}
   </value>
 <final>true</final>
</property>
 
<property>
 <name>mapreduce.admin.reduce.child.java.opts</name>
 <value>-server -Djava.net.preferIPv4Stack=true -Dhdp.version=${hdp.version}</value>
 <final>true</final>
</property>
 
<property>
 <name>mapreduce.application.classpath</name> 
 <value>$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*,
   $PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*,
   $PWD/mr-framework/hadoop/share/hadoop/common/*,
   $PWD/mr-framework/hadoop/share/hadoop/common/lib/*,
   $PWD/mr-framework/hadoop/share/hadoop/yarn/*,
   $PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*,
   $PWD/mr-framework/hadoop/share/hadoop/hdfs/*,
   $PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*,
  /usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar,
  /etc/hadoop/conf/secure</value>
</property>

	Note
	You do not need to modify ${hdp.version}.

Add the following property to yarn-site.xml:

<property>
 <name>yarn.application.classpath</name>
 <value>$HADOOP_CONF_DIR,/usr/hdp/${hdp.version}/hadoop-client/*,
   /usr/hdp/${hdp.version}/hadoop-client/lib/*,
   /usr/hdp/${hdp.version}/hadoop-hdfs-client/*,
   /usr/hdp/${hdp.version}/hadoop-hdfs-client/lib/*,
   /usr/hdp/${hdp.version}/hadoop-yarn-client/*,
   /usr/hdp/${hdp.version}/hadoop-yarn-client/lib/*</value>
</property>

For secure clusters, you must create and configure the container-executor.cfg configuration file:
- Create the container-executor.cfg file in /etc/hadoop/conf/.
- Insert the following properties:
```
yarn.nodemanager.linux-container-executor.group=hadoop
banned.users=hdfs,yarn,mapred
min.user.id=1000
```
  - yarn.nodemanager.linux-container-executor.group - Configured value of yarn.nodemanager.linux-container-executor.group. This must match the value of yarn.nodemanager.linux-container-executor.group in yarn-site.xml.
  - banned.users - Comma-separated list of users who can not run container-executor.
  - min.user.id - Minimum value of user id. This prevents system users from running container-executor.
  - allowed.system.users - Comma-separated list of allowed system users.
- Set the file /etc/hadoop/conf/container-executor.cfg file permissions to only be readable by root:
  chown root:hadoop /etc/hadoop/conf/container-executor.cfg
  chmod 400 /etc/hadoop/conf/container-executor.cfg
- Set the container-executor program so that only root or hadoop group users can run it:
  chown root:hadoop /usr/hdp/${hdp.version}/hadoop-yarn-server-nodemanager/bin /container-executor
  chmod 6050 /usr/hdp/${hdp.version}/hadoop-yarn-server-nodemanager/bin /container-executor

​Configure YARN and MapReduce

Configure YARN and MapReduce