3. Configure YARN and MapReduce

After you install Hadoop, modify your configs.

  1. Upload the MapReduce tarball to HDFS. As the HDFS user, for example 'hdfs':

    su $HDFS_USER
    hdfs dfs -mkdir -p /hdp/apps/<hdp_version>/mapreduce/
    hdfs dfs -put /usr/hdp/current/hadoop-client/mapreduce.tar.gz /hdp/apps/<hdp_version>/mapreduce/
    hdfs dfs -chown -R hdfs:hadoop /hdp
    hdfs dfs -chmod -R 555 /hdp/apps/<hdp_version>/mapreduce
    hdfs dfs -chmod -R 444 /hdp/apps/<hdp_version>/mapreduce/mapreduce.tar.gz

    Where $HDFS_USER is the HDFS user, for example hdfs, and <hdp_version> is the current HDP version, for example 2.2.0.0.

  2. Copy mapred-site.xml from the companion files and make the following changes to mapred-site.xml:

    • Add:

      <property>
           <name>mapreduce.admin.map.child.java.opts</name>
           <value>-server -Djava.net.preferIPv4Stack=true -Dhdp.version=${hdp.version}</value>
           <final>true</final>
      </property>
      [Note]Note

      You do not need to modify ${hdp.version}.

    • Modify the following existing properties to include ${hdp.version}:

       <property>
           <name>mapreduce.admin.user.env</name> 
           <value>LD_LIBRARY_PATH=/usr/hdp/${hdp.version}/hadoop/lib/native:/usr/hdp/${hdp.version}/hadoop/lib/native/Linux-amd64-64</value>
      </property>
      
      <property>
           <name>mapreduce.application.framework.path</name>
           <value>/hdp/apps/${hdp.version}/mapreduce/mapreduce.tar.gz#mr-framework</value>
      </property>
      
      <property>
           <name>mapreduce.application.classpath</name>
           <value>$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure</value>
      </property>
      [Note]Note

      You do not need to modify ${hdp.version}.

  3. Copy yarn-site.xml from the companion files and modify:

    <property>
         <name>yarn.application.classpath</name> 
         <value>$HADOOP_CONF_DIR,/usr/hdp/${hdp.version}/hadoop-client/*,/usr/hdp/${hdp.version}/hadoop-client/lib/*,/usr/hdp/${hdp.version}/hadoop-hdfs-client/*,/usr/hdp/${hdp.version}/hadoop-hdfs-client/lib/*,/usr/hdp/${hdp.version}/hadoop-yarn-client/*,/usr/hdp/${hdp.version}/hadoop-yarn-client/lib/*</value>
    </property> 
  4. For secure clusters, you must create and configure the container-executor.cfg configuration file:

    • Create the container-executor.cfg file in /etc/hadoop/conf/.

    • Insert the following properties:

      yarn.nodemanager.linux-container-executor.group=hadoop
      banned.users=hdfs,yarn,mapred
      min.user.id=1000
    • Set the file /etc/hadoop/conf/container-executor.cfg file permissions to only be readable by root:

      chown root:hadoop /etc/hadoop/conf/container-executor.cfg
      chmod 400 /etc/hadoop/conf/container-executor.cfg
    • Set the container-executor program so that only root or hadoop group users can execute it:

      chown root:hadoop /usr/hdp/${hdp.version}/hadoop-yarn/bin/container-executor
      chmod 6050 /usr/hdp/${hdp.version}/hadoop-yarn/bin/container-executor

loading table of contents...