Command Line Installation
Also available as:
PDF
loading table of contents...

Configure YARN and MapReduce

After you install Hadoop, modify your configs.

  1. As the HDFS user, for example 'hdfs', upload the MapReduce tarball to HDFS.

    su - $HDFS_USER
    hdfs dfs -mkdir -p /hdp/apps/<hdp_version>/mapreduce/
    hdfs dfs -put /usr/hdp/current/hadoop-client/mapreduce.tar.gz /hdp/apps/<hdp_version>/mapreduce/
    hdfs dfs -chown -R hdfs:hadoop /hdp
    hdfs dfs -chmod -R 555 /hdp/apps/<hdp_version>/mapreduce
    hdfs dfs -chmod 444 /hdp/apps/<hdp_version>/mapreduce/mapreduce.tar.gz

    Where $HDFS_USER is the HDFS user, for example hdfs, and <hdp_version> is the current HDP version, for example 2.5.3.0.

  2. Copy mapred-site.xml from the companion files and make the following changes to mapred-site.xml:

    • Add:

      <property>
           <name>mapreduce.admin.map.child.java.opts</name>
           <value>-server -Djava.net.preferIPv4Stack=true -Dhdp.version=${hdp.version}</value>
           <final>true</final>
      </property>
      [Note]Note

      You do not need to modify ${hdp.version}.

    • Modify the following existing properties to include ${hdp.version}:

       <property>
           <name>mapreduce.admin.user.env</name> 
           <value>LD_LIBRARY_PATH=/usr/hdp/${hdp.version}/hadoop/lib/native:/usr/hdp/${hdp.version}/hadoop/lib/native/Linux-amd64-64</value>
      </property>
      
      <property>
           <name>mapreduce.application.framework.path</name>
           <value>/hdp/apps/${hdp.version}/mapreduce/mapreduce.tar.gz#mr-framework</value>
      </property>
      
      <property>
      <name>mapreduce.application.classpath</name>
      <value>$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure</value>
      </property>
      
      [Note]Note

      You do not need to modify ${hdp.version}.

  3. Copy yarn-site.xml from the companion files and modify:

    <property>
         <name>yarn.application.classpath</name> 
         <value>$HADOOP_CONF_DIR,/usr/hdp/${hdp.version}/hadoop-client/*,
         /usr/hdp/${hdp.version}/hadoop-client/lib/*,
         /usr/hdp/${hdp.version}/hadoop-hdfs-client/*,
         /usr/hdp/${hdp.version}/hadoop-hdfs-client/lib/*,
         /usr/hdp/${hdp.version}/hadoop-yarn-client/*,
         /usr/hdp/${hdp.version}/hadoop-yarn-client/lib/*</value>
    </property> 
  4. For secure clusters, you must create and configure the container-executor.cfg configuration file:

    • Create the container-executor.cfg file in /etc/hadoop/conf/.

    • Insert the following properties:

      yarn.nodemanager.linux-container-executor.group=hadoop
      banned.users=hdfs,yarn,mapred
      min.user.id=1000
    • Set the file /etc/hadoop/conf/container-executor.cfg file permissions to only be readable by root:

      chown root:hadoop /etc/hadoop/conf/container-executor.cfg
      chmod 400 /etc/hadoop/conf/container-executor.cfg
    • Set the container-executor program so that only root or hadoop group users can execute it:

      chown root:hadoop /usr/hdp/${hdp.version}/hadoop-yarn/bin/container-executor
      chmod 6050 /usr/hdp/${hdp.version}/hadoop-yarn/bin/container-executor