Configure the NFS Gateway

You must ensure that the proxy user for the NFS Gateway can proxy all the users accessing the NFS mounts. In addition, you must configure settings specific to the Gateway.

  1. Ensure that the proxy user for the NFS Gateway can proxy all the users accessing the NFS mounts.
    In non-secure mode, the user running the Gateway is the proxy user, while in secure mode the user in Kerberos keytab is the proxy user.
    If a user nfsserver is running the Gateway and there are users belonging to groups nfs-users1 and nfs-users2, then set the following values in core-site.xml on the NameNode.
    <property>
      <name>hadoop.proxyuser.nfsserver.groups</name>
      <value>nfs-users1,nfs-users2</value>
      <description>
        The 'nfsserver' user is allowed to proxy all members of the
        'nfs-users1' and 'nfs-users2' groups. Set this to '*' to allow
        nfsserver user to proxy any group.
      </description>
    </property>
    
    <property>
      <name>hadoop.proxyuser.nfsserver.hosts</name>
      <value>nfs-client-host1.com</value>
      <description>
        This is the host where the nfs gateway is running. Set this to
        '*' to allow requests from any hosts to be proxied.
      </description>
    </property>
    For a Kerberized cluster, set the following properties in hdfs-site.xml:
    <property>
      <name>dfs.nfsgateway.keytab.file</name>
      <value>/etc/hadoop/conf/nfsserver.keytab</value> <!-- path to the 
          nfs gateway keytab -->
    </property>
    
    <property>
      <name>dfs.nfsgateway.kerberos.principal</name>
      <value>nfsserver/_HOST@YOUR-REALM.COM</value>
    </property>
  2. Configure settings for the NFS Gateway.
    The NFS Gateway uses the same settings that are used by the NameNode and DataNode. Configure various properties based on your application's requirements:
    1. Edit the hdfs-site.xml file on your NFS Gateway machine.
      <property>
        <name>dfs.namenode.accesstime.precision</name>
        <value>3600000</value>
        <description>
          The access time for HDFS file is precise up to this value. 
          The default value is 1 hour. Setting a value of 0 disables
          access times for HDFS.
        </description>
      </property>
    2. Add the value of the dfs.nfs3.dump.dir property in hdfs-site.xml.
      
      <property>    
          <name>dfs.nfs3.dump.dir</name>    
          <value>/tmp/.hdfs-nfs</value> 
      </property>
    3. Update the value of the dfs.nfs.exports.allowed.hosts property in hdfs-site.xml as specified.
      
      <property>    
          <name>dfs.nfs.exports.allowed.hosts</name>    
          <value>* rw</value> 
      </property>
    4. Restart the NFS Gateway.
    5. Optional: Customize log settings by modifying the log4j.property file.
      To change the trace level, add the following:

      log4j.logger.org.apache.hadoop.hdfs.nfs=DEBUG

      To view more information about ONCRPC requests, add the following:

      log4j.logger.org.apache.hadoop.oncrpc=DEBUG

  3. Specify JVM heap space (HADOOP_NFS3_OPTS) for the NFS Gateway.
    You can increase the JVM heap allocation for the NFS Gateway using this option. To set this option, specify the following in hadoop-env.sh:

    export HADOOP_NFS3_OPTS=<memory-setting(s)>

    The following example specifies a 2GB process heap (2GB starting size and 2GB maximum):
    export HADOOP_NFS3_OPTS="-Xms2048m -Xmx2048m"
  4. To improve the performance of large file transfers, you can increase the values of the dfs.nfs.rtmax and dfs.nfs.wtmax properties.
    These properties are configuration settings on the NFS Gateway server that change the maximum read and write request size supported by the Gateway. The default value for both settings is 1MB.