Configure the HDFS NFS Gateway
To configure the HDFS NFS gateway, complete the following steps.
The user running the NFS gateway must be able to proxy all users that are using NFS mounts.
For example, if user "nfsserver" is running the gateway and users belong to groups "nfs-users1" and "nfs-users2", then set the following values in the
core-site.xml
file on the NameNode.Note Replace "nfsserver" with the user account that will start the gateway in your cluster.
<property> <name>hadoop.proxyuser.nfsserver.groups</name> <value>nfs-users1,nfs-users2</value> <description> The 'nfsserver' user is allowed to proxy all members of the 'nfs-users1' and 'nfs-users2' groups. Set this to '*' to allow nfsserver user to proxy any group. </description> </property> <property> <name>hadoop.proxyuser.nfsserver.hosts</name> <value>nfs-client-host1.com</value> <description> This is the host where the nfs gateway is running. Set this to '*' to allow requests from any hosts to be proxied. </description> </property>
The preceding properties are the only required configuration settings for the NFS gateway in non-secure mode.
Configuring the HDFS NFS Gateway on a Kerberized Cluster
For a Kerberized cluster, set the following properties in the
hdfs-site.xml
file:<property> <name>dfs.nfsgateway.keytab.file</name> <value>/etc/hadoop/conf/nfsserver.keytab</value> <!-- path to the nfs gateway keytab --> </property> <property> <name>dfs.nfsgateway.kerberos.principal</name> <value>nfsserver/_HOST@YOUR-REALM.COM</value> </property>
Configure settings for the HDFS NFS gateway:
The NFS gateway uses the same settings that are used by the NameNode and DataNode. Configure the following properties based on your application's requirements:
Edit the
hdfs-site.xml
file on your NFS gateway machine. Modify the following property:<property> <name>dfs.namenode.accesstime.precision</name> <value>3600000</value> <description> The access time for HDFS file is precise up to this value. The default value is 1 hour. Setting a value of 0 disables access times for HDFS. </description> </property>
Note If the export is mounted with access time update allowed, make sure this property is not disabled in the configuration file. Only NameNode needs to restart after this property is changed. If you have disabled access time update by mounting with "noatime" you do NOT have to change this property nor restart your NameNode.
Add the following property to the
hdfs-site.xml
file:<property> <name>dfs.nfs3.dump.dir</name> <value>/tmp/.hdfs-nfs</value> </property>
Note The NFS client often reorders writes. Sequential writes can arrive at the NFS gateway at random order. This directory is used to temporarily save out-of-order writes before writing to HDFS. One needs to make sure the directory has enough space. For example, if the application uploads 10 files with each having 100MB, it is recommended for this directory to have 1GB space in case if a worst-case write reorder happens to every file.
Update the following property in the
hdfs-site.xml
file:<property> <name>dfs.nfs.exports.allowed.hosts</name> <value>* rw</value> </property>
Note By default, the export can be mounted by any client. You must update this property to control access. The value string contains the machine name and access privilege, separated by whitespace characters. The machine name can be in single host, wildcard, or IPv4 network format. The access privilege uses
rw
orro
to specifyreadwrite
orreadonly
access to exports. If you do not specifiy an access privilege, the default machine access to exports isreadonly
. Separate machine dentries by;
. For example,192.168.0.0/22 rw ; host*.example.com ; host1.test.org ro;
.Restart the NFS gateway after this property is updated.
(Optional) Customize log settings by modifying the
log4j.property
file.To change trace level, add the following:
log4j.logger.org.apache.hadoop.hdfs.nfs=DEBUG
To view more information about ONCRPC requests, add the following:
log4j.logger.org.apache.hadoop.oncrpc=DEBUG
Specify JVM heap space (HADOOP_NFS3_OPTS) for the NFS Gateway. You can increase the JVM heap allocation for the NFS gateway using this option.
To set this option, specify the following in the
hadoop-env.sh
file:export HADOOP_NFS3_OPTS=<memory-setting(s)>
The following example specifies a 2 GB process heap (2GB starting size and 2GB maximum):
export HADOOP_NFS3_OPTS="-Xms2048m -Xmx2048m"
The dfs.nfs.rtmax and dfs.nfs.wtmax properties are HDFS configuration settings on the HDFS NFS gateway server. These options change the maximum read and write request size supported by the gateway. The default value for both settings is 1 MB. Increasing these values might improve the performance of large file transfers. The defaults are expected to work well for most deployments.