Adding and Configuring an NFS Gateway
- Browse the HDFS file system as though it were part of the local file system
- Upload and download files from the HDFS file system to and from the local file system.
- Stream data directly to HDFS through the mount point.
File append is supported, but random write is not.
Adding and Configuring an NFS Gateway Using Cloudera Manager
Minimum Required Role: Cluster Administrator (also provided by Full Administrator)
The NFS Gateway role implements an NFSv3 gateway. It is an optional role for a CDH 5 HDFS service.
Requirements and Limitations
- The NFS gateway works only with the following operating systems and Cloudera Manager and CDH versions:
- With Cloudera Manager 5.0.1 or later and CDH 5.0.1 or later, the NFS gateway works on all operating systems supported by Cloudera Manager.
- With Cloudera Manager 5.0.0 or CDH 5.0.0, the NFS gateway only works on RHEL and similar systems.
- The NFS gateway is not supported on versions earlier than Cloudera Manager 5.0.0 and CDH 5.0.0.
- If any NFS server is already running on the NFS Gateway host, it must be stopped before the NFS Gateway role is started.
- There are two configuration options related to NFS Gateway role: Temporary Dump Directory and Allowed Hosts and Privileges. The Temporary Dump Directory is automatically created by the NFS Gateway role and should be configured before starting the role.
- The Access Time Precision property in the HDFS service must be enabled.
Adding and Configuring the NFS Gateway Role
- Go to the HDFS service.
- Click the Instances tab.
- Click Add Role Instances.
- Click the text box below the NFS Gateway field. The Select Hosts dialog box displays.
- Select the host on which to run the role and click OK.
- Click Continue.
- Click the NFS Gateway role.
- Click the Configuration tab.
- Select .
- Select .
- Ensure that the requirements on the directory set in the Temporary Dump Directory property are met.
If more than one role group applies to this configuration, edit the value for the appropriate role group. See Modifying Configuration Properties.
- Optionally edit Allowed Hosts and Privileges.
If more than one role group applies to this configuration, edit the value for the appropriate role group. See Modifying Configuration Properties.
- Click Save Changes to commit the changes.
- Click the Instances tab.
- Check the checkbox next to the NFS Gateway role and select .
Configuring an NFSv3 Gateway Using the Command Line
The subsections that follow provide information on installing and configuring the gateway.
Upgrading from a CDH 5 Beta Release
If you are upgrading from a CDH 5 Beta release, you must first remove the hadoop-hdfs-portmap package. Proceed as follows.
- Unmount existing HDFS gateway mounts. For example, on each client, assuming the file system is mounted on /hdfs_nfs_mount:
$ umount /hdfs_nfs_mount
- Stop the services:
$ sudo service hadoop-hdfs-nfs3 stop $ sudo hadoop-hdfs-portmap stop
- Remove the hadoop-hdfs-portmap package.
- On a RHEL-compatible system:
$ sudo yum remove hadoop-hdfs-portmap
- On a SLES system:
$ sudo zypper remove hadoop-hdfs-portmap
- On an Ubuntu or Debian system:
$ sudo apt-get remove hadoop-hdfs-portmap
- On a RHEL-compatible system:
- Install the new version
- On a RHEL-compatible system:
$ sudo yum install hadoop-hdfs-nfs3
- On a SLES system:
$ sudo zypper install hadoop-hdfs-nfs3
- On an Ubuntu or Debian system:
$ sudo apt-get install hadoop-hdfs-nfs3
- On a RHEL-compatible system:
- Start the system default portmapper service:
$ sudo service portmap start
- Now proceed with Starting the NFSv3 Gateway, and then remount the HDFS gateway mounts.
Installing the Packages for the First Time
On RHEL and similar systems:
- nfs-utils
- nfs-utils-lib
- hadoop-hdfs-nfs3
Use the following command:
$ sudo yum install nfs-utils nfs-utils-lib hadoop-hdfs-nfs3
On SLES:
$ sudo zypper install nfs-utils
On an Ubuntu or Debian system:
$ sudo apt-get install nfs-common
Configuring the NFSv3 Gateway
- Add the following property to hdfs-site.xml on the NameNode:
<property> <name>dfs.namenode.accesstime.precision</name> <value>3600000</value> <description>The access time for an HDFS file is precise up to this value. The default value is 1 hour. Setting a value of 0 disables access times for HDFS.</description> </property>
- Add the following property to hdfs-site.xml on the NFS server:
<property> <name>dfs.nfs3.dump.dir</name> <value>/tmp/.hdfs-nfs</value> </property>
- Configure the user running the gateway (normally the hdfs user as in this example) to be a proxy for other users. To allow the hdfs user to be a proxy for all other users, add the following entries to core-site.xml on the NameNode:
<property> <name>hadoop.proxyuser.hdfs.groups</name> <value>*</value> <description> Set this to '*' to allow the gateway user to proxy any group. </description> </property> <property> <name>hadoop.proxyuser.hdfs.hosts</name> <value>*</value> <description> Set this to '*' to allow requests from any hosts to be proxied. </description> </property>
- Restart the NameNode.
Starting the NFSv3 Gateway
Do the following on the NFS server.
- First, stop the default NFS services, if they are running:
$ sudo service nfs stop
- Start the HDFS-specific services:
$ sudo service hadoop-hdfs-nfs3 start
Verifying that the NFSv3 Gateway is Working
$ rpcinfo -p <nfs_server_ip_address>
program vers proto port 100005 1 tcp 4242 mountd 100005 2 udp 4242 mountd 100005 2 tcp 4242 mountd 100000 2 tcp 111 portmapper 100000 2 udp 111 portmapper 100005 3 udp 4242 mountd 100005 1 udp 4242 mountd 100003 3 tcp 2049 nfs 100005 3 tcp 4242 mountd
$ showmount -e <nfs_server_ip_address>You should see output similar to the following:
Exports list on <nfs_server_ip_address>: / (everyone)
Mounting HDFS on an NFS Client
$ mount -t nfs -o vers=3,proto=tcp,nolock <nfs_server_hostname>:/ /hdfs_nfs_mount