HDFS NFS Gateway User Guide
Also available as:
PDF

Access HDFS

To access HDFS, first mount the export "/". Currently NFS v3 is supported. It uses TCP, as the transportation protocol is TCP.

  1. Mount the HDFS namespace as follows:

    mount -t nfs -o vers=3,proto=tcp,nolock,sync,rsize=1048576,wsize=1048576 $server:/ $mount_point

    Access HDFS as part of the local file system, except that hard/symbolic link and random write are not supported in this release.

    [Note]Note

    Because NLM is not supported, the mount option nolock is needed.

    Use the sync option for performance when writing large files. The sync mount option to the NFS client improves the performance and reliability of writing large files to HDFS using the NFS gateway. If the sync option is specified, the NFS client machine flush writes operations to the NFS gateway before returning control to the client application. A useful side effect of sync is that the client does not issue reordered writes. This reduces buffering requirements on the NFS gateway.

    sync is specified on the client machine when mounting the NFS share.

    Here is additional information about sync, rtmax/wtmax, and HADOOP_NFS3_OPTS (for gateway heap space):

    User authentication and mapping:

    NFS gateway uses AUTH_UNIX-style authentication, which means that the the login user on the client is the same user that NFS passes to the HDFS. For example, if the NFS client has current user as admin, when the user accesses the mounted directory, NFS gateway will access HDFS as user admin. To access HDFS as hdfs user, you must first switch the current user to hdfs on the client system before accessing the mounted directory.

  2. Set up client machine users to interact with HDFS through NFS.

    The NFS gateway converts the User Identifier (UID) to username, and HDFS uses username to check permissions.

    The system administrator must ensure that the user on NFS client machine has the same name and UID as that on the NFS gateway machine. This is usually not a problem if you use the same user management system such as LDAP/NIS to create and deploy users to HDP nodes and to client node.

    If the user is created manually, you might need to modify the UID on either the client or NFS gateway host in order to make them the same:

    usermod -u 123 $myusername

    The following diagram illustrates how the UID and name are communicated between the NFS client, NFS gateway, and NameNode.