3. Access HDFS

To access HDFS, first mount the export "/".

Currently NFS v3 is supported. It uses TCP, as the transportation protocol is TCP.

  1. Mount the HDFS namespace as follows:

    mount -t nfs -o vers=3,proto=tcp,nolock $server:/ $mount_point

    Access HDFS as part of the local file system, except that hard/symbolic link and random write are not supported in this release.

    [Note]Note

    Because NLM is not supported, the mount option nolock is needed.

    The following additional mount options are supported:

    OptionDescription
    sync

    The sync mount option to the NFS client improves the performance and reliability of writing large files to HDFS via the NFS gateway. If the sync option is specified, the NFS client machine will flush write operations to the NFS gateway before returning control to the client application. A useful side effect of sync is that the client will not issue reordered writes which reduces buffering requirements on the NFS gateway.

    Note: sync is specified on the client machine when mounting the NFS share.

    rtmax, wtmax

    The dfs.nfs.rtmax and dfs.nfs.wtmax properties are HDFS configuration settings on the HDFS NFS gateway server. These options change the maximum read and write request size supported by the gateway. The default value for both settings is 1 MB. Increasing these values may improve the performance of large file transfers. The defaults are expected to work well for most deployments.

    User authentication and mapping:

    NFS gateway uses AUTH_UNIX-style authentication, which means that the the login user on the client is the same user that NFS passes to the HDFS. For example, if the NFS client has current user as admin, when the user accesses the mounted directory, NFS gateway will access HDFS as user admin. To access HDFS as hdfs user, you must first switch the current user to hdfs on the client system before accessing the mounted directory.

  2. Set up client machine users to interact with HDFS through NFS.

    The NFS gateway converts the User Identifier (UID) to username, and HDFS uses username to check permissions.

    The system administrator must ensure that the user on NFS client machine has the same name and UID as that on the NFS gateway machine. This is usually not a problem if you use the same user management system such as LDAP/NIS to create and deploy users to HDP nodes and to client node.

    If the user is created manually, you might need to modify the UID on either the client or NFS gateway host in order to make them the same:

    usermod -u 123 $myusername

    The following diagram illustrates how the UID and name are communicated between the NFS client, NFS gateway, and NameNode.