Configure mountable HDFS

CDP includes a FUSE (Filesystem in UserSpace) interface into HDFS. The hadoop-hdfs-fuse package enables you to use your HDFS cluster as if it were a traditional filesystem on Linux.

You must have a working HDFS cluster and know the hostname and port that your NameNode exposes.
  1. Depending on the operating system that is running on your CDP host, use the required command to install hadoop-hdfs-fuse.
    • To install hadoop-hdfs-fuses on Red Hat-compatible systems:
      sudo yum install hadoop-hdfs-fuse
    • To install hadoop-hdfs-fuses on Ubuntu systems:
      sudo apt-get install hadoop-hdfs-fuse
    • To install hadoop-hdfs-fuses on SLES systems:
    • sudo zypper install hadoop-hdfs-fuse
  2. Set up and test your mount point depending on whether you have an HA or a non-HA configuration.
    • To set up and test your mount point in a non-HA installation:
      $ mkdir -p <mount_point>
      $ hadoop-fuse-dfs dfs://<name_node_hostname>:<namenode_port> <mount_point>

      where namenode_port is the NameNode's RPC port, dfs.namenode.servicerpc-address.

    • To set up and test your mount point in an HA installation:
      mkdir -p <mount_point>
      hadoop-fuse-dfs dfs://<nameservice_id> <mount_point>
    • where nameservice_id is the value of fs.defaultFS. In this case the port defined for dfs.namenode.rpc-address.[nameservice ID].[name node ID] is used automatically.

    You can now run operations as if they are on your mount point.
  3. Clean up your test mount by using the umount command.
    umount <mount_point>
You can now add a permanent HDFS mount which persists through reboots.