Scaling Namespaces and Optimizing Data Storage
Also available as:
PDF
loading table of contents...

Considerations for working with ViewFs mount table entries

After specifying the ViewFs mount table name, you must define the mount table entries to map physical locations on the federation to their corresponding mount points. You must consider factors such as data access, mount levels, and application requirements while defining ViewFs mount table entries.

  • Define the ViewFs mount table entries for a cluster in a separate file and reference the file using XInclude in core-site.xml.
  • For access to data across clusters, ensure that a cluster configuration contains mount table entries for all the clusters in your environment.
  • For a nested directory path, you can define separate ViewFs mounts to point to the top-level directory and the sub-directories based on requirements.

    For a directory user that contains sub-directories such as joe and jane, you can define separate ViewFs mount points to /user, /user/joe, and /user/jane.

  • For applications that work across clusters and store persistent file paths, consider defining mount paths of type viewfs://cluster/path.

    Such a path definition insulates users from movement of data within the cluster if the data movement is transparent.

  • When moving files from one NameNode to another inside a cluster, update the mount point corresponding to the files so that the mount point specifies the correct NameNode.

    Consider an example where the cluster administrator moves /user and /data, originally present on a common NameNode, to separate NameNodes. Before the movement, if the mount points for both /user and /data specified the same NameNode namenodeContainingUserAndData, the administrator must change the mount points after the data movement to specify separate NameNodes namenodeContaingUser and namenodeContainingData respectively.

  • A client reads the cluster mount table when submitting a job. The XInclude in core-site.xml is expanded only at the time of job submission. Therefore, if you make any changes to the mount table entries, the jobs must be resubmitted.