Network File System (NFS)

A Network File System (NFS) is a protocol to access storage on a network that emulates accessing storage in a local file system. Cloudera Machine Learning requires an NFS server for storing project files and folders, and the NFS export must be configured before you provision the first Cloudera Machine Learning Workspace in the cluster.

There are many different products or packages that can create an NFS in your private network. A Kubernetes cluster can host an internal NFS server, or an external NFS server can be installed on another cluster that is accessible by the private cloud cluster nodes. NFS storage is used only for storing project files and folders, and not for any other Cloudera Machine Learning data, such as PostgreSQL database and livelog files.

Cloudera Machine Learning does not support shared volumes, such as Portworx shared volumes, for storing project files. A read-write-once (RWO) persistent volume must be allocated to the internal NFS server (for example, NFS server provisioner) as the persistence layer. The NFS server uses the volume to dynamically provision read-write-many (RWX) NFS volumes for the Cloudera Machine Learning clients.

An external NFS server option is currently the recommended option for Cloudera Private Cloud production workloads. Not specifying an external NFS Server for your Cloudera Machine Learning Workspace will use/require a deprecated internal NFS provisioner, which should only be used for small, proof-of-concept deployments. There are several options for setting up an internal NFS provisioner, described in the appendix. The Cloudera Private Cloud Administrator is responsible for setting up an NFS for use by your cluster.