NFS Options for Private Cloud

Cloudera Machine Learning on Private Cloud requires a Network File System (NFS) server for storing project files and folders.

The recommended approach is to use an NFS server that is external to the cluster, such as a NetApp Filer appliance. In this case, you must manually create a directory for each workspace.

The NFS server must be configured before deploying the first CML workspace in the cluster. One important limitation is that CML does not support using shared volumes for storing project files. A read-write-once (RWO) persistent volume must be allocated to the internal NFS server (e.g., NFS server provisioner) as the persistence layer. The NFS server uses the volume to dynamically provision read-write-many (RWX) NFS volumes for the CML clients.

An alternative approach is to use an internal NFS server which is deployed into the cluster. This method uses a deprecated internal NFS provisioner, and it should only be used for small, proof-of-concept deployments. Solutions include NFS over Ceph using NFS Server Provisioner (NFS Ganesha) on OpenShift. On ECS, Cloudera manages and deploys an NFS which can be used for CML The storage space for each workspace is transparently managed by the internal NFS server.