Network File System

Network File System (NFS) is a protocol to access storage on a network that emulates accessing storage in a local file system. CML requires an NFS server for storing project files and folders, and the NFS export must be configured before you provision the first CML workspace in the cluster.

There are many different products or packages that can create an NFS in your private network. A Kubernetes cluster can host an internal NFS server, or an external NFS server can be installed on another cluster that is accessible by the private cloud cluster nodes. NFS storage is used only for storing project files and folders, and not for any other CML data, such as PostgreSQL database or livelog files.

CML does not support shared volumes, such as Portworx shared volumes, for storing project files. A read-write-once (RWO) persistent volume must be allocated to the internal NFS server (for example, NFS server provisioner) as the persistence layer. The NFS server uses the volume to dynamically provision read-write-many (RWX) NFS volumes for the CML clients.

NFS Options for Private Cloud

The recommended approach is an internal NFS server which is deployed into the cluster. Solutions include NFS over Ceph or Portworx using NFS Server Provisioner (NFS Ganesha). The storage space for each workspace is transparently managed by the internal NFS server.

An alternative is to use an NFS server that is external to the cluster, such as a NetApp Filer appliance. In this case, you must manually create a directory for each workspace.

The NFS server must be configured before deploying the first CML workspace in the cluster. One important limitation is that CML does not support using shared volumes for storing project files. A read-write-once (RWO) persistent volume must be allocated to the internal NFS server (e.g., NFS server provisioner) as the persistence layer. The NFS server uses the volume to dynamically provision read-write-many (RWX) NFS volumes for the CML clients.

Deploying NFS Server Provisioner on Rook Ceph

As an example, you can deploy the NFS Server Provisioner using the Helm chart provided here.

  • For the nfs storage class, set “allowVolumeExpansion=true”
  • For the underlying persistent volume, specify a size of 1 TiB.
  • On the block storage system class, rook-ceph-block in this case, set allowVolumeExpansion=true
  • Download two yaml files here: nfs-server-provisioner.yml and nfs-scc.yml.
  1. Install Path 1: Installing using the oc command and yaml files:

    1. If you do not have Tillerless Helm v2 set up, then you can simply apply the nfs-server-provisioner.yml file as follows: $ oc create -f nfs-server-provisioner.yml -n cml-nfs

  2. Install Path 2: Installing using the oc command and Tillerless Helm v2:

    $ oc delete scc nfs-scc
           $ oc delete clusterrole cml-nfs-nfs-server-provisioner 
           $ oc delete clusterrolebinding cml-nfs-nfs-server-provisioner 
           $ oc delete namespace cml-nfs
           
           $ helm tiller run cml-nfs -- helm delete cml-nfs --purge
           $ oc delete scc nfs-scc securitycontextconstraints.security.openshift.io "nfs-scc" deleted
          

Using an External NFS Server

As an alternative, you can install an NFS server that is external to the cluster. This is not the recommended approach.

Currently, CML only works with NFS version 3 and 4.1. The NFS client within CML must be able to mount the NFS storage with default options, and also assumes these export options:
rw,sync,no_root_squash,no_all_squash,no_subtree_check

The no_root_squash option has security implications, which is a strong reason to choose internal NFS instead.

Before creating a CML workspace, the storage administrator must create a directory that will be exported to the cluster for storing ML project files for that workspace. Either a dedicated NFS export path, or a subdirectory in an existing export must be specified for each workspace.

Each CML workspace needs a unique directory that does not have files in it from a different or previous workspace. For example, if 10 CML workspaces are expected, the storage administrator will need to create 10 unique directories. Either one NFS export and 10 subdirectories within it need to be created, or 10 unique exports need to be created.

For example, to use a dedicated NFS share for a workspace named “workspace1” from NFS server “nfs_server”, do the following:

  1. Create NFS export directory “/workspace1”.


  2. Change ownership for the exported directory

    1. CML accesses this directory as a user with a UID and GID of 8536.  So run chown 8536:8536 /workspace1

    2. Make the export directory group-writeable and set GID: 
chmod g+srwx /workspace1
  3. Provide the NFS export path nfs_server:/workspace1 when prompted by the CML Control Plane App while creating the workspace.

  4. To use a subdirectory in an existing NFS share, say nfs_server:/export, do the following:

    1. Create a subdirectory /export/workspace1

    2. Change ownership: chown 8536:8536 /export/workspace1

    3. Set GID and make directory group writeable: chmod g+srwx /export/workspace1

    4. Provide the export path nfs_server:/export/workspace1 when prompted by the CML Control Plane App.


Backing up Project Files and Folders

The block device backing the NFS server data must be backed up to protect the CML project files and folders. The backup mechanism would vary depending on the underlying block storage system and backup policies in place.

  1. identify the underlying block storage to backup, first determine the NFS PV:

    $ echo `kubectl get pvc -n cml-nfs -o jsonpath='{.items[0].spec.volumeName}'`
           pvc-bec1de27-753d-11ea-a287-4cd98f578292
          
  2. For Rook Ceph, the RBD volume/image name is the name of the dynamically created persistent volume (pvc-3d3316b6-6cc7-11ea-828e-1418774847a1).

Ensure this volume is backed up using an appropriate backup policy.

Uninstalling the NFS server

Uninstall the NFS server provisioner using either of the following commands.

Use this command if the NFS server provisioner was installed using oc and yaml files:
$ oc delete scc nfs-scc
$ oc delete clusterrole cml-nfs-nfs-server-provisioner 
$ oc delete clusterrolebinding cml-nfs-nfs-server-provisioner 
$ oc delete namespace cml-nfs
Use this command if the NFS server provisioner was installed using Helm:
$ helm tiller run cml-nfs -- helm delete cml-nfs --purge
$ oc delete scc nfs-scc securitycontextconstraints.security.openshift.io "nfs-scc" deleted