Other NFS Options
As an alternative to Azure Files NFS or Azure NetApp Files, you can configure an NFS server that is external to the Cloudera Machine Learning cluster. This is not the recommended approach.
Make sure to check the following points:
- Cloudera Machine Learning requires an NFS service that is backed by a fully POSIX compliant file system. NFS service implemented over S3-like blob storage, such as NFS v3 over ADLS, is not supported.
- Currently, Cloudera Machine Learning supports NFS versions 4.1 and 3. The NFS client within Cloudera Machine Learning must be able to
mount the NFS storage with default options, and also assumes these export options:
rw,sync,no_root_squash,no_all_squash,no_subtree_check
To prevent data loss on disk failure, it is recommended that the NFS export directory resides on a RAID disk. Before attempting to provision your Cloudera Machine Learning Workspace, create a Linux VM in the subnet you will use for Cloudera Machine Learning. - Before creating a Cloudera Machine Learning Workspace, the storage administrator must create a directory that will be exported to the cluster for storing Cloudera Machine Learning project files for that workspace. Either a dedicated NFS export path, or a subdirectory in an existing export must be specified for each workspace. If the directory was previously used for another workspace, there may be extraneous files remaining there. Make sure that all such remaining files are deleted.
- Each Cloudera Machine Learning Workspace needs a unique directory that does not have files in it from a different or previous workspace. For example, if 10 Cloudera Machine Learning Workspace are expected, the storage administrator will need to create 10 unique directories. Either one NFS export and 10 subdirectories within it need to be created, or 10 unique exports need to be created.
Create a workspace in a dedicated NFS share
For example, to use a dedicated NFS share for a workspace named “workspace1” from NFS server “nfs_server”, do the following:
-
Create NFS export directory “/workspace1”.
-
Change ownership for the exported directory
-
Cloudera Machine Learning accesses this directory as a user with a UID and GID of 8536. So run chown 8536:8536 /workspace1
- Make the export directory group-writeable and set GID: chmod g+srwx /workspace1
-
-
Provide the NFS export path nfs_server:/workspace1 when prompted by the Cloudera Machine Learning Control Plane App while creating the workspace.
-
To use a subdirectory in an existing NFS share, say nfs_server:/export, do the following:
-
Create a subdirectory /export/workspace1
-
Change ownership: chown 8536:8536 /export/workspace1
-
Set GID and make directory group writeable: chmod g+srwx /export/workspace1
-
Provide the export path nfs_server:/export/workspace1 when prompted by the Cloudera Machine Learning Control Plane App.
-