Data connections that point to data sources outside of Cloudera Data Platform or require custom
configurations can be created and made available to end users with Custom Data Connections.
These Python implementations of the Cloudera Machine Learning Data library are stored in the Data Connections
Registry. Workspace users can track and connect to any data source and connection
implementation a Cloudera Machine Learning Administrator makes available.
Consider the followings:
- Custom connections can only be created in projects created by the
Administrator.
- The project source selection list in the Data Connection creation dialogue only
displays projects created by the user.
- Team projects or projects with multiple collaborators will also not be
displayed, only those directly created by the user.
- Custom connections at workspace level can only be edited by the creator, not
other Administrator users. Attempts at editing workspace level custom
connections will result in an error.
Before setting up a custom connection, you might want to create a dedicated Cloudera Machine Learning Team
to collaborate on external connections. A good practice is to separate the
connection code projects and and configure collaborators on the Team level to build
and maintain the connection code.
-
Develop your own custom data connection (see Developing a Custom Data
Connection) in a Cloudera Machine Learning project, or clone an existing custom data connection files
directory into a Cloudera Machine Learning project.
-
In , select New Connection.
-
Enter the connection name. You cannot have duplicate names for data connections
within a workspace or within a given project.
-
Select the connection type: Custom Connection
-
Enter the Type Display name. This should be a descriptive label to help Cloudera Machine Learning
project owners identify what this custom connection could be used for.
-
Select the Cloudera Machine Learning Project and Project directory which contains your custom
connection implementation
- Connection files must be in a directory and not in the root of your
project.
- A snapshot of all implementation files in the directory will be uploaded
to the Cloudera Machine Learning Custom Data Connection registry located in the
workspace.
- These uploaded files are safe from any changes to the originating
project. To make changes to the files, create a new
custom data connection.
-
(Optional) Enter any custom parameters. These are available during a
session and can be validated or overridden depending on the interface
implementation for the custom data connection. Refer to the implementation of
your custom data connection for specific details on required keys and
values.
-
Click Create.
The data connection is now available to all users. To change availability, click the
Available switch. This switch determines if the data
connection is displayed in Projects created within the workspace. Refer to Data
connection management for availability of your newly created custom
connections in new and existing Cloudera Machine Learning Projects.