Set up a Custom Data Connection

Data connections that point to data sources outside of CDP or require custom configurations can be created and made available to end users with Custom Data Connections. These Python implementations of the CML Data library are stored in the Data Connections Registry. Workspace users can track and connect to any data source and connection implementation a CML Administrator makes available.

Keep in mind the following:

  • Custom connections can only be created in projects created by the Administrator.
  • The project source selection list in the Data Connection creation dialogue only displays projects created by the user.
  • Team projects or projects with multiple collaborators will also not be displayed, only those directly created by the user.
  • Custom connections at workspace level can only be edited by the creator, not other Administrator users. Attempts at editing workspace level custom connections will result in an error.

Before setting up a custom connection, you might want to create a dedicated CML Team to collaborate on external connections. A good practice is to separate the connection code projects and and configure collaborators on the Team level to build and maintain the connection code.

  1. Develop your own custom data connection (see Developing a Custom Data Connection) in a CML project, or clone an existing custom data connection files directory into a CML project.
  2. In Site Administration > Data Connections, select New Connection.
  3. Enter the connection name. You cannot have duplicate names for data connections within a workspace or within a given project.
  4. Select the connection type: Custom Connection
  5. Enter the Type Display name. This should be a descriptive label to help CML project owners identify what this custom connection could be used for.
  6. Select the CML Project and Project directory which contains your custom connection implementation
    1. Connection files must be in a directory and not in the root of your project.
    2. A snapshot of all implementation files in the directory will be uploaded to the CML Custom Data Connection registry located in the workspace.
    3. These uploaded files are safe from any changes to the originating project. To make changes to the files, create a new custom data connection.
  7. (Optional) Enter any custom parameters. These are available during a session and can be validated or overridden depending on the interface implementation for the custom data connection. Refer to the implementation of your custom data connection for specific details on required keys and values.
  8. Click Create.
The data connection is now available to all users. To change availability, click the Available switch. This switch determines if the data connection is displayed in Projects created within the workspace. Refer to Data connection management for availability of your newly created custom connections in new and existing CML Projects (https://docs.cloudera.com/machine-learning/cloud/mlde/topics/ml-mlde-data-conn-mgmt.html)