Configure Databricks Metadata Source in Cloudera Octopai

Learn how to configure the Databricks Metadata Source in Cloudera Octopai using either user authentication with Personal Access Tokens or machine-to-machine authentication with service principals.

Cloudera Octopai Data Lineage supports two authentication methods for connecting to Databricks:

  • User authentication using a Personal Access Token
  • Machine-to-machine (M2M) authentication using a service principal

Option 1: User authentication token (Personal Access Token)



Configure the following settings when using the Personal Access Token authentication method:
  1. Unity Catalog Options
    • HMS only – when Databricks uses Hive Metastore without Unity Catalog.
    • Unity Catalog (can contain HMS) – when Databricks uses Unity Catalog. Hive Metastore can also be used (not mandatory).
  2. Connection Name

    Assign a clear and meaningful name for the connection. This name will appear to users within the Cloudera Octopai platform.

  3. Databricks Server URL

    Enter the customer's Databricks workspace URL.

    Example: https://abc-1234.5.azuredatabricks.net

  4. Token

    Enter the Personal Access Token generated under Settings > Developer > Access Tokens (Manage) in Databricks.

  5. HTTP Path

    Paste the HTTP Path copied from the Databricks SQL Warehouse > Connection Details field.

    Example: /sql/1.0/warehouses/abc123xyz

  6. Workspace ID (for Unity Catalog only)
  7. Account ID (for Unity Catalog only, optional)

Option 2: Machine-to-machine authentication (service principal)



Configure the following settings when using the service principal authentication method:
  1. Unity Catalog Options
    • HMS only – when Databricks uses Hive Metastore without Unity Catalog.
    • Unity Catalog (may include HMS) – when Databricks uses Unity Catalog. Hive Metastore can also be used but is not mandatory.
  2. Connection Name

    Assign a clear and meaningful name for the connection. This name will appear to users within the platform.

  3. Databricks Server URL

    Enter the customer's Databricks workspace URL.

    Example: https://abc-1234.5.azuredatabricks.net

  4. Client ID

    Enter the Client ID of the service principal created in Databricks.

  5. Client Secret

    Enter the secret token generated for the service principal.

  6. HTTP Path (for Unity Catalog only)

    Paste the HTTP Path copied from the Databricks SQL Warehouse > Connection Details field.

    Example: /sql/1.0/warehouses/abc123xyz

  7. Workspace ID (for Unity Catalog only)
  8. Account ID (for Unity Catalog only, optional)