Creating a Data Share with CDP CLI

Learn how to register external clients in Cloudera on cloud and create Data Shares using CDP CLI commands. This process involves provisioning credentials for external users and managing data sharing through a series of CLI commands. Ensure prerequisites are met and follow the steps to securely share data assets with external users.

Resource owners or Data Share administrators who want to share their Iceberg tables in Cloudera with external clients must first register the external client in the Cloudera on cloud environment using the cdp datacatalog create-external-users CDP CLI command. This provisions a CLIENT_ID and CLIENT_SECRET for the external user.

After registering the external user, the resource owner creates a Data Share using the cdp datacatalog create-data-share CDP CLI command. The command packages specified data assets (Iceberg tables) into a shareable unit and optionally grants access to registered external users at creation time using the --external-users parameter.

The CDP CLI also provides commands to manage the full Data Share lifecycle, including listing, updating, activating, deactivating, and deleting shares, as well as managing asset membership and external user access.

  • Users who run the token generation commands, must be a part of the Knox admin users and groups configuration. For more information see Knox configuration in gateway-site.xml. Having the DataShareAdmin resource role includes the knoxAdmin role. For more information, see Providing access to users.
  • Run all commands within the network of your Cloudera Runtime or through a VPN.
  • For Cloudera on cloud environments, you can alternatively register external users using the Cloudera Data Catalog user interface. For more information, see Creating external users.
  • CDP CLI is installed and configured. For more information, see CLI client setup.
Ensure that you have the following information before performing the steps:
Share Admin user and password
Username and password of the Cloudera Administrator. For more information, see Cloudera account administrator.
Data Lake name
Go to Management Console > Environments > <***YOUR_ENVIRONMENT_NAME***> > Data Lake Details and copy and make a note of the Data Lake name.
Figure 1. Data Lake Details

Data Share management commands

The following additional CDP CLI commands are available for Data Share management:

  • cdp datacatalog create-external-users — Creates external user accounts for individuals outside Cloudera, generating a CLIENT_ID and CLIENT_SECRET for each user.
  • cdp datacatalog list-external-users — Lists external users registered for data sharing, with optional filtering and pagination.
  • cdp datacatalog revoke-external-user-credentials — Revokes the active credentials for an external user.
  • cdp datacatalog regenerate-external-user-credentials — Issues a new set of credentials for an external user, invalidating the old ones.
  • cdp datacatalog delete-external-user — Permanently deletes an external user and removes their access to all data shares.
  • cdp datacatalog create-data-share — Creates a new data share and packages specified data assets into a shareable unit.
  • cdp datacatalog list-data-shares — Lists all available data shares within a specified Data Lake.
  • cdp datacatalog get-data-share — Retrieves the full details of a specific data share, including its assets and user access list.
  • cdp datacatalog update-data-share — Updates the metadata for an existing data share, such as its name, keywords, or expiration.
  • cdp datacatalog delete-data-share — Permanently deletes a data share.
  • cdp datacatalog share-data-share — Activates a data share, making its assets available to the configured external users.
  • cdp datacatalog unshare-data-share — Deactivates a data share, making its assets temporarily unavailable.
  • cdp datacatalog add-assets-to-data-share — Adds new data assets, such as tables or views, to an existing data share.
  • cdp datacatalog remove-assets-from-data-share — Removes one or more assets from an existing data share by resource ID.
  • cdp datacatalog grant-access-to-external-users-on-data-share — Grants one or more external users access to a data share, with an optional expiration.
  • cdp datacatalog update-access-of-external-users-on-data-share — Adds external users to a data share or updates their access expiration time.
  • cdp datacatalog remove-access-of-external-users-on-data-share — Removes one or more external users' access from a specific data share.
  1. Instead using the Cloudera Data Catalog user interface, you can register external users directly using the cdp datacatalog create-external-users CDP CLI command:

    Run the following command to create one or more external users:

    cdp datacatalog create-external-users \
        --datalake-crn "[***DATALAKE-CRN***]" \
        --environment-crn "[***ENVIRONMENT-CRN***]" \
        --external-users email=[***EMAIL***],username=[***USERNAME***],companyName=[***COMPANY***]

    The command accepts the following parameters for each entry in --external-users:

    email
    Email address of the external user.
    username
    Username for the external user account.
    companyName
    Name of the external user's organization.

    On success, the command returns the created user objects including the generated credentials:

    {
        "externalUsers": [
            {
                "userId": 51,
                "username": "[***USERNAME***]",
                "email": "[***EMAIL***]",
                "companyName": "[***COMPANY***]",
                "clientId": "[***CLIENT_ID***]",
                "secret": "[***SECRET***]",
                "createdAt": "2025-07-29T14:07:05.742000+00:00",
                "error": ""
            }
        ]
    }
    • To verify via Ranger Audits:

      1. To verify if the CLIENT_ID (Token ID) is generated successfully, go to Cloudera Management Console > [***ENVIRONMENT-NAME***] > [***DATALAKE-NAME***] > Ranger > Audits > Admin. Your external user creation events are shown as User created [***CLIENTID***].
        Figure 2. Client ID verification in Ranger
    • To verify via Data Catalog user interface:

      1. In Cloudera Data Catalog, go to Manage Users.
      2. Select the target Data Lake from the dropdown at the top of the page. The user list shows all existing external users for the selected Data Lake, including their Client ID, email address, company name, associated shares, and registration date.
        Figure 3. List of external users
    • External user Client IDs are also visible in Knox under Cloudera Manager > Environment > [***YOUR_DATALAKE***] > Token Integration > Token Management

  2. After verifying the Client IDs, create a Data Share using the CDP CLI command:
    cdp datacatalog create-data-share \
        --datalake-crn "[***DATALAKE-CRN***]" \
        --environment-crn "[***ENVIRONMENT-CRN***]" \
        --data-share-name "[***DATA-SHARE-NAME***]" \
        --assets databaseName=[***DATABASE-NAME***],tableName=[***TABLE-NAME***],guid=[***ASSET-GUID***]

    The command accepts the following required parameters:

    --datalake-crn
    The CRN of the source Data Lake.
    --environment-crn
    The CRN of the associated CDP environment.
    --data-share-name
    A unique name for the new data share (maximum 512 characters).
    --assets
    The list of data tables to include in the share, specified as databaseName=[***DATABASE-NAME***],tableName=[***TABLE-NAME***],guid=[***ASSET-GUID***]. Separate multiple entries with spaces.

    You can also specify the following optional parameters: --summary, --terms-of-use, --keywords, --expiry-time, and --external-users. Use --external-users to grant access to one or more external users at creation time. Each entry uses the shorthand format externalUserId=[***USER-ID***],expiryTime=[***YYYY-MM-DDTHH:MM:SS***]. Separate multiple entries with spaces. The externalUserId is the userId returned when creating external users with cdp datacatalog create-external-users.

    cdp datacatalog create-data-share \
        --datalake-crn "crn:cdp:datalake:us-west-1:..." \
        --environment-crn "crn:cdp:environments:us-west-1:..." \
        --data-share-name "q3_marketing_data" \
        --assets databaseName=marketing,tableName=campaign_results,guid=5eb2d66b-d47d-402b-a26c-de77b403ef6a \
        --summary "Q3 marketing campaign results" \
        --expiry-time 2025-08-20T06:30:00 \
        --external-users externalUserId=344

    On success, the command returns a JSON object with the identifiers for the new share:

    {
        "dataShareId": 1,
        "dataShareName": "q3_marketing_data",
        "datalakeCrn": "crn:cdp:datalake:us-west-1:..."
    }
    • To verify via Ranger Audits:

      1. To verify if the Data Share is generated successfully, go to Cloudera Management Console > [***ENVIRONMENT-NAME***] > [***DATALAKE-NAME***] > Ranger > Audits > Admin. Your Data Share creation events are shown as three consecutive Ranger audit events:
        • DataShare in Dataset created
        • Data Share created ***DATASET SERVICE NAME***
        • Dataset create ***DATASHARE NAME***
        Figure 4. Data Share verification in Ranger
    • To verify via Data Catalog user interface:

      1. In Cloudera Data Catalog, go to Data Sharing. The new Data Share is listed on the All Shares page in Not Shared status until you publish it.
        Figure 5. All Shares page