Creating a Data Share

Learn how resource owners or Data Share administrators can share Iceberg tables in Cloudera by registering external clients in Cloudera on cloud and configuring Ranger policies.

Resource owners or Data Share administrators who want to share their Iceberg tables in Cloudera with external clients must first register the client in the Cloudera on cloud environment. After that, the resource owner needs to configure Ranger policies to allow access for the external client. By creating Ranger groups for your registered users and connecting Ranger policies to these for the tables to be shared, you create a logical unit called a Data Share.

Registering external clients in Cloudera on cloud

Learn how to register external clients in Cloudera on cloud to provision a CLIENT_ID and CLIENT_SECRET.

Ensure that you have the following information before performing the steps:
Share Admin user and password
Username and password of the Cloudera Administrator. For more information, see Cloudera account administrator.
Knox hostname
To get the Knox hostname, go to Cloudera Manager > Knox > Instances, and copy the hostname for the Knox Gateway role.
Data Lake name
Go to Management Console > Environments > <***YOUR_ENVIRONMENT_NAME***> > Data Lake Details and copy and make a note of the Data Lake name.
  1. Create the CLIENT_ID and SECRET in Knox by running the following command:
    curl -k -u [***CLOUDERA_ADMIN_USER***]:"[***PASSWORD***] https://[***KNOX-HOST-NAME***]:8443/[***DATALAKE-NAME***]/cdp-share-management/knoxtoken/api/v1/token?doAs=external.user&comment=[***<ADDITION INFO>***]s&md_contact=<[***EMAIL_ID***]>&md_role=<[***ROLE OF THE CLIENT***]> &md_type=[***CLIENT_ID***]" |  jq -r '"CLIENT_ID: \(.token_id) SECRET: \(.passcode)"'
    • doAs=external.user - Sets the value of the external.user
    • comment - Additional comments on the CLIENT
    • md_contact - Client contact metadata, for example, email_id
    • md_role - [***Role for the CLIENT_ID***]
    • md_type - Sets the value of CLIENT_ID
    curl -k -u [***CLOUDERA_ADMIN_USER***]:"[***PASSWORD***] "https://my-datalake-name.int.cldr.work:8443/my-datalake-name/cdp-share-management/knoxtoken/api/v1/token?doAs=external.user&comment=carriers&md_contact=client_name@companay.com&md_role=UnitedAirlinesRole&md_type=[***CLIENT_ID***]" |  jq -r '"CLIENT_ID: \(.token_id) SECRET: \(.passcode)"'
    
    CLIENT_ID : 462babd3-fe5a-4abf-8b47-526897677ad5 SECRET: TkRZeVltRmlaRE10Wm1VMVlTMDBZV0ptTFRoaU5EY3ROVEkyT0RrM05qYzNZV1ExOjpaalV4WWpReFkyWXRObVV6WlMwME4yTm1MVGcyWWpFdE9XSXhZekE0T1RZMU9XTmw='
    By Enterprise Data Lakes, use the following command:
    curl -k -u [***CLOUDERA_ADMIN_USER***]:"[***PASSWORD***]] https://[***LOAD BALANCER***]/[***DATALAKE-NAME***]/cdp-share-management/knoxtoken/api/v1/token?doAs=external.user&comment=[***<ADDITION INFO>***]s&md_contact=[***EMAIL_ID***]&md_role=[***ROLE OF THE CLIENT***] &md_type=[***CLIENT_ID***]" |  jq -r '"CLIENT_ID: \(.token_id) SECRET: \(.passcode)"'
    curl -k -u [***CLOUDERA_ADMIN_USER***]:[***PASSWORD***]:"https://my-loadbalancer-1745504451479-b5765eaee5b22d08.elb.us-west-2.amazonaws.com/dldamedi-28qgc9/cdp-share-management/knoxtoken/api/v1/token?doAs=external.user&comment=carriers&md_contact=client_name@company.com&md_role=UnitedAirlinesRole&md_type=[***CLIENT_ID***]" | jq -r '"CLIENT_ID: \(.token_id) SECRET: \(.passcode)"'
  1. After verifying the Client IDs, you need to create a Ranger group representing the data share. Create a Ranger Group with Client ID as the name of that Group with the following command:
    curl -k -u [***CLOUDERA_ADMIN_USER***]:"[***PASSWORD***] -H "Accept: application/json" -H "Content-Type: application/json" -X POST "https://[***RANGER-HOST-NAME***]:8443/[***DATALAKE-NAME***]/cdp-share-management/ranger/service/xusers/groups/" -d '{"name": "[***CLIENT_ID***]", "description": "group representing a share for a CLIENT_ID"}'
    curl -k -u [***CLOUDERA_ADMIN_USER***]:"[***PASSWORD***] -H "Accept: application/json" -H "Content-Type: application/json" -X POST "https://my-datalake.int.cldr.work:8443/my-datalake-name/cdp-share-management/ranger/service/xusers/groups/" -d '{"name": "462babd3-fe5a-4abf-8b47-526897677ad5", "description": "group representing a share for a CLIENT_ID"}'
    By Enterprise Data Lakes, use the following command:
    curl -k -u [***CLOUDERA_ADMIN_USER***]:"[***PASSWORD***]  -H "Accept: application/json" -H "Content-Type: application/json" -X POST "https://[***RANGER-HOST-NAME***]/[***DATALAKE-NAME***]/cdp-share-management/ranger/service/xusers/groups/" -d '{"name": "[***CLIENT_ID***]", "description": "group representing a share for a CLIENT_ID"}'
    curl -k -u [***CLOUDERA_ADMIN_USER***]:[***PASSWORD***] -H "Accept: application/json" -H "Content-Type: application/json" -X POST "https://my-gateway-hostname.cldr.work:8443/my-datalake-name/cdp-share-management/ranger/service/xusers/groups/" -d '{"name": "462babd3-fe5a-4abf-8b47-526897677ad5", "description": "group representing a share for a CLIENT_ID"}'
  2. Create a new Role and add the created Group to the Role in Ranger with the following command:
    curl -k -u [***CLOUDERA_ADMIN_USER***]:"[***PASSWORD***]  -H "Accept: application/json" -H "Content-Type: application/json" -X POST "https://[***RANGER-HOST-NAME***]:8443/[***DATALAKE-NAME***]/cdp-share-management/ranger/service/public/v2/api/roles/" -d '{ "name": "***CLIENT_ROLE***", "description": "***CLIENT_ROLE DESCRIPTION******", "groups": [ { "name": "***CLIENT_ID CREATED***", "isAdmin": false } ] }'
    
    curl -k -u [***CLOUDERA_ADMIN_USER***]:"[***PASSWORD***] -H "Accept: application/json" -H "Content-Type: application/json" -X POST "https://my-ranger-hostname.int.cldr.work:8443/my-datalake-name/cdp-share-management/ranger/service/public/v2/api/roles/" -d '{ "name": "SalesDeptRole", "description": "SalesRole description", "groups": [ { "name": "462babd3-fe5a-4abf-8b47-526897677ad5", "isAdmin": false } ] }'
    By Enterprise Data Lakes and, use the following command:
    curl -k -u [***CLOUDERA_ADMIN_USER***]:"[***PASSWORD***]  -H "Accept: application/json" -H "Content-Type: application/json" -X POST "https://[***GATEWAY-HOST-NAME***]/[***DATALAKE-NAME***]/cdp-share-management/ranger/service/public/v2/api/roles/" -d '{ "name": "***CLIENT_ROLE***", "description": "***CLIENT_ROLE DESCRIPTION***", "groups": [ { "name": "***CLIENT_ID CREATED***", "isAdmin": false } ] }'
    curl -k -u [***CLOUDERA_ADMIN_USER***]:[***PASSWORD***] -H "Accept: application/json" -H "Content-Type: application/json" -X POST "https://my-gateway-hostname.int.cldr.work:8443/my-datalake-name/cdp-share-management/ranger/service/public/v2/api/roles/" -d '{ "name": "TestRole1", "description": "TestRole1 description", "groups": [ { "name": "test-group1", "isAdmin": false } ] }'

    After running the commands, you can see the roles in Ranger:

    Figure 1. Cloudera Management Console > Environments > ***YOUR_ENVIRONMENT**** > > Apache Ranger > Settings > Roles
  3. Optional: Add the Group to an existing Role with the following commands:
    • Get the RoleId for the RoleName from Ranger Admin via API:

      curl -k -u [***CLOUDERA_ADMIN_USER***]:"[***PASSWORD***] -H "Accept: application/json" -H "Content-Type: application/json" -X GET "https://[***RANGER-HOST-NAME***]:8443/[***DATALAKE-NAME***]/cdp-share-management/ranger/service/public/v2/api/roles/name/<name>"| jq -r '"RoleId: \(.id)"'
      
    • Add the Group to the Role using RoleId

      curl -k -u [***CLOUDERA_ADMIN_USER***]:"[***PASSWORD***] -H "Accept: application/json" -H "Content-Type: application/json" -X PUT "https://[***RANGER-HOST-NAME***]:8443/[***DATALAKE-NAME***]/cdp-share-management/ranger/service/public/v2/api/roles/<RoleId>" -d '{ "name": "***CLIENT_ROLE***", "description": "***CLIENT_ROLE DESCRIPTION***", "groups": [ { "name": "***CLIENT_ID CREATED***", "isAdmin": false } ] }'
      

The registration process results in provisioning a CLIENT_ID and CLIENT_SECRET followed by creating Ranger ROLE and adding CLIENT_ID as a Group to the ROLE. You can verify the creation of your Ranger groups and users in Cloudera Management Console > Environments > ***YOUR_ENVIRONMENT**** > > Apache Ranger > Audits > Admin

Managing Ranger policies

Learn how to manage your Ranger policies to authenticate your external users.

The Ranger Administrator must maintain policies for the set of databases and tables for the Ranger role and group to enable read access for these assets.

  1. In the Allow Conditions, a SELECT and SHOW permission has to be maintained for the databases or tables to be shared to provide a READ-only access.
  2. Go to Apache Ranger > Resource Policies > Hadoop SQL and create a policy for the assets to be shared.
Create a Data Share policy in Ranger for the Role created in Registering external clients in Cloudera on cloud with the following command
curl -k -u [***CDP_ADMIN_USER***]:[***PASSWORD***] -H "Accept: application/json" -H "Content-Type: application/json" -X POST "https://[***RANGER-HOST-NAME***]:8443/[***DATALAKE-NAME***]/cdp-share-management/ranger/service/public/v2/api/policy/" -d  '{"service":"hive_service_name", "policyType": 0, "name": "Iceberg Table Policy", "description": "Policy for SELECT access to an CLIENT_ID", "isEnabled": true, "resources": { "database": { "values": "[***DATABASE_NAME***]" }, "table": { "values": "[***TABLE_NAME***]" } ,"column": { "values": ["*"] } } , "policyItems": [ { "accesses": [ { "type": "select" } ], "users": [], "groups":[], "roles": "[***CLIENT_ROLE***]", "conditions": [] } ] }'
curl -k -u [***CDP_ADMIN_USER***]:[***PASSWORD***]  -H "Accept: application/json" -H "Content-Type: application/json" -X POST "https://dldanew-vxtt5w-master0.dldanew.svbr-nqvp.int.cldr.work:8443/dldanew-vxtt5w/cdp-share-management/ranger/service/public/v2/api/policy/" -d '{"service":"cm_hive", "policyType": 0, "name": "Hive Table Policy", "description": "Policy for SELECT access to an exteral user", "isEnabled": true, "resources": { "database": { "values": ["emp_data"] }, "table": { "values": ["employees"] } ,"column": { "values": ["*"] } } , "policyItems": [ { "accesses": [ { "type": "select" } ], "users": [], "groups":[], "roles": ["testrole13"], "conditions": [] } ] }'