Creating a Data Share

Learn how resource owners or Data Share administrators can share Iceberg tables in Cloudera by registering external clients in Cloudera Cloud and configuring Ranger policies.

Resource owners or Data Share administrators who want to share their Iceberg tables in Cloudera with external clients must first register the client in the Cloudera on cloud environment. After that, the resource owner needs to configure Ranger policies to allow access for the external client.

Registering external clients in Cloudera

Learn how to register external clients in Cloudera to provision a CLIENT_ID and CLIENT_SECRET.

Ensure that you have the following information before performing the steps:
Share Admin user and password
Username and password of the Cloudera Administrator
Knox hostname
To get the Knox hostname, go to Cloudera Manager > Knox > Instances, and copy the hostname for the Knox Gateway role.
Data Lake name
Go to Management Console > Environments > <***YOUR_ENVIRONMENT_NAME***> > Data Lake Details and copy and make a note of the Data Lake name.
  1. Create the CLIENT_ID and SECRET in Knox by running the following command:
    curl -k -u [***CDP_ADMIN_USER***]:"[***PASSWORD***] https://[***KNOX-HOST-NAME***]:8443/[***DATALAKE-NAME***]/cdp-share-management/knoxtoken/api/v1/token?doAs=external.user&comment=[***<ADDITION INFO>***]s&md_contact=<[***EMAIL_ID***]>&md_role=<[***ROLE OF THE CLIENT***]> &md_type=[***CLIENT_ID***]" |  jq -r '"CLIENT_ID: \(.token_id) SECRET: \(.passcode)"'
    
    • doAs=external.user - Set value to external.user
    • comment - Additional comments on the CLIENT
    • md_contact - Client contact metadata, for example, email_id
    • md_role - <[***Role for the CLIENT_ID***]>
    • md_type - Set value to CLIENT_ID
    curl -k -u [***CDP_ADMIN_USER***]:"[***PASSWORD***] "https://my-datalake-name.int.cldr.work:8443/my-datalake-name/cdp-share-management/knoxtoken/api/v1/token?doAs=external.user&comment=carriers&md_contact=client_name@companay.com&md_role=UnitedAirlinesRole&md_type=[***CLIENT_ID***]" |  jq -r '"CLIENT_ID: \(.token_id) SECRET: \(.passcode)"'
    
    CLIENT_ID : 462babd3-fe5a-4abf-8b47-526897677ad5 SECRET: TkRZeVltRmlaRE10Wm1VMVlTMDBZV0ptTFRoaU5EY3ROVEkyT0RrM05qYzNZV1ExOjpaalV4WWpReFkyWXRObVV6WlMwME4yTm1MVGcyWWpFdE9XSXhZekE0T1RZMU9XTmw='
    By Enterprise Data Lakes, use the following command:
    curl -k -u [***CDP_ADMIN_USER***]:"[***PASSWORD***]] https://[***LOAD BALANCER***]/[***DATALAKE-NAME***]/cdp-share-management/knoxtoken/api/v1/token?doAs=external.user&comment=[***<ADDITION INFO>***]s&md_contact=[***EMAIL_ID***]&md_role=[***ROLE OF THE CLIENT***] &md_type=[***CLIENT_ID***]" |  jq -r '"CLIENT_ID: \(.token_id) SECRET: \(.passcode)"'
    curl -k -u [***CDP_ADMIN_USER***]:[***PASSWORD***]:"https://my-loadbalancer-1745504451479-b5765eaee5b22d08.elb.us-west-2.amazonaws.com/dldamedi-28qgc9/cdp-share-management/knoxtoken/api/v1/token?doAs=external.user&comment=carriers&md_contact=client_name@company.com&md_role=UnitedAirlinesRole&md_type=[***CLIENT_ID***]" | jq -r '"CLIENT_ID: \(.token_id) SECRET: \(.passcode)"'
  1. Create a Ranger Group with Client ID as the name of that Group with the following command:
    curl -k -u [***CDP_ADMIN_USER***]:"[***PASSWORD***]  -H "Accept: application/json" -H "Content-Type: application/json" -X POST "https://[***RANGER-HOST-NAME***]:8443/[***DATALAKE-NAME***]/cdp-share-management/ranger/service/xusers/groups/" -d '{"name": "[***CLIENT_ID***]", "description": "group representing a share for a CLIENT_ID"}'
    curl -k -u [***CDP_ADMIN_USER***]:"[***PASSWORD***] -H "Accept: application/json" -H "Content-Type: application/json" -X POST "https://my-datalake.int.cldr.work:8443/my-datalake-name/cdp-share-management/ranger/service/xusers/groups/" -d '{"name": "462babd3-fe5a-4abf-8b47-526897677ad5", "description": "group representing a share for a CLIENT_ID"}'
    By Enterprise Data Lakes, use the following command:
    curl -k -u [***CDP_ADMIN_USER***]:"[***PASSWORD***]  -H "Accept: application/json" -H "Content-Type: application/json" -X POST "https://[***RANGER-HOST-NAME***]/[***DATALAKE-NAME***]/cdp-share-management/ranger/service/xusers/groups/" -d '{"name": "[***CLIENT_ID***]", "description": "group representing a share for a CLIENT_ID"}'
    curl -k -u [***CDP_ADMIN_USER***]:[***PASSWORD***] -H "Accept: application/json" -H "Content-Type: application/json" -X POST "https://my-gateway-hostname.cldr.work:8443/my-datalake-name/cdp-share-management/ranger/service/xusers/groups/" -d '{"name": "462babd3-fe5a-4abf-8b47-526897677ad5", "description": "group representing a share for a CLIENT_ID"}'
  2. Create a new Role and add the created Group to the Role in Ranger with the following command
    curl -k -u [***CDP_ADMIN_USER***]:"[***PASSWORD***]  -H "Accept: application/json" -H "Content-Type: application/json" -X POST "https://[***RANGER-HOST-NAME***]:8443/[***DATALAKE-NAME***]/cdp-share-management/ranger/service/public/v2/api/roles/" -d '{ "name": "***CLIENT_ROLE***", "description": "***CLIENT_ROLE DESCRIPTION******", "groups": [ { "name": "***CLIENT_ID CREATED***", "isAdmin": false } ] }'
    
    curl -k -u [***CDP_ADMIN_USER***]:"[***PASSWORD***] -H "Accept: application/json" -H "Content-Type: application/json" -X POST "https://my-ranger-hostname.int.cldr.work:8443/my-datalake-name/cdp-share-management/ranger/service/public/v2/api/roles/" -d '{ "name": "SalesDeptRole", "description": "SalesRole description", "groups": [ { "name": "462babd3-fe5a-4abf-8b47-526897677ad5", "isAdmin": false } ] }'
    By Enterprise Data Lakes and, use the following command:
    curl -k -u [***CDP_ADMIN_USER***]:"[***PASSWORD***]  -H "Accept: application/json" -H "Content-Type: application/json" -X POST "https://[***GATEWAY-HOST-NAME***]/[***DATALAKE-NAME***]/cdp-share-management/ranger/service/public/v2/api/roles/" -d '{ "name": "***CLIENT_ROLE***", "description": "***CLIENT_ROLE DESCRIPTION***", "groups": [ { "name": "***CLIENT_ID CREATED***", "isAdmin": false } ] }'
    curl -k -u [***CDP_ADMIN_USER***]:[***PASSWORD***] -H "Accept: application/json" -H "Content-Type: application/json" -X POST "https://my-gateway-hostname.int.cldr.work:8443/my-datalake-name/cdp-share-management/ranger/service/public/v2/api/roles/" -d '{ "name": "TestRole1", "description": "TestRole1 description", "groups": [ { "name": "test-group1", "isAdmin": false } ] }'
    Figure 1. Apache Ranger > Settings > Roles
  3. Optional: Add the Group to an existing Role with the following commands:
    • Get the RoleId for the RoleName from Ranger Admin via API:

      curl -k -u [***CDP_ADMIN_USER***]:"[***PASSWORD***] -H "Accept: application/json" -H "Content-Type: application/json" -X GET "https://[***RANGER-HOST-NAME***]:8443/[***DATALAKE-NAME***]/cdp-share-management/ranger/service/public/v2/api/roles/name/<name>"| jq -r '"RoleId: \(.id)"'
      
    • Add the Group to the Role using RoleId

      curl -k -u [***CDP_ADMIN_USER***]:"[***PASSWORD***] -H "Accept: application/json" -H "Content-Type: application/json" -X PUT "https://[***RANGER-HOST-NAME***]:8443/[***DATALAKE-NAME***]/cdp-share-management/ranger/service/public/v2/api/roles/<RoleId>" -d '{ "name": "***CLIENT_ROLE***", "description": "***CLIENT_ROLE DESCRIPTION***", "groups": [ { "name": "***CLIENT_ID CREATED***", "isAdmin": false } ] }'
      

The registration process results in provisioning a CLIENT_ID and CLIENT_SECRET followed by creating Ranger ROLE and adding CLIENT_ID as a Group to the ROLE and then maintaining policy for the ROLE to create the data share.

Managing Ranger policies

Learn how to provide authentication capabilities to your external users. Manage and govern your Ranger policies.

The Ranger Administrator must maintain policies for the set of databases and tables for the Ranger role and group.

Figure 2. Apache Ranger > Resource Policies > Hadoop SQL

In the Allow Conditions, a “SELECT” permission has to be maintained for the Databases or Tables to provide a READ-only access.

Figure 3. Apache Ranger > Resource Policies > Hadoop SQL > Allow Conditions
Create a Data Share policy in Ranger for the previously created Role with the following command
curl -k -u [***CDP_ADMIN_USER***]:[***PASSWORD***] -H "Accept: application/json" -H "Content-Type: application/json" -X POST "https://[***RANGER-HOST-NAME***]:8443/[***DATALAKE-NAME***]/cdp-share-management/ranger/service/public/v2/api/policy/" -d  '{"service":"hive_service_name", "policyType": 0, "name": "Iceberg Table Policy", "description": "Policy for SELECT access to an CLIENT_ID", "isEnabled": true, "resources": { "database": { "values": "[***DATABASE_NAME***]" }, "table": { "values": "[***TABLE_NAME***]" } ,"column": { "values": ["*"] } } , "policyItems": [ { "accesses": [ { "type": "select" } ], "users": [], "groups":[], "roles": "[***CLIENT_ROLE***]", "conditions": [] } ] }'
curl -k -u [***CDP_ADMIN_USER***]:[***PASSWORD***]  -H "Accept: application/json" -H "Content-Type: application/json" -X POST "https://dldanew-vxtt5w-master0.dldanew.svbr-nqvp.int.cldr.work:8443/dldanew-vxtt5w/cdp-share-management/ranger/service/public/v2/api/policy/" -d '{"service":"cm_hive", "policyType": 0, "name": "Hive Table Policy", "description": "Policy for SELECT access to an exteral user", "isEnabled": true, "resources": { "database": { "values": ["emp_data"] }, "table": { "values": ["employees"] } ,"column": { "values": ["*"] } } , "policyItems": [ { "accesses": [ { "type": "select" } ], "users": [], "groups":[], "roles": ["testrole13"], "conditions": [] } ] }'