Learn how to collect the information you need to deploy the Non-CDP ADLS to CDP ADLS ReadyFlow, and meet other prerequisites.
For your data ingest source
- You have the source ADLS file system/container name, path, and storage account name and key.
- You have access to the source ADLS directory.
You have enabled DataFlow for an environment.
For information on how to enable DataFlow for an environment, see Enabling DataFlow for an Environment.
You have created a Machine User to use as the CDP Workload User.
- You have given the CDP Workload User the
- From the Management Console, go to the environment for which DataFlow is enabled.
- From the Actions drop down, click Manage Access.
- Identify the user you want to use as a Workload User.
- Give that user EnvironmentUser role.
You have synchronized your user to the CDP Public Cloud environment that you enabled for DataFlow.
For information on how to synchronize your user to FreeIPA, see Performing User Sync.
- You have granted your CDP user the
DFCatalogAdmin and DFFlowAdmin roles to
enable your user to add the ReadyFlow to the Catalog and deploy the flow definition.
- Give a user permission to add the ReadyFlow to the
- From the Management Console, click User Management.
- Enter the name of the user or group you wish to authorize in the Search field.
- Select the user or group from the list that displays.
- Click .
- From Update Roles, select DFCatalogAdmin and click Update.
- Give your user or group permission to deploy flow definitions.
- From the Management Console, click Environments to display the Environment List page.
- Select the environment to which you want your user or group to deploy flow definitions.
- Click Environment Access page. to display the
- Enter the name of your user or group you wish to authorize in the Search field.
- Select your user or group and click Update Roles.
- Select DFFlowAdmin from the list of roles.
- Click Update Roles.
- Give a user permission to add the ReadyFlow to the Catalog.
For your data ingest target
- You have the destination ADLS file system/containeer name, path, and storage account name.
- You have performed one of the following to configure access to the destination ADLS
You have configured access to the ADLS folders with a RAZ enabled environment.It is a best practice to enable RAZ to control access to your object store folders. This allows you to use your CDP credentials to access ADLS folders, increases auditability, and makes object store data ingest workflows portable across cloud providers.
- Ensure that Fine-grained access control is enabled for your DataFlow environment.
- From the Ranger UI, navigate to the ADLS repository.
- Create a policy to govern access to the ADLS container and path used in your ingest workflow. For example: adls-to-adls-avro-ingest
- Add the machine user that you have created for your ingest workflow to ingest the policy you just created.
You have configured access to ADLS folders using ID Broker mapping.If your environment is not RAZ-enabled, you can configure access to ADLS folders using ID Broker mapping.
- Access IDBroker mappings.
- To access IDBroker mappings in your environment, click .
- Choose the IDBroker Mappings tab where you can provide mappings for users or groups and click Edit.
- Add your CDP Workload User and the corresponding Azure role that provides write access to your folder in ADLS to the Current Mappings section by clicking the blue + sign.
- Click Save and Sync.
- Access IDBroker mappings.