Prerequisites
Learn how to collect the information you need to deploy the PostgreSQL CDC to Iceberg [Technical Preview] ReadyFlow, and meet other prerequisites.
For your data ingest source
-
You have obtained the PostgreSQL database server hostname and port.
-
You have obtained the PostgreSQL schema name and table name. Take note of the table structure, specifically field case sensitivity.
-
You have obtained a username and password to access the PostgreSQL table.
-
You have performed the PostgreSQL setup tasks required to run Debezium.
For Cloudera DataFlow
-
You have enabled Cloudera DataFlow for an environment.
For information on how to enable Cloudera DataFlow for an environment, see Enabling Cloudera DataFlow for an Environment.
-
You have created a Machine User to use as the Cloudera Workload User.
- You have given the Cloudera Workload User the
EnvironmentUser role.
- From the Management Console, go to the environment for which Cloudera DataFlow is enabled.
- From the Actions drop down, click Manage Access.
- Identify the user you want to use as a Workload User.
- Give that user EnvironmentUser role.
-
You have synchronized your user to the Cloudera Public Cloud environment that you enabled for Cloudera DataFlow.
For information on how to synchronize your user to FreeIPA, see Performing User Sync.
- You have granted your Cloudera user the DFCatalogAdmin and DFFlowAdmin
roles to enable your user to add the ReadyFlow to the Catalog and deploy the flow
definition.
- Give a user permission to add the ReadyFlow to the
Catalog.
- From the Management Console, click User Management.
- Enter the name of the user or group you wish to authorize in the Search field.
- Select the user or group from the list that displays.
- Click .
- From Update Roles, select DFCatalogAdmin and click Update.
- Give your user or group permission to deploy flow definitions.
- From the Management Console, click Environments to display the Environment List page.
- Select the environment to which you want your user or group to deploy flow definitions.
- Click Environment Access page. to display the
- Enter the name of your user or group you wish to authorize in the Search field.
- Select your user or group and click Update Roles.
- Select DFFlowAdmin from the list of roles.
- Click Update Roles.
- Give your user or group access to the Project where the ReadyFlow will be
deployed.
- Go to .
- Select the project where you want to manage access rights and click .
- Start typing the name of the user or group you want to add and select them from the list.
- Select the Resource Roles you want to grant.
- Click Update Roles.
- Click Synchronize Users.
- Give a user permission to add the ReadyFlow to the
Catalog.
For your data ingest target
- In Cloudera Data Warehouse, you have activated the same environment for which Cloudera DataFlow has been enabled. This will create a default database catalog. For more information, see Activating an AWS environment from Cloudera Data Warehouse or Activating Azure environments.
- You have created a Hive Virtual Warehouse referencing the default database catalog. For more information, see Creating your first Virtual Warehouse.
- You have created the Iceberg table that you want to ingest data into, running in your Hive Virtual Warehouse. For more information, see Iceberg table creation from Hive.