Prerequisites
Learn how to collect the information you need to deploy the ListenHTTP filter to Kafka ReadyFlow, and meet other prerequisites.
For your data ingest source
- You have the port to listen on for incoming HTTP events.
For DataFlow
-
You have enabled DataFlow for an environment.
For information on how to enable DataFlow for an environment, see Enabling DataFlow for an Environment.
-
You have created a Machine User to use as the CDP Workload User.
- You have given the CDP Workload User the
EnvironmentUser role.
- From the Management Console, go to the environment for which DataFlow is enabled.
- From the Actions drop down, click Manage Access.
- Identify the user you want to use as a Workload User.
- Give that user EnvironmentUser role.
-
You have synchronized your user to the CDP Public Cloud environment that you enabled for DataFlow.
For information on how to synchronize your user to FreeIPA, see Performing User Sync.
- You have granted your CDP user the
DFCatalogAdmin and DFFlowAdmin roles to
enable your user to add the ReadyFlow to the Catalog and deploy the flow definition.
- Give a user permission to add the ReadyFlow to the
Catalog.
- From the Management Console, click User Management.
- Enter the name of the user or group you wish to authorize in the Search field.
- Select the user or group from the list that displays.
- Click .
- From Update Roles, select DFCatalogAdmin and click Update.
- Give your user or group permission to deploy flow definitions.
- From the Management Console, click Environments to display the Environment List page.
- Select the environment to which you want your user or group to deploy flow definitions.
- Click Environment Access page. to display the
- Enter the name of your user or group you wish to authorize in the Search field.
- Select your user or group and click Update Roles.
- Select DFFlowAdmin from the list of roles.
- Click Update Roles.
- Give your user or group access to the Project where the ReadyFlow will be
deployed.
- Go to .
- Select the project where you want to manage access rights and click .
- Start typing the name of the user or group you want to add and select them from the list.
- Select the Resource Roles you want to grant.
- Click Update Roles.
- Click Synchronize Users.
- Give a user permission to add the ReadyFlow to the
Catalog.
For your data ingest target
-
You have created a Streams Messaging cluster in CDP Public Cloud to host your Schema Registry.
For information on how to create a Streams Messaging cluster, see Setting up your Streams Messaging Cluster.
-
You have created at least one Kafka topic.
- Navigate to Management Console > Environments and select your environment.
- Select your Streams Messaging cluster.
- Click on the Streams Messaging Manager icon.
- Navigate to the Topics page.
- Click Add New and provide the following information:
- Topic name
- Number of partitions
- Level of availability
- Cleanup policy
- Click Save.
-
You have created a schema for your data and have uploaded it to the Schema Registry in the Streams Messaging cluster.
For information on how to create a new schema, see Creating a new schema. For example:{ "type":"record", "name":"SensorReading", "namespace":"com.cloudera.example", "doc":"This is a sample sensor reading", "fields":[ { "name":"sensor_id", "doc":"Sensor identification number.", "type":"int" }, { "name":"sensor_ts", "doc":"Timestamp of the collected readings.", "type":"long" }, { "name":"sensor_0", "doc":"Reading #0.", "type":"int" }, { "name":"sensor_1", "doc":"Reading #1.", "type":"int" }, { "name":"sensor_2", "doc":"Reading #2.", "type":"int" }, { "name":"sensor_3", "doc":"Reading #3.", "type":"int" } ] }
-
You have the Schema Registry Host Name.
- From the Management Console, go to Data Hub Clusters and select the Streams Messaging cluster you are using.
- Navigate to the Hardware tab to locate the Master Node FQDN. Schema Registry is always running on the Master node, so copy the Master node FQDN.
-
You have the Kafka broker end points.
- From the Management Console, click Data Hub Clusters.
- Select the Streams Messaging cluster from which you want to ingest data.
- Click the Hardware tab.
- Note the Kafka Broker FQDNs for each node in your cluster.
- Construct your Kafka Broker Endpoints by using the FQDN and Port
number 9093 separated by a colon. Separate endpoints by a comma. For example:
broker1.fqdn:9093,broker2.fqdn:9093,broker3.fqdn:9093
Kafka broker FQDNs are listed under the Core_broker section.
-
You have the Kafka Consumer Group ID.
This ID is defined by the user. Pick an ID and then create a Ranger policy for it. Use the ID when deploying the flow in DataFlow.
-
You have assigned the CDP Workload User policies to access the consumer group ID and topic.
- Navigate to Management Console > Environments, and select the environment where you have created your cluster.
- Select Ranger. You are redirected to the Ranger Service Manager page.
- Select your Streams Messaging cluster under the Kafka folder.
- Create a policy to enable your Workload User to access the Kafka source topic.
- On the Create Policy page, give the policy a name, select topic from the drop-down list, add the user, and assign the Consume permission.
- Create another policy to give your Workload User access to the consumer group ID.
- On the Create Policy page, give the policy a name, select consumergroup from the drop-down list, add the user, and assign the Consume permission.
-
You have assigned the CDP Workload User read-access to the schema.
- Navigate to Management Console > Environments, and select the environment where you have created your cluster.
- Select Ranger. You are redirected to the Ranger Service Manager page.
- Select your Streams Messaging cluster under the Schema Registry folder.
- Click Add New Policy.
- On the Create Policy page, give the policy a name, specify the schema details, add the user, and assign the Read permission.