Build the data flow

Learn how to create NiFi data flows easily with a number of processors and flow objectives to choose from.

You can use the PublishKafkaRecord_2_0 processor to build your Kafka ingest data flow. Regardless of the type of flow you are building, the basic steps in building your data flow are to open NiFi, add your processors to the canvas, and connect the processors to create the flow.

Open NiFi  in  Data Hub.
1. To access the NiFi service in your Flow Management  Data Hub cluster, navigate to Management Console service > Data Hub Clusters.
2. Click the tile representing the Flow Management Data Hub cluster you want to work with.
3. Click Nifi in the Services section of the cluster overview page to access the NiFi UI.
You will be logged into NiFi automatically with your CDP credentials.
Add the GenerateFlowFile processor for data input.
This processor creates FlowFiles with random data or custom content.
1. Drag and drop the processor icon into the canvas. This displays a dialog that allows you to choose the processor you want to add.
2. Select the GenerateFlowFile processor from the list.
3. Click Add or double-click the required processor type to add it to the canvas.
You will configure the GenerateFlowFile processor to define how to create the sample data in Configure the processor for your data source.
Add the PublishKafkaRecord_2_0 processor for data output.

note
There are various processors in NiFi for ingesting data into Kafka. PublishKafkaRecord uses a configured record reader to read the incoming FlowFiles as records, and then it uses a configured record writer to serialize each record for publishing to Kafka. As Streams Messaging clusters in Data Hub include Kafka version 2.x, use PublishKafkaRecord_2_0 in your data flow.

You will configure the PublishKafkaRecord_2_0 processor in Configure the processor for your data target.
Connect the two processors to create a flow.
1. Drag the connection icon from the first processor, and drop it on the second processor.
  A Create Connection dialog appears with two tabs: Details and Settings.
2. Configure the connection.
  You can add the connection's name, flowfile expiration time period, thresholds for back pressure, load balance strategy, and prioritization.
3. Click Add to close the dialog box and add the connection to your data flow.
Optionally, you can add success and failure funnels to your data flow, which help you see where flow files are being routed when your flow is running. Connect the PublishKafkaRecord_2_0 to these funnels with success and failure connections.
If you want to know more about working with funnels, see Adding Components to the Canvas in Building an Apache NiFi DataFlow.

Your data flow may look similar to the following:

Create controller services for your data flow. You will need these services later on, when configuring the PublishKafkaRecord_2_0 processor.