Building your dataflow

This topic shows you how to set up the elements of your NiFi dataflow that will enable you to move data into Ozone using Apache NiFi.

Building the dataflow involves opening the NiFi UI, adding processors to your NiFi canvas, and connecting the processors. The following basic flow uses GenerateFlowFile to create some sample data and the PutHDFS processor for the data ingest.

You must have reviewed and met the prerequisites.

  1. Launch NiFi from your CDP Public Cloud or CDP Private Cloud Base cluster.
  2. Add the NiFi processors to your canvas.
    1. Select the Processor icon from the Cloudera Flow Management Actions pane, and drag a processor to the canvas.
    2. Use the Add Processor filter box to search for the processor you want to add, and then click Add.
    3. Add the following processors on the canvas:
      • GenerateFlowFile
      • PutHDFS
  3. Connect the two processors to create your basic dataflow.
    1. Click the Connection icon in the first processor, and drag it to the second processor.

      A Create Connection dialog displays. It has two tabs: Details and Settings where you can configure the connection's name, flow file expiration time period, thresholds for back pressure, load balance strategy, and prioritization.

    2. Click Add to close the dialog box and add the connection to your flow. Optionally, you can add success and failure funnels to your dataflow, which help you see where flow files are routed when your dataflow is running.

Once you have finished building the dataflow, move on to the following steps:

  • Configure your source processor.
  • Configure your target processor.