Building your dataflow

Learn how you can set up your NiFi dataflow that will enable you to move data to Ozone. This involves adding processors and other dataflow elements to the NiFi canvas, configuring them, and connecting the elements to create the dataflow.

You must have reviewed and met the prerequisites.

  1. Launch NiFi from your CDP Public Cloud or CDP Private Cloud Base cluster.
  2. Add the NiFi processors to your canvas.
    1. Select the Processor icon from the Cloudera Flow Management Actions pane, and drag a processor to the canvas.
    2. Use the Add Processor filter box to search for the processor you want to add, and then click Add.
    3. Add the following processors on the canvas:
      • GenerateFlowFile as your data source
      • PutHDFS or PutCDPObjectStore as your data ingest tool
  3. Connect the two processors to create your basic dataflow.
    1. Click the Connection icon in the first processor, and drag it to the second processor.

      A Create Connection dialog displays. It has two tabs: Details and Settings where you can configure the connection's name, flow file expiration time period, thresholds for back pressure, load balance strategy, and prioritization.

    2. Click Add to close the dialog box and add the connection to your flow.
    3. Optional: You can add success and failure funnels to your dataflow, which help you see where flow files are routed when your dataflow is running.

Using PutHDFS

Using PutCDPObjectStore

Once you have finished building the dataflow, move on to the following steps:

  • Configure your source processor.
  • Configure your target processor.