Stream Data Using NiFi

NiFi provides a highly intuitive streaming user interface that is compatible with most types of data sources.

  1. Open the NiFi user interface canvas.
  2. Drag (processor icon) to your workspace.
    NiFi displays the Add Processor dialog box.
  3. Select the TailFile type of processor and click ADD.
    NiFi displays a new TailFile processor:
  4. Right-click the processor icon and select Configure to display the Configure Processor dialog box.
    1. In the Settings tab, change the name to Ingest $DATASOURCE Events:
    2. In the Properties tab, enter the path to the data source file in the Value column for the File(s) to Tail property:


  5. Click Apply to save your changes and dismiss the Configure Processor dialog box.
  6. Add another processor by dragging (processor icon) to your workspace.
  7. Select the PutKafka type of processor and click Add.
  8. Right-click the processor and select Configure.
  9. In the Settings tab, change the name to Stream to Metron and then select the relationship check boxes for failure and success.


  10. In the Properties tab, set the following three properties:
    Known Brokers

    $KAFKA_HOST:6667

    Topic Name

    $DATAPROCESSOR

    Client Name

    nifi-$DATAPROCESSOR



  11. Click Apply to save your changes and dismiss the Configure Processor dialog box.
  12. Create a connection by dragging the arrow from the Ingest $DATAPROCESSOR Events processor to the Stream to Metron processor.
    NiFi displays Configure Connection dialog box.


  13. In the Details tab, check the failure checkbox under For Relationships.
  14. Click APPLY to accept the default settings for the connection.
  15. Press Shift and draw a box around both parsers to select the entire flow.
    All of the processor icons turn into green arrows:


  16. In the Operate panel, click the arrow icon.


  17. Generate some data using the new data processor client.
  18. Look at the Storm UI for the parser topology and confirm that tuples are coming in.
  19. After about five minutes, you should see a new index called $DATAPROCESSOR_index* in either the Solr Admin UI or the Elastic Admin UI.