Create a NiFi Flow to Stream Events to CCP

You can use NiFi to create a data flow to capture events from Squid and push them into Cloudera Cybersecurity Platform (CCP). For this task we will use NiFi to create two processors, one TailFile processor that will ingest Squid events and one PutKafka processor that will stream the information to Metron. When we create and define the PutKafka processor, Kafka will create a topic for the Squid data source. We'll then connect the two processors, generate some data in the Squid log, and watch the data flow from one processor to the other.

  1. Drag the first icon on the toolbar (the processor icon) to your workspace.
    NiFi displays the Add Processor dialog box.

  2. Select the TailFile type of processor and click Add.
    NiFi displays a new TailFile processor.

  3. Right-click the processor icon and select Configure to display the Configure Processor dialog box.
    1. In the Settings tab, change the name to Ingest Squid Events.

    2. In the Properties tab, enter the path to the squid access.log file in the Value column for the File(s) to Tail property.

  4. Click Apply to save your changes and dismiss the Configure Processor dialog box.
  5. Add another processor by dragging the Processor icon to the main window.
  6. Select the PutKafka type of processor and click Add.
  7. Right-click the processor and select Configure.
  8. In the Settings tab, change the name to Stream to Metron and then select the Automatically Terminate Relationships check boxes for Failure and Success.

  9. In the Properties tab, set the following three properties:
    • Known Brokers: $KAFKA_HOST:6667
    • Topic Name: squid
    • Client Name: nifi-squid

  10. Click Apply to save your changes and dismiss the Configure Processor dialog box.
  11. Create a connection by dragging the arrow from the Ingest Squid Events processor to the Stream to Metron processor.
    NiFi displays a Create Connection dialog box.

  12. Click Add to accept the default settings for the connection.
  13. Press the Shift key and draw a box around both parsers to select the entire flow.

  14. Click (Start button) in the Operate panel.

  15. Generate some data using the new data processor client.
    1. Use ssh to access the host for the new data source.
    2. With Squid started, look at the different log files that get created:
      sudo su - 
      cd /var/log/squid 
      The file you want for Squid is the access.log, but another data source might use a different name.
    3. Generate entries for the log so you can see the format of the entries.
      squidclient -h ""
      You will see the following data in the access.log file.
      1481143984.330   1111 TCP_MISS/301 714 GET - DIRECT/ text/html
    4. Using the Squid log entries, you can determine the format of the log entries is:
      timestamp | time elapsed | remotehost | code/status | bytes | method | URL rfc931 peerstatus/peerhost | type 
      You should see metrics on the processor of data being pushed into Metron.
  16. Look at the Storm UI for the parser topology and you should see tuples coming in.
    1. Navigate to Ambari UI.
    2. From the Quick Links pull-down menu in the upper center of the window, select Storm UI.
  17. Before leaving this section, run the following commands to fill the access.log with data. You'll need this data when you enrich the telemetry data.
    squidclient -h ""
    squidclient -h ""
    squidclient -h ""
    squidclient -h ""
    squidclient -h ""
    squidclient -h ""
    squidclient -h ""
    squidclient -h ""
    squidclient -h ""
    squidclient -h ""
    squidclient -h ""
    squidclient -h ""

For more information about creating a NiFi data flow, see the NiFi documentation.