Tutorial: MiNiFi to Cloudera DataFlow flow deployment

This tutorial walks you through creating an inbound connection endpoint in Cloudera DataFlow used by a flow deployment to receive data from one or more MiNiFi agents managed by Edge Flow Manager.

  1. In a development NiFi environment, create a Controller Service of type StandardRestrictedSSLContextService at the root canvas level and name it Inbound SSL Context Service.
    1. In the Operate palette click Configuration > Controller Services > Create a new controller service
    2. Filter for ssl, select StandardRestrictedSSLContextService then click Add.
    3. Click Configure.
    4. On the Settings tab change the Name to Inbound SSL Context Service, then click Apply.

    You do not need to make further configuration on this Controller Service; it acts as a placeholder and will be created with a managed SSL Context when deployed by Cloudera DataFlow.

  2. Create a Process Group on the root canvas to hold your flow definition and give it a name.
    This tutorial uses the name ListenHTTP Flow.
  3. Enter the process group.
  4. Inside the Process Group, add a listen processor.
    This tutorial uses ListenHTTP.
  5. Configure the listen processor:
    Base Path
    This tutorial uses the default contentListener.
    Listening Port
    Define a value that is valid for your use case. This tutorial uses port 9000.
    SSL Context Service
    Select Inbound SSL Context Service.
    Client Authentication
    Select REQUIRED.
    Click Apply.
  6. Connect the ListenHTTP processor to a downstream processor of your choice.
    This tutorial uses LogAttribute, where all relationships terminate.
  7. From the root canvas, right click on the Process Group and select Download flow definition > Without external controller services.
  8. Upload the flow definition JSON to the Flow Catalog of your Cloudera DataFlow deployment.
  9. Deploy the flow.
    1. At the NiFi Configuration step of the Deployment wizard, select Inbound Connections > Allow NiFi to Receive Data to enable inbound connections.
      Accept the automatically created endpoint hostname and automatically discovered port by clicking Next.
    2. At Parameters, click Next.
    3. At Sizing & Scaling select the Extra Small NiFi Node Size then click Next.
    4. Add a KPI on the ListenHTTP processor to monitor how many bytes it is receiving, by clicking Add new KPI.
      Make the following settings:
      KPI Scope
      Processor
      Processor Name
      ListenHTTP
      Metric to Track
      Bytes Received
    5. Review the information provided and click Deploy.

    Soon after the flow deployment has started, the client certificate and private key required for sending data to the NiFi flow become available for the flow deployment that is being created.

  10. Collect the information required to configure your load balancer.
    1. Once the deployment has been created successfully, select it in the Deployments view and click Manage Deployment.

    2. In the Deployment Settings section, navigate to the NiFi Configuration tab to find information about the associated inbound connection endpoint.

    3. Copy the endpoint hostname and port and download the certificate and private key.

  11. Start designing your MiNiFi flow in EFM.

    To design a flow for your MiNiFi C++ agent class:

    1. Copy the downloaded client-private-key-encoded key and client-certificate-encoded.cer certificate files to the host with the running MiNiFi C++ agent, so they are accessible by filepath from the agent.

    2. Create a Service of type SSL Context Service with the following configuration:
      Service Name
      Specify a name for this service. This tutorial uses Client SSL Context Service.
      CA Certificate
      Leave it empty. As Cloudera DataFlow uses Let's Encrypt as a Certificate Authority, the certificate will be accepted automatically, without additional configuration.
      Client Certificate
      [***/PATH/TO/***]client-certificate-encoded.cer

      For example, /opt/minifi/minifi-test/client-certs/client-certificate-encoded.cer.

      Passphrase
      Set no value.
      Private Key
      [***PATH/TO/***]client-private-key-encoded

      For example, /opt/minifi/minifi-test/client-certs/client-private-key-encoded

      Use System Cert Store
      Keep the default False value.
    3. Click Apply.
    4. Create an InvokeHTTP processor named Send to CDF with the following configuration:
      Automatically Terminated Relationships
      Select all relationships.
      Content-type
      Depends on your flow file data type. This tutorial uses text/plain.
      HTTP Method
      POST
      Remote URL
      https://[***ENDPOINT HOSTNAME COPIED FROM CLOUDERADATAFLOW FLOW DEPLOYMENT MANAGER***]:9000/contentListener

      For example, https://my-flow.inbound.my-dfx.c94x5i9m.xcu2-8y8z.mycompany.test:9000/contentListener

      SSL Context Service
      Client SSL Context Service
      Leave all other settings with their default values.

    To design a flow for your MiNiFi Java agent class:

    1. Convert the downloaded client-private-key-encoded key and client-certificate-encoded.cer certificate files to a JKS Keystore:
      1. Create a PKCS12 keystore:

        openssl pkcs12 -export -in client-certificate-encoded -inkey client-private-key-encoded -out client-keystore.p12

      2. Convert the PKCS12 keystore to a JKS keystore:

        keytool -importkeystore -srckeystore client-keystore.p12 -srcstoretype pkcs12 -destkeystore client-keystore.jks

    2. Copy the resulting client-keystore.jks file to the host with the running MiNiFi Java agent, so they are accessible by filepath from the agent.
    3. Obtain the CA root cert and add it to truststore client-truststore.jks, by running the following commands:
      wget https://letsencrypt.org/certs/isrgrootx1.pem
      keytool -import -file isrgrootx1.pem -alias isrgrootx1 -keystore client-truststore.jks

      MiNiFi Java requires you to specify an explicit truststore for inbound connections. Remember the password you used for creating client-truststore.jks, as you will need it .

    4. Create a Service of type Restricted SSL Context Service with the following configuration:
      Service Name
      Specify a name for this service. This tutorial uses Client SSL Context Service.
      Keystore Filename
      [***/PATH/TO/***]client-truststore.jks
      Keystore Password
      [***THE PASSWORD YOU PROVIDED WHEN CREATING THE JKS STORE***]
      Key Password
      [***THE PASSWORD YOU PROVIDED WHEN CREATING THE JKS STORE***]
      Keystore Type
      JKS
      Truststore Filename
      client-truststore.jks
      Truststore Type
      JKS
      Truststore Password
      [***THE PASSWORD YOU PROVIDED WHEN CREATING THE CLIENT TRUSTSTORE***]
    5. Click Apply.
    6. Create an InvokeHTTP processor named Send to CDF with the following configuration:
      Automatically Terminated Relationships
      Select all relationships.
      Content-type
      Depends on your flow file data type. This tutorial uses text/plain.
      HTTP Method
      POST
      Remote URL
      https://[***ENDPOINT HOSTNAME COPIED FROM CLOUDERA DATAFLOW FLOW DEPLOYMENT MANAGER***]:9000/contentListener

      For example, https://my-flow.inbound.my-dfx.c94x5i9m.xcu2-8y8z.mycompany.test:9000/contentListener

      SSL Context Service
      Client SSL Context Service
      Leave all other settings with their default values.
  12. Build the rest of your data flow to read data and send to your Cloudera DataFlow flow deployment using InvokeHTTP. As a simple example, this tutorial uses the GenerateFlowFile processor, with the following settings:
    Run Schedule
    Set to 10000 ms (10 seconds).
    Custom Text
    The message you type here will be sent to the ListenHTTP Flow you have created, with the frequency specified by Run Schedule. For example, Hello DFX! This is MiNiFi.
    Data Format
    Set to Text.
    Unique FlowFiles
    Set to false.
  13. Connect the GenerateFlowFile processor to the InvokeHTTP processor.
  14. Click Actions > Publish...to publish the flow and start it on your MiNiFi agent.
  15. Select your flow deployment in the Cloudera DataFlow Dashboard and click KPIs.

    Observe that your Cloudera DataFlow flow deployment is now receiving data from MiNiFi.