Configure the processor for your data target

You can set up a data flow to move data to many locations. This example assumes that you are moving data to Apache Solr in a Data Discovery and Exploration cluster in CDP Public Cloud, using the PutSolrRecord processor. If you are moving your data from a location other than Kafka, see Getting Started with Apache NiFi for general information about how to build a data flow and about other data consumption processor options. You can also check out the other ingest use cases in the Cloudera Data Flow for Data Hub library.

Create a machine user in CDP User Management and synchronize this user to your CDP environment.
  1. Launch the Configure Processor window by right clicking the PutSolrRecordprocessor and selecting Configure.
    You can see a configuration dialog with the following tabs: Settings, Scheduling, Properties, Comments.
  2. Configure the processor according to the behavior you expect in your data flow.
    1. Click the Properties tab.
    2. Configure PutSolrRecord with the required values.
      The following table includes a description and example values for the properties required for this Kafka to Solr ingest data flow. For a complete list of PutSolrRecord options, see the processor documentation in Getting Started with Apache NiFi.
      Property Description Value to set for ingest to Solr data flow
      Solr Type Provide the type of Solr instance. It can be Cloud or Standard. Cloud
      Solr Location

      Specify the Solr url for a Solr Type of Standard (for example:

      http://localhost:8984/solr/gettingstarted ), or the ZooKeeper hosts for a Solr Type of Cloud (for example: localhost:9983).

      You can find this value on the dashboard of the Solr web UI as the zkHost parameter value.

      In this example the value is: zookeeper1.cloudera.site:2181,zookeeper2.cloudera.site:2181,zookeeper3.cloudera.site:2181/solr-dde
      Collection Specify the name of the Solr collection. In this example the value is: solr-nifi-demo
      Solr Update Path Provide the path in Solr to post the flowfile records. /update
      Kerberos Principal Specify the CDP user name you are using to perform this workflow. Provide the CDP user name you created and synced with your CDP environment in Meet the prerequisites.
      Kerberos Password Specify the password for the CDP user you are using to perform this workflow. In this example the value is:Password1!
      RecordReader

      Specify the service for reading records from incoming flowfiles.

      AvroReader
      SSL Context Service Specify the controller service to use to obtain an SSL context. This property must be set when communicating with Solr over https. Reference the default SSL context service that has been created for you. In this example the value is: Default NiFi SSL Context Service

    Make sure that you set all required properties, as you cannot start the processor until all mandatory properties have been configured.

  3. When you have finished configuring the options you need, save the changes by clicking Apply.
Your data flow is ready to ingest data into Solr. Start the data flow.