Configure the processor for your data source

Learn how to configure a data source processor for the Kafka ingest data flow.

You can set up a data flow to move data into Apache Kafka from many different locations. This example assumes that you are using sample data generated by the GenerateFlowFile processor. If you are moving data from a certain location, see the Apache NiFi Getting Started for information on how to build a data flow, and about other data ingest processor options.

  1. Launch the Configure Processor window, by right clicking the GenerateFlowFile processor and selecting Configure. A configuration dialog with the following tabs is displayed: Settings, Scheduling, Properties, and Comments.
  2. Configure the processor according to the behavior you expect in your data flow.
    The GenerateFlowFile processor can create many FlowFiles very quickly. Setting the run schedule to a reasonable value is important so that the flow does not overwhelm the system.
  3. When you have finished configuring the options you need, save the changes by clicking the Apply button.

    Make sure that you set all required properties, because you cannot start the processor until all mandatory properties have been configured.

The following settings and properties are used in this example:

Table 1. GenerateFlowFile processor scheduling
Scheduling Description Example value for ingest data flow

Run Schedule

Run schedule dictates how often the processor should be scheduled to run. The valid values for this field depend on the selected Scheduling Strategy.

500 ms

Table 2. GenerateFlowFile processor properties
Description Example value for ingest data flow

Custom text

If Data Format is text and if Unique FlowFiles is false, you can provide custom to be used as the content of the generated FlowFiles.

The expression statement in the example value generates a random ID between 1 and 10 000, with random last names assigned.

customer_id, customer_name
${random():mod(10000):plus(1)}, Smith
${random():mod(10000):plus(1)}, Johnson
${random():mod(10000):plus(1)}, Steward
Configure the processor for your data target.