You can set up a dataflow to push data into Snowflake database tables from many different locations. To do this, start by configuring the processor for your data source by launching the Configure Processor window and specifying the necessary configurations.

Configure the GenerateFlowFile processor to create random data for this example dataflow. GenerateFlowFile is useful when you are testing or creating proof of concept dataflows. When you have confirmed that this dataflow meets your business use case, you can replace it with a processor getting data from your actual data source.

See Related Information for full details on this Apache NiFi Processor.

  • You must have built the dataflow.
  • You must have configured your Controller Services.
  1. Launch the Configure Processor window, by right-clicking the GenerateFlowFile processor and selecting Configure. A configuration dialog box with the following tabs is displayed: Settings, Scheduling, Properties, and Comments.
  2. Configure the processor according to the behavior you expect in your dataflow.
    See the Example section below for recommended configuration to satisfy this example use case.
  3. Save the changes by clicking Apply .

The following settings and properties are used in this example:

Table 1. GenerateFlowFile processor scheduling
Scheduling Description Example value for ingest data flow

Run Schedule

Run schedule dictates how often the processor should be scheduled to run. The valid values for this field depend on the selected scheduling strategy.

60 s

Table 2. GenerateFlowFile processor properties
<Title>? Description Example value for ingest data flow

Custom text

If the value of Data Format is text and if Unique FlowFiles is set to false, you can provide custom to be used as the content of the generated FlowFiles.

The expression statement in the example value generates a random ID between 1 and 10 000, with random last names assigned.

100,foo1, blablabla 
101, foo2, blabla 
102, foo3, bla

