Joins together Records from two different FlowFiles where one FlowFile, the 'original' contains arbitrary records and the second FlowFile, the 'enrichment' contains additional data that should be used to enrich the first. See Additional Details for more information on how to configure this processor and the different use cases that it aims to accomplish.
fork, join, enrichment, record, sql, wrap, recordpath, merge, combine, streams
In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.
Display Name | API Name | Default Value | Allowable Values | Description |
---|---|---|---|---|
Original Record Reader | Original Record Reader | Controller Service API: RecordReaderFactory Implementations: EBCDICRecordReader XMLReader AvroReader CiscoEmblemSyslogMessageReader ReaderLookup JsonPathReader JASN1Reader GrokReader JsonTreeReader WindowsEventLogReader YamlTreeReader IPFIXReader CEFReader SyslogReader ExcelReader ParquetReader ScriptedReader CSVReader Syslog5424Reader | The Record Reader for reading the 'original' FlowFile | |
Enrichment Record Reader | Enrichment Record Reader | Controller Service API: RecordReaderFactory Implementations: EBCDICRecordReader XMLReader AvroReader CiscoEmblemSyslogMessageReader ReaderLookup JsonPathReader JASN1Reader GrokReader JsonTreeReader WindowsEventLogReader YamlTreeReader IPFIXReader CEFReader SyslogReader ExcelReader ParquetReader ScriptedReader CSVReader Syslog5424Reader | The Record Reader for reading the 'enrichment' FlowFile | |
Record Writer | Record Writer | Controller Service API: RecordSetWriterFactory Implementations: RecordSetWriterLookup ScriptedRecordSetWriter JsonRecordSetWriter AvroRecordSetWriter CSVRecordSetWriter ParquetRecordSetWriter XMLRecordSetWriter FreeFormTextRecordSetWriter | The Record Writer to use for writing the results. If the Record Writer is configured to inherit the schema from the Record, the schema that it will inherit will be the result of merging both the 'original' record schema and the 'enrichment' record schema. | |
Join Strategy | Join Strategy | Wrapper |
| Specifies how to join the two FlowFiles into a single FlowFile |
SQL | SQL | SELECT original.*, enrichment.* FROM original LEFT OUTER JOIN enrichment ON original.id = enrichment.id | The SQL SELECT statement to evaluate. Expression Language may be provided, but doing so may result in poorer performance. Because this Processor is dealing with two FlowFiles at a time, it's also important to understand how attributes will be referenced. If both FlowFiles have an attribute with the same name but different values, the Expression Language will resolve to the value provided by the 'enrichment' FlowFile. Supports Expression Language: true (will be evaluated using flow file attributes and variable registry) This Property is only considered if the [Join Strategy] Property has a value of "SQL". | |
Default Decimal Precision | dbf-default-precision | 10 | When a DECIMAL/NUMBER value is written as a 'decimal' Avro logical type, a specific 'precision' denoting number of available digits is required. Generally, precision is defined by column data type definition or database engines default. However undefined precision (0) can be returned from some database engines. 'Default Decimal Precision' is used when writing those undefined precision numbers. Supports Expression Language: true (will be evaluated using flow file attributes and variable registry) This Property is only considered if the [Join Strategy] Property has a value of "SQL". | |
Default Decimal Scale | dbf-default-scale | 0 | When a DECIMAL/NUMBER value is written as a 'decimal' Avro logical type, a specific 'scale' denoting number of available decimal digits is required. Generally, scale is defined by column data type definition or database engines default. However when undefined precision (0) is returned, scale can also be uncertain with some database engines. 'Default Decimal Scale' is used when writing those undefined numbers. If a value has more decimals than specified scale, then the value will be rounded-up, e.g. 1.53 becomes 2 with scale 0, and 1.5 with scale 1. Supports Expression Language: true (will be evaluated using flow file attributes and variable registry) This Property is only considered if the [Join Strategy] Property has a value of "SQL". | |
Insertion Record Path | Insertion Record Path | / | Specifies where in the 'original' Record the 'enrichment' Record's fields should be inserted. Note that if the RecordPath does not point to any existing field in the original Record, the enrichment will not be inserted. Supports Expression Language: true (will be evaluated using flow file attributes and variable registry) This Property is only considered if the [Join Strategy] Property has a value of "Insert Enrichment Fields". | |
Maximum number of Bins | Maximum number of Bins | 10000 | Specifies the maximum number of bins that can be held in memory at any one time | |
Timeout | Timeout | 10 min | Specifies the maximum amount of time to wait for the second FlowFile once the first arrives at the processor, after which point the first FlowFile will be routed to the 'timeout' relationship. |
Name | Description |
---|---|
timeout | If one of the incoming FlowFiles (i.e., the 'original' FlowFile or the 'enrichment' FlowFile) arrives to this Processor but the other does not arrive within the configured Timeout period, the FlowFile that did arrive is routed to this relationship. |
joined | The resultant FlowFile with Records joined together from both the original and enrichment FlowFiles will be routed to this relationship |
failure | If both the 'original' and 'enrichment' FlowFiles arrive at the processor but there was a failure in joining the records, both of those FlowFiles will be routed to this relationship. |
original | Both of the incoming FlowFiles ('original' and 'enrichment') will be routed to this Relationship. I.e., this is the 'original' version of both of these FlowFiles. |
Name | Description |
---|---|
mime.type | Sets the mime.type attribute to the MIME Type specified by the Record Writer |
record.count | The number of records in the FlowFile |
Resource | Description |
---|---|
MEMORY | This Processor will load into heap all FlowFiles that are on its incoming queues. While it loads the FlowFiles themselves, and not their content, the FlowFile attributes can be very memory intensive. Additionally, if the Join Strategy is set to SQL, the SQL engine may require buffering the entire contents of the enrichment FlowFile for each concurrent task. See Processor's Additional Details for more details and for steps on how to mitigate these concerns. |