Route Based on Content (One-to-Many)
If a Processor will route a single FlowFile to potentially many relationships, this
Processor will be slightly different than the above-described Processor for Routing Data
Based on Content. This Processor typically has Relationships that are dynamically defined by
the user as well as an unmatched
relationship.
In order for the user to be able to define additionally Properties, the
getSupportedDynamicPropertyDescriptor
method must be overridden. This
method returns a PropertyDescriptor with the supplied name and an applicable Validator to
ensure that the user-specified Matching Criteria is valid.
In this Processor, the Set of Relationships that is returned by the
getRelationships
method is a member variable that is marked
volatile
. This Set is initially constructed with a single Relationship
named unmatched
. The onPropertyModified
method is
overridden so that when a Property is added or removed, a new Relationship is created with
the same name. If the Processor has Properties that are not user-defined, it is important to
check if the specified Property is user-defined. This can be achieved by calling the
isDynamic
method of the PropertyDescriptor that is passed to this
method. If this Property is dynamic, a new Set of Relationships is then created, and the
previous set of Relationships is copied into it. This new Set either has the newly created
Relationship added to it or removed from it, depending on whether a new Property was added
to the Processor or a Property was removed (Property removal is detected by check if the
third argument to this function is null
). The member variable holding the
Set of Relationships is then updated to point to this new Set.
If the Properties that specify routing criteria require processing, such as compiling a
Regular Expression, this processing is done in a method annotated with
@OnScheduled
, if possible. The result is then stored in a member
variable that is marked as volatile
. This member variable is generally of
type Map
where the key is of type Relationship
and the
value's type is defined by the result of processing the property value.
The onTrigger
method obtains a FlowFile via the
get
method of ProcessSession. If no FlowFile is available, it returns
immediately. Otherwise, a Set of type Relationship is created. The method reads the contents
of the FlowFile via the ProcessSession's read
method, evaluating
each of the Match Criteria as the data is streamed. For any criteria that matches, the
relationship associated with that Match Criteria is added to the Set of
Relationships.
After reading the contents of the FlowFile, the method checks if the Set of
Relationships is empty. If so, the original FlowFile has an attribute added to it to
indicate the Relationship to which it was routed and is routed to the
unmatched
. This is logged, a Provenance ROUTE event is emitted, and the
method returns. If the size of the Set is equal to 1, the original FlowFile has an attribute
added to it to indicate the Relationship to which it was routed and is routed to the
Relationship specified by the entry in the Set. This is logged, a Provenance ROUTE event is
emitted for the FlowFile, and the method returns.
In the event that the Set contains more than 1 Relationship, the Processor creates a
clone of the FlowFile for each Relationship, except for the first. This is done via the
clone
method of the ProcessSession. There is no need to report a CLONE
Provenance Event, as the framework will handle this for you. The original FlowFile and each
clone are routed to their appropriate Relationship with attribute indicating the name of the
Relationship. A Provenance ROUTE event is emitted for each FlowFile. This is logged, and the
method returns.
This Processor is annotated with the @SideEffectFree
and
@SupportsBatching
annotations from the
org.apache.nifi.annotations.behavior
package.