Route Based on Content (One-to-Many)
If a Processor will route a single FlowFile to potentially many relationships, this
Processor will be slightly different than the above-described Processor for Routing Data
Based on Content. This Processor typically has Relationships that are dynamically defined
by the user as well as an unmatched
relationship.
In order for the user to be able to define additionally Properties, the
getSupportedDynamicPropertyDescriptor
method must be overridden. This
method returns a PropertyDescriptor with the supplied name and an applicable Validator to
ensure that the user-specified Matching Criteria is valid.
In this Processor, the Set of Relationships that is returned by the
getRelationships
method is a member variable that is marked
volatile
. This Set is initially constructed with a single
Relationship named unmatched
. The onPropertyModified
method is overridden so that when a Property is added or removed, a new Relationship is
created with the same name. If the Processor has Properties that are not user-defined, it
is important to check if the specified Property is user-defined. This can be achieved by
calling the isDynamic
method of the PropertyDescriptor that is passed
to this method. If this Property is dynamic, a new Set of Relationships is then created,
and the previous set of Relationships is copied into it. This new Set either has the newly
created Relationship added to it or removed from it, depending on whether a new Property
was added to the Processor or a Property was removed (Property removal is detected by
check if the third argument to this function is null
). The member
variable holding the Set of Relationships is then updated to point to this new Set.
If the Properties that specify routing criteria require processing, such as compiling
a Regular Expression, this processing is done in a method annotated with
@OnScheduled
, if possible. The result is then stored in a member
variable that is marked as volatile
. This member variable is generally
of type Map
where the key is of type Relationship
and the value's type is defined by the result of processing the property
value.
The onTrigger
method obtains a FlowFile via the
get
method of ProcessSession. If no FlowFile is available, it returns
immediately. Otherwise, a Set of type Relationship is created. The method reads the
contents of the FlowFile via the ProcessSession's read
method,
evaluating each of the Match Criteria as the data is streamed. For any criteria that
matches, the relationship associated with that Match Criteria is added to the Set of
Relationships.
After reading the contents of the FlowFile, the method checks if the Set of
Relationships is empty. If so, the original FlowFile has an attribute added to it to
indicate the Relationship to which it was routed and is routed to the
unmatched
. This is logged, a Provenance ROUTE event is emitted, and
the method returns. If the size of the Set is equal to 1, the original FlowFile has an
attribute added to it to indicate the Relationship to which it was routed and is routed to
the Relationship specified by the entry in the Set. This is logged, a Provenance ROUTE
event is emitted for the FlowFile, and the method returns.
In the event that the Set contains more than 1 Relationship, the Processor creates a
clone of the FlowFile for each Relationship, except for the first. This is done via the
clone
method of the ProcessSession. There is no need to report a
CLONE Provenance Event, as the framework will handle this for you. The original FlowFile
and each clone are routed to their appropriate Relationship with attribute indicating the
name of the Relationship. A Provenance ROUTE event is emitted for each FlowFile. This is
logged, and the method returns.
This Processor is annotated with the @SideEffectFree
and
@SupportsBatching
annotations from the
org.apache.nifi.annotations.behavior
package.