ModifyHTMLElement

Description:

Modifies the value of an existing HTML element. The desired element to be modified is located by using CSS selector syntax. The incoming HTML is first converted into a HTML Document Object Model so that HTML elements may be selected in the similar manner that CSS selectors are used to apply styles to HTML. The resulting HTML DOM is then "queried" using the user defined CSS selector string to find the element the user desires to modify. If the HTML element is found the element's value is updated in the DOM using the value specified "Modified Value" property. All DOM elements that match the CSS selector will be updated. Once all of the DOM elements have been updated the DOM is rendered to HTML and the result replaces the flowfile content with the updated HTML. A more thorough reference for the CSS selector syntax can be found at "http://jsoup.org/apidocs/org/jsoup/select/Selector.html"

Tags:

modify, html, dom, css, element

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.

Display NameAPI NameDefault ValueAllowable ValuesDescription
CSS SelectorCSS SelectorCSS selector syntax string used to extract the desired HTML element(s).
Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)
HTML Character EncodingHTML Character EncodingUTF-8Character encoding of the input HTML
Output TypeOutput TypeHTML
  • HTML
  • Text
  • Attribute
Controls whether the HTML element is output as HTML,Text or Data
Modified ValueModified ValueValue to update the found HTML elements with
Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)
Attribute NameAttribute NameWhen modifying the value of an element attribute this value is used as the key to determine which attribute on the selected element will be modified with the new value.
Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)

Relationships:

NameDescription
element not foundElement could not be found in the HTML document. The original HTML input will remain in the FlowFile content unchanged. Relationship 'original' will not be invoked in this scenario.
successSuccessfully parsed HTML element
originalThe original HTML input
invalid htmlThe input HTML syntax is invalid

Reads Attributes:

None specified.

Writes Attributes:

NameDescription
NumElementsModifiedTotal number of HTML element modifications made

State management:

This component does not store state.

Restricted:

This component is not restricted.

Input requirement:

This component requires an incoming relationship.

System Resource Considerations:

None specified.

See Also:

GetHTMLElement, PutHTMLElement