ExtractDocumentText

Description:

Extract text contents from supported binary document formats using Apache Tika

Tags:

extract, document, text

Properties:

This component has no required or optional properties.

Relationships:

NameDescription
extractedSuccess for extracted text FlowFiles
failureContent extraction failed
originalSuccess for original input FlowFiles

Reads Attributes:

None specified.

Writes Attributes:

None specified.

State management:

This component does not store state.

Restricted:

This component is not restricted.

System Resource Considerations:

None specified.