PartitionCsv

Description:

Partitions a CSV file using the partition_csv function of unstructured.io. Properties are forwarded to partition_csv as parameters. The output is a JSON document in the format output by partition_csv.

Tags:

ai, artificial intelligence, ml, machine learning, text, LLM, partition, csv, partition_csv

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.

Display Name	API Name	Default Value	Description
Include Metadata	Include Metadata	true	Whether to include metadata in the output. Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
Metadata Filename	Metadata Filename		If present, will be included in the metadata as filename. Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
Metadata Last Modified	Metadata Last Modified		Date-time to include in the metadata as last_modified. Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
Languages	Languages		Comma-separated list of 3-letter language codes to be used as metadata.languages. If unset, the language is detected via langdetect. Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
Include Header	Include Header	false	Whether to interpret the first row of the input as a table header. Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
Infer Table Structure	Infer Table Structure	true	If true, add text_as_html field to metadata on extracted tables. Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)