SFTP Source properties reference

Review the following reference for a comprehensive list of the connector properties that are specific to the SFTP Source connector.

The properties listed in this reference must be added to the connector configuration with the following prefix:
parameter.[***CONNECTOR NAME***] Parameters:

In addition to the properties listed here, this connector also accepts certain properties of the Kafka Connect framework as well as the properties of the NiFi Stateless Source connector. When creating a new connector using the SMM UI, all valid properties are presented in the default configuration template. You can view the configuration template to get a full list of valid properties. In addition, for more information regarding the accepted properties not listed here, you can review the Apache Kafka documentation and the Stateless NiFi Source properties reference.

CSV Character Set

Description
The character set used to read the input CSV files.

This property is ignored if the input is not a CSV file or if record processing is not enabled.

Default Value
UTF-8
Accepted Values
Required
true

CSV Escape Character

Description
The escape character used in the input CSV files to escape other special characters.

This property is ignored if the input is not a CSV file or if record processing is not enabled.

Default Value
\
Accepted Values
Required
true

CSV Quote Character

Description
The quote character used in the input CSV files.

This property is ignored if the input is not a CSV file or if record processing is not enabled.

Default Value
"
Accepted Values
Required
true

CSV Record Separator

Description
The record separator used in the input CSV files.

This property is ignored if the input is not a CSV file or if record processing is not enabled.

Default Value
\n
Accepted Values
Required
true

CSV Treat First Line as Header

Description
Specifies whether the first line in the input file is handled as a header.

Ignored if the input is not a CSV file or record processing is not enabled.

Default Value
false
Accepted Values
true, false
Required
true

CSV Trim Fields

Description
Specifies whether whitespace characters are removed from the beginning and the end of fields.

Ignored if the input is not a CSV file or record processing is not enabled.

Default Value
true
Accepted Values
true, false
Required
true

CSV Value Separator

Description
The value separator used in the input CSV files.

Ignored if the input is not a CSV file or record processing is not enabled.

Default Value
,
Accepted Values
Required
true

Completion Strategy

Description
Specifies what to do with the original file on the server once it has been fetched.

If the Completion Strategy fails, a warning is logged but the data is still transferred.

Default Value
None
Accepted Values
None, Move File, Delete File
Required
true

Date Format

Description
Specifies the format used for parsing date fields in the input data.

This property is only used if Input Data Format is set to CSV or JSON.

Default Value
yyyy-MM-dd
Accepted Values
Required
true

Enable Record Processing

Description
Enables or disables record processing.

If set to true, the Input Data Format is considered and the file gets parsed into records. In this case the Records Per Kafka Message property defines how many records are written into one Kafka message.

If set to false, the entire file gets forwarded to Kafka as one message.

Default Value
true
Accepted Values
true, false
Required
true

File Filter Regex

Description
The Java regular expression to use for filtering filenames. Only files whose names match the regular expression are fetched.
Default Value
.*
Accepted Values
Required
true

Follow Symlink

Description
If set to true, both symbolic files and nested symbolic subdirectories are pulled. Otherwise, symbolic files are not read and symbolic link subdirectories are not traversed.
Default Value
false
Accepted Values
true, false
Required
true

Grok Expression

Description
Specifies the format of a line in Grok format. This allows the connector to understand how to parse each line in the input file. If a line in the file does not match this pattern, the line is handled according to what is set in the Grok No Match Behavior property.

A valid Grok expression must be specified using this property even if Grok format is not used.

Default Value
%{GREEDYDATA:message}
Accepted Values
Required
true

Grok No Match Behavior

Description
Specifies how to handle lines that do not match the pattern set in the Grok Expression property.

If set to append-to-previous-message, non-matching lines are appended to the last field of the previous message.

If set to skip-line, non-matching lines are skipped.

If set to raw-line, non-matching lines are only added to the _raw field.

Default Value
append-to-previous-message
Accepted Values
append-to-previous-message, skip-line, raw-line
Required
true

Host Key File

Description
The fully-qualified filename of the host key file.

If supplied, this file is used as the host key.

If a host key is not supplied, but Strict Host Key Checking is set to true, the known_hosts and known_hosts2 files from the ~/.ssh directory are used.

If a host key is not supplied and Strict Host Key Checking is set to false, no host key file is used.

This parameter must either contain the fully-qualified name of a file, or be completely removed from the configuration JSON.

Default Value
Accepted Values
Required
false

Hostname

Description
The hostname or IP address of the remote system.
Default Value
localhost
Accepted Values
Required
true

Ignored Dotted Files

Description
Specifies whether to ignore files whose names begin with a dot (".").
Default Value
true
Accepted Values
true, false
Required
true

Input Data Format

Description
The format in which the input file contains record-oriented data.

If Enable Record Processing is set to false, this setting is ignored.

Default Value
JSON
Accepted Values
JSON, CSV, GROK
Required
true

Kerberos Keytab for Schema Registry

Description
The fully-qualified filename of the kerberos keytab associated with the principal for accessing Schema Registry.
Default Value
The location of the default keytab which is empty and can only be used for unsecure connections.
Accepted Values
Required
true

Kerberos Principal for Schema Registry

Description
The Kerberos principal used for authenticating to Schema Registry.
Default Value
default
Accepted Values
Required
true

Move Destination Directory

Description
The fully-qualified name of the directory on the remote server to move the original file to once it is ingested. This property is ignored unless the Completion Strategy property is set to Move File. The specified directory must already exist on the remote system.

This parameter must either contain the fully-qualified name of a directory, or be completely removed from the configuration JSON.

Default Value
Accepted Values
Required
false

Password

Description
The password to use when connecting to the SFTP server.

If the server does not require a password, this property must be completely removed from the configuration JSON.

Default Value
Accepted Values
Required
false

Path Filter Regex

Description
The Java Regular Expression to use for filtering paths.

If Search Recursively is set to true, only subdirectories whose path matches the given regular expression are scanned.

If Search Recursively is set to false, this property is ignored.

Default Value
.*
Accepted Values
Required
true

Port

Description
The port that the remote system is listening on for file transfers.
Default Value
22
Accepted Values
Required
true

Private Key File

Description
The fully-qualified filename of a private key file.

If no private key is used, this property must be completely removed from the configuration JSON.

Default Value
Accepted Values
Required
false

Private Key Password

Description
The password used to access the private key.

If no private key is used, this property must be completely removed from the configuration JSON.

Default Value
Accepted Values
Required
false

Record Per Kafka Message

Description
Specifies how many records to write into each Kafka message.

If Enable Record Processing is set to false, this setting is ignored.

Default Value
1
Accepted Values
Required
true

Remote Path

Description
The path on the remote system from which to pull files.
Default Value
.
Accepted Values
Required
true

Schema Access Strategy

Description
Specifies the strategy used for determining the schema of the input records if the Enable Record Processing property is set to true .

The value you set here depends on the input data format.

If set to Schema Registry, the schema is read from Schema Registry.

This setting works with all input data formats.

If set to Infer Schema, the schema is inferred based on the input file. This setting can only be used if your input data format is either JSON or CSV.

If set to Field Names From Grok Expression, the schema is determined using the field names in the Grok Expression property. This setting can only be used if your input data format is .

Additionally, if record processing is enabled (Enable Record Processing is set to true), this property also affects the contents of the output file.

If record processing is enabled and the access strategy is Schema Registry, schemas are not embedded in the output file.

If record processing is enabled and the access strategy is either Infer schema or Field Names from Grok Expression, schemas are embedded in the output file.

Default Value
Schema Registry
Accepted Values
Schema Registry, Infer Schema, Field Names From Grok Expression
Required
true

Schema Branch

Description
The name of the branch to use when looking up the schema in Schema Registry. Schema Branch and Schema Version cannot be specified at the same time. If one is specified, the other needs to be removed from the configuration. If Schema Registry is not used, this property must be completely removed from the configuration.
Default Value
Accepted Values
Required
false

Schema Name

Description
The schema name to look up in Schema Registry.

If the Schema Access Strategy property is set to Schema Registry, this property must contain a valid schema name.

If Schema Registry is not used, this property must be completely removed from the configuration JSON.

Default Value
Accepted Values
Required
false

Schema Registry URL

Description
The URL of the Schema Registry server.

If Schema Registry is not used, use the default value.

Default Value
http://localhost:7788/api/v1
Accepted Values
Required
true

Schema Version

Description
The version of the schema to look up in Schema Registry. If Schema Registry is used and a schema version is not specified, the latest version of the schema is retrieved. Schema Branch and Schema Version cannot be specified at the same time. If one is specified, the other needs to be removed from the configuration. If Schema Registry is not used, this property must be completely removed from the configuration.
Default Value
Accepted Values
Required
true

Search Recursively

Description
Specifies whether to pull files from arbitrarily nested subdirectories. Subdirectories are not traversed if set to false.
Default Value
false
Accepted Values
true, false
Required
true

Strict Host Key Checking

Description
Specifies whether strict enforcement of host keys is applied.
Default Value
false
Accepted Values
true, false
Required
true

Time Format

Description
Specifies the format used for parsing time fields in the input data. This property is only used if Input Data Format is set to CSV or JSON.
Default Value
HH:mm:ss
Accepted Values
Required
true

Timestamp Format

Description
Specifies the format used for parsing timestamp fields in the input data. This property is only used if Input Data Format is set to CSV or JSON.
Default Value
yyyy-MM-dd HH:mm:ss.SSS
Accepted Values
Required
true

Truststore Filename for Schema Registry

Description
The fully-qualified filename of a truststore. This truststore is used to establish a secure connection with Schema Registry using TLS.
Default Value
The location of the default truststore which is empty and can only be used for unsecure connections.
Accepted Values
Required
true

Truststore Password for Schema Registry

Description
The password used to access the contents of the truststore configured in the Truststore Filename for Schema Registry property.
Default Value
password
Accepted Values
Required
true

Truststore Type for Schema Registry

Description
The type of the truststore configured in the Truststore Filename for Schema Registry property.
Default Value
Accepted Values
BCFKS, PKCS12, JKS
Required
true

Username

Description
Username for connecting to the SFTP server.
Default Value
Accepted Values
Required
true