SFTP Source properties reference
Review the following reference for a comprehensive list of the connector properties that are specific to the SFTP Source connector.
parameter.[***CONNECTOR NAME***] Parameters:
In addition to the properties listed here, this connector also accepts certain properties of the Kafka Connect framework as well as the properties of the NiFi Stateless Source connector. When creating a new connector using the SMM UI, all valid properties are presented in the default configuration template. You can view the configuration template to get a full list of valid properties. In addition, for more information regarding the accepted properties not listed here, you can review the Apache Kafka documentation and the Stateless NiFi Source properties reference.
CSV Character Set
- Description
- The character set used to read the input CSV files.
This property is ignored if the input is not a CSV file or if record processing is not enabled.
- Default Value
- UTF-8
- Accepted Values
- Required
- true
CSV Escape Character
- Description
- The escape character used in the input CSV files to escape other special characters.
This property is ignored if the input is not a CSV file or if record processing is not enabled.
- Default Value
- \
- Accepted Values
- Required
- true
CSV Quote Character
- Description
- The quote character used in the input CSV files.
This property is ignored if the input is not a CSV file or if record processing is not enabled.
- Default Value
- "
- Accepted Values
- Required
- true
CSV Record Separator
- Description
- The record separator used in the input CSV files.
This property is ignored if the input is not a CSV file or if record processing is not enabled.
- Default Value
- \n
- Accepted Values
- Required
- true
CSV Treat First Line as Header
- Description
- Specifies whether the first line in the input file is handled as a header.
Ignored if the input is not a CSV file or record processing is not enabled.
- Default Value
- false
- Accepted Values
- true, false
- Required
- true
CSV Trim Fields
- Description
- Specifies whether whitespace characters are removed from the beginning and the end of
fields.
Ignored if the input is not a CSV file or record processing is not enabled.
- Default Value
- true
- Accepted Values
- true, false
- Required
- true
CSV Value Separator
- Description
- The value separator used in the input CSV files.
Ignored if the input is not a CSV file or record processing is not enabled.
- Default Value
- ,
- Accepted Values
- Required
- true
Completion Strategy
- Description
- Specifies what to do with the original file on the server once it has been fetched.
If the Completion Strategy fails, a warning is logged but the data is still transferred.
- Default Value
- None
- Accepted Values
- None, Move File, Delete File
- Required
- true
Date Format
- Description
- Specifies the format used for parsing date fields in the input data.
This property is only used if
Input Data Format
is set toCSV
orJSON
. - Default Value
- yyyy-MM-dd
- Accepted Values
- Required
- true
Enable Record Processing
- Description
- Enables or disables record processing.
If set to
true
, theInput Data Format
is considered and the file gets parsed into records. In this case theRecords Per Kafka Message
property defines how many records are written into one Kafka message.If set to
false
, the entire file gets forwarded to Kafka as one message. - Default Value
- true
- Accepted Values
- true, false
- Required
- true
File Filter Regex
- Description
- The Java regular expression to use for filtering filenames. Only files whose names match the regular expression are fetched.
- Default Value
- .*
- Accepted Values
- Required
- true
Follow Symlink
- Description
- If set to
true
, both symbolic files and nested symbolic subdirectories are pulled. Otherwise, symbolic files are not read and symbolic link subdirectories are not traversed. - Default Value
- false
- Accepted Values
- true, false
- Required
- true
Grok Expression
- Description
- Specifies the format of a line in Grok format. This allows the connector to understand
how to parse each line in the input file. If a line in the file does not match this
pattern, the line is handled according to what is set in the
Grok No Match Behavior
property.A valid Grok expression must be specified using this property even if Grok format is not used.
- Default Value
- %{GREEDYDATA:message}
- Accepted Values
- Required
- true
Grok No Match Behavior
- Description
- Specifies how to handle lines that do not match the pattern set in the
Grok Expression
property.If set to
append-to-previous-message
, non-matching lines are appended to the last field of the previous message.If set to
skip-line
, non-matching lines are skipped.If set to
raw-line
, non-matching lines are only added to the_raw
field. - Default Value
- append-to-previous-message
- Accepted Values
- append-to-previous-message, skip-line, raw-line
- Required
- true
Host Key File
- Description
- The fully-qualified filename of the host key file.
If supplied, this file is used as the host key.
If a host key is not supplied, but
Strict Host Key Checking
is set totrue
, the known_hosts and known_hosts2 files from the ~/.ssh directory are used.If a host key is not supplied and
Strict Host Key Checking
is set tofalse
, no host key file is used.This parameter must either contain the fully-qualified name of a file, or be completely removed from the configuration JSON.
- Default Value
- Accepted Values
- Required
- false
Hostname
- Description
- The hostname or IP address of the remote system.
- Default Value
- localhost
- Accepted Values
- Required
- true
Ignored Dotted Files
- Description
- Specifies whether to ignore files whose names begin with a dot (".").
- Default Value
- true
- Accepted Values
- true, false
- Required
- true
Input Data Format
- Description
- The format in which the input file contains record-oriented data.
If
Enable Record Processing
is set tofalse
, this setting is ignored. - Default Value
- JSON
- Accepted Values
- JSON, CSV, GROK
- Required
- true
Kerberos Keytab for Schema Registry
- Description
- The fully-qualified filename of the kerberos keytab associated with the principal for accessing Schema Registry.
- Default Value
- The location of the default keytab which is empty and can only be used for unsecure connections.
- Accepted Values
- Required
- true
Kerberos Principal for Schema Registry
- Description
- The Kerberos principal used for authenticating to Schema Registry.
- Default Value
- default
- Accepted Values
- Required
- true
Move Destination Directory
- Description
- The fully-qualified name of the directory on the remote server to move the original
file to once it is ingested. This property is ignored unless the
Completion Strategy
property is set toMove File
. The specified directory must already exist on the remote system.This parameter must either contain the fully-qualified name of a directory, or be completely removed from the configuration JSON.
- Default Value
- Accepted Values
- Required
- false
Password
- Description
- The password to use when connecting to the SFTP server.
If the server does not require a password, this property must be completely removed from the configuration JSON.
- Default Value
- Accepted Values
- Required
- false
Path Filter Regex
- Description
- The Java Regular Expression to use for filtering paths.
If
Search Recursively
is set totrue
, only subdirectories whose path matches the given regular expression are scanned.If
Search Recursively
is set tofalse
, this property is ignored. - Default Value
- .*
- Accepted Values
- Required
- true
Port
- Description
- The port that the remote system is listening on for file transfers.
- Default Value
- 22
- Accepted Values
- Required
- true
Private Key File
- Description
- The fully-qualified filename of a private key file.
If no private key is used, this property must be completely removed from the configuration JSON.
- Default Value
- Accepted Values
- Required
- false
Private Key Password
- Description
- The password used to access the private key.
If no private key is used, this property must be completely removed from the configuration JSON.
- Default Value
- Accepted Values
- Required
- false
Record Per Kafka Message
- Description
- Specifies how many records to write into each Kafka message.
If
Enable Record Processing
is set tofalse
, this setting is ignored. - Default Value
- 1
- Accepted Values
- Required
- true
Remote Path
- Description
- The path on the remote system from which to pull files.
- Default Value
- .
- Accepted Values
- Required
- true
Schema Access Strategy
- Description
- Specifies the strategy used for determining the schema of the input records if the
Enable Record Processing
property is set totrue
.The value you set here depends on the input data format.
If set to
Schema Registry
, the schema is read from Schema Registry.This setting works with all input data formats.
If set to
Infer Schema
, the schema is inferred based on the input file. This setting can only be used if your input data format is eitherJSON
orCSV
.If set to
Field Names From Grok Expression
, the schema is determined using the field names in theGrok Expression
property. This setting can only be used if your input data format is.
Additionally, if record processing is enabled (
Enable Record Processing
is set totrue
), this property also affects the contents of the output file.If record processing is enabled and the access strategy is
Schema Registry
, schemas are not embedded in the output file.If record processing is enabled and the access strategy is either
Infer schema
orField Names from Grok Expression
, schemas are embedded in the output file. - Default Value
- Schema Registry
- Accepted Values
- Schema Registry, Infer Schema, Field Names From Grok Expression
- Required
- true
Schema Branch
- Description
- The name of the branch to use when looking up the schema in Schema Registry.
Schema Branch
andSchema Version
cannot be specified at the same time. If one is specified, the other needs to be removed from the configuration. If Schema Registry is not used, this property must be completely removed from the configuration. - Default Value
- Accepted Values
- Required
- false
Schema Name
- Description
- The schema name to look up in Schema Registry.
If the
Schema Access Strategy
property is set toSchema Registry
, this property must contain a valid schema name.If Schema Registry is not used, this property must be completely removed from the configuration JSON.
- Default Value
- Accepted Values
- Required
- false
Schema Registry URL
- Description
- The URL of the Schema Registry server.
If Schema Registry is not used, use the default value.
- Default Value
- http://localhost:7788/api/v1
- Accepted Values
- Required
- true
Schema Version
- Description
- The version of the schema to look up in Schema Registry. If Schema Registry is used
and a schema version is not specified, the latest version of the schema is retrieved.
Schema Branch
andSchema Version
cannot be specified at the same time. If one is specified, the other needs to be removed from the configuration. If Schema Registry is not used, this property must be completely removed from the configuration. - Default Value
- Accepted Values
- Required
- true
Search Recursively
- Description
- Specifies whether to pull files from arbitrarily nested subdirectories. Subdirectories are not traversed if set to false.
- Default Value
- false
- Accepted Values
- true, false
- Required
- true
Strict Host Key Checking
- Description
- Specifies whether strict enforcement of host keys is applied.
- Default Value
- false
- Accepted Values
- true, false
- Required
- true
Time Format
- Description
- Specifies the format used for parsing time fields in the input data. This property is
only used if
Input Data Format
is set toCSV
orJSON
. - Default Value
- HH:mm:ss
- Accepted Values
- Required
- true
Timestamp Format
- Description
- Specifies the format used for parsing timestamp fields in the input data. This
property is only used if
Input Data Format
is set toCSV
orJSON
. - Default Value
- yyyy-MM-dd HH:mm:ss.SSS
- Accepted Values
- Required
- true
Truststore Filename for Schema Registry
- Description
- The fully-qualified filename of a truststore. This truststore is used to establish a secure connection with Schema Registry using TLS.
- Default Value
- The location of the default truststore which is empty and can only be used for unsecure connections.
- Accepted Values
- Required
- true
Truststore Password for Schema Registry
- Description
- The password used to access the contents of the truststore configured in the
Truststore Filename for Schema Registry
property. - Default Value
- password
- Accepted Values
- Required
- true
Truststore Type for Schema Registry
- Description
- The type of the truststore configured in the
Truststore Filename for Schema Registry
property. - Default Value
- Accepted Values
- BCFKS, PKCS12, JKS
- Required
- true
Username
- Description
- Username for connecting to the SFTP server.
- Default Value
- Accepted Values
- Required
- true