This property expects a comma-separated list of file resources.
Supports Expression Language: true (will be evaluated using variable registry only)Kerberos Credentials Service | kerberos-credentials-service | | Controller Service API: KerberosCredentialsService Implementation: KeytabCredentialsService | Specifies the Kerberos Credentials Controller Service that should be used for authenticating with Kerberos |
Kerberos User Service | kerberos-user-service | | Controller Service API: KerberosUserService Implementations: KerberosTicketCacheUserService KerberosKeytabUserService KerberosPasswordUserService | Specifies the Kerberos User Controller Service that should be used for authenticating with Kerberos |
Kerberos Principal | Kerberos Principal | | | Kerberos principal to authenticate as. Requires nifi.kerberos.krb5.file to be set in your nifi.properties Supports Expression Language: true (will be evaluated using variable registry only) |
Kerberos Keytab | Kerberos Keytab | | | Kerberos keytab associated with the principal. Requires nifi.kerberos.krb5.file to be set in your nifi.properties
This property requires exactly one file to be provided..
Supports Expression Language: true (will be evaluated using variable registry only) |
Kerberos Password | Kerberos Password | | | Kerberos password associated with the principal. Sensitive Property: true |
Kerberos Relogin Period | Kerberos Relogin Period | 4 hours | | Period of time which should pass before attempting a kerberos relogin.
This property has been deprecated, and has no effect on processing. Relogins now occur automatically. Supports Expression Language: true (will be evaluated using variable registry only) |
Additional Classpath Resources | Additional Classpath Resources | | | A comma-separated list of paths to files and/or directories that will be added to the classpath and used for loading native libraries. When specifying a directory, all files with in the directory will be added to the classpath, but further sub-directories will not be included.
This property expects a comma-separated list of resources. Each of the resources may be of any of the following types: directory, file.
|
Record Reader | record-reader | | Controller Service API: RecordReaderFactory Implementations: EBCDICRecordReader JsonTreeReader GrokReader ReaderLookup IPFIXReader WindowsEventLogReader ParquetReader CSVReader Syslog5424Reader JASN1Reader ExcelReader CiscoEmblemSyslogMessageReader ScriptedReader ProtobufReader JsonPathReader XMLReader CEFReader SyslogReader AvroReader YamlTreeReader | The service for reading records from incoming flow files. |
Directory | Directory | | | The parent directory to which files should be written. Will be created if it doesn't exist. Supports Expression Language: true (will be evaluated using flow file attributes and variable registry) |
Compression Type | compression-type | NONE | | The type of compression for the file being written. |
Overwrite Files | overwrite | false | | Whether or not to overwrite existing files in the same directory with the same name. When set to false, flow files will be routed to failure when a file exists in the same directory with the same name. |
Permissions umask | permissions-umask | | | A umask represented as an octal number which determines the permissions of files written to HDFS. This overrides the Hadoop Configuration dfs.umaskmode |
Remote Group | remote-group | | | Changes the group of the HDFS file to this value after it is written. This only works if NiFi is running as a user that has HDFS super user privilege to change group |
Remote Owner | remote-owner | | | Changes the owner of the HDFS file to this value after it is written. This only works if NiFi is running as a user that has HDFS super user privilege to change owner |
ORC Configuration Resources | putorc-config-resources | | | A file or comma separated list of files which contains the ORC configuration (hive-site.xml, e.g.). Without this, Hadoop will search the classpath for a 'hive-site.xml' file or will revert to a default configuration. Please see the ORC documentation for more details.
This property expects a comma-separated list of file resources.
|
Stripe Size | putorc-stripe-size | 64 MB | | The size of the memory buffer (in bytes) for writing stripes to an ORC file |
Buffer Size | putorc-buffer-size | 10 KB | | The maximum size of the memory buffers (in bytes) used for compressing and storing a stripe in memory. This is a hint to the ORC writer, which may choose to use a smaller buffer size based on stripe size and number of columns for efficient stripe writing and memory utilization. |
Hive Table Name | putorc-hive-table-name | | | An optional table name to insert into the hive.ddl attribute. The generated DDL can be used by a PutClouderaHiveQL processor (presumably after a PutHDFS processor) to create a table backed by the converted ORC file. If this property is not provided, the full name (including namespace) of the incoming Avro record will be normalized and used as the table name. Supports Expression Language: true (will be evaluated using flow file attributes and variable registry) |
Normalize Field Names for Hive | putorc-hive-field-names | true | | Whether to normalize field names for Hive (force lowercase, e.g.). If the ORC file is going to be part of a Hive table, this property should be set to true. To preserve the original field names from the schema, this property should be set to false. |
Relationships:
Name | Description |
---|
retry | Flow Files that could not be processed due to issues that can be retried are transferred to this relationship |
success | Flow Files that have been successfully processed are transferred to this relationship |
failure | Flow Files that could not be processed due to issue that cannot be retried are transferred to this relationship |
Reads Attributes:
Name | Description |
---|
filename | The name of the file to write comes from the value of this attribute. |
Writes Attributes:
Name | Description |
---|
filename | The name of the file is stored in this attribute. |
absolute.hdfs.path | The absolute path to the file is stored in this attribute. |
hadoop.file.url | The hadoop url for the file is stored in this attribute. |
record.count | The number of records written to the ORC file |
hive.ddl | Creates a partial Hive DDL statement for creating an external table in Hive from the destination folder. This can be used in ReplaceText for setting the content to the DDL. To make it valid DDL, add "LOCATION '<path_to_orc_file_in_hdfs>'", where the path is the directory that contains this ORC file on HDFS. For example, this processor can send flow files downstream to ReplaceText to set the content to this DDL (plus the LOCATION clause as described), then to PutHiveQL processor to create the table if it doesn't exist. |
State management:
This component does not store state.Restricted:
Required Permission | Explanation |
---|
write distributed filesystem | Provides operator the ability to write to any file that NiFi has access to in HDFS or the local filesystem. |
Input requirement:
This component requires an incoming relationship.System Resource Considerations:
None specified.