This property expects a comma-separated list of file resources.
Supports Expression Language: true (will be evaluated using Environment variables only)Kerberos Credentials Service | kerberos-credentials-service | | Controller Service API: KerberosCredentialsService Implementation: KeytabCredentialsService | Specifies the Kerberos Credentials Controller Service that should be used for authenticating with Kerberos |
Kerberos User Service | kerberos-user-service | | Controller Service API: KerberosUserService Implementations: KerberosTicketCacheUserService KerberosPasswordUserService KerberosKeytabUserService | Specifies the Kerberos User Controller Service that should be used for authenticating with Kerberos |
Kerberos Principal | Kerberos Principal | | | Kerberos principal to authenticate as. Requires nifi.kerberos.krb5.file to be set in your nifi.properties Supports Expression Language: true (will be evaluated using Environment variables only) |
Kerberos Keytab | Kerberos Keytab | | | Kerberos keytab associated with the principal. Requires nifi.kerberos.krb5.file to be set in your nifi.properties
This property requires exactly one file to be provided..
Supports Expression Language: true (will be evaluated using Environment variables only) |
Kerberos Password | Kerberos Password | | | Kerberos password associated with the principal. Sensitive Property: true |
Kerberos Relogin Period | Kerberos Relogin Period | 4 hours | | Period of time which should pass before attempting a kerberos relogin.
This property has been deprecated, and has no effect on processing. Relogins now occur automatically. Supports Expression Language: true (will be evaluated using Environment variables only) |
Additional Classpath Resources | Additional Classpath Resources | | | A comma-separated list of paths to files and/or directories that will be added to the classpath and used for loading native libraries. When specifying a directory, all files with in the directory will be added to the classpath, but further sub-directories will not be included.
This property expects a comma-separated list of resources. Each of the resources may be of any of the following types: directory, file.
|
Full path | gethdfsfileinfo-full-path | | | A directory to start listing from, or a file's full path. Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables) |
Recurse Subdirectories | gethdfsfileinfo-recurse-subdirs | true | | Indicates whether to list files from subdirectories of the HDFS directory |
Directory Filter | gethdfsfileinfo-dir-filter | | | Regex. Only directories whose names match the given regular expression will be picked up. If not provided, any filter would be apply (performance considerations). Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables) |
File Filter | gethdfsfileinfo-file-filter | | | Regex. Only files whose names match the given regular expression will be picked up. If not provided, any filter would be apply (performance considerations). Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables) |
Exclude Files | gethdfsfileinfo-file-exclude-filter | | | Regex. Files whose names match the given regular expression will not be picked up. If not provided, any filter won't be apply (performance considerations). Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables) |
Ignore Dotted Directories | gethdfsfileinfo-ignore-dotted-dirs | true | | If true, directories whose names begin with a dot (".") will be ignored |
Ignore Dotted Files | gethdfsfileinfo-ignore-dotted-files | true | | If true, files whose names begin with a dot (".") will be ignored |
Group Results | gethdfsfileinfo-group | All | - All
- Parent Directory
- None
| Groups HDFS objects |
Batch Size | gethdfsfileinfo-batch-size | | | Number of records to put into an output flowfile when 'Destination' is set to 'Content' and 'Group Results' is set to 'None' |
Destination | gethdfsfileinfo-destination | Content | - Attributes
- Content
| Sets the destination for the resutls. When set to 'Content', attributes of flowfile won't be used for storing results. |
Relationships:
Name | Description |
---|
success | All successfully generated FlowFiles are transferred to this relationship |
not found | If no objects are found, original FlowFile are transferred to this relationship |
failure | All failed attempts to access HDFS will be routed to this relationship |
original | Original FlowFiles are transferred to this relationship |
Reads Attributes:
None specified.Writes Attributes:
Name | Description |
---|
hdfs.objectName | The name of the file/dir found on HDFS. |
hdfs.path | The path is set to the absolute path of the object's parent directory on HDFS. For example, if an object is a directory 'foo', under directory '/bar' then 'hdfs.objectName' will have value 'foo', and 'hdfs.path' will be '/bar' |
hdfs.type | The type of an object. Possible values: directory, file, link |
hdfs.owner | The user that owns the object in HDFS |
hdfs.group | The group that owns the object in HDFS |
hdfs.lastModified | The timestamp of when the object in HDFS was last modified, as milliseconds since midnight Jan 1, 1970 UTC |
hdfs.length | In case of files: The number of bytes in the file in HDFS. In case of dirs: Retuns storage space consumed by directory. |
hdfs.count.files | In case of type='directory' will represent total count of files under this dir. Won't be populated to other types of HDFS objects. |
hdfs.count.dirs | In case of type='directory' will represent total count of directories under this dir (including itself). Won't be populated to other types of HDFS objects. |
hdfs.replication | The number of HDFS replicas for the file |
hdfs.permissions | The permissions for the object in HDFS. This is formatted as 3 characters for the owner, 3 for the group, and 3 for other users. For example rw-rw-r-- |
hdfs.status | The status contains comma separated list of file/dir paths, which couldn't be listed/accessed. Status won't be set if no errors occured. |
hdfs.full.tree | When destination is 'attribute', will be populated with full tree of HDFS directory in JSON format.WARNING: In case when scan finds thousands or millions of objects, having huge values in attribute could impact flow file repo and GC/heap usage. Use content destination for such cases |
State management:
This component does not store state.Restricted:
This component is not restricted.Input requirement:
This component allows an incoming relationship.System Resource Considerations:
None specified.See Also:
ListHDFS, GetHDFS, FetchHDFS, PutHDFS