FetchCDPObjectStore

Description:

Retrieves a file from an object store. The content of the incoming FlowFile is replaced by the content of the file in the object store. The file in the store is left intact without any changes being made to it.

Additional Details...

Tags:

hadoop, HCFS, HDFS, get, ingest, fetch, source, filesystem, CDP, GCP, GCS, Google, S3, AWS, ADLS, Azure

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.

Display NameAPI NameDefault ValueAllowable ValuesDescription
Storage Locationcdp-storage-locationUse this property to set the storage location in use. Example: 's3a://myBucket/myDirectory'. In case the property is not specified the processor will use the value set in /etc/hadoop/conf/core-site.xml
Supports Expression Language: true (will be evaluated using variable registry only)
Filenamefilename${path}/${filename}The name of the file to retrieve
Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)
Kerberos Credentials Servicekerberos-credentials-serviceController Service API:
KerberosCredentialsService
Implementation: KeytabCredentialsService
Specifies the Kerberos Credentials Controller Service that should be used for authenticating with Kerberos
CDP UsernameKerberos PrincipalCDP User name. Recommendation is to create a dedicated Machine User in the CDP User Management UI.
Supports Expression Language: true (will be evaluated using variable registry only)
CDP PasswordKerberos PasswordWorkload password associated to your CDP User. You can set it in the CDP User Management UI. If you don't want to use a workload password, you can use the Kerberos Credentials controller service property.
Sensitive Property: true

Dynamic Properties:

Supports Sensitive Dynamic Properties: No

Dynamic Properties allow the user to specify both the name and value of a property.

NameValueDescription
A Hadoop client configuration nameThe value to set it toSets and if already set, overwrites the Hadoop client configuration with the given name.
Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)

Relationships:

NameDescription
successFlowFiles will be routed to this relationship once they have been updated with the content of file from the object store.
comms.failureFlowFiles will be routed to this relationship if the content of the file from the object store cannot be retrieve due to a communications failure. This generally indicates that the Fetch should be tried again.
failureFlowFiles will be routed to this relationship if the content of the file from the object store cannot be retrieved and trying again will likely not be helpful. This would occur, for instance, if the file is not found or if there is a permissions issue

Reads Attributes:

None specified.

Writes Attributes:

NameDescription
hdfs.failure.reasonWhen a FlowFile is routed to 'failure', this attribute is added indicating why the file could not be fetched from HDFS
hadoop.file.urlThe hadoop url for the file is stored in this attribute.
objectstore.failure.reasonWhen a FlowFile is routed to 'failure', this attribute is added indicating why the file could not be fetched.

State management:

This component does not store state.

Restricted:

Required PermissionExplanation
read distributed filesystemProvides operator the ability to retrieve any file that NiFi has access to in the object store or the local filesystem.

Input requirement:

This component requires an incoming relationship.

System Resource Considerations:

None specified.

See Also:

DeleteCDPObjectStore, ListCDPObjectStore, PutCDPObjectStore