PutCDPObjectStore 2.3.0.4.10.0.0-147

Bundle
com.cloudera | nifi-cdf-objectstore-nar
Description
Write FlowFile data to an object store.
Tags
ADLS, AWS, Azure, CDP, GCP, GCS, Google, HCFS, HDFS, S3, copy, filesystem, hadoop, put
Input Requirement
REQUIRED
Supports Sensitive Dynamic Properties
false
  • Additional Details for PutCDPObjectStore 2.3.0.4.10.0.0-147

    FetchCDPObjectStore

    Description

    PutCDPCObjectStore provides the capability to upload files. In most aspects it behaves identical to its HDFS counterpart (PutHDFS). For these details please refer to the description of the PutHDFS.

    CDP Object Store processors

    PutCDPCObjectStore is part of the CDP Object Store processor family. This comes with a number of consequences listed below.

    Object Store access

    This processor is designed to ease the interactions with the object store associated to the NiFi cluster. If used in CDP Private Cloud, it can be used to facilitate interactions with HDFS and/or Ozone. If used in CDP Public Cloud, it can be used to interact with the object store of the underlying cloud provider (S3 for AWS, ADLS for Azure, GCS for Google Cloud, etc) but not cross cloud providers. If the cluster is configured with RAZ, the processor will interact with RAZ to check the Ranger policies when accessing the resources in the object store. If RAZ is not enabled, it is possible to leverage the IDBroker mappings to map CDP users with cloud accounts and policies.

    Configuration file

    This processor needs a configuration which contains connection details to the object store. This should be a Hadoop-style XML file, occasionally with additional parameters that are specific to the given kind of object store and authentication method. Unless specified otherwise the processor is looking for the CDP-default /etc/hadoop/conf/core-site.xml configuration file.

    This configuration contains information specific to the object store provider (For example Amazon AWS) which, combined with the underlying Hadoop library provides the capability to connect to different kind of stores, authenticate with Kerberos and authorize with Ranger. In the majority of the cases the use of this default configuration is recommended.

    Users may override the default location by adding a dynamic parameter, by the name of “cdp.configuration.resources”. It is possible to add multiple configuration files as a comma-separated list. It is important to note however that for the additional features provided by the underlying Hadoop library to continue to work, a number of additional configuration parameters are needed.

    Storage Location

    If Storage Location property is not set, the default storage location will be used. The default value is defined by the “fs.defaultFS” property of the object store configuration. If the default CDP configuration is used, this will be the Data Lake’s object storage. If this is being set, the value of “fs.defaultFS” will be ignored. It is important to adjust the authentication and authorization settings accordingly.

    Dynamic parameters

    This processors supports dynamic parameters. All dynamic parameters, except the protected ones are passed to the object storage configuration. These will be added as additional configuration parameters or in case some parameters already exist, overwrite them. This provides the opportunity to fine tune the connection without changing the configuration file. The protected parameters are: “fs.defaultFS” and “cdp.configuration.resources”.

    Authentication

    This processor supports Kerberos authentication via either Kerberos Credential Service or explicitly providing CDP Username and CDP Password. Both will authenticate against the cluster’s adherent Kerberos service.

Properties
Dynamic Properties
Restrictions
Required Permission Explanation
write distributed filesystem Provides operator the ability to delete any file that NiFi has access to in HDFS or the local filesystem.
Relationships
Name Description
success Files that have been successfully written to the object store are transferred to this relationship
failure Files that could not be written to the object store for some reason are transferred to this relationship
Reads Attributes
Name Description
filename The name of the file written to the object store comes from the value of this attribute.
Writes Attributes
Name Description
filename The name of the file written to HDFS is stored in this attribute.
absolute.hdfs.path The absolute path to the file on HDFS is stored in this attribute.
hadoop.file.url The hadoop url for the file is stored in this attribute.
target.dir.created The result(true/false) indicates if the folder is created by the processor.
filename The name of the file written to object store is stored in this attribute.
See Also