Configure each object store processor

Learn how you can configure the object store processors for your ingest data flow.

Configure the processor according to the behavior you expect in your data flow. Make sure that you set all required properties, as you cannot start the processor until all mandatory properties have been configured. Property values can be parameterized.
  1. Launch the Configure Processor window, by right clicking the ListCDPObjectStore processor and selecting Configure.
    This gives you a configuration dialog with the following tabs: Settings, Scheduling, Properties, Comments.
  2. Configure the ListCDPObjectStore processor properties and click Apply.
    Figure 1. ListCDPObjectStore processor properties
    Table 1. ListCDPObjectStore processor properties
    Property Description Example value for ingest data flow
    Storage Location The input bucket that contains the data that you want to ingest.

    You can create a parameter for the input bucket.

    s3a://my-input-bucket-nifi
    Directory Path to the data files in the source bucket. /$(now():toNumber():minus(86400000):format("yyy-MM-dd", "GMT")
    CDP Username Your CDP account user name.

    You can create a parameter for the CDP account credentials.

    srv_nifi-logs-ingest
    CDP Password Your CDP account password.

    You can create a parameter for the CDP account credentials.

    Recurse Subdirectories Set to true if you want to include subdirectories in the data ingest operation. true
  3. Configure the FetchCDPObjectStore processor properties and click Apply.
    Figure 2. FetchCDPObjectStore processor properties
    Table 2. FetchCDPObjectStore processor properties
    Property Description Example value for ingest data flow
    Storage Location The input bucket that contains the data that you want to ingest.

    You can create a parameter for the input bucket and provide your workload account credentials to access the bucket.

    s3a://my-input-bucket-nifi
    Filename The fully-qualified filename of the file to fetch from the source bucket.

    You do not need to set this property because the default value will use the attributes of the flow files created by the ListCDPObjectStore processor.

    $(path)/$(filename)
    CDP Username Your CDP account user name.

    You can create a parameter for the CDP account credentials.

    srv_nifi-logs-ingest
    CDP Password Your CDP account password.

    You can create a parameter for the CDP account credentials.

  4. Configure the PutCDPObjectStore processor properties and click Apply.
    Figure 3. PutCDPObjectStore processor properties
    Table 3. PutCDPObjectStore processor properties
    Property Description Example value for ingest data flow
    Directory The directory to which files should be written. /
    Storage Location The target bucket.

    You can create a parameter for the target bucket.

    s3a://my-output-bucket-nifi
    Conflict Resolution Strategy Indicates what should happen when a file with the same name already exists in the target directory. You can enter one of the following values:
    • fail
    • ignore
    • replace
    CDP Username Your CDP account user name.

    You can create a parameter for the CDP account credentials.

    srv_nifi-logs-ingest
    CDP Password Your CDP account password.

    You can create a parameter for the CDP account credentials.

  5. Configure the DeleteCDPObjectStore processor properties and click Apply.
    Figure 4. DeleteCDPObjectStore processor properties
    Table 4. DeleteCDPObjectStore processor properties
    Property Description Example value for ingest data flow
    Storage Location The source bucket that contains the data that you have ingested and now want to delete.

    You can create a parameter for the input bucket.

    s3a://my-input-bucket-nifi
    Path Path to the data files in the source bucket that you want to delete. $(path)/$(filename).gz
    CDP Username Your CDP account user name.

    You can create a parameter for the CDP account credentials.

    srv_nifi-logs-ingest
    CDP Password Your CDP account password.

    You can create a parameter for the CDP account credentials.

Create policies in Ranger to enable the CDP user to access the source and target buckets.