Flow testing principles

As with any data flow that you build, it is important to test it before deployment. Data flows to be used by AWS Lambda require an Input Port and most often will also contain at least one Output Port. This makes it quite easy to test these data flows in NiFi.

  1. Go to the Parent Process Group within Apache NiFi and add a GenerateFlowFile processor.
  2. Set the Custom Text property to any value that you want to feed into your data flow.
    For example, to simulate an event indicating that data was added to an S3 bucket, you would set the Custom Text property to the following:
    {
      "Records": [
        {
          "awsRegion": "us-east-1",
          "s3": {
            "bucket": {
              "name": "my-nifi-logs",
              "arn": "arn:aws:s3:::example-bucket"
            },
            "object": {
              "key": "nifi-app_2021-10-12_18.36.log.gz",
              "size": 1024,
            }
          }
        }
      ]
    }
    
  3. Connect the GenerateFlowFile processor to the Input Port of the Process Group that you will run in Lambda.
    The Run Once feature of NiFi makes it easy to create this FlowFile and send it to the Process Group.
  4. Start the Process Group and ensure that the data processes as expected.
    If not, you can fix the data flow and trigger the GenerateFlowFile processor again, until you have properly handled the data.
When ready, you can download the Flow Definition from NiFi and upload it to the Catalog in Cloudera DataFlow.