An important step in building a data flow that you can run outside of the NiFi instance
where it was built is the concept of parameterization. NiFi allows you to define Processor and
Controller Service properties at runtime instead of at build time by using Parameter Contexts.
The Lambda function handler allows you to specify the parameters in two ways: using environment
variables or using the AWS Secrets Manager.
Environment variables
Any parameter can be specified using the environment variables of the AWS Lambda
function. When configuring the Lambda function, simply add an environment variable whose
name matches the name of a parameter in your Parameter Context.
AWS Secrets Manager
A more secure mechanism for storing parameters is to use the AWS Secrets
Manager.
Open the AWS Secrets Manager and click Store a new secret.
For secret type, select Other type of secret.
Provide your secret information, such as credentials and connection details, as
key/value pairs.
Enter one or more key/value pairs to use as parameters. Each key provided to the
Secret Manager maps to a parameter in the data flow's Parameter Context with the same
name.
If the name of the secret matches the name of the Parameter Context in
the data flow, there is no further configuration needed. The Lambda function will
automatically configure the data flow to pull the specified secret's values to represent
the Parameter Context.
When you are ready, click Next.
Provide a name for your secret.
You can configure other options like tags, resource permissions,
rotation schedule and others.
On the Review page, review your secret details, and click
Store.
Example:
You developed a data flow with a Parameter Context named
AWS_CONTEXT, and you want to deploy this data flow three times. The
first time, it should use the values in the "Context_1" secret. For the
second deployment, it should use values for the "Context_2" secret. For
the third deployment, it should use values from the "Context_3"
secret.
To accomplish this, you can specify an environment variable named
PARAM_CONTEXT_AWS_CONTEXT. In this case, the name of the environment
variable is PARAM_CONTEXT_ followed by the name of the Parameter
Context. The value of the environment variable will be the name of the secret in the AWS
Secrets Manager.
For the first deployment you would use the environment variable
PARAM_CONTEXT_AWS_CONTEXT with a value of
"Context_1". For the second deployment, you would use an environment
variable named PARAM_CONTEXT_AWS_CONTEXT with a value of
"Context_2" and so on.
If your data flow contains multiple Parameter Contexts, you can also map each
of them to different secrets. For example, if you have a Parameter Context named
AWS_CONTEXT and another one named CLOUDERA_CONTEXT,
and you wanted to map those to secrets named "Context_1" and
"Cldr_Context" respectively, you could do so by adding two
environment variables: PARAM_CONTEXT_AWS_CONTEXT =
"Context_1" and PARAM_CONTEXT_CLOUDERA_CONTEXT =
"Cldr_Context".
Additionally, you can simply default all Parameter Contexts whose names do
not map any secrets in AWS Secrets Manager to use a default secret by setting an
environment variable named DEFAULT_PARAM_CONTEXT. The value of this
environment variable should be the name of the secret to use.
You can also specify that multiple Secrets map to the same Parameter Context by
providing a comma-separated list of Secret names. For example, if you want both the
"Kafka" and "Common" Secrets to contribute parameters to a Parameter Context named
"ALL_KAFKA", you would set PARAM_CONTEXT_ALL_KAFKA = "Kafka,
Common".
The same Secret may also provide parameters to multiple different Parameter Contexts.
Therefore, a configuration similar to the following is possible: