General configuration

​​You must configure the function to specify the dataflow to run and add any necessary runtime configuration using environment variables.

  1. Click Configuration under Settings on the left of your Function App screen.
  2. Select the Application Settings tab.

    Application settings are exposed as Environment Variables inside the function. Azure pre-populates some of these, which are used by the Function framework itself. You have to add the remaining Application Settings.

  3. Click New application setting at the top to add the first setting, which is required for all Cloudera DataFlow Functions deployments.
  4. As a name, enter WEBSITE_RUN_FROM_PACKAGE.
    The value is different depending on the Operating System:
    • For Windows, enter 1. This tells the Function App to use the Zip package you uploaded earlier to the File share.
    • For Linux, enter the full path to the NaafAzureFunctions.zip uploaded to the Container.
      1. To get the URL, navigate to the Storage account and click Containers.
      2. Click the container you created before.
      3. Click the NaafAzureFunctions.zip file and click the Copy icon next to the URL.
      4. Back in the Function app, paste the value as the value of the WEBSITE_RUN_FROM_PACKAGE setting.

    Additionally, the following Application Settings are supported.

    Variable Name Description Required Default Value
    FUNCTION_NAME

    The unique function name for this deployment. It is recommended to simply use the name of the Function App.

    true --
    FLOW_CRN

    The Cloudera Resource Name (CRN) for the data flow that is to be run.

    The data flow must be stored in the Cloudera DataFlow Catalog. This CRN should indicate the specific version of the data flow and as such will end with some suffix such as /v.1.

    For more information, see Retrieving data flow CRN.
    true --
    DF_PRIVATE_KEY

    The Private Key for accessing the Cloudera DataFlow service.

    The Private Key and Access Key are used to authenticate with the Cloudera DataFlow Service and they must provide the necessary authorizations to access the specified data flow.

    For more information, see Provisioning Access Key ID and Private Key.

    true --
    DF_ACCESS_KEY

    The Access Key for accessing the Cloudera DataFlow service.

    The Private Key and Access Key are used to authenticate with the Cloudera DataFlow Service and they must provide the necessary authorizations to access the specified data flow.

    For more information, see Provisioning Access Key ID and Private Key.

    true --
    INPUT_PORT

    The name of the Input Port to use. If the specified data flow has more than one Input Port at the root group level, this environment variable must be specified, indicating the name of the Input Port to queue up the Cloud Function event. If there is only one Input Port, this variable is not required. If it is specified, it must properly match the name of the Input Port.

    false --
    OUTPUT_PORT

    The name of the Output Port to retrieve the results from. If no Output Port exists, the variable does not need to be specified and no data will be returned. If at least one Output Port exists in the data flow, this variable can be used to determine the name of the Output Port whose data will be sent along as the output of the function.

    For more information on how the appropriate Output Port is determined, see Output ports.

    false --
    FAILURE_PORTS

    A comma-separated list of Output Ports that exist at the root group level of the data flow. If any FlowFile is sent to one of these Output Ports, the function invocation is considered a failure.

    For more information, see Output ports.

    false --
    DF_SERVICE_URL

    The Base URL for the Cloudera Dataflow Service.

    false https://api.us-west-1.cdp.cloudera.com/
    NEXUS_URL

    The Base URL for a Nexus Repository for downloading any NiFi Archives (NARs) needed for running the data flow.

    false https://repository.cloudera.com/artifactory/cloudera-repos/
    CONTENT_REPO

    The contents of the FlowFiles can be stored either in memory, on the JVM heap, or on disk. If this environment variable is specified, it specifies the path to a directory where the content should be stored. If not specified, the content is held in memory.

    false --
    WORKING_DIR

    The working directory, where NAR files will be expanded.

    false /tmp/extensions for Linux and C:\local\Temp for Windows
    EXTENSIONS_DOWNLOAD_DIR

    The directory to which missing extensions / NiFi Archives (NARs) will be downloaded.

    For more information, see Providing custom extensions / NARs.

    false /tmp/extensions for Linux and C:\local\Temp for Windows
    EXTENSIONS_DIR_*

    Specifies read-only directories that may contain custom extensions / NiFi Archives (NARs).

    For more information, see the Providing custom extensions / NARs.

    false --
    STORAGE_ENDPOINT

    A Storage account endpoint to look for custom extensions / NiFi Archives (NARs) and resources.

    For more information, see Cloud storage.

    false --
    STORAGE_CONTAINER

    A Storage account container name in which to look in for custom extensions / NiFi Archives (NARs) and resources.

    For more information, see Cloud storage.

    false --
    STORAGE_EXTENSIONS_DIRECTORY

    The directory in the Storage account Blob container to look in for custom extensions / NiFi Archives (NARs).

    For more information, see Cloud storage.

    false extensions
    STORAGE_RESOURCES_DIRECTORY

    The directory in the Storage account Blob container to look in for custom resources.

    For more information, see Cloud storage.

    false resources
    COSMOSDB_ENDPOINT

    The Cosmos DB account URI for the Cosmos DB state provider. It is required if you wish to preserve state in any stateful processors.

    For more information, see Data flow state.

    false --
    COSMOSDB_STATE_DATABASE

    The Cosmos DB database name for the Cosmos DB state provider.

    For more information, see Data flow state.

    false nifi_state
    COSMOSDB_STATE_CONTAINER

    The Cosmos DB container name for the Cosmos DB state provider.

    For more information, see Data flow state.

    false nifi_state
    WEBSITE_RUN_FROM_PACKAGE

    Specifies that the Function App should use the NaaF Zip package as its binary distribution.

    true 1
    KRB5_FILE

    It specifies the filename of the krb5.conf file. This is necessary only if connecting to a Kerberos-protected endpoint.

    For more information, see Configuring Kerberos.

    false /etc/krb5.conf

    The following environment variables apply only to HTTP triggers:

    Variable Name Description Required Default Value
    HEADER_ATTRIBUTE_PATTERN A Regular Expression capturing all flowfile attributes from the Output Port that should be returned as HTTP headers, if the trigger type is HTTP. For all other trigger types, this variable is ignored. false --
    HTTP_STATUS_CODE_ATTRIBUTE

    A flowfile attribute name that will set the response HTTP status code in a successful data flow, if the trigger type is HTTP. For all other trigger types, this variable is ignored.

    See the section on HTTP Functions for more information.

    false --