Deploying a flow definition

Deploy a flow definition to run NiFi flows as flow deployments in CDF. To do this, launch the Deployment wizard and specify your environment, parameters, sizing, and KPIs.

About this task

The CDF Catalog is where you manage the flow definition lifecycle, from initial import, to versioning, to deploying a flow definition.

Before you begin

Steps

  1. In DataFlow, from the left navigation pane, click Catalog.

    Flow definitions available for you to deploy are displayed, one definition per row.

  2. Launch the Deployment wizard.
    1. Click the row to display the flow definition details and versions.
    2. Click a row representing a flow definition version to display flow definition version details and the Deploy button.
    3. Click Deploy to launch the Deployment wizard.
  3. From the Deployment wizard, select the environment to which you want to deploy this version of your flow definition.
  4. From Overview, perform the following preliminary tasks:
    • Give your flow deployment a unique name. You can use this name to distinguish between different versions of a flow definition, flow definitions deployed to different environments, and similar.

  5. From NiFi Configuration, specify the following NiFi configuration information:
    • Pick the NiFi Runtime Version for your flow deployment. Cloudera recommends that you always use the latest available version, if possible.

    • Specify whether you want to use Inbound Connections that allow your flow deployment receiving data from an external data source. If so, specify the endpoint host name and listening port(s) where your flow deployment listens to incoming data.

      See Create an inbound connection endpoint for complete information on endpoint configuration options.

    • Specify whether you want to use NiFi Archives (NAR) to deploy custom NiFi processors or controller services. If so, specify the CDP Workload Username and password, and cloud storage location you used when preparing to deploy custom processors.

    • Specify whether you want the flow deployment to auto-start once deployed.

  6. In Parameters, specify parameter values like connection strings, usernames and similar, and upload files like truststores, jars, and similar.
  7. Specify your Sizing & Scaling configurations.
    NiFi node sizing
    • You can adjust the size of your cluster from Extra Small to Large.
    Number of NiFi nodes
    • You can set whether you want to automatically scale your cluster according to flow deployment capacity requirements. When you enable auto-scaling, the minimum NiFi nodes are used for initial size and the workload scales up or down depending on resource demands.
    • You can set the number of nodes from 1 to 32.
  8. From KPIs, you may choose to identify key performance indicators (KPIs), the metrics to track those KPIs, and when and how to receive alerts about the KPI metrics tracking.

    See Working with KPIs for complete information about the KPIs available to you and how to monitor them.

  9. Review a summary of the information provided and make any necessary edits by clicking Previous. When you are finished, complete your flow deployment by clicking Deploy.

Results

Once you click Deploy, you are redirected to the Alerts tab for the deployment where you can track its progress.

Before you begin

  • You have installed CDP CLI.
  • Run cdp df list-services to get the service-crn value.
  • Run cdp df list-flows to get the flow-version-crn value.

Steps

  1. To deploy a flow, enter:
    cdp df create-deployment
              --service-crn [***service-crn-value***]          
              --flow-version-crn [***flow-version-crn-value***]
              --deployment-name [***flow-deployment-name***]          
              [--cluster-size-name [***cluster-size***]]
              [--static-node-count [***value***]]
              [--auto-scaling-enabled | --no-auto-scaling-enabled]
              [--auto-scale-min-nodes [***number-min-nodes***]]
              [--auto-scale-max-nodes [***number-max-nodes***]]
              [--cfm-nifi-version [***nifi-version***]]
              [--auto-start-flow | --no-auto-start-flow]
              [--parameter-groups [***json-file-locatione***]]
              [--kpis [***json-file-locatione***]]  

    Where:

    • --service-crn –- Specifies the service-crn value you obtained when completing the prerequisites.
    • --flow-version-crn – Specifies the flow-version-crn value you obtained when completing the prerequisites.
    • --deployment-name – Specifies a unique name for your flow deployment.
    • [--cluster-size-name ] – Specifies the cluster size. Valid values are:
      • EXTRA_SMALL
      • SMALL
      • MEDIUM
      • LARGE

      The default is EXTRA_SMALL.

    • [--static-node-count] – Specifies the number of NiFi nodes when autoscaling is not enabled. You can select between 1 and 32 nodes. The default value is 1.
    • [--auto-scaling-enabled | --no-auto-scaling-enabled] – Specifies whether you want to enable autoscaling. The default is to disable autoscaling.
      • [--auto-scale-min-nodes] – Specifies the minimum nodes when you have autoscaling enabled. If you have autoscaling enabled, this parameter is required.
      • [--auto-scale-max-nodes] – Specifies the maximum nodes when autoscaling is enabled. If you have autoscaling enabled, this parameter is required.
    • [--cfm-nifi-version] – Specifies the NiFi runtime version. The default is the latest version.
    • [--auto-start-flow | --no-auto-start-flow] – Specifies whether you want to automatically start your flow once it has been deployed. The default is to enable the automatic start.
    • [--parameter-groups] – Specifies the location of the parameter group JSON file you, if you are using one for this flow deployment.
    • [--kpis] – Specifies the location of the KPIs JSON file, if you are providing KPIs for this flow.

Example parameter group file

The JSON file you develop for parameter group will be different depending on your flow objectives and requirements. This is an example of the parameter group file format:

[
  {
    "name": "kafka-filter-to-kafka",
    "parameters": [
      {
        "name": "CDP Workload User",
        "assetReferences": [],
        "value": "srv_nifi-machine-ingest"
      },
      {
        "name": "CDP Workload User Password",
        "assetReferences": [],
        "value": "<<CDP_MISSING_SENSITIVE_VALUE>>"
      },
      {
        "name": "CSV Delimiter",
        "assetReferences": [],
        "value": ","
      },
      {
        "name": "Data Input Format",
        "assetReferences": [],
        "value": "CSV"
      },
      {
        "name": "Data Output Format",
        "assetReferences": [],
        "value": "JSON"
      },
      {
        "name": "Filter Rule",
        "assetReferences": [],
        "value": "SELECT * FROM FLOWFILE"
      },
      {
        "name": "Kafka Broker Endpoint",
        "assetReferences": [],
        "value": "streams-messaging-broker0.pm-sandb.a465-9q4k.cloudera.site:9093"
      },
      {
        "name": "Kafka Consumer Group ID",
        "assetReferences": [],
        "value": "cdf"
      },
      {
        "name": "Kafka Destination Topic",
        "assetReferences": [],
        "value": "MachineDataJSON"
      },
      {
        "name": "Kafka Producer ID",
        "assetReferences": [],
        "value": "cdf"
      },
      {
        "name": "Kafka Source Topic",
        "assetReferences": [],
        "value": "MachineDataCSV"
      },
      {
        "name": "Schema Name",
        "assetReferences": [],
        "value": "SensorReading"
      },
      {
        "name": "Schema Registry Hostname",
        "assetReferences": [],
        "value": "streams-messaging-master0.pm-sandb.a465-9q4k.cloudera.site"
      }
    ]
  }
]

Example KPI file

The JSON file you develop for KPIs will be different depending on your flow objectives and requirements. This is an example of the KPI file format:

[
  {
    "metricId": "rateBytesReceived",
    "alert": {
      "thresholdLessThan": {
        "unitId": "kilobytesPerSecond",
        "value": 150
      },
      "frequencyTolerance": {
        "unit": {
          "id": "MINUTES"
        },
        "value": 1
      }
    }
  },
  {
    "metricId": "processorAmountBytesSent",
    "alert": {},
    "componentId": "a7f7df1c-a32d-3c25-9b09-e1d1036dcc04;a33a1b48-005b-32dd-bb88-b63230bb8525"
  }
]

Result

Successfully deploying a flow results in output similar to:


{
  "crn": "deployment-crn"
}