Deployments

On the Deployments tab you can view and manage all deployments within a workspace.

Viewing data flow in NiFi

You can go to the NiFi cluster where your flow is deployed and view or edit the data flow.

When you access the NiFi cluster, the ability to view or edit the flow is based on your Cloudera DataFlow authorizations. The DFFlowUser role has read-only privileges. The DFFlowAdmin role has full privileges.

You must have deployed a data flow in Cloudera DataFlow.
  1. Select the Deployment that you want to manage.
  2. Click Options > View in NiFi.

    The UI for the NiFi cluster where your flow is deployed opens.

  3. View your data flow or edit it based on your NiFi privileges.

    If you edit the flow in NiFi and want the changes to exist in a new deployment, perform the following steps:

    1. Download the flow as a flow definition.
      For more information, see Downloading a flow definition from NiFi.
    2. Import the flow definition (as a new flow definition or as a new version of an existing flow definition).
      For more information, see Importing a flow definition to Cloudera DataFlow.
    3. Deploy the flow definition.
      For more information, see Deploy a flow.

Starting a flow

You can start a stopped flow a Cloudera DataFlow deployment.

Starting a flow deployment starts all processors of a Cloudera DataFlow deployment.

  • You must have a stopped flow deployment in Cloudera DataFlow.
  • You must have DFFlowAdmin permission.
  1. Select the Deployment that you want to manage.
  2. Click Options > Start flow.
    The Start [Deployment Name] pop-up appears.
  3. Confirm your choice by clicking Start Flow.

Stopping a flow

Stopping the flow of a Cloudera DataFlow deployment temporarily pauses the NiFi flow.

Stopping a flow results in the following:

  • All processors are stopped and no data processing happens within the NiFi flow.
  • KPI alerts are stopped. Your KPI alerts are activated again when the flow is restarted.
  • Any active KPI alerts are resolved.
  • All underlying cloud resources remain allocated for the Cloudera DataFlow deployment.
  • You can modify deployment configuration while the flow is stopped.
  • Stopped flows are still billable however if auto-scaling is enabled for the flow, a certain amount of cost reduction may occur.
You must have deployed a flow definition in Cloudera DataFlow.
  1. Select the Deployment that you want to manage.
  2. Click Options > Stop flow.
    The Stop [Deployment Name] pop-up appears.
  3. Click Stop Flow to stop the flow deployment.

Changing flow version

Learn how to change the flow definition version of a running flow deployment. Using the ‘change flow version’ capability eliminates the need to terminate and re-create deployments when you want to deploy a new version of your flow definition.

  • You must have DFFlowAdmin permission.
  • There is at least one more version for the same flow definition present in the catalog.
  • The state of the flow deployment is either Good Health or Stopped.
  • You have read the applicable restrictions and version change strategies.
Restrictions

The following restrictions apply to flow version changes:

  • Changing inbound connections is not supported.

  • Changing custom resource (custom NARs and custom Python resources) configurations is not supported.

  • While you can add, change or remove assets when moving to a new version, you cannot introduce assets (text files, binaries, JARs, or similar) if the currently deployed version does not have any.
  • Components where state or provenance and other repositories must be kept between flow versions must keep their flow JSON ids. The id changes if you move the component to a different process group or if you delete and then re-add the component to the same process group. NiFi Identifies components by these ids. If you move a component to a different process group between versions, its id changes and NiFi perceives it as a new component. This results in the original component being deleted during flow version change together with its state and a new, identical processor being created in a different process group. In an extreme case, you could change to an identical flow version with just the component ids changed and it would result in the deletion of the entire NiFi flow and the recreation of an identical one, with all history and data lost.
  • Remapping Parameter Group and Parameter Context assignments is not supported as the original assignment is not removed. For example, you have Process Group 1 (PG1) with Parameter Context 1 (PC1) and Process Group 2 (PG2) with Parameter Context 2 (PC2) assigned. If you initiate a flow version change where parameter contexts are flipped, resulting in a PG1-PC2 and PG2-PC1 assignment, NiFi will not re-map the PG to PC assignments.

Version change strategies

Depending on the type of your flow, you may select the flow version change strategy most appropriate to you.

Stop & Process Data

This strategy prioritizes data consistency by stopping source processors and waiting until data is processed before stopping all other components. Once all components have stopped, the flow version is changed and components are started.

Use this strategy when your sources are durable and can handle your source processor being stopped. This generally works well when your source processors are pulling data from sources like Kafka or other messaging queues, databases or file systems.

Should the queued data not be processed within the set time, version change will fail and you can retry the operation with a bigger timeout or you can cancel

Only Restart Affected Components

This strategy prioritizes uptime by identifying and stopping only components that have changed while keeping all others running, replacing and then starting affected components.

Use this strategy when you want to prioritize uptime of unchanged components or you have made only minor processor configuration changes.

This works well for deployments with inbound connections and will keep your source processors running if they have not changed compared to the previous version.

Stop & Empty Queues

This strategy forces a version change by stopping all components, emptying all queues, changing flow version, and then starting all components.

Use this strategy only when you want to force a flow version change without keeping any processors running or attempting to process queued data.

All processors will be stopped and all queues will be emptied as part of this strategy.

  1. Select the Deployment that you want to manage.
  2. Click Options > Change Flow Version.
    The Change Flow Version modal window opens. It shows the list of available flow versions. The current version is grayed out.
  3. Select the flow version you want to change to and click Continue .
  4. Review a summary of the configuration changes caused by the version change and make any necessary edits from the left pane.
  5. Select a flow version change strategy.

    The available options are:

    • Stop & Process Data - If you select this strategy, you can set the maximum wait time for data to be processed and queues to be emptied before the request timed out. The default value is 15 minutes.

    • Only Restart Affected Components
    • Stop & Empty Queues - If you select this strategy, you must accept potential data loss by selecting I understand and choose to proceed with the configuration as is.

  6. Click Change Flow Version.

After you click Deploy, you are redirected to the Alerts tab in the Flow Details view where you can track how the version change progresses.

Downloading NiFi application log

You can download the NiFi application log from the CDF Deployment Manager to use it for troubleshooting.

This feature allows you to download the NiFi application log that is currently being written. As the log file is rotated and the old file is archived once the file size reaches 10 MB, this is the theoretical maximum you can download using this method. For information on downloading archived log files, see Diagnostic bundle collection.
You need DFFlowAdmin permission to perform this action.
  1. Select the Deployment that you want to manage.
  2. Click Options > Download NiFi Log.
    The current NiFi application log is downloaded to your computer in tar.gz format.

Suspending a deployment

Suspending a Cloudera DataFlow deployment terminates cloud resources belonging to a NiFi flow, while maintaining flow persistence.

Suspending a Cloudera DataFlow deployment results in the following:

  • The NiFi flow stops processing data and all underlying cloud resources are terminated. Any unprocessed data in the flow is stored in memory and its processing resumes when you resume the deployment.

  • Flow persistence is maintained while a deployment is suspended.
  • You cannot modify deployment configuration while the deployment is suspended.
  • Suspended deployments are not billable, resulting in reduced costs.
You must have deployed a flow definition in Cloudera DataFlow.
  1. Select the Deployment that you want to manage.
  2. Click Options > Suspend Deployment.
    The Suspend [Deployment Name] modal opens.
  3. Optional: Select the Finish data processing option and set a maximum wait time in minutes for data to be processed and queues to be emptied before the request times out.
    This option stops source processors first and waits for queued data to be processed before the flow is suspended. Set a wait time between 5 and 60 minutes using the slider.
  4. Click Suspend to suspend the Cloudera DataFlow deployment.

Resuming a deployment

You can resume a suspended Cloudera DataFlow deployment.

Resuming a Cloudera DataFlow deployment reallocates the underlying cloud resources and returns a deployment to the state it was in before being suspended.

You must have a suspended flow deployment in Cloudera DataFlow.
  1. Select the Deployment that you want to manage.
  2. Click Options > Resume Deployment.
    The Resume [Deployment Name] modal opens.
  3. Click Resume Deployment to resume the flow deployment, reallocating cloud resources.

Export deployment configuration

You can export a deployment configuration to create additional deployments with a similar configuration in the same or a different environment.

  • Exported configurations may be edited, and you can also modify them after the importing step during flow deployment.
  • One deployment can have only one exported configuration. Performing a new export overwrites the existing one.
  • Exported deployment configurations are available for every user who can start a new deployment in a given environment, even if the exported deployment was originally created under a specific Project.
  1. Select the Deployment that you want to manage.
  2. Click Options > Export Configuration.
    The Export Deployment Configuration modal opens.
  3. Optional: You can optionally add comments to the exported cofiguration.
  4. Confirm your choice by clicking Export in the modal.

The configuration is exported to the {LOG location}/cdf-deployment-backup directory. {LOG location} is configured during the creation of the associated Cloudera Environment. If you want to reuse the exported configuration in a different environment, you can either configure that to use the same {LOG location}, or you can copy the exported .tar.gz and JSON files to the {LOG location}/cdf-deployment-backup directory of the target environment.

You can reuse the exported configuration during deployment of the same flow definition to recreate a flow with similar configuration.

Terminating a deployment

You can terminate a deployment to remove it from Cloudera DataFlow.

If you terminate a deployment, you delete the associated NiFi resources and your flow no longer remains active. The associated flow definition remains in the catalog and is available to be deployed again in a new deployment.

You must have deployed a flow definition in Cloudera DataFlow.
  1. Select the Deployment that you want to manage.
  2. Click Options > Terminate Deployment.
    The Terminate [Deployment Name] modal opens.
  3. Optional: If you select the Delete assigned endpoint hostname option. If you do not select this option, you can reassign existing, unassigned endpoints during flow deployment.
  4. Enter the name of the deployment to confirm and click Terminate.