Cloudera Flow Management Migration Tool overview

The Cloudera Flow Management Migration Tool enables the semi-automatic migration of flows from Cloudera Flow Management 2.1.7.1000 powered by NiFi 1 to Cloudera Flow Management 4.0.0 powered by NiFi 2, using predefined transformation rules and logic. It supports a clear, step-by-step approach for migrating your data flows.

When using the Cloudera Flow Management Migration Tool, you can issue various commands that enable partial or complete migration of the incoming data flow. These commands are customizable through specific arguments to run either individual steps or the full migration process. The Migration Tool stops running after the command is completed and can be run again with the same or different commands, as needed. This modular design allows you to customize the migration workflow according to your needs.

Running Migration Tool commands does not modify the source data flow or the associated NiFi instances. Repeating the same command provides identical results, ensuring predictability. Exceptions are the generation of unique IDs, which may vary across runs, and potential differences in component order within the serialized flow.json file. While file-based comparisons may show variations, the functional outcome remains identical.

Different Migration Tool commands apply specific transformation logic to the input data flow. Each command must be configured appropriately to achieve the desired outcome. For comprehensive details on each command’s functionality, as well as the arguments and parameters available for configuration, see the Cloudera Flow Management Migration Tool Command Reference .

The Migration Tool works with flow.json as the input source. From this source, the tool can perform the following actions:

  • Transforming variables to parameter contexts: It translates variables into parameter contexts for improved organization and compatibility with NiFi 2.
  • Converting templates: It extracts and converts templates into separate flow_definition.json files.
  • Updating components: It updates components to align with NiFi 2 requirements wherever possible, potentially applying broader modifications when needed.

The Migration Tool generates various output files. These output artifacts are saved in the directory specified by the --outputDirectory argument. Results are organized into subdirectories (v1 for Stage 1 and v2 for Stage 2).

Key features

The Cloudera Flow Management Migration Tool provides several features to support the migration of flows in smaller, manageable chunks, enhancing control and validation.

Reusability

Since the Migration Tool does not modify the input files and produces deterministic results (except for unique identifiers of newly generated components), you can rerun the migration on the same input multiple times. This can be useful if adjustments are needed before or after migration and the results need to be compared.

Activity log

The Activity Log is an important tool for supervising changes in the flow. It lists every modification and provides the reasoning (change-info) behind them.

Stages

The Migration Tool allows you to handle manual migration steps while continuing to use the source NiFi version for most of the migration process. This approach allows you to apply a high number of expected changes without encountering version-related differences, making validation and tracking modifications easier.

Separable commands

Flow migration steps such as template migration can be run separately, allowing you to work on smaller, more manageable parts of the flow.

Process groups

Migration can be scoped to specific process groups by setting the Process Group ID argument. This limits transformations to the specified group and its children, enabling targeted migrations.

Iterative migration (Loopback)

The main output of the Migration Tool, typically a flow.json file, can be used as input for subsequent migrations. This iterative approach helps identify and address additional issues after manual adjustments.

Important considerations

Business logic validation

The Cloudera Flow Management Migration Tool ensures that the migration follows the predefined ruleset, but since the NiFi 2 feature set differs from NiFi 1, some business behaviors may change and require human validation. This post-migration validation is essential to ensure the final flow meets your business requirements.

Generating an initial report

Although the Migration Tool lacks an explicit "preview" feature, performing a full migration generates artifacts that can provide an initial overview of compatibility issues. Review the Activity Log entries for an early assessment. The number of change entries in the Activity Logs, in relation to the flow size, as well as the count of manual-change-requests, serves as a reliable indicator of the expected number of changes.

Migration stages

Migration is performed in two sequential stages (referred to as “Stage 1” and “Stage 2”) ensuring a structured transition from source to target compatibility. A complete migration workflow consists of a Stage 1 migration of the incoming flow and a consecutive Stage 2 migration of the result of the Stage 1 migration. The final output of Stage 2 is the completed migration result: a fully migrated and NiFi 2 compatible flow.

Stage 1: Source compatibility
  • Processes the incoming NiFi flow (input flow.json) to create a partially updated flow as the output, compatible with NiFi 1.

  • Running Stage 1 commands for a template, variable, component, or using the aggregate command for all steps results in transformations compatible with the original version.

  • Generates a list of manual changes needed for further adjustments.

  • Requires iterative execution after applying manual changes until the flow reaches stability (no further modifications are required).

Stage 2: Target compatibility
  • Takes the output of Stage 1 and processes it as input to produce a flow compatible with NiFi 2.

  • Applies transformations that go beyond the source NiFi version's capabilities to meet NiFi 2 standards.

  • Ensures that the final output is a fully migrated and compliant data flow.

Both migration stages consist of multiple steps that can be executed individually or as a whole. For best results, Cloudera recommends completing all steps within a stage before proceeding to the next one to ensure consistency and avoid potential issues.

Migration steps are present in both stages but involve different transformations and serve different purposes:

  • In Stage 1, component migration applies changes compatible with the source version.

  • In Stage 2, component migration applies changes required by the target version.

Limitations and manual steps

Some aspects of the migration are not handled automatically by the Migration Tool and require manual intervention.

Deprecated components

Certain deprecated component types lack replacements in the target NiFi version or cannot be algorithmically replaced. These components are not migrated automatically when using the Migration Tool and need to be updated manually. In most cases, the Migration Tool provides information on the expected manual changes.

Custom components

Custom components, such as processors not included in the Cloudera Flow Management version in use, are not supported. The definitions of such components, including attributes and bundle versions, are preserved without modification. You must manually update these components.

Ghost components in templates

Components saved as ghost components within templates are not migrated during the template migration process.

Variables, dynamic properties, and flow file attributes

Overlapping names among variables, dynamic properties, and flow file attributes are not migrated automatically. You must manually resolve these conflicts and update the elements to ensure compatibility.