Migration best practices

Migrating complex data flows requires careful planning, validation, and following a structured strategy to ensure success. While the Cloudera Flow Management Migration Tool automates many aspects of the process, manual oversight and adjustments in the migration process are essential for achieving accurate results. Additionally, some expectations from NiFi 2 cannot be fully addressed through automation alone.

To achieve a successful migration, Cloudera recommends a systematic approach combining the tool’s features with manual intervention and careful validation.

Follow an iterative refinement process in Stage 1!

Stage 1 migration automatically converts the flow, applying transformations while maintaining compatibility with NiFi 1, and a list of manual changes is generated. You have to manually edit your flow to fix the requested changes. After making these manual edits, the updated flow should be used as input for the next iteration of Stage 1. Repeat this process on the modified flow iteratively until the flow reaches stability, meaning that no automatic changes are introduced during conversion (there are no change entries in the Activity Log) and there are no manual change requests in the Activity Log that would require attention in Stage 1.

At this point, Stage 1 can be considered fully completed with all issues resolved, allowing you to proceed to Stage 2.

Partition the migration process for better control!

Migrating complex data flows requires a more structured approach. Cloudera advises not to migrate the entire flow in a single Stage 1 migration. Instead, partition the migration and focus on one part of the flow at a time to simplify troubleshooting and applying manual changes. You can use the following methods:

  • Run the migration in smaller steps, starting with templates, followed by variables, and then components.
  • Run the migration by process group.

Both approaches allow you to focus on smaller, more manageable sections of the flow. For larger flows, you can also combine these two methods. The migration logic you choose should be tailored to your specific needs and flow structure. Keep in mind that multiple levels of partitioning may require additional coordination, so it is important to select a level of granularity that provides more benefit than added complexity.

Validate after every iteration!

Load the modified flow into a NiFi instance matching the version of the stage you are on. Confirm functionality and resolve any manual inspection requests from the Activity Log.

Always review the Activity Log!

You can use the Activity log to understand the rationale behind changes and identify what manual adjustments are needed in the flow.

Run a full migration after each stage to confirm completeness!

As part of your final validation, run a full migration after each stage. This ensures that no part of the flow was overlooked during partitioning.

  • At the end of Stage 1, perform a full migration only using the Stage 1 restriction to verify completeness.
  • At the end of Stage 2, run a final full migration without restrictions to ensure the entire flow migration is complete.