Switching flow persistance providers

This tutorial walks you through how to switch from DatabaseFlowPersistenceProvider to GitFlowPersistenceProvider with the help of NiFi CLI.

  1. Connect to your NiFi Registry instance through NiFi CLI using ssh.
  2. Create the target directory where you will save the flows that you are exporting from NiFi Registry.
    For example: sudo mkdir /tmp/flowbkp
  3. Export the flows stored in NiFi Registry and save them in the directory you have created using the following command:
    sudo -u nifiregistry **[/path/to/cli.sh]** registry export-all-flows --outputDirectory "**[path/to/directory]**"

    For example:

    sudo -u nifiregistry /opt/cloudera/parcels/CFM/TOOLKIT/bin/cli.sh registry export-all-flows --outputDirectory "/tmp/flowbkp"
    Where:
    • sudo -u nifiregistry executes the subsequent command as the user nifiregistry.
    • /opt/cloudera/parcels/CFM/TOOLKIT/bin/cli.sh specifies the full path to the NiFi Toolkit CLI script.
    • registry export-all-flows instructs the CLI to export all flows and the associated resources from the NiFi Registry.
    • --outputDirectory "/tmp/flowbkp" specifies the output directory where the exported flows will be saved.
  4. Stop NiFi Registry in Cloudera Manager.
    This ensures a clean transition and prevents conflicts or data inconsistencies during switching persistence providers.
    1. Click the Clusters tab in the left-hand navigation.
    2. Select the NiFi Registry service in the list of services to access the service dashboard page.
    3. Stop the service by opening the Actions drop-down menu next to the service name and clicking Stop.
  5. Back up the existing database.
    It is crucial to create a backup of your existing NiFi Registry database before making any changes. This ensures that you have a copy of the data in case any issues occur during switching persistence providers.
    1. Click the Configuration tab of the NiFi service dashboard page in Cloudera Manager.
    2. Locate the configuration property named nifi.registry.flow.storage.directory.
      This property specifies the directory where the Flow Storage files are stored for the FileSystemFlowPersistenceProvider. It is only required if Enable Flow File Provider is enabled. Otherwise, both flow content and metadata are stored in the database.
    3. Create a directory as a backup location, where you want to store the Flow Storage files.
    4. Update the flow storage directory to the backup location you created using the Providers: Default Flow Persistence File Provider Property - Flow Storage Directory property.
      You can search for ‘providers’ to find this property more easily. Make sure to specify the full absolute path to the new location.
    5. Use file system commands or tools to move the contents of the original Flow Storage directory to the new backup directory following your usual backup procedure.
      This involves copying all the files and directories from the original location to the new location while preserving the directory structure.
    6. Delete all data from the original database location.
      The NiFi Registry should be completely empty including the buckets.
  6. Configure the NiFi Registry flow persistence provider properties in Cloudera Manager.
    1. Click the Configuration tab of the NiFi service dashboard page in Cloudera Manager.
      This tab displays the configuration settings for NiFi Registry.
    2. Disable the database provider by unclicking the Providers: Enable Database Flow Persistence Provider property.
      If Flow File Provider was enabled, it needs to be disabled as well using Enable Flow File Provider.
    3. Enable the Git-based provider by selecting the Providers: Enable the git-based flow persistence provider property.
    4. Configure the GitFlowPersistenceProvider provider.
      1. Set the path to your cloned Git repository using the Providers: File system path for a directory where flow contents files are persisted to property.

        This is the file system path for the directory where flow contents files are persisted to. The directory must exist when NiFi Registry starts, and it must be initialized as a Git directory.

      2. Define where you want to push the flow changes using the Providers: Name of the remote repository to push to property.

        When a new flow snapshot is created, files are updated in the specified Git directory. The GitFlowPersistenceProvider then creates a commit to the local repository. If Remote to push is defined, it also pushes to the specified remote repository.

      3. Provide a Git user name using the Providers: Git username for the remote repository property.

        This username should be used by the NiFi Registry to interact with the repository. It is used to make push requests to the remote repository when Remote to push is enabled, and the remote repository is accessed by HTTP protocol. If ssh is used, user authentication is done with SSH keys.

      4. Provide a Git password using the Providers: The password for the Remote Access User property.

    5. Save the configuration changes.
  7. Start NiFi Registry to apply the new configuration.
    1. From Cloudera Manager, click the Clusters tab in the left-hand navigation.
    2. Select NiFi Registry.
    3. Start the service by opening the Actions drop-down menu next to the service name and clicking Start.
  8. Ensure the user has full access to the Git repository and that the Git working tree is clean.
  9. Import all flows from the directory where you saved your exported flows using the following command:
    sudo -u nifiregistry **[/path/to/cli.sh]** registry import-all-flows --input "**[path/to/directory]**" --skipExisting

    For example:

    sudo -u nifiregistry /opt/cloudera/parcels/CFM/TOOLKIT/bin/cli.sh registry import-all-flows --input "/tmp/flowbkp" --skipExisting
    Where:
    • sudo -u nifiregistry executes the subsequent command as the user nifiregistry.
    • /opt/cloudera/parcels/CFM/TOOLKIT/bin/cli.sh specifies the full path to the NiFi Toolkit CLI script.
    • registry import-all-flows instructs the CLI to import all flows and the associated resources to the NiFi Registry.
    • --input "/tmp/flowbkp" specifies the input directory from which the flows will be imported.
    • --skipExisting indicates that any existing flows with the same UUIDs in the Registry should be skipped during the import process. It helps to avoid overwriting existing flows.
  10. Verify and test:
    Ensure that NiFi Registry is functioning as expected with the new flow persistence provider. Test the functionality, including creating and versioning flows, accessing metadata, and verifying data integrity.
  11. Monitor and troubleshoot:
    Monitor the NiFi Registry instance closely after the migration to ensure there are no unexpected issues. If any problems arise, check the documentation and troubleshooting resources provided by NiFi Registry and the chosen flow persistence provider.