What's new in Flow Management with NiFi 2 [Technical Preview]

Learn about the new features of Flow Management using NiFi 2.0 in Cloudera DataFlow for Data Hub 7.2.18.

Flow Management Data Hub in CDP Public Cloud 7.2.18 is compatible with both NiFi 1 and NiFi 2. This section provides details about Flow Management Data Hub based on Apache NiFi 2.0. Two new Data Hub templates are introduced in Technical Preview for deploying Flow Management clusters using NiFi 2.0.

Currently there is no upgrade path from Flow Management clusters based on NiFi 1 to clusters with NiFi 2. The only approach is to start new clusters using the new templates and migrate your existing flows to these new clusters. However, these new clusters are not production ready and should not be used for critical workloads.

There are a significant number of changes, including breaking ones, between NiFi 1 and NiFi 2. See Behavioral changes for more information about these changes. Additionally, you can anticipate further breaking changes in the upcoming releases, particularly concerning components completely removed in favor of better and more efficient alternatives.

Here are the most important new features and other significant improvements of this release:

Python API - Technical Preview

There is now a first-class citizen Python API allowing users to develop NiFi processors using Python. For more information, see the NiFi Python Developer’s Guide.

Stateless Engine at Process Group level - Tech Preview
It is now possible to configure a Process Group to use the Stateless Engine for executing flows. This is particularly useful for transactional use cases such as CDC, or scenarios where a message broker is the source, aiming to achieve exactly-once semantics. For more information, see the Apache NiFi User Guide.
Rules Engine - Technical Preview

This feature allows NiFi administrators to define a set of rules to enforce best practices in flow design in NiFi. The feature version available in this release is not complete, it is still under development. More details will be available once the feature is finalized and released.

New NiFi components

The below list contains the components that are added specifically in Flow Management 7.2.18 clusters using NiFi 2 compared to Flow Management 7.2.18 clusters using NiFi 1. A large number of components have been added between 7.2.17 and 7.2.18. For a comprehensive list, see What's new in Flow Management with NiFi 1.

  • Processors
    • ParseDocument & ChunkDocument

      These processors handle documents and extract text content to help with Generate AI use cases.

    • ConsumePLC & PutPLC & FetchPLC

      These processors based on the Apache PLC4X project interact with systems over common IoT protocol systems such as Siemens S7, Modbus, CAN Open, Allen Bradley ETH, OPC-UA and more.

    • ConsumeSlack

    • CopyAzureBlobStorage_v12

    • ListenSlack

    • PromptChatGPT

    • PutChroma & QueryChroma

      These processors interact with the ChromaDB vector database and compute embeddings while pushing documents.

    • PutMongoBulkOperations

    • PutPinecone & QueryPinecone

      These processors interact with the Pinecone vector database and compute embeddings while pushing documents.

    • QueryAzureDataExplorer

    • RenameRecordField

  • Controller services
    • ApicurioSchemaRegistry

    • ConfluentEncodedSchemaReferenceReader

    • ConfluentEncodedSchemaReferenceWriter

    • GCSFileResourceService

    • GenericPLC4XConnectionPool

    • ProxyPLC4XConnectionPool

    • SlackRecordSink

    • StandardKustoQueryService

    • StandardPLC4XConnectionPool

    • TinkerpopClientService

  • Parameter providers

    • OnePasswordParameterProvider

  • Rules

    • DisallowComponentType