What's new in Flow Management with NiFi 1

Learn about the new features of Flow Management using NiFi 1.25 in Cloudera DataFlow for Data Hub 7.2.18.

Flow Management Data Hub in CDP Public Cloud 7.2.18 is compatible with both NiFi 1 and NiFi 2. This section provides details about Flow Management Data Hub based on Apache NiFi 1.25.0. Below are the most important new features, improvements, and fixes included in this release:

Rebase on NiFi 1.25
The upgrade offers access to the newest Apache NiFi features and enhancements on the 1.x branch. Transitioning to NiFi 2.x later oon has the prerequisite of already using NiFi 1.25.
New components
  • Processors
    • CalculateParquetOffsets and CalculateParquetRowGroupOffsets

      These processors can be used in combination with ConvertRecord and Parquet Reader to significantly reduce the amount of time required to convert very large Parquet files into another format.

    • CaptureChangeDebeziumDB2, CaptureChangeDebeziumMySQL, CaptureChangeDebeziumOracle, CaptureChangeDebeziumPostgreSQL, CaptureChangeDebeziumSQLServer

      These processors, currently in Technical Preview, leverageare the Debezium project to ingest Change Data Capture events from external databases.

    • DecryptContentAge and EncryptContentAge

      These are a new generation of processors for data encryption / decryption. For more information, see Modernizing Streaming Encryption with age in Apache NiFi.

    • ListenNetFlow

      This processor enables NiFi to receive Netflow data from network equipment. For more information, see Collecting NetFlow Records with Cloudera DataFlow.

    • ListenOTLP

      This processor allows NiFi to function as a destination for OLTP agents to receive OpenTelemetry data from external applications. You can learn more about this new processor Building OpenTelemetry Collection in Apache NiFi with Netty.

    • PutAzureQueueStorage_v12 and GetAzureQueueStorage_v12

      These processors use the latest library for interacting with Azure Queue Storage. It is highly recommended to switch to these new processors.

    • PutClouderaHiveQL, PutClouderaHiveStreaming, PutClouderaORC, SelectClouderaHiveQL, UpdateClouderaHiveTable, TriggerClouderaHiveMetaStoreEvent

      This is a set of Cloudera exclusive components to interact with Hive-based components in the Cloudera Data Platform. It is highly recommended to switch to these components as they leverage features that are part of Hive 4, not yet released in the Apache project.

    • PutIcebergCDC

      This processor in Technical Preview can be used in combination with the new Debezium processors to implement CDC pipelines with Iceberg tables as the destination.

    • ExtractRecordSchema

      This processor should be used as a replacement of the InferAvroSchema processor, which will be deprecated in NiFi 2.

    • PutJiraIssue

    • PutZendeskTicket

    • QueryIoTDBRecord

    • RemoveRecordField

    • ConsumeElasticsearch

    • FilterAttribute

    • PackageFlowFile

    • PublishSlack

  • Controller services
    • ActiveMQJMSConnectionFactoryProvider

      This controller service allows you to interact with ActiveMQ without the need to deploy the JMS client on all of the NiFi nodes.

    • ADLSCredentialsControllerServiceLookup

    • AmazonGlueSchemaRegistry

    • AzureServiceBusJMSConnectionFactoryProvider

      This controller service allows you to interact with Azure Service Bus without the need to deploy the required dependencies on all of the NiFi nodes.

    • AzureStorageCredentialsControllerServiceLookup_v12

    • ClouderaHiveConnectionPool

      This controller service allows you to interact with Hive without the need to deploy the required dependencies on all of the NiFi nodes.

    • CMLLookupService

      This controller service can be used to enrich the data going through NiFi by calling exposed Machine Learning models running in Cloudera Machine Learning.

    • DatabaseTableSchemaRegistry

      This controller service enables you to retrieve the schema associated with a table from an external database. This allows you to validate the data going through NiFi against that schema before pushing the data into this table.

    • EBCDICRecordReader

      This controller service allows you to read and convert Mainframe data into another structured format like JSON, Avro, and so on. For more information, see here.

    • ExcelReader

    • ImpalaConnectionPool

      This controller service allows you to interact with Impala without the need to deploy the required dependencies on all NiFi nodes.

    • JiraRecordSink

    • RabbitMQJMSConnectionFactoryProvider

      This controller service allows you to interact with RabbitMQ without the need to deploy the JMS client on all NiFi nodes.

    • RedshiftConnectionPool

      This controller service allows you to interact with Redshift over JDBC without the need to care about deploying the JDBC driver on all of the NiFi nodes.

    • SimpleRedisDistributedMapCacheClientService

    • StandardFileResourceService

    • StandardJsonSchemaRegistry

    • YamlTreeReader

    • ZendeskRecordSink

  • Parameter provider
    • CyberArkConjurParameterProvider

      This parameter provider allows you to retrieve the value associated with your parameters from an external CyberArk Conjur instance.