What's new in Flow Management with NiFi 1

Learn about the new features of Flow Management using NiFi 1.25 in Cloudera DataFlow for Data Hub 7.2.18.

Flow Management Data Hub in CDP Public Cloud 7.2.18 is compatible with both NiFi 1 and NiFi 2. This section provides details about Flow Management Data Hub based on Apache NiFi 1.25.0. Below are the most important new features, improvements, and fixes included in this release:

Rebase on NiFi 1.25

The upgrade offers access to the newest Apache NiFi features and enhancements on the 1.x branch. Transitioning to NiFi 2.x later oon has the prerequisite of already using NiFi 1.25.

New components

Processors
- CalculateParquetOffsets and CalculateParquetRowGroupOffsets
  
  These processors can be used in combination with ConvertRecord and Parquet Reader to significantly reduce the amount of time required to convert very large Parquet files into another format.
- CaptureChangeDebeziumDB2, CaptureChangeDebeziumMySQL, CaptureChangeDebeziumOracle, CaptureChangeDebeziumPostgreSQL, CaptureChangeDebeziumSQLServer
  
  These processors, currently in Technical Preview, leverageare the Debezium project to ingest Change Data Capture events from external databases.
- DecryptContentAge and EncryptContentAge
  
  These are a new generation of processors for data encryption / decryption. For more information, see Modernizing Streaming Encryption with age in Apache NiFi.
- ListenNetFlow
  
  This processor enables NiFi to receive Netflow data from network equipment. For more information, see Collecting NetFlow Records with Cloudera DataFlow.
- ListenOTLP
  
  This processor allows NiFi to function as a destination for OLTP agents to receive OpenTelemetry data from external applications. You can learn more about this new processor Building OpenTelemetry Collection in Apache NiFi with Netty.
- PutAzureQueueStorage_v12 and GetAzureQueueStorage_v12
  
  These processors use the latest library for interacting with Azure Queue Storage. It is highly recommended to switch to these new processors.
- PutClouderaHiveQL, PutClouderaHiveStreaming, PutClouderaORC, SelectClouderaHiveQL, UpdateClouderaHiveTable, TriggerClouderaHiveMetaStoreEvent
  
  This is a set of Cloudera exclusive components to interact with Hive-based components in the Cloudera Data Platform. It is highly recommended to switch to these components as they leverage features that are part of Hive 4, not yet released in the Apache project.
- PutIcebergCDC
  
  This processor in Technical Preview can be used in combination with the new Debezium processors to implement CDC pipelines with Iceberg tables as the destination.
- ExtractRecordSchema
  
  This processor should be used as a replacement of the InferAvroSchema processor, which will be deprecated in NiFi 2.
- PutJiraIssue
- PutZendeskTicket
- QueryIoTDBRecord
- RemoveRecordField
- ConsumeElasticsearch
- FilterAttribute
- PackageFlowFile
- PublishSlack
Controller services
- ActiveMQJMSConnectionFactoryProvider
  
  This controller service allows you to interact with ActiveMQ without the need to deploy the JMS client on all of the NiFi nodes.
- ADLSCredentialsControllerServiceLookup
- AmazonGlueSchemaRegistry
- AzureServiceBusJMSConnectionFactoryProvider
  
  This controller service allows you to interact with Azure Service Bus without the need to deploy the required dependencies on all of the NiFi nodes.
- AzureStorageCredentialsControllerServiceLookup_v12
- ClouderaHiveConnectionPool
  
  This controller service allows you to interact with Hive without the need to deploy the required dependencies on all of the NiFi nodes.
- CMLLookupService
  
  This controller service can be used to enrich the data going through NiFi by calling exposed Machine Learning models running in Cloudera Machine Learning.
- DatabaseTableSchemaRegistry
  
  This controller service enables you to retrieve the schema associated with a table from an external database. This allows you to validate the data going through NiFi against that schema before pushing the data into this table.
- EBCDICRecordReader
  
  This controller service allows you to read and convert Mainframe data into another structured format like JSON, Avro, and so on. For more information, see here.
- ExcelReader
- ImpalaConnectionPool
  
  This controller service allows you to interact with Impala without the need to deploy the required dependencies on all NiFi nodes.
- JiraRecordSink
- RabbitMQJMSConnectionFactoryProvider
  
  This controller service allows you to interact with RabbitMQ without the need to deploy the JMS client on all NiFi nodes.
- RedshiftConnectionPool
  
  This controller service allows you to interact with Redshift over JDBC without the need to care about deploying the JDBC driver on all of the NiFi nodes.
- SimpleRedisDistributedMapCacheClientService
- StandardFileResourceService
- StandardJsonSchemaRegistry
- YamlTreeReader
- ZendeskRecordSink
Parameter provider
- CyberArkConjurParameterProvider
  
  This parameter provider allows you to retrieve the value associated with your parameters from an external CyberArk Conjur instance.

We want your opinion

How can we improve this page?

What kind of feedback do you have?