What's new in Flow Management with NiFi 1
Learn about the new features of Flow Management using NiFi 1.25 in Cloudera DataFlow for Data Hub 7.2.18.
Flow Management Data Hub in CDP Public Cloud 7.2.18 is compatible with both NiFi 1 and NiFi 2. This section provides details about Flow Management Data Hub based on Apache NiFi 1.25.0. Below are the most important new features, improvements, and fixes included in this release:
- Rebase on NiFi 1.25
- The upgrade offers access to the newest Apache NiFi features and enhancements on the 1.x
branch. Transitioning to NiFi 2.x later oon has the prerequisite of already using NiFi
1.25.
- New components
-
-
Processors
-
CalculateParquetOffsets and CalculateParquetRowGroupOffsets
These processors can be used in combination with ConvertRecord and Parquet Reader to significantly reduce the amount of time required to convert very large Parquet files into another format.
-
CaptureChangeDebeziumDB2, CaptureChangeDebeziumMySQL, CaptureChangeDebeziumOracle, CaptureChangeDebeziumPostgreSQL, CaptureChangeDebeziumSQLServer
These processors, currently in Technical Preview, leverageare the Debezium project to ingest Change Data Capture events from external databases.
-
DecryptContentAge and EncryptContentAge
These are a new generation of processors for data encryption / decryption. For more information, see Modernizing Streaming Encryption with age in Apache NiFi.
-
ListenNetFlow
This processor enables NiFi to receive Netflow data from network equipment. For more information, see Collecting NetFlow Records with Cloudera DataFlow.
-
ListenOTLP
This processor allows NiFi to function as a destination for OLTP agents to receive OpenTelemetry data from external applications. You can learn more about this new processor Building OpenTelemetry Collection in Apache NiFi with Netty.
-
PutAzureQueueStorage_v12 and GetAzureQueueStorage_v12
These processors use the latest library for interacting with Azure Queue Storage. It is highly recommended to switch to these new processors.
-
PutClouderaHiveQL, PutClouderaHiveStreaming, PutClouderaORC, SelectClouderaHiveQL, UpdateClouderaHiveTable, TriggerClouderaHiveMetaStoreEvent
This is a set of Cloudera exclusive components to interact with Hive-based components in the Cloudera Data Platform. It is highly recommended to switch to these components as they leverage features that are part of Hive 4, not yet released in the Apache project.
-
PutIcebergCDC
This processor in Technical Preview can be used in combination with the new Debezium processors to implement CDC pipelines with Iceberg tables as the destination.
-
ExtractRecordSchema
This processor should be used as a replacement of the InferAvroSchema processor, which will be deprecated in NiFi 2.
-
PutJiraIssue
-
PutZendeskTicket
-
QueryIoTDBRecord
-
RemoveRecordField
-
ConsumeElasticsearch
-
FilterAttribute
-
PackageFlowFile
-
PublishSlack
-
-
Controller services
-
ActiveMQJMSConnectionFactoryProvider
This controller service allows you to interact with ActiveMQ without the need to deploy the JMS client on all of the NiFi nodes.
-
ADLSCredentialsControllerServiceLookup
-
AmazonGlueSchemaRegistry
-
AzureServiceBusJMSConnectionFactoryProvider
This controller service allows you to interact with Azure Service Bus without the need to deploy the required dependencies on all of the NiFi nodes.
-
AzureStorageCredentialsControllerServiceLookup_v12
-
ClouderaHiveConnectionPool
This controller service allows you to interact with Hive without the need to deploy the required dependencies on all of the NiFi nodes.
-
CMLLookupService
This controller service can be used to enrich the data going through NiFi by calling exposed Machine Learning models running in Cloudera Machine Learning.
-
DatabaseTableSchemaRegistry
This controller service enables you to retrieve the schema associated with a table from an external database. This allows you to validate the data going through NiFi against that schema before pushing the data into this table.
-
EBCDICRecordReader
This controller service allows you to read and convert Mainframe data into another structured format like JSON, Avro, and so on. For more information, see here.
-
ExcelReader
-
ImpalaConnectionPool
This controller service allows you to interact with Impala without the need to deploy the required dependencies on all NiFi nodes.
-
JiraRecordSink
-
RabbitMQJMSConnectionFactoryProvider
This controller service allows you to interact with RabbitMQ without the need to deploy the JMS client on all NiFi nodes.
-
RedshiftConnectionPool
This controller service allows you to interact with Redshift over JDBC without the need to care about deploying the JDBC driver on all of the NiFi nodes.
-
SimpleRedisDistributedMapCacheClientService
-
StandardFileResourceService
-
StandardJsonSchemaRegistry
-
YamlTreeReader
-
ZendeskRecordSink
-
-
Parameter provider
-
CyberArkConjurParameterProvider
This parameter provider allows you to retrieve the value associated with your parameters from an external CyberArk Conjur instance.
-
-