Behavioral Changes in Flow Management

Review the list of Flow Management behavioral changes in Cloudera DataFlow for Data Hub 7.2.18.

Flow Management with NiFi 1

There are no behavioral changes for Flow Management clusters based on NiFi 1.25 in Cloudera DataFlow for Data Hub 7.2.18.

Flow Management with NiFi 2

NiFi 2.0 is a new, major release of Apache NiFi that introduces a lot of significant changes and enhacements, including some breaking changes for Flow Management clusters based on NiFi 2.0. It is important to familiarize yourself with the following points before migrating your existing flows.

If you want to migrate a data flow, you need to export the process group as a JSON file from your NiFi 1.x cluster and import this JSON file into your NiFi 2.0 cluster. Tooling to help with upgrades and automatically manage the breaking changes will be provided in the next release.

Java 21

Java 21 is the minimum Java version required with NiFi 2.0. This version is automatically installed and configured on new Data Hub clusters using NiFi 2.0.

Templates and XML flow definitions
The concept of templates in NiFi has been deprecated. Instead, versioning flows should be managed using the DataFlow Catalog and/or the NiFi Registry. It is highly recommended to handle any existing templates in your NiFi 1.x clusters by:
  • versioning the templates into the desired registry (DataFlow Catalog, NiFi Registry)
  • deleting the templates from NiFi process groups

Additionally, flow.xml.gz no longer exists, only flow.json.gz can be used in NiFi clusters for defining flows in the canvas.

Custom components / NARs

Although not certain, it is very likely that a custom NAR designed for NiFi 1 will not be successfully loaded into NiFi 2. If your NiFi setup includes custom components or NARs, it is a requirement to update your dependencies to align with NiFi 2. This entails making the necessary adjustments and rebuilding your NARs using Java 21.

Variables are removed in favor of parameters

Variables and the variable registry have been removed from NiFi. Only Parameter Contexts and parameters should be used going forward. In future releases, tools will be provided to help with the conversion of variables to parameters. In the meantime, this conversion should be done manually when migrating flows to NiFi 2. Any variables left will simply be ignored when loading the flow definition.

Event driven thread pool no longer exists

The event driven thread pool has been removed, leaving only the time driven thread pool available. Any components previously configured using the event driven scheduling strategy should be switched to the time driven scheduling strategy.

Removed languages in scripted components

In NiFi 2.0, support for certain languages in scripted components has been removed. The affected languages are: ECMAScript, Lua, Ruby, and Python. It is recommended to switch to Groovy or to leverage the new Python API feature for developing processors. The following components are impacted:

  • org.apache.nifi.processors.script.ExecuteScript
  • org.apache.nifi.processors.script.InvokeScriptedProcessor
  • org.apache.nifi.processors.script.ScriptedFilterRecord
  • org.apache.nifi.processors.script.ScriptedPartitionRecord
  • org.apache.nifi.processors.script.ScriptedTransformRecord
  • org.apache.nifi.processors.script.ScriptedValidateRecord
  • org.apache.nifi.lookup.script.ScriptedLookupService
  • org.apache.nifi.record.script.ScriptedReader
  • org.apache.nifi.record.script.ScriptedRecordSetWriter
  • org.apache.nifi.record.sink.script.ScriptedRecordSink
  • org.apache.nifi.lookup.script.SimpleScriptedLookupService
  • org.apache.nifi.reporting.script.ScriptedReportingTask
Removed components and replacement options

The following list contains the list of the components that have been removed between clusters based on NiFi 1.25 and clusters based on NiFi 2.0, along with the recommended alternatives where available.

  • Processors
    • Base64EncodeContent => EncodeContent
    • CompareFuzzyHash => no replacement
    • ConsumeEWS => no replacement
    • ConsumeKafka_1_0 => ConsumeKafka_2_6
    • ConsumeKafka_2_0 => ConsumeKafka_2_6
    • ConsumeKafkaRecord_1_0 => ConsumeKafkaRecord_2_6
    • ConsumeKafkaRecord_2_0 => ConsumeKafkaRecord_2_6
    • ConvertAvroSchema => ConvertRecord
    • ConvertAvroToORC => no replacement
    • ConvertCSVToAvro => ConvertRecord
    • ConvertExcelToCSVProcessor => ConvertRecord with ExcelReader
    • ConvertJSONToAvro => ConvertRecord
    • CryptographicHashAttribute => UpdateAttribute
    • DeleteAzureBlobStorage => DeleteAzureBlobStorage_v12
    • DeleteRethinkDB => no replacement
    • EncryptContent => EncryptContentAge or EncryptContentPGP
    • ExecuteInfluxDBQuery => use Influx Data NARs for NiFi
    • ExtractCCDAAttributes => no replacement
    • FetchAzureBlobStorage => FetchAzureBlobStorage_v12
    • FetchElasticsearchHttp => GetElasticsearch
    • FuzzyHashContent => no replacement
    • GetAzureQueueStorage => GetAzureQueueStorage_v12
    • GetHTMLElement => no replacement
    • GetHTTP => InvokeHTTP
    • GetIgniteCache => no replacement
    • GetJMSQueue => ConsumeJMS
    • GetJMSTopic => ConsumeJMS
    • GetRethinkDB => no replacement
    • GetTCP => no replacement
    • GetTwitter => ConsumeTwitter
    • HashAttribute => CryptographicHashAttribute
    • HashContent => CryptographicHashContent
    • InferAvroSchema => ExtractRecordSchema
    • ListAzureBlobStorage => ListAzureBlobStorage_v12
    • ModifyHTMLElement => no replacement
    • PostHTTP => InvokeHTTP
    • PostSlack => PublishSlack
    • PublishKafka_1_0 => PublishKafka_2_6
    • PublishKafka_2_0 => PublishKafka_2_6
    • PublishKafkaRecord_1_0 => PublishKafkaRecord_2_6
    • PublishKafkaRecord_2_0 => PublishKafkaRecord_2_6
    • PutAzureBlobStorage => PutAzureBlobStorage_v12
    • PutAzureQueueStorage => PutAzureQueueStorage_v12
    • PutBigQueryBatch => PutBigQuery
    • PutBigQueryStreaming => PutBigQuery
    • PutElasticsearchHttp => PutElasticsearchJson
    • PutElasticsearchHttpRecord => PutElasticsearchRecord
    • PutHiveQL => PutClouderaHiveQL
    • PutHiveStreaming => PutClouderaHiveStreaming
    • PutHTMLElement => no replacement
    • PutIgniteCache => no replacement
    • PutInfluxDB => use Influx Data NARs for NiFi
    • PutJMS => PublishJMS
    • PutRethinkDB => no replacement
    • PutRiemann => no replacement
    • PutSlack => PublishSlack
    • QueryElasticsearchHttp => PaginatedJsonQueryElasticsearch
    • ScrollElasticsearchHttp => SearchElasticsearch
    • SelectHiveQL => SelectClouderaHiveQL
    • SpringContextProcessor => no replacement
    • StoreInKiteDataset => no replacement
    • UpdateHiveTable => UpdateClouderaHiveTable
  • Controller services
    • ActionHandlerLookup => no replacement
    • AlertHandler => no replacement
    • AzureStorageCredentialsControllerService => AzureStorageCredentialsControllerService_v12
    • AzureStorageCredentialsControllerServiceLookup => AzureStorageCredentialsControllerServiceLookup_v12
    • AzureStorageEmulatorCredentialsControllerService => no replacement
    • EasyRulesEngineProvider => no replacement
    • EasyRulesEngineService => no replacement
    • ExpressionHandler => no replacement
    • GraphiteMetricReporterService => no replacement
    • GremlinClientService => no replacement
    • HBase_1_1_2_ClientMapCacheService => HBase_2_ClientMapCacheService
    • HBase_1_1_2_ClientService => HBase_2_ClientService
    • HBase_1_1_2_ListLookupService => no replacement
    • HBase_1_1_2_RecordLookupService => HBase_2_RecordLookupService
    • HiveConnectionPool => ClouderaHiveConnectionPool
    • HortonworksSchemaRegistry => ClouderaSchemaRegistry
    • KafkaRecordSink_1_0 => KafkaRecordSink_2_6
    • KafkaRecordSink_2_0 => KafkaRecordSink_2_6
    • LogHandler => no replacement
    • OAuth2TokenProviderImpl => StandardOauth2AccessTokenProvider
    • OpenCypherClientService => no replacement
    • RecordSinkHandler => no replacement
    • ScriptedActionHandler => no replacement
    • ScriptedRulesEngine => no replacement
  • Reporting tasks
    • AmbariReportingTask => no replacement
    • MetricsEventReportingTask => no replacement
    • MetricsReportingTask => no replacement
  • Components with new coordinates
    • InvokeGRPC => moved into nifi-cdf-grpc-nar
    • ListenGRPC => moved into nifi-cdf-grpc-nar
    • KerberosKeytabUserService => moved into nifi-kerberos-user-service-nar
    • KerberosPasswordUserService => moved into nifi-kerberos-user-service-nar
    • KerberosTicketCacheUserService => moved into nifi-kerberos-user-service-nar
    Tooling will be provided in upcoming releases to automatically handle these changes. Currently, two options are available:
    • Manually edit the flow.json.gz file to update the coordinates of the impacted components
    • make the changes after the flow is imported in NiFi 2.0 by replacing the ghost components with the new implementations for each instance of the components listed above.
  • Pulsar components
    All Pulsar components have been temporarily removed in this release. They will be reintroduced in upcoming releases. In the meantime, you can download the NARs from a public Maven repository and deploy them as custom NARs.