Breaking changes in NiFi 2
Learn about the breaking changes Apache NiFi 2 will bring to Cloudera Flow Management.
Flow design level breaking changes
- Moved components
-
In NiFi 2, some components have been relocated within the Apache NiFi repository, resulting in changes to the bundle coordinates for the associated NAR files.
Unfortunately, no pre-upgrade actions can fully prevent these breaking changes. You will need to update the flow.json.gz file with new coordinates. Cloudera’s upcoming NiFi Migration Tooling is designed to automate as many changes as possible to help the upgrade process. Some changes will still require manual handling, so it is highly recommended to run the pre-upgrade check script to identify any potential issues or impacted components before proceeding with the upgrade.
JoltTransformJSON processor
… "type": "org.apache.nifi.processors.standard.JoltTransformJSON", "bundle": { "group": "org.apache.nifi", "artifact": "nifi-standard-nar", "version": "1.27.0" } …
… "type": "org.apache.nifi.processors.jolt.JoltTransformJSON", "bundle": { "group": "org.apache.nifi", "artifact": "nifi-jolt-nar", "version": "2.0.0" } …
JoltTransformRecord processor
… "type": "org.apache.nifi.processors.jolt.record.JoltTransformRecord", "bundle": { "group": "org.apache.nifi", "artifact": "nifi-jolt-record-nar", "version": "1.27.0" } …
… "type": "org.apache.nifi.processors.jolt.JoltTransformRecord", "bundle": { "group": "org.apache.nifi", "artifact": "nifi-jolt-nar", "version": "2.0.0" } …
FileParameterProvider renamed to KubernetesSecretParameterProvider
… "type": "org.apache.nifi.parameter.FileParameterProvider", "bundle": { "group": "org.apache.nifi", "artifact": "nifi-standard-nar", "version": "1.27.0" } …
… "type": "org.apache.nifi.parameter.KubernetesSecretParameterProvider", "bundle": { "group": "org.apache.nifi", "artifact": "nifi-standard-nar", "version": "2.0.0" } …
- Removed key components
-
- Kafka processors:
All Kafka processors in Apache NiFi have been removed and got replaced by new components using a controller service-based approach. This change is a significant breaking change, as it does not allow for a non-breaking upgrade. To ease this transition, Cloudera has preserved the Kafka 2.6 processors without altering the bundle coordinates. However, adjustments to the Kerberos configuration will still be necessary (see the details below). This approach provides time to transition to the new Kafka components while on NiFi 2.
- Hive components:
All Hive-related components in Apache NiFi have been removed. Cloudera has introduced specific components downstream to align with the Hive version distributed as part of CDP. You will need to perform proper bundle coordinate updates in the flow.json.gz file to migrate to these new components. Both the old and new components are available in the latest NiFi 1.x releases, so you are advised to switch to the new components while still using NiFi 1.x.
- Kafka processors:
- Components with Kerberos configuration changes
-
Kerberos authentication in NiFi requires presenting a Kerberos credential which can be in one of the following forms:
-
Principal + Keytab: The keytab, stored on disk, contains the client’s secret key. This credential is used by the application to authenticate and obtain the Ticket Granting Ticket (TGT).
-
Principal + Password: The client’s secret key is derived from a password, so no keytab is stored on disk. Otherwise, this method functions similarly to the keytab-based approach.
-
Principal + Ticket cache: The TGT must be acquired externally and stored in a ticket cache, which the application uses. The application itself is unaware of the keytab or password and is not responsible for handling TGT acquisition.
Historically, NiFi supported the following Kerberos configuration options:
-
Kerberos Principal + Kerberos Keytab: Component-level properties supporting keytab-based credential type
-
Kerberos Credentials Service: A property referencing the
KerberosCredentialsService
controller service interface with the following implementation:KeytabCredentialsService
for keytab-based credential type -
Kerberos Principal + Kerberos Password: Component-level properties supporting password-based credentials
-
Kerberos User Service: A property referencing the
KerberosUserService
controller service interface, with implementations for all credential types:KerberosKeytabUserService
for keytab-based credentialsKerberosPasswordUserService
for password-based credentialsKerberosTicketCacheUserService
for ticket cache-based credentials.
In NiFi 2, only the Kerberos User Service is retained because it can accommodate all credential types (keytab, password, ticket cache) with a single property on the component. The other configuration options have been removed in NiFi 2. For more information, see NIFI-13510.
Affected components in Cloudera Flow Management 2.1.7:
The table below lists the affected components and their legacy Kerberos properties. It also indicates whether these components are planned to be available in Cloudera Flow Management 4.x and provides additional comments on their migration.
Component Group Component Type Component Artifact Component Name (1) Kerberos Principal +
Kerberos Keytab(2) Kerberos Credentials Service (3) Kerberos Principal +
Kerberos Password(4) Kerberos User Service Future availability in CFM 4 Comment org.apache.nifi.accumulo.controllerservices.AccumuloService nifi-accumulo-services-nar AccumuloService x x x NO org.apache.nifi.dbcp.DBCPConnectionPool nifi-dbcp-service-nar DBCPConnectionPool x x x YES org.apache.nifi.dbcp.HadoopDBCPConnectionPool nifi-hadoop-dbcp-service-nar HadoopDBCPConnectionPool x x x x YES org.apache.nifi.schemaregistry.hortonworks.HortonworksSchemaRegistry nifi-hwx-schema-registry-nar HortonworksSchemaRegistry x x NO Replaced by ClouderaSchemaRegistry org.apache.nifi.controller.kudu.KuduLookupService nifi-kudu-nar KuduLookupService x YES org.apache.nifi.controller.livy.LivySessionController nifi-livy-nar LivySessionController x YES org.apache.nifi.processors.kudu.PutKudu nifi-kudu-nar PutKudu x x x YES org.apache.nifi.atlas.reporting.ReportLineageToAtlas nifi-atlas-nar ReportLineageToAtlas x x YES CDP Object Store processors org.apache.nifi.processors.hadoop.ListCDPObjectStore nifi-cdf-objectstore-nar ListCDPObjectStore x YES org.apache.nifi.processors.hadoop.FetchCDPObjectStore nifi-cdf-objectstore-nar FetchCDPObjectStore YES org.apache.nifi.processors.hadoop.PutCDPObjectStore nifi-cdf-objectstore-nar PutCDPObjectStore YES org.apache.nifi.processors.hadoop.DeleteCDPObjectStore nifi-cdf-objectstore-nar DeleteCDPObjectStore YES Hadoop processors org.apache.nifi.processors.hadoop.ListHDFS nifi-hadoop-nar ListHDFS x x x x YES org.apache.nifi.processors.hadoop.FetchHDFS nifi-hadoop-nar FetchHDFS YES org.apache.nifi.processors.hadoop.PutHDFS nifi-hadoop-nar PutHDFS YES org.apache.nifi.processors.hadoop.DeleteHDFS nifi-hadoop-nar DeleteHDFS YES org.apache.nifi.processors.hadoop.GetHDFS nifi-hadoop-nar GetHDFS YES org.apache.nifi.processors.hadoop.MoveHDFS nifi-hadoop-nar MoveHDFS YES org.apache.nifi.processors.hadoop.inotify.GetHDFSEvents nifi-hadoop-nar GetHDFSEvents YES org.apache.nifi.processors.hadoop.GetHDFSFileInfo nifi-hadoop-nar GetHDFSFileInfo YES org.apache.nifi.processors.hadoop.GetHDFSSequenceFile nifi-hadoop-nar GetHDFSSequenceFile YES org.apache.nifi.processors.hadoop.CreateHadoopSequenceFile nifi-hadoop-nar CreateHadoopSequenceFile YES org.apache.nifi.processors.parquet.FetchParquet nifi-parquet-nar FetchParquet YES org.apache.nifi.processors.parquet.PutParquet nifi-parquet-nar PutParquet YES org.apache.nifi.processors.orc.PutORC nifi-hive3-nar PutORC NO Replaced by PutClouderaORC HBase services org.apache.nifi.hbase.HBase_1_1_2_ClientService nifi-hbase_1_1_2-client-service-nar HBase_1_1_2_ClientService x x x x NO HBase_2_ClientService should be used instead org.apache.nifi.hbase.HBase_2_ClientService nifi-hbase_2-client-service-nar HBase_2_ClientService YES Hive components org.apache.nifi.dbcp.hive.HiveConnectionPool nifi-hive-nar HiveConnectionPool x x x NO Replaced by ClouderaHiveConnectionPool org.apache.nifi.dbcp.hive.Hive3ConnectionPool nifi-hive3-nar Hive3ConnectionPool NO Replaced by ClouderaHiveConnectionPool org.apache.nifi.processors.hive.PutHiveStreaming nifi-hive-nar PutHiveStreaming NO Replaced by PutClouderaHiveStreaming org.apache.nifi.processors.hive.PutHive3Streaming nifi-hive3-nar PutHive3Streaming x x NO Replaced by PutClouderaHiveStreaming Kafka_1_0 processors org.apache.nifi.processors.kafka.pubsub.ConsumeKafka_1_0 nifi-kafka-1-0-nar ConsumeKafka_1_0 x NO ConsumeKafka_2_6 should be used instead org.apache.nifi.processors.kafka.pubsub.PublishKafka_1_0 nifi-kafka-1-0-nar PublishKafka_1_0 NO PublishKafka_2_6 should be used instead org.apache.nifi.processors.kafka.pubsub.ConsumeKafkaRecord_1_0 nifi-kafka-1-0-nar ConsumeKafkaRecord_1_0 x x NO ConsumeKafkaRecord_2_6 should be used instead org.apache.nifi.processors.kafka.pubsub.PublishKafkaRecord_1_0 nifi-kafka-1-0-nar PublishKafkaRecord_1_0 NO PublishKafkaRecord_2_6 should be used instead Kafka_2_0 processors org.apache.nifi.processors.kafka.pubsub.ConsumeKafka_2_0 nifi-kafka-2-0-nar ConsumeKafka_2_0 x x NO ConsumeKafka_2_6 should be used instead org.apache.nifi.processors.kafka.pubsub.PublishKafka_2_0 nifi-kafka-2-0-nar PublishKafka_2_0 NO PublishKafka_2_6 should be used instead org.apache.nifi.processors.kafka.pubsub.ConsumeKafkaRecord_2_0 nifi-kafka-2-0-nar ConsumeKafkaRecord_2_0 NO ConsumeKafkaRecord_2_6 should be used instead org.apache.nifi.processors.kafka.pubsub.PublishKafkaRecord_2_0 nifi-kafka-2-0-nar PublishKafkaRecord_2_0 NO PublishKafkaRecord_2_6 should be used instead Kafka_2_6 processors org.apache.nifi.processors.kafka.pubsub.ConsumeKafka_2_6 nifi-kafka-2-6-nar ConsumeKafka_2_6 x x x YES org.apache.nifi.processors.kafka.pubsub.PublishKafka_2_6 nifi-kafka-2-6-nar PublishKafka_2_6 YES org.apache.nifi.processors.kafka.pubsub.ConsumeKafkaRecord_2_6 nifi-kafka-2-6-nar ConsumeKafkaRecord_2_6 YES org.apache.nifi.processors.kafka.pubsub.PublishKafkaRecord_2_6 nifi-kafka-2-6-nar PublishKafkaRecord_2_6 YES Kafka2CDP processors org.apache.nifi.processors.kafka.pubsub.ConsumeKafka2CDP nifi-cdf-kafka-2-nar ConsumeKafka2CDP x x x YES org.apache.nifi.processors.kafka.pubsub.PublishKafka2CDP nifi-cdf-kafka-2-nar PublishKafka2CDP YES org.apache.nifi.processors.kafka.pubsub.ConsumeKafkaRecord2CDP nifi-cdf-kafka-2-nar ConsumeKafkaRecord2CDP YES org.apache.nifi.processors.kafka.pubsub.PublishKafkaRecord2CDP nifi-cdf-kafka-2-nar PublishKafkaRecord2CDP YES Solr processors org.apache.nifi.processors.solr.GetSolr nifi-solr-nar GetSolr x x x YES org.apache.nifi.processors.solr.QuerySolr nifi-solr-nar QuerySolr YES org.apache.nifi.processors.solr.PutSolrContentStream nifi-solr-nar PutSolrContentStream YES org.apache.nifi.processors.solr.PutSolrRecord nifi-solr-nar PutSolrRecord YES -
- Flow migration for Kerberos configuration changes
-
To ensure compatibility with NiFi 2, you will need to migrate your flow by creating a new
KerberosUserService
controller service based on the old controller service or component level properties. This process involves transitioning from legacy component-level properties to the updated service and clearing outdated configurations.Migration steps for Kerberos Principal + Kerberos Keytab:
- Create service: Set up a
KerberosKeytabUserService
using the existing Kerberos Principal and Kerberos Keytab component-level properties. - Update reference: Link the new service to the component's Kerberos User Service property.
- Remove legacy properties: Clear the old property values from Kerberos Principal and Kerberos Keytab.
Migration steps for Kerberos Credentials Service:
- Create service: Set up a
KerberosKeytabUserService
based on the properties from theKeytabCredentialsService
. - Update reference: Point the component's Kerberos User Service property to the new service.
- Remove legacy service: Clear the old Kerberos Credentials Service reference property and delete the outdated service.
Migration steps for Kerberos Principal + Kerberos Password:
- Create service: Set up a
KerberosPasswordUserService
using the Kerberos Principal and Kerberos Password component-level properties. - Update reference: Link the new service to the component’s Kerberos User Service property.
- Remove legacy properties: Clear the old property values from Kerberos Principal and Kerberos Password.
- Create service: Set up a
- Components requiring initial code-level changes
-
For certain components, the Kerberos User Service property is not yet available. These components will require an initial code-level update before migration.
- Kafka_1_0 processors
- Kafka_2_0 processors
- CDPObjectStore processors
- ReportLineageToAtlas
- KuduLookupService
- LivySessionController
- Migration to Cloudera-specific components
-
As part of the migration, some components will transition to new, Cloudera-specific types. The Kerberos property migration should be included in this transition.
- HortonworksSchemaRegistry => ClouderaSchemaRegistry
- Hive[3]ConnectionPool => ClouderaHiveConnectionPool
- PutHive[3]Streaming => PutClouderaHiveStreaming
- Scripted components
-
In NiFi 2, support for certain languages in scripted components has been removed. The affected languages include ECMAScript, Lua, Ruby, and Python. Cloudera recommends to switch to Groovy or leverage the new Python API feature for developing processors. The following components are impacted by this change:
- org.apache.nifi.processors.script.ExecuteScript
- org.apache.nifi.processors.script.InvokeScriptedProcessor
- org.apache.nifi.processors.script.ScriptedFilterRecord
- org.apache.nifi.processors.script.ScriptedPartitionRecord
- org.apache.nifi.processors.script.ScriptedTransformRecord
- org.apache.nifi.processors.script.ScriptedValidateRecord
- org.apache.nifi.lookup.script.ScriptedLookupService
- org.apache.nifi.record.script.ScriptedReader
- org.apache.nifi.record.script.ScriptedRecordSetWriter
- org.apache.nifi.record.sink.script.ScriptedRecordSink
- org.apache.nifi.lookup.script.SimpleScriptedLookupService
- org.apache.nifi.reporting.script.ScriptedReportingTask
- Custom components
-
If your NiFi environment includes custom components or NARs developed for NiFi 1.x, they are unlikely to be compatible with NiFi 2. To ensure compatibility, you must update your dependencies to align with NiFi 2 and rebuild your NARs using Java 21. This update is essential for a successful transition to the new version.
Flow controller level breaking changes
- Transition from variables to parameters
-
Variables and the variable registry are removed in NiFi 2 due to their inherent limitations, such as requiring expression language support to reference a variable and the inability to store sensitive values. You can use parameter contexts instead, which have been significantly enhanced over recent years. For example, the addition of the Parameter Context Provider allows for sourcing parameter values from external stores (like HashiCorp Vault or cloud provider vaults).
This change is one of the most impactful in NiFi 2, which will require rework on existing data flows. However, it also presents an opportunity to optimize the organization of parameters, allowing you to split them into multiple parameter contexts and use inheritance when sharing parameters across different use cases.
Cloudera will provide automated tools within the NiFi Migration Tooling to assist with transitioning from variables to parameters.
- Removal of XML templates
-
The concept of XML templates is being phased out in NiFi 2. Historically, these templates were stored in memory as well as in the flow definition files (flow.xml.gz and flow.json.gz). This caused significant issues for some NiFi users, especially those managing numerous large templates with thousands of components. Removing templates from NiFi will enhance stability and reduce memory usage.
If you use templates in your NiFi 1.x clusters, you should export your existing templates as JSON definitions or version them into a NiFi Registry instance to prepare for this change. Using NiFi Registry is the recommended best practice for version control, sharing, and reusing flow definitions.
If your template is a process group:- Drag and drop the template onto the canvas.
- Right-click it, and choose to export it as a flow definition (JSON file) or start version control in your NiFi Registry, if you have one configured.
If your template is not a process group (just a flow with components):- Drag and drop a process group onto the canvas.
- Go into that process group and drag and drop your template within that process group.
- Go back to the parent process group containing your template and export it as a flow definition or start version control on it.
Cloudera will provide automated tools within the NiFi Migration Tooling to help manage the migration of templates.
- Discontinuation of event driven thread pool
-
The Event Driven Scheduling Strategy, which was available for some processors in previous versions of NiFi, is being removed in NiFi 2. This feature was experimental and did not demonstrate significant performance improvements.
If you are using this scheduling strategy, you will need to update your components to use the time driven scheduling strategy instead. You can identify components using the Event Driven strategy by searching for “event” in the NiFi search bar.
NiFi framework breaking changes in NiFi 2
- Java 21 compatibility
- Java 21 is the minimum Java version required with NiFi 2. While most versions of NiFi 1.x may work with Java 21, it is not officially supported to run NiFi 1.x with Java 21 for extended periods in preparation for the upgrade. The transition to Java 21 should happen as part of the upgrade process to NiFi 2. You have to ensure that your environment is ready for Java 21 before initiating the upgrade.
- Transition from flow.xml.gz to flow.json.gz
- In NiFi 2, flow.xml.gz, which has been a cornerstone of flow configuration storage, is completely phased out and replaced by flow.json.gz. While the two files have coexisted in many NiFi 1.x releases, NiFi 2 will exclusively use JSON-based flow representations. Before upgrading, it is recommended to back up your flow.json.gz file. During the upgrade, only work with this JSON file if any updates are needed.
- Repository encryption removal
-
The ability to encrypt repositories has been removed in NiFi 2.
- NiFi Toolkit changes
-
The NiFi Toolkit will also undergo changes in NiFi 2, with some features being removed.
Dynamic parameter retrieval with parameter providers
In NiFi 2, the values associated with parameters retrieved by parameter providers are no longer stored within the flow.json.gz file. Instead, these values are retrieved on-demand or during NiFi's startup and are only stored in memory. This change enhances security and reduces the size of flow files, but it also means that parameter values will need to be available from their source each time NiFi is started or when the parameters are accessed. It is important to ensure that the external sources for these parameters are reliable and accessible to avoid disruptions during NiFi operations. For more information, see NIFI-13560.