Upgrade Overview
This section describes the changes made to NiFi 1.0.0. Because NiFi 1.0.0 is a major release, you should review this section carefully, and ensure that you understand the impact that these changes may have on your existing dataflows.
Minimum JRE/JDK Support
Java 8 is now the minimum JRE/JDK supported.
Kerberos System Properties
SPNEGO and service principals for Kerberos are now established via separate system properties.
New SPNEGO properties:
nifi.kerberos.spnego.principal
nifi.kerberos.spnego.keytab.location
nifi.kerberos.spnego.authentication.expiration
New service properties:
nifi.kerberos.service.principal
nifi.kerberos.service.keytab.location
Removed properties:
nifi.kerberos.keytab.location
nifi.kerberos.authentication.expiration
DBCPConnectionPool Service
The Database Driver Jar Url property has been replaced by the Database Driver Location(s) property which accepts a comma-separated list of URLs or local files and folders containing the driver JAR.
Existing processors that reference this service will be invalid until you have configured the new Database Driver Location(s) property.
MonitorDiskUsage Reporting Task
This standard reporting task has been simplified to let you specify a logical name, a directory, and a threshold to monitor. Previously it was tightly coupled to the internal flow file and content repositories in a manner that did not align to the pluggable nature of those repositories.
The new approach gives the user total control over what they want MonitorDiskUsage to monitor.
Connection/Relationship Default Back Pressure Settings
In previous versions, no backpressure settings were supplied by default. In NiFi 1.0.0, new connections made have a default value set of 10,000 flow files and 1GB worth of data size.
Multi-Tenant Authorization Model
Authority Provider model has been replaced by a Multi-tenant Authorization model. Access privileges are now defined by policies that can be applied system-wide or to individual components. Details can be found in Multi-tenant Authorization in Hortonworks DataFlow Administration.
The system properties nifi.authority.provider.configuration.file and nifi.security.user.authority.provider have been replaced by nifi.authorizer.configuration.file and nifi.security.user.authorizer, respectively. Details on configuration can be found in Authorizer Configuration in Hortonworks DataFlow Administration.
You can convert NiFi 0.7.0 authorized users and roles to the new authorization model. An existing
authorized-users.xml
file can be referenced in theauthorizers.xml
"Legacy Authorized Users File" property to automatically generate users and authorizations. Details on configuration can be found in Authorizers.xml Setup in Hortonworks DataFlow Administration.
HTTP(S) Site-to-Site
HTTP(S) protocol is now supported in Site-to-Site as an underlying transport protocol.
HTTP(S) protocol is enabled by default (nifi.remote.input.http.enabled=true). Configuration details can be found in Site-to-Site Properties in Hortonworks DataFlow Administration.
Of note:
With both socket and HTTP protocols supported, nifi.remote.input.socket.host has been renamed to nifi.remote.input.host.
nifi.remote.input.secure is now set to false by default.
Zero-Master Clustering
Master/slave clustering model has been replaced by a Zero-Master Clustering paradigm. Each node in a NiFi cluster performs the same tasks on the data, but each operates on a different set of data. A DataFlow manager can now interact with the NiFi cluster through the UI of any node.
ZooKeeper elects a single node as the Cluster Coordinator and also handles failover. All cluster nodes report heartbeat and status information to the Cluster Coordinator, which is responsible for disconnecting and connecting nodes. Additionally, every cluster has one Primary Node, also elected by ZooKeeper.
Configuration details can be found in the Clustering Configuration, the Cluster Common Properties, the Cluster Node Properties, and the ZooKeeper Properties sections of Hortonworks DataFlow Administration.
Of note for your upgrade:
NiFi Cluster Manager (NCM) configuration and properties are no longer relevant and have been removed.
The following properties should be set on each node:
nifi.web.http.port=<node port>
nifi.cluster.is.node=true
nifi.cluster.node.address=<fully qualified hostname of the node>
nifi.cluster.node.protocol.port=<node protocol port>
nifi.state.management.embedded.zookeeper.start=true
nifi.state.management.provider.cluster=zk-provider
nifi.state.management.embedded.zookeeper.properties=./conf/zookeeper.properties
nifi.zookeeper.connect.string=<A comma-separated list of host:port pairs to connect to ZooKeeper. For example, my-zk-server1:2181,my-zk-server2:2181,my-zk-server3:2183>
Coordinated dataflow selection across cluster nodes. During startup, a cluster coordinator is selected at random, and manages the distribution of the dataflow across all nodes. You should set the following to properties, to ensure that the cluster coordinator and other nodes have time to select the correct dataflow:
nifi.cluster.flow.election.max.wait.time=5 mins
nifi.cluster.flow.election.max.candidates=<number of NiFi nodes in the cluster>
Embedded ZooKeeper setup
The zookeeper.properties file needs to be populated with a list of each node's embedded ZooKeeper server. The servers are specified in the form of server.1, server.2, to server.n. Each of these servers is configured as <hostname>:<quorum port>[:<leader election port>]. For example, server.1=nifi-node1-hostname:2888:3888.
The
zookeeper.properties
file has a property named dataDir which is set to ./state/zookeeper by default. For each node, create a file named myid and place it in this directory. The contents of this file should be the index of the server as specified by the server.<number>. Configuration details can be found in Embedded ZooKeeper Server in Hortonworks DataFlow Administration.
State Management – In the state-management.xml file, set the “Connect String” property to the same list of ZooKeeper host:port pairs used for the nifi.zookeeper.connect.string property value.
Secure Clustered Environment – The identities for each node must be specified in the authorizers.xml file. The authorization policies required for the nodes to communicate will then be created during startup. Details on configuration can be found in Authorizers.xml Setup in Hortonworks DataFlow Administration.
Secured ZooKeeper
The username and password mechanism to provide ZooKeeper authentication is no longer supported. As a result, the “Username” and “Password” properties in the
state-management.xml
file have been removed.The “Access Control” property in the
state-management.xml
file is now set to “Open” by default. It should be changed to “CreatorOnly” when ZooKeeper is secured via Kerberos.
QueryDatabaseTable Processor
The 'SQL Pre-processing Strategy' property has been replaced by the 'Database Type’ property. This property sets the type of database for generating database-specific code. Property values include ‘Generic' (default) and ‘Oracle’ (for custom SQL clauses).
TailFile Processor
TailFile originally stored state in a local file, then state management was added in 0.5.0 to support reading in the local state and moving it into the state manager. In 1.0.0, the auto-migration from the old state mechanism has been removed.
If upgrading from a pre-0.5.0 version of NiFi, it is suggested to upgrade to a version greater than or equal to 0.5.0 first, then go to 1.0.0 to not lose state on existing TailFile processors.
LDAP referral strategy ‘IGNORE’ bug fix
Errors occurred if the Referral Strategy was set to ‘IGNORE’ due to a typo in the code
to accept ‘INGORE' instead. The login-identity-providers.xml
file
should now be configured with the intended value of ‘IGNORE’ for the Referral Strategy
property.