Upgrade
Also available as:
PDF

Upgrade Overview

This section describes the changes made to NiFi 1.0.0. Because NiFi 1.0.0 is a major release, you should review this section carefully, and ensure that you understand the impact that these changes may have on your existing dataflows.

Minimum JRE/JDK Support

Java 8 is now the minimum JRE/JDK supported.

Kerberos System Properties

SPNEGO and service principals for Kerberos are now established via separate system properties.

New SPNEGO properties:

  • nifi.kerberos.spnego.principal

  • nifi.kerberos.spnego.keytab.location

  • nifi.kerberos.spnego.authentication.expiration

New service properties:

  • nifi.kerberos.service.principal

  • nifi.kerberos.service.keytab.location

Removed properties:

  • nifi.kerberos.keytab.location

  • nifi.kerberos.authentication.expiration

DBCPConnectionPool Service

The Database Driver Jar Url property has been replaced by the Database Driver Location(s) property which accepts a comma-separated list of URLs or local files and folders containing the driver JAR.

Existing processors that reference this service will be invalid until you have configured the new Database Driver Location(s) property.

MonitorDiskUsage Reporting Task

This standard reporting task has been simplified to let you specify a logical name, a directory, and a threshold to monitor. Previously it was tightly coupled to the internal flow file and content repositories in a manner that did not align to the pluggable nature of those repositories.

The new approach gives the user total control over what they want MonitorDiskUsage to monitor.

Connection/Relationship Default Back Pressure Settings

In previous versions, no backpressure settings were supplied by default. In NiFi 1.0.0, new connections made have a default value set of 10,000 flow files and 1GB worth of data size.

Multi-Tenant Authorization Model

  • Authority Provider model has been replaced by a Multi-tenant Authorization model. Access privileges are now defined by policies that can be applied system-wide or to individual components. Details can be found in Multi-tenant Authorization in Hortonworks DataFlow Administration.

  • The system properties nifi.authority.provider.configuration.file and nifi.security.user.authority.provider have been replaced by nifi.authorizer.configuration.file and nifi.security.user.authorizer, respectively. Details on configuration can be found in Authorizer Configuration in Hortonworks DataFlow Administration.

  • You can convert NiFi 0.7.0 authorized users and roles to the new authorization model. An existing authorized-users.xml file can be referenced in the authorizers.xml "Legacy Authorized Users File" property to automatically generate users and authorizations. Details on configuration can be found in Authorizers.xml Setup in Hortonworks DataFlow Administration.

HTTP(S) Site-to-Site

  • HTTP(S) protocol is now supported in Site-to-Site as an underlying transport protocol.

  • HTTP(S) protocol is enabled by default (nifi.remote.input.http.enabled=true). Configuration details can be found in Site-to-Site Properties in Hortonworks DataFlow Administration.

    Of note:

    • With both socket and HTTP protocols supported, nifi.remote.input.socket.host has been renamed to nifi.remote.input.host.

    • nifi.remote.input.secure is now set to false by default.

Zero-Master Clustering

Master/slave clustering model has been replaced by a Zero-Master Clustering paradigm. Each node in a NiFi cluster performs the same tasks on the data, but each operates on a different set of data. A DataFlow manager can now interact with the NiFi cluster through the UI of any node.

ZooKeeper elects a single node as the Cluster Coordinator and also handles failover. All cluster nodes report heartbeat and status information to the Cluster Coordinator, which is responsible for disconnecting and connecting nodes. Additionally, every cluster has one Primary Node, also elected by ZooKeeper.

Configuration details can be found in the Clustering Configuration, the Cluster Common Properties, the Cluster Node Properties, and the ZooKeeper Properties sections of Hortonworks DataFlow Administration.

Of note for your upgrade:

  • NiFi Cluster Manager (NCM) configuration and properties are no longer relevant and have been removed.

  • The following properties should be set on each node:

    • nifi.web.http.port=<node port>

    • nifi.cluster.is.node=true

    • nifi.cluster.node.address=<fully qualified hostname of the node>

    • nifi.cluster.node.protocol.port=<node protocol port>

    • nifi.state.management.embedded.zookeeper.start=true

    • nifi.state.management.provider.cluster=zk-provider

    • nifi.state.management.embedded.zookeeper.properties=./conf/zookeeper.properties

    • nifi.zookeeper.connect.string=<A comma-separated list of host:port pairs to connect to ZooKeeper. For example, my-zk-server1:2181,my-zk-server2:2181,my-zk-server3:2183>

  • Coordinated dataflow selection across cluster nodes. During startup, a cluster coordinator is selected at random, and manages the distribution of the dataflow across all nodes. You should set the following to properties, to ensure that the cluster coordinator and other nodes have time to select the correct dataflow:

    • nifi.cluster.flow.election.max.wait.time=5 mins

    • nifi.cluster.flow.election.max.candidates=<number of NiFi nodes in the cluster>

  • Embedded ZooKeeper setup

    • The zookeeper.properties file needs to be populated with a list of each node's embedded ZooKeeper server. The servers are specified in the form of server.1, server.2, to server.n. Each of these servers is configured as <hostname>:<quorum port>[:<leader election port>]. For example, server.1=nifi-node1-hostname:2888:3888.

    • The zookeeper.properties file has a property named dataDir which is set to ./state/zookeeper by default. For each node, create a file named myid and place it in this directory. The contents of this file should be the index of the server as specified by the server.<number>. Configuration details can be found in Embedded ZooKeeper Server in Hortonworks DataFlow Administration.

  • State Management – In the state-management.xml file, set the “Connect String” property to the same list of ZooKeeper host:port pairs used for the nifi.zookeeper.connect.string property value.

  • Secure Clustered Environment – The identities for each node must be specified in the authorizers.xml file. The authorization policies required for the nodes to communicate will then be created during startup. Details on configuration can be found in Authorizers.xml Setup in Hortonworks DataFlow Administration.

Secured ZooKeeper

  • The username and password mechanism to provide ZooKeeper authentication is no longer supported. As a result, the “Username” and “Password” properties in the state-management.xml file have been removed.

  • The “Access Control” property in the state-management.xml file is now set to “Open” by default. It should be changed to “CreatorOnly” when ZooKeeper is secured via Kerberos.

QueryDatabaseTable Processor

The 'SQL Pre-processing Strategy' property has been replaced by the 'Database Type’ property. This property sets the type of database for generating database-specific code. Property values include ‘Generic' (default) and ‘Oracle’ (for custom SQL clauses).

TailFile Processor

TailFile originally stored state in a local file, then state management was added in 0.5.0 to support reading in the local state and moving it into the state manager. In 1.0.0, the auto-migration from the old state mechanism has been removed.

If upgrading from a pre-0.5.0 version of NiFi, it is suggested to upgrade to a version greater than or equal to 0.5.0 first, then go to 1.0.0 to not lose state on existing TailFile processors.

LDAP referral strategy ‘IGNORE’ bug fix

Errors occurred if the Referral Strategy was set to ‘IGNORE’ due to a typo in the code to accept ‘INGORE' instead. The login-identity-providers.xml file should now be configured with the intended value of ‘IGNORE’ for the Referral Strategy property.