Configuring State Providers
When a component decides to store or retrieve state, it does so by providing a "Scope" - either Node-local or Cluster-wide. The mechanism that is used to store and retrieve this state is then determined based on this Scope, as well as the configured State Providers. The nifi.properties file contains three different properties that are relevant to configuring these State Providers.
Property |
Description |
nifi.state.management.configuration.file |
The first is the property that specifies an external XML file that is used for configuring the local and/or cluster-wide State Providers. This XML file may contain configurations for multiple providers |
nifi.state.management.provider.local |
The property that provides the identifier of the local State Provider configured in this XML file |
nifi.state.management.provider.cluster |
Similarly, the property provides the identifier of the cluster-wide State Provider configured in this XML file. |
This XML file consists of a top-level state-management
element, which has one or more local-provider
and zero or more cluster-provider
elements. Each of these elements then contains an id
element that is used to specify the identifier that can be referenced in the nifi.properties file, as well as a class
element that specifies the fully-qualified class name to use in order to instantiate the State Provider. Finally, each of these elements may have zero or more property
elements. Each property
element has an attribute, name
that is the name of the property
that the State Provider supports. The textual content of the property element is the value of the property.
Once these State Providers have been configured in the state-management.xml file (or whatever file is configured), those Providers may be referenced by their identifiers.
By default, the Local State Provider is configured to be a WriteAheadLocalStateProvider
that persists the data to the $NIFI_HOME/state/local directory. The default Cluster State Provider is configured to be a ZooKeeperStateProvider
. The default ZooKeeper-based provider must have its Connect String
property populated before it can be used. It is also advisable, if multiple NiFi instances will use the same ZooKeeper instance, that the value of the Root Node
property be changed. For instance, one might set the value to /nifi/<team name>/production
. A Connect String
takes the form of comma separated <host>:<port> tuples, such as my-zk-server1:2181,my-zk-server2:2181,my-zk-server3:2181. In the event a port is not specified for any of the hosts, the ZooKeeper default of 2181 is assumed.
When adding data to ZooKeeper, there are two options for Access Control:
Open
and CreatorOnly
. If the Access
Control
property is set to Open
, then anyone is allowed to log
into ZooKeeper and have full permissions to see, change, delete, or administer the data. If
CreatorOnly
is specified, then only the user that created the data is
allowed to read, change, delete, or administer the data. In order to use the
CreatorOnly
option, NiFi must provide some form of authentication.
If NiFi is configured to run in a standalone mode, the cluster-provider
element need not be populated in the state-management.xml file and will actually be ignored if they are populated. However, the local-provider
element must always be present and populated. Additionally, if NiFi is run in a cluster, each node must also have the cluster-provider
element present and properly configured. Otherwise, NiFi will fail to startup.
While there are not many properties that need to be configured for these providers, they were externalized into a separate state-management.xml file, rather than being configured via the nifi.properties file, simply because different implementations may require different properties, and it is easier to maintain and understand the configuration in an XML-based file such as this, than to mix the properties of the Provider in with all of the other NiFi framework-specific properties.
It should be noted that if Processors and other components save state using the Clustered scope, the Local State Provider will be used if the instance is a standalone instance (not in a cluster) or is disconnected from the cluster. This also means that if a standalone instance is migrated to become a cluster, then that state will no longer be available, as the component will begin using the Clustered State Provider instead of the Local State Provider.