Configuring State Providers
When a component decides to store or retrieve state, it does so by providing a "Scope" - either Node-local or Cluster-wide. The mechanism that is used to store and retrieve this state is then determined based on this Scope, as well as the configured State Providers. The nifi.properties file contains three different properties that are relevant to configuring these State Providers.
The first is the property that specifies an external XML file that is used for configuring the local and/or cluster-wide State Providers. This XML file may contain configurations for multiple providers
The property that provides the identifier of the local State Provider configured in this XML file
Similarly, the property provides the identifier of the cluster-wide State Provider configured in this XML file.
This XML file consists of a top-level
state-management element, which has one or more
local-provider and zero or more
cluster-provider elements. Each of these elements then contains an
id element that is used to specify the identifier that can be referenced in the nifi.properties file, as well as a
class element that specifies the fully-qualified class name to use in order to instantiate the State Provider. Finally, each of these elements may have zero or more
property elements. Each
property element has an attribute,
name that is the name of the
property that the State Provider supports. The textual content of the property element is the value of the property.
Once these State Providers have been configured in the state-management.xml file (or whatever file is configured), those Providers may be referenced by their identifiers.
By default, the Local State Provider is configured to be a
WriteAheadLocalStateProvider that persists the data to the
$NIFI_HOME/state/local directory. The default Cluster State Provider is configured to be a
ZooKeeperStateProvider. The default ZooKeeper-based provider must have its
Connect String property populated before it can be used. It is also advisable, if multiple NiFi instances will use the same ZooKeeper instance, that the value of the
Root Node property be changed. For instance, one might set the value to
/nifi/<team name>/production. A
Connect String takes the form of comma separated <host>:<port> tuples, such as
my-zk-server1:2181,my-zk-server2:2181,my-zk-server3:2181. In the event a port is not specified for any of the hosts, the ZooKeeper default of
2181 is assumed.
When adding data to ZooKeeper, there are two options for Access Control:
CreatorOnly. If the
Control property is set to
Open, then anyone is allowed to log
into ZooKeeper and have full permissions to see, change, delete, or administer the data. If
CreatorOnly is specified, then only the user that created the data is
allowed to read, change, delete, or administer the data. In order to use the
CreatorOnly option, NiFi must provide some form of authentication. See
the ZooKeeper Access Control section below for more
information on how to configure authentication.
If NiFi is configured to run in a standalone mode, the
cluster-provider element need not be populated in the state-management.xml file and will actually be ignored if they are populated. However, the
local-provider element must always be present and populated. Additionally, if NiFi is run in a cluster, each node must also have the
cluster-provider element present and properly configured. Otherwise, NiFi will fail to startup.
While there are not many properties that need to be configured for these providers, they were externalized into a separate state-management.xml file, rather than being configured via the nifi.properties file, simply because different implementations may require different properties, and it is easier to maintain and understand the configuration in an XML-based file such as this, than to mix the properties of the Provider in with all of the other NiFi framework-specific properties.
It should be noted that if Processors and other components save state using the Clustered scope, the Local State Provider will be used if the instance is a standalone instance (not in a cluster) or is disconnected from the cluster. This also means that if a standalone instance is migrated to become a cluster, then that state will no longer be available, as the component will begin using the Clustered State Provider instead of the Local State Provider.