Configuring State Providers
When a component decides to store or retrieve state, it does so by providing a "Scope" - either Node-local or Cluster-wide. The mechanism that is used to store and retrieve this state is then determined based on this Scope, as well as the configured State Providers. The nifi.properties file contains three different properties that are relevant to configuring these State Providers.
Property |
Description |
nifi.state.management.configuration.file |
The first is the property that specifies an external XML file that is used for configuring the local and/or cluster-wide State Providers. This XML file may contain configurations for multiple providers |
nifi.state.management.provider.local |
The property that provides the identifier of the local State Provider configured in this XML file |
nifi.state.management.provider.cluster |
Similarly, the property provides the identifier of the cluster-wide State Provider configured in this XML file. |
This XML file consists of a top-level state-management
element,
which has one or more local-provider
and zero or more
cluster-provider
elements. Each of these elements then contains an
id
element that is used to specify the identifier that can be
referenced in the nifi.properties file, as well as a
class
element that specifies the fully-qualified class name to use in
order to instantiate the State Provider. Finally, each of these elements may have zero or
more property
elements. Each property
element has an
attribute, name
that is the name of the property
that the State Provider supports. The textual content of the property element is the value
of the property.
Once these State Providers have been configured in the state-management.xml file (or whatever file is configured), those Providers may be referenced by their identifiers.
By default, the Local State Provider is configured to be a
WriteAheadLocalStateProvider
that persists the data to the
$NIFI_HOME/state/local directory. The default Cluster State
Provider is configured to be a ZooKeeperStateProvider
. The default
ZooKeeper-based provider must have its Connect String
property
populated before it can be used. It is also advisable, if multiple NiFi instances will use
the same ZooKeeper instance, that the value of the Root Node
property
be changed. For instance, one might set the value to /nifi/<team
name>/production
. A Connect String
takes the form of
comma separated <host>:<port> tuples, such as
my-zk-server1:2181,my-zk-server2:2181,my-zk-server3:2181. In the event a port is not
specified for any of the hosts, the ZooKeeper default of 2181 is assumed.
When adding data to ZooKeeper, there are two options for Access Control:
Open
and CreatorOnly
. If the Access
Control
property is set to Open
, then anyone is allowed to
log into ZooKeeper and have full permissions to see, change, delete, or administer the
data. If CreatorOnly
is specified, then only the user that created the
data is allowed to read, change, delete, or administer the data. In order to use the
CreatorOnly
option, NiFi must provide some form of authentication.
See the ZooKeeper Access Control section below
for more information on how to configure authentication.
If NiFi is configured to run in a standalone mode, the
cluster-provider
element need not be populated in the
state-management.xml file and will actually be ignored if they are
populated. However, the local-provider
element must always be present
and populated. Additionally, if NiFi is run in a cluster, each node must also have the
cluster-provider
element present and properly configured. Otherwise,
NiFi will fail to startup.
While there are not many properties that need to be configured for these providers, they were externalized into a separate state-management.xml file, rather than being configured via the nifi.properties file, simply because different implementations may require different properties, and it is easier to maintain and understand the configuration in an XML-based file such as this, than to mix the properties of the Provider in with all of the other NiFi framework-specific properties.
It should be noted that if Processors and other components save state using the Clustered scope, the Local State Provider will be used if the instance is a standalone instance (not in a cluster) or is disconnected from the cluster. This also means that if a standalone instance is migrated to become a cluster, then that state will no longer be available, as the component will begin using the Clustered State Provider instead of the Local State Provider.