Getting StartedPDF version

Terminology

Agent

Apache MiNiFi Java or C++ agent. MiNiFi implements the core features of Apache NiFi, focusing on data collection and processing the data at the edge.

Bucket

A container in NiFi Registry that stores and organizes flows.

C2 server

MiNiFi C2 server (MiNiFi Command and Control) is a sub-project of Apache NiFi currently under development. Its role is to provide a central point of configuration to hundreds or thousands of MiNiFi agents in the wild. The C2 server manages versioned classes of applications (MiNiFi flow configurations) and exposes them through a REST API. MiNiFi agents can connect to this API at a defined frequency to update their configuration.

Once data land at the company servers, on the Cloud, or at the Data Center, there is a large set of applications that can be implemented. Real-time monitoring, process analysis and optimization, or predictive maintenance are some examples.

Class

A CEM class allows you to create a single configuration template for multiple CEM instances.

Connection

You create an automated dataflow by dragging components from the CEM Components toolbar to the canvas and then connect the components together by using connections. Each connection consists of one or more relationships. For each connection that is drawn, you can determine which relationships should be used for the connection. This allows data to be routed in different ways based on its processing outcome.

Content Repository

Content repository is a repository where the actual content bytes of a given flowfile live.

Dataflow

Dataflow is an automated and managed flow of information between systems.

Edge

Edge is the device that you want to manage, control, and monitor through CEM. To do so, you install the MiNiFi agent at the edge device to collect data and then pushes intelligence back to the same edge device.

Flowfile Repository

Flowfile repository is a repository where CEM keeps track of the state of what it knows about a given flowfile that is presently active in the flow.

Heartbeat

The nodes communicate their health and status to the currently elected cluster coordinator via heartbeats, which let the coordinator know they are still connected to the cluster and working properly. By default, the nodes emit heartbeats every 5 seconds, and if the cluster coordinator does not receive a heartbeat from a node within 40 seconds, it disconnects the node due to lack of heartbeat. You can configure the 5-second setting in the nifi.properties file. The reason that the cluster coordinator disconnects the node is because the coordinator needs to ensure that every node in the cluster is in sync, and if a node is not heard from regularly, the coordinator cannot be sure whether it is still in sync with the rest of the cluster. If, after 40 seconds, the node does send a new heartbeat, the coordinator automatically requests that the node rejoins the cluster, to include the re-validation of the node's flow. Both the disconnection due to lack of heartbeat and the reconnection once a heartbeat is received are reported to the DFM in the user interface.

Metastore

Metastore is the central repository of Hive Metadata. It stores the metadata for Hive tables and relations.

Provenance Repository

Provenance repository is a repository where data from all provenance events is stored.

We want your opinion

How can we improve this page?

What kind of feedback do you have?