Concepts
Learn about basic concepts of flow design and how those concepts relate to their NiFi counterparts.
- Controller Service
-
Controller Services are extension points that provide information for use by other components (such as processors or other controller services). The idea is that, rather than configure this information in every processor that might need it, the controller service provides it for any processor to use as needed.
- Draft
-
A draft is a flow definition in development, and its lifecycle is tied to the workspace it is created in. You can edit, update, and continuously test drafts by starting a Test Session. When your draft is ready to be deployed in production, you can publish it as a flow definition to the Cloudera DataFlow catalog.
- Workspace
- A workspace is where your drafts are stored. You can manage and delete drafts from this view. The workspace is automatically created when Cloudera DataFlow is enabled for an environment and is tied to the lifecycle of your Cloudera DataFlow service.
- Environment
- A logical environment defined with a specific virtual network and region on a customer’s cloud provider account. After enabling Cloudera DataFlow for an environment, its service components run in an environment.
- Parameter group
-
One default parameter group is auto-created when you create a new draft. You can then add parameters this one group, you cannot create additional ones. When you initiate a test session, a number of parameters are auto-generated in this parameter group. Depending on the configuration options you chose, these are necessary for your NiFi sandbox deployment, and also make it possible to integrate your NiFi instance with Cloudera Public Cloud components or external data sources.
This default group is automatically bound to each Process Group you create within your draft, making all parameters available to be used in any process group.
- Processor
- The Processor is the NiFi component that is used to listen for incoming data; pull data from
external sources; publish data to external sources; and route, transform, or extract
information from FlowFiles.
For more information, see Apache NiFi User Guide.
- Process group
-
When a data flow becomes complex, it often is beneficial to reason about the data flow at a higher, more abstract level. NiFi allows multiple components, such as Processors, to be grouped together into a Process Group.
In Cloudera DataFlow Flow Designer, creating a draft automatically creates a process group with the name you specified when creating the draft, acting as the root process group. You can create child process groups as you see fit to organize your data processing logic.
For more information, see Apache NiFi User Guide.
- Service
-
A logical environment defined with a specific virtual network and region on a customer’s cloud provider account. Cloudera DataFlow service components run in an environment. In the Cloudera DataFlow Service, you can provision a workload. This workload allows you to create a number of NiFi Deployments.
- Test session
- Starting a test session provisions NiFi resources, acting like a development sandbox for a particular draft flow. It allows you to work with live data to validate your data flow logic while updating your draft. You can suspend a test session any time and change the configuration of the NiFi cluster then resume testing with the updated configuration.