Understand the use case

Learn how you can use a Flow Management cluster connected to a Streams Messaging cluster to build an end-to-end flow that ingests data to Azure Data Lake Storage (ADLS). This example use case shows you how to use Apache NiFi to move data from Kafka to ADLS.

Why move data to object stores?

Cloud environments offer numerous deployment options and services. There are many ways to store data in the cloud, but the easiest option is to use object stores. Object stores are extremely robust and cost/effective storage solutions with multiple levels of durability and availability. You can include them in your data pipeline, both as an intermediate step and as an end state. Object stores are accessible to many tools and connecting systems, and you have a variety of options to control access.

Apache NiFi for cloud ingestion

You can use Apache NiFi to move data from a range of locations into object stores in CDP Public Cloud.

As a part of Cloudera Data Flow, NiFi offers a scalable solution for managing data flows with guaranteed delivery, data buffering / pressure release, and prioritized queuing. It supports data transfer from nearly any type of data source to cloud storage solutions in a secure and governed manner.

You can use NiFi to move data from various locations into Azure Data Lake Storage (ADLS) in a quick and user-friendly way, by simply dragging and dropping a series of processors on the NiFi user interface. When the data flow is ready, you can visually monitor and control the pipeline you have built.

This use case walks you through the steps associated with creating an ingest-focused data flow from Apache Kafka into ADLS folders. If you are moving data from a location other than Kafka, see the Getting Started with Apache NiFi for information about how to build a data flow, and about other data get and consume processor options.