Understanding the use case

Learn how to use NiFi effectively to move data from various sources into an Iceberg table in a data warehouse on CDP Private Cloud Data Services.

Cloudera Data Warehouse (CDW) can be configured to use the Iceberg table format, optimizing data querying through tools like Impala and Hive. You can use Apache NiFi to move data from a range of locations into a data warehouse cluster running Hive or Impala in CDP Private Cloud Data Services.

This use case guides you through the creation of a data flow that generates FlowFiles containing randomized JSON data and writes this data into an Iceberg table in Hive or Impala. This simple design can get you started with creating an Iceberg ingest data flow. If your specific use case involves a different data source, see the other ingest documents for options and comprehensive guidance on using the appropriate processors.

For more information on Iceberg table format, see Apache Iceberg features.