Understanding the use case

This use case shows you how you can move your data from a Kafka topic into Apache Kudu in your CDP Public Cloud Real-time Data Mart cluster. You can learn how to create such a data flow easily using the Kafka to Kudu ReadyFlow.

Time series use cases analyse data obtained during specified intervals, and enable you to improve performance based on available data. Examples include:

  • Optimizing yield or yield quality in a manufacturing plant

  • Dynamically optimizing network capacity during peak load of better telecommunications uptime and services

These use cases require that you store events at a high frequency, while providing ad-hoc query and record update abilities.

The Kafka to Kudu ReadyFlow consumes JSON, CSV or Avro data from a source Kafka topic, parses the schema by looking up the schema name in the CDP Schema Registry and ingests it into a Kudu table. You can pick the Kudu operation (INSERT, INSERT_IGNORE, UPSERT, UPDATE, DELETE, UPDATE_IGNORE, DELETE_IGNORE) that best fits your use case. Failed Kudu write operations are retried automatically to handle transient issues. Define a KPI on the failure_WriteToKudu connection to monitor failed write operations.