ReadyFlow: Kafka to Kudu
This use case shows you how you can move your data from a Kafka topic into Apache Kudu in your Cloudera Public Cloud Real-time Data Mart cluster. You can learn how to create such a data flow easily using the Kafka to Kudu ReadyFlow.
This ReadyFlow consumes JSON, CSV or Avro data from a source Kafka topic, parses the schema by looking up the schema name in the Cloudera Schema Registry and ingests it into a Kudu table. You can pick the Kudu operation (INSERT, INSERT_IGNORE, UPSERT, UPDATE, DELETE, UPDATE_IGNORE, DELETE_IGNORE) that fits best for your use case. Failed Kudu write operations are retried automatically to handle transient issues. Define a KPI on the failure_WriteToKudu connection to monitor failed write operations.
Kafka to Kudu ReadyFlow details | |
---|---|
Source | Kafka topic |
Source Format | JSON, CSV, Avro |
Destination | Kudu |
Destination Format | Kudu Table |
Time series use cases analyse data obtained during specified intervals, and enable you to improve performance based on available data. Examples include:
-
Optimizing yield or yield quality in a manufacturing plant
-
Dynamically optimizing network capacity during peak load of better telecommunications uptime and services
These use cases require that you store events at a high frequency, while providing ad-hoc query and record update abilities.