Introducing streams messaging cluster on Cloudera on cloud
You can create a streams messaging cluster on Cloudera on cloud and produce data to and consume data from an Apache Kafka topic by using that cluster. The concepts and usage of streams messaging cluster on Cloudera on cloud are explained using a simple workflow.
In this example you use a simple application (vmstat
) to generate some raw
text data and a kafka-console-producer
command to send the data to a Kafka
topic. You then use a kafka-console-consumer
command to read data from that
Kafka topic. After you have the producer and consumer running, you use Streams Messaging Manager, which is also available on Cloudera on cloud, to monitor the functionality of the workflow that
you create.
Alternately, you can also structure the data in an object format and send it to Kafka in Avro format. To do this, you store a schema in Cloudera Schema Registry and understand how to use a simple Java Kafka client to send and read data using that schema.
You can also update your schema to fit your requirements, and reconfigure the consumer and producer to use the latest schema.
- Meeting the prerequisites
- Creating a Machine User
- Granting Machine User access to environment
- Creating a Kafka topic
- Creating Ranger policies
- Producing data to Kafka topic
- Consuming data from Kafka topic
- Using Kerberos authentication
- Monitoring Kafka activity in Streams Messaging Manager
- Using Schema Registry
- Monitoring end-to-end latency in Streams Messaging Manager
- Evolving your schema