Creating Kafka tables

You can use the registered Kafka provider to create Kafka tables that can be used as source and sink in your SQL Stream jobs.

Make sure that you have created a Kafka topic.
important
When creating the topic for the Kafka sink, make sure to not use log compaction as it can cause the SQL job to fail.
Make sure there is generated data in the Kafka topic.
Make sure that you have the right permissions set in Ranger.

Go to your cluster in Cloudera Manager.
Click SQL Stream Builder from the list of services.
Click SQLStreamBuilder Console.
The Streaming SQL Console opens in a new window.
Select Console from the left- side menu.
Go to the Tables tab.
Select Add table > Apache Kafka.
The Kafka Table window appears.
Provide a Name for the Table.

note
You will use this name in the FORM clause when running the SQL statement.
Select a registered Kafka provider as Kafka cluster.
Select a Kafka topic from the list.

note
The automatically created topics for the websocket output is also listed here. Select the topic you want to use for the SQL job.
Select the Data format.
- You can select JSON as data format.
- You can select AVRO as data format.
note
You can only select the AVRO format when Schema Registry is used.
Determine the Schema for the Kafka table.
1. Add a customized schema to the Schema Definition field.
2. Click Detect Schema to read a sample of the JSON messages and automatically infer the schema.
  
  note
  If there are no messages in the topic, then no schema will be inferred.
Customize your Kafka Table with the following options:
1. Configure the Event Time if you do not want to use the Kafka Timestamps.
  1. Unselect the checkbox of Use Kafka Timestamps.
  2. Provide the name of the Input Timestamp Column.
  3. Add a name for the Event Time Column.
  4. Add a value to the Watermark Seconds.
2. Configure an Input Transform, add the code using the Transformations tab.
3. Configure any Kafka properties required using the Properties tab.
For more information about how to configure the Kafka table, see the Configuring Kafka tables section.
Select Save Changes.

The Kafka Table is ready to be selected as a source or a sink for the SQL Stream job. To use Kafka as a source add it to the SQL query with the FROM statement. To use it as a sink, select the Kafka table when creating the SQL job from the Sink Table drop-down menu.

Creating Kafka tables

We want your opinion

How can we improve this page?