Creating a Kafka source

You can use the registered Kafka source to create a Kafka Virtual Table Source. You need to add a Kafka topic and choose to use Schema Registry to have a Kafka source for your SQL Stream job.

  • Make sure that you have created a Kafka topic.
  • Make sure that you have the right permissions set in Ranger.
  • Make sure there is generated data in the Kafka topic.
  • If you are using Schema Registry, you have added a schema.
  1. Go to your cluster in Cloudera Manager.
  2. Click on SQL Stream Builder from the list of Services.
  3. Click on SQLStreamBuilder Console.
    The Streaming SQL Console opens up in a new window.
  4. Select Console from the left hand menu.
  5. Select the Virtual Tables tab.
  6. Select the Virtual Table Source sub-tab.
  7. Select Add Source > Apache Kafka.
    The Kafka Source window appears.
  8. Provide a name to the Virtual Table.
  9. Select a registered Kafka data source.
  10. Select a Kafka topic from the list.
  11. Select the Data Format.
    • You can select JSON as data format.
    • You can select Avro as data format.
  12. Determine the schema for the Virtual Table Source.
    1. Add a customized schema to the Schema Definition field.
    2. Click Detect Schema to read a sample of the messages and automatically infer the schema.
    3. Click Detect Schema if you are using Schema Registry.
  13. Customize your Virtual Table Source.
    1. Configure an Input Transform, add the code using the Transformations tab.
    2. Configure any Kafka properties required using the Properties tab.
  14. Select Save Changes.
The Kafka Virtual Table Source is ready for queries to be run. You can check its configuration by a DESC <tablename> command in the Compose tab, and selecting Execute. You can use the Kafka sink for your SQL job by declaring in with the FORM clause in the SQL statement.