Adding Kafka Data Source

To create Kafka tables in SQL Stream Builder (SSB) you need to register Kafka as a Data Source, using the Streaming SQL Console.

A default Local Kafka is added to the SSB data sources during installation, using a Kafka service within the same cluster as SSB. This Local Kafka data source cannot be updated or delete, as it is used in the Streaming SQL Console for sampling results and cleaning up sample topics. To add your own, customizable Kafka data source instead, follow the steps in this task.
  • Make sure that you have the right permissions set in Ranger.
  1. Navigate to the Streaming SQL Console.
    1. Navigate to Management Console > Environments, and select the environment where you have created your cluster.
    2. Select the Streaming Analytics cluster from the list of Data Hub clusters.
    3. Select Streaming SQL Console from the list of services.
      The Streaming SQL Console opens in a new window.
  2. Open a project from the Projects page of Streaming SQL Console.
    1. Select an already existing project from the list by clicking the Open button or Switch button.
    2. Create a new project by clicking the New Project button.
    3. Import a project by clicking the Import button.
    You are redirected to the Explorer view of the project.
  3. Open Data Sources from the Explorer view.
  4. Click next to Kafka.
  5. Select New Kafka Source.
    The Kafka Source window appears.
  6. Add a Name to your Kafka provider.
  7. Add the broker host name(s) to Brokers.
    You need to copy the Kafka broker name(s) from Cloudera Manager.
    1. Go to the Streams Messaging cluster in your environment.
    2. Select Cloudera Manager from the list of services.
    3. Click Kafka from the list of services.
    4. Click Instances.
    5. Copy the hostname of the Kafka broker(s) you want to use.
    6. Go back to the Add Kafka Source page.
    7. Paste the broker hostname to the Brokers field.
    8. Add the default Kafka port after the hostname(s).
      Example:
      docs-test-1.vpc.cloudera.com:9092, 
      docs-test-2.vpc.cloudera.com:9092
  8. Select the security Protocol.
    The connection protocol must be the same as it is configured for the Kafka cluster in Cloudera Manager.

    You can choose from the following protocols:

    1. Click Validate.
    2. Click Create after validation is successful.
    1. Choose your authentication method:
      By default the auto-discovery of CDP TrustStore is enabled. The auto-discovery of CDP TrustStore can be used for Kafka sources that are located in the same CDP Private Cloud Base cluster as SSB. This means that the default TrustStore path is used for authentication, which can be customized in Cloudera Manager using local.kafka.truststore.location and local.kafka.truststore.password parameters. If you disable the auto-discovery, you need to provide the following configurations:
      • Kafka TrustStore path
      • Kafka TrustStore Password
      You also have the option to provide the Kafka KeyStore path and KeyStore Password that belongs to the Kafka source.
    2. Click Validate.
    3. Click Create after validation is successful.
    1. Choose your authentication method:
      By default the auto-discovery of CDP TrustStore is enabled. The auto-discovery of CDP TrustStore can be used for Kafka sources that are located in the same CDP Private Cloud Base cluster as SSB. This means that the default TrustStore path is used for authentication, which can be customized in Cloudera Manager using local.kafka.truststore.location and local.kafka.truststore.password parameters. If you disable the auto-discovery, you need to provide the following configurations:
      • Kafka TrustStore path
      • Kafka TrustStore Password
      You also have the option to provide the Kafka KeyStore path and KeyStore Password that belongs to the Kafka source.
    2. Choose an SASL Mechanism.
    3. Click Validate.
    4. Click Create after validation is successful.
    1. Choose an SASL Mechanism.
    2. Provide the Username for SASL.
    3. Provide the Password for SASL.
    4. Click Validate.
    5. Click Create after validation is successful.
You have registered Kafka as a data source to be able to add Kafka as a table in your SQL query. The already existing Kafka topics can be selected when adding Kafka as a table.
After registering the Kafka data source, you can edit, duplicate, and delete it from the Streaming SQL Console:
  1. Open Data Sources from the Explorer view.
  2. Click next to Kafka.
  3. Select Manage.

    The Kafka Sources tab opens where the registered Kafka providers are listed. You have the following options to manage the Kafka sources:

    • Click on one of the existing Kafka providers to edit its configurations.
    • Click to remove the Kafka provider.
    • Click to duplicate the Kafka provider with its configurations.