Creating Data Transformations

Input Transforms are a powerful way to clean, modify, and arrange data that is poorly organized, has changing format, has data that is not needed or otherwise hard to use. With the Input Transfrom feature of SQL Stream Builder, you can create a javascript function to transform the data after it has been consumed from a Kafk topic, and before you run SQL queries on the data.

  1. Navigate to the Streaming SQL Console.
    1. Go to your cluster in Cloudera Manager.
    2. Select SQL Stream Builder from the list of services.
    3. Click SQLStreamBuilder Console.
    The Streaming SQL Console opens in a new window.
  2. Click Create Job or select a previous job on the Getting Started page.
    You are redirected to the Console page.
  3. Open the Kafka table configurations.
    You can add the Input Transform to the Kafka table when you create the Kafka table:
    1. Choose Apache Kafka from the Add table drop-down.
    You can add the Input Transform to an already existing Kafka table:
    1. Select the edit button for the Kafka table you want to add a transformation.
    The Kafka table wizard appears.
  4. Click Data Transformation.
    You have the following options to insert your Input Transform:
    1. Add your javascript transformation code to the Data Transformation box.
      Make sure the output of your transform matches the Schema definition detected or defined for the Kafka table.
    2. Click Install default template and schema.
      The Install Default template and schema option fills out the Data Transformation box with a template that you can use to create the Input Transform, and matches the schema with the format.
  5. Click Review and Create.