Creating Input Transforms

Input Transforms are a powerful way to clean, modify, and arrange data that is poorly organized, has changing format, has data that is junk or otherwise hard to use. You create a javascript function to transform the data after it has been consumed from the source and before queries run in SQL Stream Builder.

You define your schema on the output of the transform. You can use the Input Transforms in the following situations:
  • The source is not in your control, for example, data feed from a third-party provider
  • The format is hard to change, for example, a legacy feed, other teams of feeds within your organization
  • The messages are inconsistent
  • The data from the sources do not have uniform keys, or without keys (like nested arrays), but are still in a valid JSON format
  • The schema you want does not match the incoming topic
In case your input data is in the JSON format, and it is hard to express with a schema, you can use Input Transformations. The Input Transformations are defined for a virtual table source have the following characteristics:
  • Allows one transformation per source.
  • Takes record as a JSON-formatted string input variable. The input is always named record.
  • Emits the output of the last line to the calling JVM. It could be any variable name. In the following examples, out and emit is used as a JSON-formatted string.
A basic input transformation looks like this:
var out = JSON.parse(record);     // record is input, parse JSON formatted string to object
                                          // do whatever you want here..
JSON.stringify(out);                     // emit JSON formatted string of object
  1. Select Console from the left-hand menu.
  2. Go to the Virtual Tables tab.
  3. Select the edit button (pencil icon) for the Virtual Table Name you want to add a transform.
  4. Click Display/Edit Transformations.
  5. Add your transform code.
  6. Make sure the output of your transform matches the Manual Schema definition.