Creating Input Transforms
Input Transforms are a powerful way to clean, modify, and arrange data that is poorly organized, has changing format, has data that is junk or otherwise hard to use. You create a javascript function to transform the data after it has been consumed from the source and before queries run in SQL Stream Builder.
- The source is not in your control, for example, data feed from a third-party provider
- The format is hard to change, for example, a legacy feed, other teams of feeds within your organization
- The messages are inconsistent
- The data from the sources do not have uniform keys, or without keys (like nested arrays), but are still in a valid JSON format
- The schema you want does not match the incoming topic
In case your input data is in the JSON format, and it is hard to express with a
schema, you can use Input Transformations. The Input Transformations are defined for
a virtual table source have the following characteristics:
- Allows one transformation per source.
- Takes record as a JSON-formatted string input variable. The input is always named record.
- Emits the output of the last line to the calling JVM. It could be any variable name. In the following examples, out and emit is used as a JSON-formatted string.
A basic input transformation looks like
this:
var out = JSON.parse(record); // record is input, parse JSON formatted string to object
// do whatever you want here..
JSON.stringify(out); // emit JSON formatted string of object
- Select Console from the left-hand menu.
- Go to the Virtual Tables tab.
- Select the edit button (pencil icon) for the Virtual Table Name you want to add a transform.
- Click Display/Edit Transformations.
- Add your transform code.
- Make sure the output of your transform matches the Manual Schema definition.