Using Filesystem source

You can use Filesystem connector in the Community Edition to process local files which are local to the container. Ultimately, the right approach in this case would be to bind mounting a volume, but this should get you started.

Learning goal

  • How to use the Filesystem connector

Learning path

Before using the Filesystem connector, you need to have a local file with data in it which you will select in Streaming SQL Console. As an example, open a CLI tool and use the following example to create a weblog_test.json with example data. After the file is created, you will move it to the local container.

$>cat weblog_test.json 
{ "foo": "baz"}
{ "foo": "baz1"}

$>docker ps  -- find the containerID for the taskmanager.
$>docker cp /Users/testuser/Downloads/weblog_test.json <containerid>:/weblog_test.json
After setting up the local file, open Streaming SQL Console, and execute the following example:
drop table if exists kg;
CREATE TABLE `ssb`.`ssb_default`.`kg` (
  `foo` VARCHAR(2147483647)
) WITH (
  'format' = 'json',
  'path' = 'file:///weblog_test.json',
  'connector' = 'filesystem'
);

select * from kg;
This example produces JSON serialized output, but you can also use RAW data format.
JSON: [{"foo":"baz"},{"foo":"baz1"}]
RAW: [{"foo":"{ \"foo\": \"baz\"}"},{"foo":"{ \"foo\": \"baz1\"}"}]

For the RAW data format, it is more likely that you want to create your own deserializer. In this case, the my_split RAW makes more sense when parsing logfiles:

SELECT t.hostname, t.datetime, t.url, t.browser, ...
FROM(
  SELECT my_split(log) as t FROM nginx_log
);

Next step

You can use the created Filesystem table in your queries.