Running SQL Stream jobs

Every time you execute an SQL statement in the SQL Stream console, it becomes a job and runs on the deployment as a Flink job. You can manage the running jobs using the Jobs tab on the UI.

There are two logical phases to run a job:

Parse: The SQL is parsed and checked for validity and then compared against the virtual table schema(s) for correct typing and key/columns.
Execution: If the parse phase is successful, a job is dynamically created, and runs on an open slot on your cluster. The job is a valid Flink job.

Make sure that you have registered a data source.
Make sure that you have created a Virtual Table Source.

Go to your cluster in Cloudera Manager.
Click on SQL Stream Builder from the list of Services.
Click on SQLStreamBuilder Console.
The Streaming SQL Console opens up in a new window.
Provide a name for the SQL job.
1. Optionally, you can click on the Random Name button to generate a name for the SQL job.
Select a Virtual Table Sink.
1. Optionally, you can leave the sink to None.
  
  note
  The Sink Virtual Table is an optional argument to a job that specifies the destination for the continuous results. If you select None, the results are not sent to a sink, but sampled to the screen or to a materialized view.
Add a SQL query to the SQL window.

note
You cannot start a job without adding a SQL statement in the SQL editor window. In case, there are no SQL statements provided and you click on the Execute button, the following error message is displayed: You must provide a SQL query.

When starting a job, the number of slots consumed on the specified cluster is equal to the Parallelism setting. The default is 1 slot. To change the parallelism setting, click Advanced settings.
Click Execute.
The Logs window updates the status of SSB.
Click Results to check the sampled data.
These results are only samples, not the entire result of the new stream being created from the output of the query. The entire result set is sent to the Sink Virtual Table and/or a Materialized View.
note
As SSB is querying unbounded data streams, you need to click on the Stop button to stop the job execution.

A job is generated that performs the SQL continuously on the stream of data from the Source Virtual Table, and pushes the results to a Sink Virtual Table.