Running SQL Stream jobs
Every time you run an SQL statement in the SQL Stream console, it becomes a job and runs on the deployment as a Flink job. You can manage the running jobs using the Jobs tab on the UI.
- Parse: The SQL is parsed and checked for validity and then compared against the virtual table schema(s) for correct typing and key/columns.
- Execution: If the parse phase is successful, a job is dynamically created, and runs on an open slot on your cluster. The job is a valid Flink job.
- Make sure that you have registered a Data Provider.
- Make sure that you have created a Table that can be used as a source in the SQL query.
- Go to your cluster in Cloudera Manager.
- Click SQL Stream Builder from the list of services.
Click SQLStreamBuilder Console.
The Streaming SQL Console opens in a new window.
Provide a name for the SQL job.
- Optionally, you can click Random Name to generate a name for the SQL job.
Select a Sink Table.
Optionally, you can leave the sink as None.
- Optionally, you can leave the sink as None.
Add a SQL query to the SQL window.
When starting a job, the number of slots consumed on the specified cluster is equal to the parallelism setting. The default is one slot. To change the parallelism setting, click Advanced settings.
The Logs window updates the status of SSB.
Click Results to check the sampled data.
These results are only samples, not the entire result of the new stream being created from the output of the query. The entire result set is sent to the Sink Table and/or a Materialized View.