Configuring SQL job settings
If you need to further customize your SQL Stream job, you can add more advanced features to configure the job restarting method and time, threads for parallelism, sample behavior, exatly once processing and restoring from savepoint.
Job parallelism (threads)
The number of threads to start to process the job. Each thread consumes a slot on the cluster. When the Job Parallelism is set to 1, the job consumes the least resources. If the data provired supports parallel reads, increasing the parallelism can raise the maximum throughput. For example, when using Kafka as a data provider, setting the parallelism to the equal number as the partitions of the topic can be a starting point for performance tuning.
Sample Count
The number of sample entries shown under the Results tab. To have an unlimited number of sample entries, add 0 to the Sample Count value.
Sample Window Size
The number of sample entries to keep in under the Results tab. To have an unlimited number of sample entries, add 0 to the Sample Window Size value.
Sample Behavior
- Sample all messages
- Sample one message every second
- Sample one message every five seconds
Restore From Savepoint
You can enable or disable restoring a SQL job from a Flink savepoint after stopping it. The savepoint is saved under hdfs:///user/flink/savepoints by default.