Sampling
How to enable sampling for your SQL queries.
To enable sampling, you create a secret with the parameters of the Kafka instance used for sampling before starting helm install
.
Example for non-secure setup:
kubectl create secret generic ssb-sampling-kafka -n flink \
--from-literal=SSB_SAMPLING_BOOTSTRAP_SERVERS=kafka.example.com:9092 \
--from-literal=SSB_SAMPLING_SECURITY_PROTOCOL=PLAINTEXT
Example for secure setup:
kubectl create secret generic ssb-sampling-kafka -n flink \
--from-literal=SSB_SAMPLING_BOOTSTRAP_SERVERS=kafka-ssl.example.com:9092 \
--from-literal=SSB_SAMPLING_SECURITY_PROTOCOL=SSL \
--from-file=sampling_kafka_truststore.jks=[*** YOUR PATH ***]/truststore.jks \
--from-literal=SSB_SAMPLING_TRUSTSTORE_PASSWORD=[*** PASSWORD ***]
In the values.yaml
file, set sampling enabled
to true
and set your secret.
ssb:
sampling:
enabled: true
secure: true
secretRef: ssb-sampling-kafka
Sample results
How to view sample results from your SQL queries.
Because Cloudera Streaming Analytics - Kubernetes Operator does not install Kafka, in the Cloudera SQL Stream Builder UI you are
not able to see any rows from the Flink jobs. To see sampled results from your SQL queries,
you need to have a Kafka cluster installed and accessible by both Cloudera SQL Stream Builder and Flink pods, and
change ssbConfiguration
to configure Cloudera SQL Stream Builder to use Kafka for data sampling:
ssbConfiguration:
application.properties: |+
kafka.enabled=true
spring.kafka.bootstrap-servers=example-kafka:9092
spring.kafka.jaas.enabled=false
spring.kafka.properties.security.protocol=PLAINTEXT