Architecture
The Community Edition of Cloudera Streaming Analytics consists of preconfigured Docker images for Zookeeper, Kafka and PostgreSQL to make getting started easier. The components can be reached using their dedicated ports. Storage for the Community Edition is handled by docker volumes, while PostgreSQL is integrated for database management and storing the Materialized Views.
PostgreSQL is used by SQL Stream Builder components internally. It is also used as the underlying database for the Materialized View Engine. The PostgreSQL database for the Materialized View tables (eventador_snapper database) can be accessed by using the user eventador_snapper. The default password for the database is cloudera.
The containers use the following docker volumes to provide persistent local storage between
restarts. If the volumes do not exist in your local environment, they are created when running
the docker-compose up
command.
- ssb-volume
- Persistent in the Flink TaskManager and JobManager containers. It is used for storing savepoints of the jobs. When using the Filesystem connector, it is also recommended to use a volume.
- pg-volume
- Used by the PostgreSQL database. It stores the internal tables required for SQL Stream Builder to work, as well as the created Materialized Views.
- kf-volume
- Used by the Kafka container to store the topics.
- zk-volume
- Used by Zookeeper.
It is possible to delete the docker volumes for a fresh start by shutting down all of the
containers with docker-compose down --volumes
command, or individually removing
them with docker volume rm <volume name>
command.
By default, the Kafka container is preconfigured in SQL Stream Builder as the Local Kafka data provider.

docker-compose exec kafka /opt/kafka/bin/kafka-topics.sh --bootstrap-server kafka:9092 --create --topic myNewTopic
docker-compose exec -T kafka /opt/kafka/bin/kafka-console-producer.sh --bootstrap-server kafka:9092 --topic airplanes