The Community Edition of Cloudera Streaming Analytics consists of preconfigured Docker images for Zookeeper, Kafka and PostgreSQL to make getting started easier. The components can be reached using their dedicated ports. Storage for the Community Edition is handled by docker volumes, while PostgreSQL is integrated for database management and storing the Materialized Views.
PostgreSQL is used by SQL Stream Builder components internally. It is also used as the underlying database for the Materialized View Engine. The PostgreSQL database for the Materialized View tables (eventador_snapper database) can be accessed by using the user eventador_snapper. The default password for the database is cloudera.
The containers use the following docker volumes to provide persistent local storage between
restarts. If the volumes do not exist in your local environment, they are created when running
docker-compose up command.
- Persistent in the Flink TaskManager and JobManager containers. It is used for storing savepoints of the jobs. When using the Filesystem connector, it is also recommended to use a volume.
- Used by the Streaming SQL Engine for persistent storage under the Streaming SQL Engine container.
- Used by the PostgreSQL database. It stores the internal tables required for SQL Stream Builder to work, as well as the created Materialized Views.
- Used by the Kafka container to store the topics.
- Used by Zookeeper.
It is possible to delete the docker volumes for a fresh start by shutting down all of the
docker-compose down --volumes command, or individually removing
docker volume rm <volume name> command. The
containers use a docker network (named ssb-net) to communicate.
By default, the Kafka container is preconfigured in SQL Stream Builder as the Local Kafka data provider.
docker-compose exec kafka /opt/kafka/bin/kafka-topics.sh --bootstrap-server kafka:9092 --create --topic myNewTopic
docker-compose exec -T kafka /opt/kafka/bin/kafka-console-producer.sh --bootstrap-server kafka:9092 --topic airplanes
The Kafka container is also accessible from outside the Docker network. However, only
kafka:9092 has been set as the advertised listener. To connect to Kafka from
your computer (outside the network), you need to add an entry to your
file to resolve the kafka domain name to localhost: