Cloudera Streaming as a Dockerized Application
Cloudera Streaming Community Edition consists of preconfigured Docker images of the components. The Streams Messaging and Streaming Analytics components share Zookeeper and PostgreSQL. The components can be reached using their dedicated ports. Storage for Cloudera Streaming Community Edition is handled by docker volumes, while PostgreSQL is integrated for database management and storing metadata.
The containers use the following docker volumes to provide persistent local storage between
restarts. If the volumes do not exist in your local environment, they are created when running
the docker-compose up
command.
- kf-volume
- Used by the Kafka container to store the topics.
- kfc-volume
- Used for Kafka Connect.
- prom-volume
- Used for Streams Messaging Manager Metrics.
- sr-volume
- Used for Schema Registry.
- flink-volume
- Persistent in the Flink TaskManager and JobManager containers. It is used for storing savepoints of the jobs. When using the Filesystem connector, it is also recommended to use this volume for file management.
- ssb-volume
- Used by the Cloudera SQL Stream Builder engine for persistent storage under the Streaming SQL Engine container.
- pg-volume
- Used by the PostgreSQL database.
- zk-volume
- Used by ZooKeeper.
It is possible to delete the docker volumes for a fresh start by shutting down all of the
containers with docker-compose down --volumes
command, or individually removing
them with docker volume rm <volume name>
command. The
containers use a docker network (named ssb-net) to communicate.