CSP as a Dockerized Application

The Community Edition of Cloudera Stream Processing (CSP) consists of preconfigured Docker images of the components. The Streams Messaging and Streaming Analytics components share Zookeeper and PostgreSQL. The components can be reached using their dedicated ports. Storage for the Community Edition is handled by docker volumes, while PostgreSQL is integrated for database management and storing metadata.

The containers use the following docker volumes to provide persistent local storage between restarts. If the volumes do not exist in your local environment, they are created when running the docker-compose up command.

kf-volume
Used by the Kafka container to store the topics.
When used with Streaming Analytics:
  • The Kafka container is by default preconfigured in SQL Stream Builder as the Local Kafka data provider.
kfc-volume
Used for Kafka Connect.
prom-volume
Used for Streams Messaging Manager Metrics.
sr-volume
Used for Schema Registry.
flink-volume
Persistent in the Flink TaskManager and JobManager containers. It is used for storing savepoints of the jobs. When using the Filesystem connector, it is also recommended to use this volume for file management.
ssb-volume
Used by the Streaming SQL Engine for persistent storage under the Streaming SQL Engine container.
pg-volume
Used by the PostgreSQL database.
When used with Streaming Analytics:
  • It stores the internal tables required for SQL Stream Builder to work, as well as the created Materialized Views.
  • PostgreSQL is used by SQL Stream Builder components internally. It is also used as the underlying database for the Materialized View Engine. The PostgreSQL database for the Materialized View tables (eventador_snapper database) can be accessed by using the user eventador_snapper. The default password for the database is cloudera.
zk-volume
Used by Zookeeper.

It is possible to delete the docker volumes for a fresh start by shutting down all of the containers with docker-compose down --volumes command, or individually removing them with docker volume rm <volume name> command. The containers use a docker network (named ssb-net) to communicate.