Storm Flink MigrationPDF version

Differences in architecture

The basic architecture of task execution is similar in Storm and Flink. The main difference between the two systems is that Workers and Executors are responsible for executing the tasks in Storm, while in Flink the execution is done by only the Task Managers. The Task Managers also manage the state backend, which is a durable storage for storing states.

Flink has a simpler architecture compared to Storm as the Task Managers fulfill the jobs of Workers and Executors. The process of task execution is similar: a Process Controller/Job Manager on a master node starts a worker node. On a worker node the Workers and Executors in Storm, Task Managers in Flink are responsible for running the Tasks. As the Executor, the Task Manager can also run more than one tasks at the same time. The resource management for the tasks are completed by the Process Controller in Storm and by the Job manager in Flink.

In a Storm cluster, the Nimbus runs the Storm topology and distributes it to the Supervisor from which the processes are delegated to workers. ZooKeeper is needed to coordinate the communication between the Nimbus and Supervisor node. In a Flink cluster, Flink jobs are executed as YARN applications. HDFS is used to store recovery and log data, while ZooKeeper is used for high availability coordination for jobs. The following illustrations detail the architecture, task execution, and cluster layout in Storm and Flink.