Rebalancing partitions
Learn what rebalancing is, when it can occur, and how its propagated to the client.
There are multiple points in the protocol between consumers and brokers where failures can occur. There are points in the normal operation of the system where you need to change the consumer group assignments. For example, to consume a new partition or to respond to a consumer going offline. The process or responding to cluster information changing is called rebalance. It can occur in the following cases:
- A consumer leaves. It can be a software failure where the session times out or a connection stalls for too long, but it can also be a graceful shutdown.
- A consumer joins. It can be a new consumer but an old one that just recovered from a software failure (automatically or manually).
- Partition is adjusted. A partition can simply go offline because of a broker failure or a partition coming back online. Alternatively an administrator can add or remove partitions to/from the broker. In these cases the consumers must reassign who is consuming.
- The cluster is adjusted. When a broker goes offline, the partitions that are lead by this broker will be reassigned. In turn the consumers must rebalance so that they consume from the new leader. When a broker comes back, then eventually a preferred leader election happens which restores the original leadership. The consumers must follow this change as well.
On the consumer side, this rebalance is propagated to the client via the
ConsumerRebalanceListener
interface. It has two methods. The first,
onPartitionsRevoked
, will be invoked when any partition goes offline. This
call happens before the changes would reflect in any of the consumers, so this is the chance
to save offsets if manual offset commit is used. On the other hand
onPartitionsAssigned
is invoked after partition reassignment. This would
allow for the programmer to detect which partitions are currently assigned to the current
consumer. Complete examples can be found in the development section.