Mirroring data between Kafka clusters
Also available as:
PDF

Avoiding Data Loss

If for some reason the producer cannot deliver messages that have been consumed and committed by the consumer, it is possible for a MirrorMaker process to lose data.

To prevent data loss, use the following settings. (Note: these are the default settings.)

  • For consumers:

    • auto.commit.enabled=false

  • For producers:

    • max.in.flight.requests.per.connection=1

    • retries=Int.MaxValue

    • acks=-1

    • block.on.buffer.full=true

  • Specify the --abortOnSendFail option to MirrorMaker

The following actions will be taken by MirrorMaker:

  • MirrorMaker will send only one request to a broker at any given point.

  • If any exception is caught in the MirrorMaker thread, MirrorMaker will try to commit the acked offsets and then exit immediately.

  • On a RetriableException in the producer, the producer will retry indefinitely. If the retry does not work, MirrorMaker will eventually halt when the producer buffer is full.

  • On a non-retriable exception, if --abort.on.send.fail is specified, MirrorMaker will stop.

    If --abort.on.send.fail is not specified, the producer callback mechanism will record the message that was not sent, and MirrorMaker will continue running. In this case, the message will not be replicated in the target cluster.