Storage container reconciliation overview
Learn how storage container reconciliation identifies corrupted or diverged container replicas and repairs them to ensure consistency and identical data across all replicas.
Overview
Generally, container replicas are identical in nature and contain the same data. Container reconciliation is a repair mechanism that allows administrators to detect container replicas that are corrupted or have otherwise diverged from each other for any reason, and repair these replicas, making them identical again. The data within each replica is summarized as a single data checksum. When the data within the replicas diverges, the replicas will have different data checksums. After reconciling the replicas, they will each contain the same data and each replica's data checksums will match the others. This data checksum is designed to remain constant even as data is deleted from the container in the background.
Restrictions and limitations
- Reconciliation is not supported for containers that are in the
OPEN,DELETING, orDELETEDstates. Reconciling containers in these states fails immediately and makes no modifications. Reconciliation can be used to repair data inUNHEALTHYreplicas, where it updates the replicas' data and corresponding checksums, but not their state. - Reconciliation is currently supported for Ratis-replicated containers only. Reconciling an EC container fails immediately and makes no modifications.
- Containers with a replication factor of 1 cannot be reconciled because there are no peer replicas to compare against. Similarly, containers with a replication factor of 3 that have zero or one replica available cannot be reconciled.
- After running
ozone admin container reconcileon aCLOSEDRATIS/THREEcontainer, checksums can match, but the replica state will not change. even if it wasUNHEALTHY. If the container is corrupted again later and the checksums diverge once more, you can run theozone admin container reconcilecommand multiple times until the container is repaired.Storage Container Manager (SCM) can only use replication to automatically recover from failures. SCM will not automatically schedule reconciliation. Since replication requires at least one source replica not in the
UNHEALTHYstate, reconciliation will need to be run manually if all replicas areUNHEALTHY.
