Snapshot support in Ozone
Learn about different scenarios where you can use snapshots, the snapshot APIs that are available for use, and the snapshot architecture.
- Backup and restore
Create hourly, daily, weekly, or monthly snapshots for backup and recovery.
- Archival and compliance
Take snapshots for compliance purpose and archive them.
- Replication and disaster recovery (DR)
Snapshots provide frozen immutable images of the bucket on the source Ozone cluster. Snapshots can be used for replicating these immutable bucket images to remote DR sites.
- Incremental replication
DistCp with SnapshotDiff offers an efficient way to incrementally sync up source and destination buckets.
Snapshot APIs
ozone fs
and ozone sh
CLI. This feature can also be programmatically accessed from Ozone ObjectStore
Java client. The feature provides following functionalities:- Create an instantenous snapshot for a given bucket ozone sh snapshot create [-hV] <bucket> [<snapshotName>].
- List all snapshots of a given bucket ozone sh snapshot list [-hV] <bucket>.
- Delete a specific snapshot for a given bucket ozone sh snapshot delete [-hV] <bucket> <snapshotName>.
- Given two snapshots, list all the keys that are different between them -
SnapshotDiff
ozone sh snapshot diff [-chV] [-p=<pageSize>] [-t=<continuation-token>] <bucket> <fromSnapshot> <toSnapshot>.
The SnapshotDiff
functionality in CLI/API is asynchronous. The first time the
API is invoked, Ozone Manager (OM) starts a background thread to calculate the
SnapshotDiff
, and returns Retry
with suggested duration
for the retry operation. After the SnapshotDiff is computed, this API returns the
differences in multiple pages. Within each SnapshotDiff
response, OM also
returns a continuation token for the client to continue from the last batch of
SnapshotDiff
results. This API is safe to be called multiple times for a
given snapshot source and destination pair. Internally, each OM computes
SnapshotDiff
only once and stores it for future invocations of the same
SnapshotDiff
API.
Snapshot architecture
Ozone snapshot architecture leverages the fact that data blocks once written, remain immutable
in their lifetime. These data blocks are reclaimed only when the object key metadata that
references them, is deleted from the Ozone namespace. All of the Ozone metadata is stored on
the OM nodes in the Ozone cluster. When you take a snapshot of an Ozone bucket, internally
the system takes snapshot of the Ozone metadata in OM nodes. Since Ozone does not allow
updates to DataNode blocks, integrity of data blocks referenced by Ozone metadata snapshot
in OM nodes remains intact. Ozone key deletion service is also aware of Ozone snapshots. Key
deletion service does not reclaim any key as long as it is referenced by the active object
store bucket or any of its snapshot. When the snapshots are deleted, a background garbage
collection service reclaims any key that is not part of any snapshot or active object store.
Ozone also provides the SnapshotDiff API. Whenever a user issues a
SnapshotDiff
between two snapshots, it efficiently calculates all the
keys that are different between these two snapshots and returns paginated
SnapshotDiff
list result.