Ozone architecture

Ozone separates management of namespaces and storage, helping it to scale effectively. The Ozone Manager (OM) manages the namespaces while the Storage Container Manager (SCM) handles the containers.

The following diagram shows the basic architecture of Ozone:
Blocks are the basic unit of storage. In Ozone, each block is of 256 MB in size. A collection of blocks forms a storage container. SCM allocates blocks inside storage containers for the client to store data.
Storage Containers
A storage container is a group of unrelated blocks managed together as a single entity. A container exists in a DataNode and is the basic unit of replication, with a capacity of 2 GB to 16 GB.
Ozone Manager
The Ozone Manager (OM) is the metadata manager for Ozone. OM manages the following storage elements:
  • The list of volumes for each user
  • The list of buckets for each volume
  • The list of keys for each bucket
In addition, OM also handles metadata operations from client applications. The clients request for keys (file names) for performing the read and write operations. OM maintains the mappings between the keys and their corresponding Block IDs. OM also interacts with SCM for information about blocks relevant to the read and write operations, and provides this information to the client.
Storage Container Manager
Ozone is built on a highly available, replicated block storage layer called Hadoop Distributed Data Store (HDDS). The Storage Container Manager (SCM) is the container manager of HDDS. SCM manages the DataNodes and allocates storage containers and blocks that are replicated through pipelines.
A storage container is a collection of blocks. SCM manages the block collections, ensuring that the blocks maintain the required level of replication. SCM also manages the addition and removal of DataNodes, which comprise of storage containers. In addition, SCM executes recovery actions when faced with DataNode or disk failures.
SCM allocates blocks to clients through OM for read and write operations. SCM provides the following abstractions:
Recon Server
Recon is the management interface for Ozone. Recon provides a unified management API for Ozone.
Pipelines determine the replication strategy for the blocks associated with a write operation.
DataNodes contain storage containers comprising of data blocks. SCM monitors DataNodes through heartbeats.