Calaculating Ozone heap sizing

Learn about the heuristic approach to configuring heap memory for Apache Ozone.

Unlike HDFS, which scales heap linearly with the number of blocks or files, Ozone utilizes RocksDB for metadata persistence and a container-based replication mechanism through the Storage Container Manager (SCM).

This architectural shift moves the bottleneck from raw data volume to workflow intensity, allowing the cluster to scale beyond 1 billion objects. The following guidelines provide calculation formulas for the Ozone Manager (OM), SCM, and DataNodes, combining a static base heap with dynamic multipliers based on peak operational metrics.

  • Ozone Manager Heap
    OM_heap = 8 GB + (write_ops_peak × 2 KB) + (delete_ops_peak × 2 KB) + (read_ops_peak × 200 B) + 2GB listing overhead

    Where,

    • write_ops_peak = om_metrics_num_key_allocate parameter value in the Prometheus metrics
    • delete_ops_peak = om_metrics_num_key_deletes parameter value in the Prometheus metrics
    • read_ops_peak = om_metrics_num_key_lookup parameter value in the Prometheus metrics
  • Storage Container Manager
    Heap SCM_heap = 6 GB + (number_of_containers × 1.4 KB) + (peak_delete_transactions × 500 B)

    where,

    • number_of_containers = Total number of containers in a cluster
    • peak_delete_transactions = scm_block_deleting_service_num_processed_transactions parameter value in the Prometheus metrics
  • DataNode Heap
    DN_heap = 4 GB + (active_pipelines × 150 MB) + (containers_on_node × 200 B)

    where,

    • active_pipelines = Get the active_pipelines value using the ozone admin pipeline list | grep --count $(hostname) command
    • containers_on_node = Container count from the Recon Web UI / Number of DataNodes in the cluster