Hierarchical metastore event processing

Hierarchical metastore event processor uses a multi-layered approach to improve synchronization speed and handle event dependencies.

In Cloudera Data Warehouse Runtime 2025.0.21.0 and higher versions, the metastore event processor supports a multi-layered approach to improve synchronization speed and handle event dependencies.

By default, the metastore event processor is single-threaded. Notification events are processed sequentially with a maximum limit of 1000 events fetched and processed in a single batch. Multiple locks address concurrency issues that can arise when catalog DDL operation processing and metastore event processing try to access or update catalog objects at the same time.

Waiting for a lock or file metadata loading of a table can slow event processing and affect subsequent events, even if those events are not dependent on the previous one. This results in a long synchronization time for Hive Metastore (HMS) events.

Multi-level event processing

You can convert existing metastore event processing into multi-level event processing by using the enable_hierarchical_event_processing flag. This method partitions events based on their dependency, maintains the order of events within that dependency, and processes them independently when possible.

The hierarchical model consists of the following layers:

  • Event dispatcher – Dispatches metastore events and maintains a fixed pool of database event executors.
  • Database event executor – Manages table event dispatching, processes database events, and manages table event executors.
  • Table event executor – Processes events for multiple tables by using table processors.

Multi-level event processing ensures linearizability for a specific table by processing all table-specific events in the order they occurred.

Event synchronization and ordering

To maintain metadata consistency, certain events act as synchronization barriers and are not processed in parallel:

  • A database synchronization barrier restricts the table processors of the database from processing events that occur after the database event, until the database event is fully processed.
  • A rename synchronization barrier wraps an ALTER TABLE rename event. It synchronizes the source and target table processors to ensure the source processor removes the table before the target processor creates the renamed table.
  • A transaction synchronization barrier is used when the system creates pseudo-commit and pseudo-abort transaction events for each table involved, because COMMIT_TXN and ABORT_TXN can involve multiple tables. These pseudo events are then processed independently at their respective table processors.