HBase metadata collection

Atlas can collect metadata from HBase that describes the data assets HBase manages.

An Atlas hook runs in each HBase instance. This hook sends metadata to Atlas for HBase data assets. HBase namespaces, tables, columns, and column families are represented by entities in Atlas.

  1. When an action occurs in the HBase instance...
  2. The corresponding Atlas hook collects information for the action into metadata entities.
  3. The hook publishes the metadata on a Kafka topic.
  4. Atlas reads the message from the topic and determines what information will create new entities and what information updates existing entities.
  5. Atlas creates and updates the appropriate entities.

The Atlas bridge for HBase pulls the same metadata as the hook, but instead of sending the metadata through Kafka, it passes message in bulk in an API call. The bridge creates entities in Atlas for all of the existing HBase namespaces, tables, columns, and column families.