Kafka metadata collection

Atlas can collect metadata from Kafka using the concept called metadata namespace.

When the Kafka cluster is configured to audit using Atlas, the Kafka brokers start notifying Atlas about metadata changes in the Kafka cluster. Clients do not have to be integrated with Atlas.

Atlas is used to track the metadata of multiple clusters. Metadata namespace in Atlas is used to categorise entities into groups. To avoid entity collision between clusters, entities belonging to different clusters can be grouped together, into separate metadata namespaces.

An Atlas hook runs in each Kafka instance. This hook transfers the metadata of Kafka assets to Atlas. The Kafka-Atlas auditing involves two metadata namespaces, one each for topics and clients.

The topic entities are created in the topic metadata namespace. Producer, Consumer and ConsumerGroup entities are created in the client metadata namespace.

For a simple use case, as a default configuration, the auditor can be configured to use the same namespace for both types.

Client metadata namespace can be used to insert entities representing an application spanning multiple Kafka clusters in the same namespace.

For example, if an application is connecting to a couple of Kafka clusters, and they have separate namespaces, the client entities are created twice, because each Kafka cluster will create them in its own client namespace. If Kafka clusters use the same client namespace, they will create and update the same client entity, so the application will be represented by a single producer and/or consumer entity.