Configuring the Atlas hook in Kafka
You can enable and configure an Atlas hook in Kafka which collects and transfers metadata to Atlas. Enabling the hook makes it possible for you to manage, govern, and monitor Kafka metadata and metadata lineage in Atlas.
The following instructions walk you through how you enable the Atlas hook, and how you configure the topic and client metadata namespaces used by the hook. The namespaces you configure in Kafka for the hook are used in Atlas to group and represent Kafka entities. Once the hook is set up and configured, Kafka metadata will be available in Atlas.
Enabling and configuring the hook only imports and exposes newly created Kafka topics.
Existing topics are not imported automatically. If you want to have access to already
existing topics, you must use the
import-kafka.sh tool and import them
manually into Atlas.
- Ensure that an Atlas service is deployed on the Kafka cluster or the data context cluster.
- Optional: Import existing Kafka topics into Atlas with the
- In Cloudera Manager, select the Kafka service.
- Go to Configuration.
- Find and enable the Enable Auditing to Atlas property.
The property enables or disables the Atlas hook in Kafka.
- Specify the topic and client metadata namespaces.The topic and client metadata namespaces are configured with the Atlas metadata namespace for Kafka Topics and Atlas metadata namespace for Kafka Clients properties. How you configure these properties depends on your environment and use case. Cloudera recommends that you to follow these guidelines when configuring namespaces:
- Use the same topic namespace you used with the kafka-topics.sh tool if you manually imported topics.
- Use identical client and topics namespaces if only a single Kafka cluster is audited
In this case, you can also consider using the default namespace which is
- Use unique topic namespaces in environments where there are multiple Kafka clusters
audited by Atlas.
This is a recommended practice to avoid the collision of topic entities.
- Configure client namespaces based on your use case.
Unlike topic namespaces, client namespaces do not have to be unique even if there are multiple Kafka clusters in your environment. For example, if there is an application communicating with multiple Kafka clusters, and it is using the same client.id, the client metadata namespace can be set to the same value for all Kafka clusters. This way, the application is represented as a single producer or consumer entity in Atlas.
- Click Save Changes.
- Restart the Kafka service.