An Atlas hook collects metadata from Kafka and transfers it to Atlas so that you can
manage, govern, and monitor Kafka metadata and metadata lineage in the Atlas UI. You can enable
and configure an Atlas hook in Kafka.
The following instructions walk you through how you enable the Atlas hook, and how
you configure the topic and client metadata namespaces used by the hook. The namespaces you
configure for the hook in Kafka, are used in Atlas, to group and represent Kafka entities.
Once the hook is set up and configured, Kafka metadata will be available in Atlas.
Enabling and configuring the hook only imports and exposes newly created Kafka topics.
Existing topics are not imported automatically. To access existing topics, you must use the
kafka-import.sh
tool and import them manually into Atlas.
- Ensure that an Atlas service is deployed on the Kafka cluster or the data context
cluster.
- Import existing Kafka topics into
Atlas.
Existing topics can be imported using either the Import Kafka Topics
Into Atlas action in Cloudera Manager or the
import-kafka.sh
tool. For more information regarding the
Import Kafka Topics Into Atlas action, see
Importing Kafka entities into Atlas. For more
information regrading the usage of the import-kafka.sh
tool, see
Setting up Atlas Kafka import tool.
- In Cloudera Manager, select the Kafka service.
- Go to Configuration.
- Find and enable the Enable Auditing to Atlas property.
This property enables or disables the Atlas hook in Kafka.
- Specify the topic and client metadata namespaces.
The topic and client
metadata namespaces are configured with the
Atlas metadata namespace for Kafka
Topics and
Atlas metadata namespace for Kafka Clients
properties. How you configure these properties depends on your environment and
use case. Cloudera recommends that you to follow these guidelines when configuring namespaces:
- Use the same topic namespace you used with the
kafka-topics.sh
tool
if you manually imported topics.
- Use identical client and topics namespaces if only a single Kafka cluster is audited
by Atlas.
In this case, you can also consider using the default namespace, which is
cm
.
- Use unique topic namespaces in environments where there are multiple Kafka clusters
audited by Atlas.
This is a recommended practice to avoid the conflict of topic
entities.
- Configure client namespaces based on your use case.
Unlike topic namespaces,
client namespaces do not have to be unique even if there are multiple Kafka clusters
in your environment. For example, if there is an application communicating with
multiple Kafka clusters, and it is using the same client.id, the client metadata
namespace can be set to the same value for all Kafka clusters. This way, the
application is represented as a single producer or consumer entity in Atlas.
- Click Save Changes.
- Restart the Kafka service.
The Atlas hook in Kafka is enabled and configured. Kafka topic, consumer, producer,
and consumer group metadata can be monitored using the Atlas UI.
Learn more about the Atlas hook as well as Kafka metadata collection. For more
information see Kafka metadata collection.