Configuring the Atlas hook in Kafka

An Atlas hook collects metadata from Kafka and transfers it to Atlas so that you can manage, govern, and monitor Kafka metadata and metadata lineage in the Atlas UI. You can enable and configure an Atlas hook in Kafka.

The following instructions walk you through how you enable the Atlas hook, and how you configure the topic and client metadata namespaces used by the hook. The namespaces you configure for the hook in Kafka, are used in Atlas, to group and represent Kafka entities. Once the hook is set up and configured, Kafka metadata will be available in Atlas.

Enabling and configuring the hook only imports and exposes newly created Kafka topics. Existing topics are not imported automatically. To access existing topics, you must use the kafka-import.sh tool and import them manually into Atlas.

  • Ensure that an Atlas service is deployed on the Kafka cluster or the data context cluster.
  • Import existing Kafka topics into Atlas.

    Existing topics can be imported using either the Import Kafka Topics Into Atlas action in Cloudera Manager or the import-kafka.sh tool. For more information regarding the Import Kafka Topics Into Atlas action, see Importing Kafka entities into Atlas. For more information regrading the usage of the import-kafka.sh tool, see Setting up Atlas Kafka import tool.

  1. In Cloudera Manager, select the Kafka service.
  2. Go to Configuration.
  3. Find and enable the Enable Auditing to Atlas property.
    This property enables or disables the Atlas hook in Kafka.
  4. Specify the topic and client metadata namespaces.
    The topic and client metadata namespaces are configured with the Atlas metadata namespace for Kafka Topics and Atlas metadata namespace for Kafka Clients properties. How you configure these properties depends on your environment and use case. Cloudera recommends that you to follow these guidelines when configuring namespaces:
    • Use the same topic namespace you used with the kafka-topics.sh tool if you manually imported topics.
    • Use identical client and topics namespaces if only a single Kafka cluster is audited by Atlas.

      In this case, you can also consider using the default namespace, which is cm.

    • Use unique topic namespaces in environments where there are multiple Kafka clusters audited by Atlas.

      This is a recommended practice to avoid the conflict of topic entities.

    • Configure client namespaces based on your use case.

      Unlike topic namespaces, client namespaces do not have to be unique even if there are multiple Kafka clusters in your environment. For example, if there is an application communicating with multiple Kafka clusters, and it is using the same client.id, the client metadata namespace can be set to the same value for all Kafka clusters. This way, the application is represented as a single producer or consumer entity in Atlas.

  5. Click Save Changes.
  6. Restart the Kafka service.
The Atlas hook in Kafka is enabled and configured. Kafka topic, consumer, producer, and consumer group metadata can be monitored using the Atlas UI.
Learn more about the Atlas hook as well as Kafka metadata collection. For more information see Kafka metadata collection.