Setting up Atlas Kafka import tool
Configure the Atlas-Kafka import tool to map Kafka topics to Apache Atlas entities by managing client configurations and Ranger security policies.
You must first set up the tool and later run the tool manually to create or update the Kafka
topic entities in Atlas. The tool uses client configuration to fetch the required configuration,
like the Atlas endpoint and Zookeeper.
The Kafka type definition in Atlas contains multiple fields. The import tool fills out the following field values:
clusterName- Contains the value provided by the
atlas.metadata.namespaceproperty in the application configuration file. The default value iscm. topic,name,description,URI- Contains the topic name.
qualifiedName- Contains the topic name and the
clusterName, joined by@. For example,my-topic@cm. This serves as the unique identifier for topics. partitionCount- Displays the number of partitions of the topic.
To set up the Atlas - Kafka import tool, follow these steps:
- Select the Atlas service in Cloudera Manager.
- Deploy client configs: Actions > Deploy Client Configuration.
- If Apache Ranger is enabled, create a Ranger policy with the user running the tool in
cm_atlas to allow the <user> to create, update, delete,
and read Kafka entities:
- Log into Ranger.
- Navigate to cm_atlas policies.
- Select the policy which has create_entity and delete_entity permissions.
- Add the Kerberos user that is used with the
import-kafka.shtool.
- SSH into a host where the client configuration is deployed.
- Run kinit as Kafka if Kerberos is used to set up the Ticket Granting Ticket (TGT).
- Set
JAVA_HOME. - Run the command
/opt/cloudera/parcels/CDH/lib/atlas/hook-bin/import-kafka.shto import Kafka topics into Atlas.
