Installing Kafka
Kafka is distributed in a parcel that is independent of the CDH parcel and integrates with Cloudera Manager using a Custom Service Descriptor (CSD).
To install Apache Kafka:
- Download the Kafka CSD here.
- Install the CSD into Cloudera Manager as instructed in Custom Service Descriptor Files. This adds a new parcel repository to your Cloudera Manager configuration. The CSD can only be installed on parcel-deployed clusters.
- Download, distribute, and activate the Kafka parcel, following the instructions in Managing Parcels. After you activate the Kafka parcel, Cloudera Manager prompts you to restart the cluster. Click the Close button to ignore this prompt. You do not need to restart the cluster after installing Kafka.
- Add the Kafka service to your cluster, following the instructions in Adding a Service.
Cloudera strongly recommends that you deploy Kafka on dedicated hosts that are not used for other cluster roles.
Kafka Command-line Tools
- kafka-topics
Create, alter, list, and describe topics. For example:
$ /usr/bin/kafka-topics --list --zookeeper zk01.example.com:2181 sink1 t1 t2
- kafka-console-consumer
Read data from a Kafka topic and write it to standard output. For example:
$ /usr/bin/kafka-console-consumer --zookeeper zk01.example.com:2181 --topic t1
- kafka-console-producer
Read data from standard output and write it to a Kafka topic. For example:
$ /usr/bin/kafka-console-producer --broker-list kafka02.example.com:9092,kafka03.example.com:9092 --topic t1
- kafka-consumer-offset-checker
Check the number of messages read and written, as well as the lag for each consumer in a specific consumer group. For example:
$ /usr/bin/kafka-consumer-offset-checker --group flume --topic t1 --zookeeper zk01.example.com:2181
Logs
The Kafka parcel is configured to log all Kafka log messages to a single file, /var/log/kafka/server.log by default. You can view, filter, and search this log using Cloudera Manager.
For debugging purposes, you can create a separate file with TRACE level logs of a specific component (such as the controller) or the state changes.
log4j.appender.kafkaAppender=org.apache.log4j.DailyRollingFileAppender log4j.appender.kafkaAppender.DatePattern='.'yyyy-MM-dd-HH log4j.appender.kafkaAppender.File=${log.dir}/kafka_server.log log4j.appender.kafkaAppender.layout=org.apache.log4j.PatternLayout log4j.appender.kafkaAppender.layout.ConversionPattern=[%d] %p %m (%c)%n log4j.appender.stateChangeAppender=org.apache.log4j.DailyRollingFileAppender log4j.appender.stateChangeAppender.DatePattern='.'yyyy-MM-dd-HH log4j.appender.stateChangeAppender.File=${log.dir}/state-change.log log4j.appender.stateChangeAppender.layout=org.apache.log4j.PatternLayout log4j.appender.stateChangeAppender.layout.ConversionPattern=[%d] %p %m (%c)%n log4j.appender.requestAppender=org.apache.log4j.DailyRollingFileAppender log4j.appender.requestAppender.DatePattern='.'yyyy-MM-dd-HH log4j.appender.requestAppender.File=${log.dir}/kafka-request.log log4j.appender.requestAppender.layout=org.apache.log4j.PatternLayout log4j.appender.requestAppender.layout.ConversionPattern=[%d] %p %m (%c)%n log4j.appender.cleanerAppender=org.apache.log4j.DailyRollingFileAppender log4j.appender.cleanerAppender.DatePattern='.'yyyy-MM-dd-HH log4j.appender.cleanerAppender.File=${log.dir}/log-cleaner.log log4j.appender.cleanerAppender.layout=org.apache.log4j.PatternLayout log4j.appender.cleanerAppender.layout.ConversionPattern=[%d] %p %m (%c)%n log4j.appender.controllerAppender=org.apache.log4j.DailyRollingFileAppender log4j.appender.controllerAppender.DatePattern='.'yyyy-MM-dd-HH log4j.appender.controllerAppender.File=${log.dir}/controller.log log4j.appender.controllerAppender.layout=org.apache.log4j.PatternLayout log4j.appender.controllerAppender.layout.ConversionPattern=[%d] %p %m (%c)%n # Turn on all our debugging info #log4j.logger.kafka.producer.async.DefaultEventHandler=DEBUG, kafkaAppender #log4j.logger.kafka.client.ClientUtils=DEBUG, kafkaAppender #log4j.logger.kafka.perf=DEBUG, kafkaAppender #log4j.logger.kafka.perf.ProducerPerformance$ProducerThread=DEBUG, kafkaAppender #log4j.logger.org.I0Itec.zkclient.ZkClient=DEBUG log4j.logger.kafka=INFO, kafkaAppender log4j.logger.kafka.network.RequestChannel$=WARN, requestAppender log4j.additivity.kafka.network.RequestChannel$=false #log4j.logger.kafka.network.Processor=TRACE, requestAppender #log4j.logger.kafka.server.KafkaApis=TRACE, requestAppender #log4j.additivity.kafka.server.KafkaApis=false log4j.logger.kafka.request.logger=WARN, requestAppender log4j.additivity.kafka.request.logger=false log4j.logger.kafka.controller=TRACE, controllerAppender log4j.additivity.kafka.controller=false log4j.logger.kafka.log.LogCleaner=INFO, cleanerAppender log4j.additivity.kafka.log.LogCleaner=false log4j.logger.state.change.logger=TRACE, stateChangeAppender log4j.additivity.state.change.logger=false
Alternatively, you can add only the appenders you need.
More Information
For more information, see the official Kafka documentation.
- Use Cloudera Manager to start and stop Kafka and ZooKeeper services. Do not use the kafka-server-start, kafka-server-stop, zookeeper-server-start, and zookeeper-server-stop commands.
- All Kafka command-line tools are located in /opt/cloudera/parcels/KAFKA/lib/kafka/bin/.
- Set the JAVA_HOME environment variable to your JDK installation directory before using the command-line tools. For example:
export JAVA_HOME=/usr/java/jdk1.7.0_55-cloudera