Known Issues in Cloudera Distribution of Apache Kafka
The following sections describe known issues in Cloudera Distribution of Apache Kafka:
- Unsupported features
- Kafka client jars included in CDH may not match the newest Kafka parcel jar
- Source cluster not definable in Kafka 1.x
- Kafka fails when configured with Sentry and an old Kafka version
- Kafka stuck with under-replicated partitions after ZooKeeper session expires
- The Flume and Spark connectors to Kafka shipped with CDH 5.7 and higher only work with Kafka 2.0 and higher
- Only new Java clients support authentication and authorization
- Requests fail when sending to a nonexistent topic with auto.create.topics.enable set to true
- Custom Kerberos principal names must not be used for Kerberized ZooKeeper and Kafka instances
- Performance degradation when SSL is enabled
- AdminUtils is not binary-compatible between Cloudera Distribution of Apache Kafka 1.x and 2.x
- Monitoring is not supported in Cloudera Manager 5.4
- MirrorMaker does not start when Sentry is enabled
- Authenticated Kafka clients may impersonate other users
- Authenticated clients may interfere with data replication
Unsupported features
- Kafka Connect is included with Cloudera Distribution of Apache Kafka 2.0.0, but is not supported at this time.
- The Kafka default authorizer is included with Cloudera Distribution of Apache Kafka 2.0.0, but is not supported at this time. This includes setting ACLs and all related APIs, broker functionality, and command-line tools.
Kafka client jars included in CDH may not match the newest Kafka parcel jar
The Kafka client jars included in CDH may not match the newest Kafka parcel jar that is released. This is done to maintain compatibility across CDH 5.7 and higher for integrations such as Spark and Flume.
Source cluster not definable in Kafka 1.x
In Kafka 1.x, the source cluster is assumed to be the cluster that MirrorMaker is running on. In Kafka 2.0, you can define a custom source and target cluster.
Kafka fails when configured with Sentry and an old Kafka version
If Kafka is configured with Sentry, and Kafka is not of version 2.1 or higher, Kafka fails with java.lang.ClassNotFoundException: org.apache.sentry.kafka.authorizer.SentryKafkaAuthorizer
Workaround: unset the Kafka to Sentry dependency.
Kafka stuck with under-replicated partitions after ZooKeeper session expires
This problem might occur when your Kafka cluster includes a large number of under-replicated Kafka partitions. One or more broker logs include messages such as the following:
[2016-01-17 03:36:00,888] INFO Partition [__samza_checkpoint_event-creation_1,3] on broker 3: Shrinking ISR for partition [__samza_checkpoint_event-creation_1,3] from 6,5 to 5 (kafka.cluster.Partition) [2016-01-17 03:36:00,891] INFO Partition [__samza_checkpoint_event-creation_1,3] on broker 3: Cached zkVersion [66] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition)
There will also be an indication of the ZooKeeper session expiring in one or more Kafka broker logs around the same time as the previous errors:
INFO zookeeper state changed (Expired) (org.I0Itec.zkclient.ZkClient)
The log is typically in /var/log/kafka on each host where a Kafka broker is running. The location is set by the property kafka.log4j.dir in Cloudera Manager. The log name is kafka-broker-hostname.log. In diagnostic bundles, the log is under logs/ hostname-ip-address/.
Affected Versions: CDK 1.4.x, 2.0.x, 2.1.x, 2.2.x Powered By Apache Kafka
Partial Fix: CDK 3.0.0 and later Powered By Apache Kafka are less likely to encounter this issue.
Workaround: To move forward after seeing this problem, restart the Kafka brokers affected. You can restart individual brokers from the Instances tab in the Kafka service page in Cloudera Manager.
To reduce the chances of this issue happening again, do what you can to make sure ZooKeeper sessions do not expire:
- Reduce the potential for long garbage collection pauses by brokers:
- Use a better garbage collection mechanism in the JVM, such as G1GC. You can do this by adding –XX:+UseG1GC in the broker_java_opts.
- Increase broker heap size if it is too small (broker_max_heap_size) (be careful that you don’t choose a heap size that can cause out-of-memory problems given all the services running on the node).
- Increase the ZooKeeper session timeout configuration on brokers (zookeeper.session.timeout.ms), to reduce the likelihood that sessions expire.
- Ensure ZooKeeper itself is well resourced and not overwhelmed, so it can respond. For example, it is highly recommended to locate the ZooKeeper log directory is on its own disk.
Cloudera JIRA: CDH-42514
Apache JIRA: KAFKA-2729
The Flume and Spark connectors to Kafka shipped with CDH 5.7 and higher only work with Kafka 2.0 and higher
Use Kafka 2.0 and higher to be compatible with the Flume and Spark connectors included with CDH 5.7.x.
Only new Java clients support authentication and authorization
Workaround: Migrate to the new Java producer and consumer APIs.
Requests fail when sending to a nonexistent topic with auto.create.topics.enable set to true
The first few produce requests fail when sending to a nonexistent topic with auto.create.topics.enable set to true.
Workaround: Increase the number of retries in the Producer configuration settings.
Custom Kerberos principal names must not be used for Kerberized ZooKeeper and Kafka instances
When using ZooKeeper authentication and a custom Kerberos principal, Kerberos-enabled Kafka does not start.
Workaround: None. You must disable ZooKeeper authentication for Kafka or use the default Kerberos principals for ZooKeeper and Kafka.
Performance degradation when SSL is enabled
Significant performance degradation can occur when SSL is enabled. The impact varies, depending on your CPU type and JVM version. The reduction is generally in the range 20-50%.
AdminUtils is not binary-compatible between Cloudera Distribution of Apache Kafka 1.x and 2.x
The AdminUtils APIs have changed between Cloudera Distribution of Apache Kafka 1.x and 2.x. If your application uses AdminUtils APIs, you must modify your application code to use the new APIs before you compile your application against Cloudera Distribution of Apache Kafka 2.x.
Monitoring is not supported in Cloudera Manager 5.4
If you use Cloudera Distribution of Kafka 1.2 with Cloudera Manager 5.4, you must disable monitoring.
MirrorMaker does not start when Sentry is enabled
When MirrorMaker is used in conjunction with Sentry, MirrorMaker reports an authorization issue and does not start. This is due to Sentry being unable to authorize the kafka_mirror_maker principal which is automatically created.
- Create the kafka_mirror_maker Linux user ID and the kafka_mirror_maker Linux group ID on the MirrorMaker hosts. Use the
following command:
useradd kafka_mirror_maker
- Create the necessary Sentry rules for the kafka_mirror_maker group.
Affected Versions: CDK 2.1.1 and higher Powered by Apache Kafka
Fixed Versions: N/A
Apache JIRA: N/A
Cloudera JIRA: CDH-53706
Authenticated Kafka clients may impersonate other users
Authenticated Kafka clients may impersonate any other user via a manually crafted protocol message with SASL/PLAIN or SASL/SCRAM authentication when using the built-in PLAIN or SCRAM server implementations in Apache Kafka.
Note that the SASL authentication mechanisms that apply to this issue are neither recommended nor supported by Cloudera. In Cloudera Manager (CM) there are four choices: PLAINTEXT, SSL, SASL_PLAINTEXT, and SASL_SSL. The SASL/PLAIN option described in this issue is not the same as SASL_PLAINTEXT option in CM. That option uses Kerberos and is not affected. As a result it is highly unlikely that Kafka is susceptible to this issue when managed by CM unless the authentication protocol is overridden by an Advanced Configuration Snippet (Safety Valve).
Products affected: CDK Powered by Apache Kafka
Releases affected: CDK 2.1.0 to 2.2.0, CDK 3.0.0
Users affected: All users
Detected by: Rajini Sivaram (rsivaram@apache.org)
Severity (Low/Medium/High):8.3 (High) (CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:U/C:L/I:H/A:H)
Impact:Privilege escalation.
CVE:CVE-2017-12610
Immediate action required: Upgrade to a newer version of CDK Powered by Apache Kafka where the issue has been fixed.
Addressed in release/refresh/patch: CDK 3.1, CDH 6.0 and higher
Knowledge article: For the latest update on this issue see the corresponding Knowledge article: TSB 2018-332: Two Kafka Security Vulnerabilities: Authenticated Kafka clients may impersonate other users and and may interfere with data replication
Authenticated clients may interfere with data replication
Authenticated Kafka users may perform an action reserved for the Broker via a manually created fetch request interfering with data replication, resulting in data loss.
Products affected: CDK Powered by Apache Kafka
Releases affected: CDK 2.0.0 to 2.2.0, CDK 3.0.0
Users affected: All users
Detected by: Rajini Sivaram (rsivaram@apache.org)
Severity (Low/Medium/High):6.3 (Medium) (CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:U/C:L/I:L/A:L)
Impact:Potential data loss due to improper replication.
CVE:CVE-2018-1288
Immediate action required: Upgrade to a newer version of CDK Powered by Apache Kafka where the issue has been fixed.
Addressed in release/refresh/patch: CDK 3.1, CDH 6.0 and higher
Knowledge article: For the latest update on this issue see the corresponding Knowledge article: TSB 2018-332: Two Kafka Security Vulnerabilities: Authenticated Kafka clients may impersonate other users and and may interfere with data replication