Homepage
/
Cloudera DataFlow for Data Hub
7.0.2
(Public Cloud • Technical Preview)
Search Documentation
▶︎
Cloudera
Reference Architectures
▼
Cloudera on cloud
Getting Started
Patterns
Preview Features
Data Catalog
Data Engineering
Data Flow
Data Hub
Data Warehouse
Data Warehouse Runtime
Cloudera AI
Management Console
Operational Database
Replication Manager
Cloudera Manager
CDF for Data Hub
Runtime
▶︎
Cloudera on premises
Data Services
Getting Started
Cloudera Manager
Management Console
Replication Manager
Data Catalog
Data Engineering
Data Warehouse
Data Warehouse Runtime
Cloudera AI
Base
Getting Started
Runtime
Upgrade
Storage
Flow Management
Streaming Analytics
Flow Management Operator
Streaming Analytics Operator
Streams Messaging Operator
▶︎
Cloudera Manager
Cloudera Manager
▶︎
Applications
Cloudera Streaming Community Edition
Data Science Workbench
Data Visualization
Edge Management
Observability SaaS
Observability on premises
Workload XM On-Prem
▶︎
Legacy
Cloudera Enterprise
Flow Management
Stream Processing
HDP
HDF
Streams Messaging Manager
Streams Replication Manager
▶︎
Getting Started
Patterns
Preview Features
Data Catalog
Data Engineering
Data Flow
Data Hub
Data Warehouse
Data Warehouse Runtime
Cloudera AI
Management Console
Operational Database
Replication Manager
Cloudera Manager
CDF for Data Hub
Runtime
«
Filter topics
CDF for Data Hub
▼
Release Notes
▶︎
What's new in Cloudera Data Flow for Data Hub
What's new in Apache Kafka
What's new in Streams Messaging Manager
What's new in Schema Registry
Component support
Known issues
▼
Fixed issues
Fixed issues in Apache Kafka
Fixed issues in Streams Messaging Manager
Fixed issues in Schema Registry
Unsupported features
▶︎
Concepts
▶︎
Streams Messaging
▶︎
Apache Kafka Overview
Kafka Introduction
▶︎
Kafka Architecture
Brokers
Topics
Records
Partitions
Record order and assignment
Logs and log segments
Kafka brokers and Zookeeper
Leader positions and in-sync replicas
▶︎
Kafka FAQ
Basics
Use cases
▶︎
Schema Registry Overview
▶︎
Schema Registry Overview
Examples of Interacting with Schema Registry
▶︎
Schema Registry Use Cases
Use Case 1: Registering and Querying a Schema for a Kafka Topic
Use Case 2: Reading/Deserializing and Writing/Serializing Data from and to a Kafka Topic
Use Case 3: Dataflow Management with Schema-based Routing
Schema Registry Component Architecture
▶︎
Schema Registry Concepts
Schema Entities
Compatibility Policies
▶︎
SMM Overview
Streams Messaging Manager Overview
▶︎
Planning
▶︎
Planning Your Deployment
Deployment scenarios
Data Hub cluster definitions
Streams Messaging cluster layout
▶︎
Connecting Kafka Clients to Data Hub Clusters
Connecting Kafka clients to Data Hub provisioned clusters
▶︎
How To
▶︎
Apache Kafka
▶︎
Configuring Apache Kafka
Operating system requirements
Performance considerations
Quotas
▶︎
JBOD
JBOD setup
JBOD Disk migration
Setting user limits for Kafka
Connecting Kafka clients to Data Hub provisioned clusters
▶︎
Securing Apache Kafka
▶︎
TLS
Step 1: Generate keys and certificates for Kafka brokers
Step 2: Create your own certificate authority
Step 3: Sign the certificate
Step 4: Configure Kafka brokers
Step 5: Configure Kafka clients
▶︎
Authentication
Kerberos authentication
▶︎
Delegation token based authentication
Enable authentication with delegation tokens
Manage individual delegation tokens
Rotate the master key/secret
▶︎
Client authentication using delegation tokens
Configure clients on a producer or consumer level
Configure clients on an application level
▶︎
Kafka security hardening with Zookeeper ACLs
Restrict access to Kafka metadata in Zookeeper
Unlock Kafka metadata in Zookeeper
▶︎
LDAP authentication
Configure Kafka brokers
Configure Kafka clients
▶︎
PAM Authentication
Configure Kafka brokers
Configure Kafka clients
▶︎
Authorization
▶︎
Ranger
Enable authorization in Kafka with Ranger
Configure the resource-based Ranger service used for authorization
Using Kafka's inter-broker security
▶︎
Tuning Apache Kafka Performance
Handling large messages
▶︎
Cluster sizing
Sizing estimation based on network and disk message throughput
Choosing the number of partitions for a topic
▶︎
Broker Tuning
JVM and garbage collection
Network and I/O threads
ISR management
Log cleaner
▶︎
System Level Broker Tuning
File descriptor limits
Filesystems
Virtual memory handling
Networking parameters
Configure JMX ephemeral ports
Kafka-ZooKeeper performance tuning
▶︎
Managing Apache Kafka
▶︎
Management basics
Broker log management
Record management
Broker garbage log collection and log rotation
Client and broker compatibility across Kafka versions
▶︎
Managing topics across multiple Kafka clusters
Set up MirrorMaker in Cloudera Manager
Settings to avoid data loss
▶︎
Broker migration
Migrate brokers by modifying broker IDs in meta.properties
Use rsync to copy files from one broker to another
▶︎
Disk management
Monitoring
▶︎
Handling disk failures
Disk Replacement
Disk Removal
Reassigning replicas between log directories
Retrieving log directory replica assignment information
▶︎
Metrics
Building Cloudera Manager charts with Kafka metrics
Essential metrics to monitor
▶︎
Command Line Tools
Unsupported command line tools
kafka-topics
kafka-configs
kafka-console-producer
kafka-console-consumer
kafka-consumer-groups
▶︎
kafka-reassign-partitions
Tool usage
Reassignment examples
kafka-log-dirs
zookeeper-security-migration
kafka-delegation-tokens
kafka-*-perf-test
Configuring log levels for command line tools
Understanding the kafka-run-class Bash Script
▶︎
Developing Apache Kafka Applications
Kafka producers
▶︎
Kafka consumers
Subscribing to a topic
Groups and fetching
Protocol between consumer and broker
Rebalancing partitions
Retries
Kafka clients and ZooKeeper
▶︎
Simple Client Examples
pom.xml
SimpleConsumer.java
SimpleProducer.java
Recommendations for using the producer and consumer APIs
Kafka public APIs
Kafka Streams
▶︎
Streams Messaging Manager
▶︎
Monitoring Kafka Clusters
Monitoring Clusters
Monitoring Producers
Monitoring Topics
Monitoring Brokers
Monitoring Consumers
▶︎
Managing Alert Policies
Alert Policies Overview
Component Types and Metrics for Alert Policies
Notifiers
▶︎
Managing Alert Policies and Notifiers
Creating a Notifier
Updating a Notifier
Deleting a Notifier
Creating an Alert Policy
Updating an Alert Policy
Enabling an Alert Policy
Disabling an Alert Policy
Deleting an Alert Policy
▶︎
Managing Topics
Creating a Kafka Topic
Modify a Kafka Topic
Deleting a Kafka Topic
▶︎
Monitoring End to End Latency
End to End Latency Overview
Granularity of Metrics
Enabling Interceptors
Monitoring End-to-end Latency
End to End Latency Use Cases
▶︎
Schema Registry
▶︎
Integrating with Schema Registry
▶︎
Integrating with NiFi
Understanding NiFi Record Based Processing
Setting up the HortonworksSchemaRegistry Controller Service
Adding and Configuring Record Reader and Writer Controller Services
Using Record-Enabled Processors
▶︎
Integrating with Kafka
Integrating Kafka and Schema Registry Using NiFi Processors
Integrating Kafka and Schema Registry
▶︎
Using Schema Registry
Adding a new schema
Querying a schema
Evolving a schema
Deleting a schema
SMM REST API Reference
Adding a new schema
Adding and Configuring Record Reader and Writer Controller Services
Alert Policies Overview
Apache Kafka
Apache Kafka Overview
Authentication
Authorization
Basics
Broker garbage log collection and log rotation
Broker log management
Broker migration
Broker Tuning
Brokers
Building Cloudera Manager charts with Kafka metrics
CDF for Data Hub
Choosing the number of partitions for a topic
Client and broker compatibility across Kafka versions
Client authentication using delegation tokens
Cluster sizing
Command Line Tools
Compatibility Policies
Component support
Component Types and Metrics for Alert Policies
Configure clients on a producer or consumer level
Configure clients on an application level
Configure JMX ephemeral ports
Configure Kafka brokers
Configure Kafka brokers
Configure Kafka clients
Configure Kafka clients
Configure the resource-based Ranger service used for authorization
Configuring Apache Kafka
Configuring log levels for command line tools
Connecting Kafka Clients to Data Hub Clusters
Connecting Kafka clients to Data Hub provisioned clusters
Connecting Kafka clients to Data Hub provisioned clusters
Creating a Kafka Topic
Creating a Notifier
Creating an Alert Policy
Data Hub cluster definitions
Delegation token based authentication
Deleting a Kafka Topic
Deleting a Notifier
Deleting a schema
Deleting an Alert Policy
Deployment scenarios
Developing Apache Kafka Applications
Disabling an Alert Policy
Disk management
Disk Removal
Disk Replacement
Enable authentication with delegation tokens
Enable authorization in Kafka with Ranger
Enabling an Alert Policy
Enabling Interceptors
End to End Latency Overview
End to End Latency Use Cases
Essential metrics to monitor
Evolving a schema
Examples of Interacting with Schema Registry
File descriptor limits
Filesystems
Fixed issues
Fixed issues in Apache Kafka
Fixed issues in Schema Registry
Fixed issues in Streams Messaging Manager
Granularity of Metrics
Groups and fetching
Handling disk failures
Handling large messages
Integrating Kafka and Schema Registry
Integrating Kafka and Schema Registry Using NiFi Processors
Integrating with Kafka
Integrating with NiFi
Integrating with Schema Registry
ISR management
JBOD
JBOD Disk migration
JBOD setup
JVM and garbage collection
Kafka Architecture
Kafka brokers and Zookeeper
Kafka clients and ZooKeeper
Kafka consumers
Kafka FAQ
Kafka Introduction
Kafka producers
Kafka public APIs
Kafka security hardening with Zookeeper ACLs
Kafka Streams
kafka-*-perf-test
kafka-configs
kafka-console-consumer
kafka-console-producer
kafka-consumer-groups
kafka-delegation-tokens
kafka-log-dirs
kafka-reassign-partitions
kafka-topics
Kafka-ZooKeeper performance tuning
Kerberos authentication
Known issues
LDAP authentication
Leader positions and in-sync replicas
Log cleaner
Logs and log segments
Manage individual delegation tokens
Management basics
Managing Alert Policies
Managing Alert Policies and Notifiers
Managing Apache Kafka
Managing Topics
Managing topics across multiple Kafka clusters
Metrics
Migrate brokers by modifying broker IDs in meta.properties
Modify a Kafka Topic
Monitoring
Monitoring Brokers
Monitoring Clusters
Monitoring Consumers
Monitoring End to End Latency
Monitoring End-to-end Latency
Monitoring Kafka Clusters
Monitoring Producers
Monitoring Topics
Network and I/O threads
Networking parameters
Notifiers
Operating system requirements
PAM Authentication
Partitions
Performance considerations
Planning Your Deployment
pom.xml
Protocol between consumer and broker
Querying a schema
Quotas
Ranger
Reassigning replicas between log directories
Reassignment examples
Rebalancing partitions
Recommendations for using the producer and consumer APIs
Record management
Record order and assignment
Records
Release Notes
Restrict access to Kafka metadata in Zookeeper
Retries
Retrieving log directory replica assignment information
Rotate the master key/secret
Schema Entities
Schema Registry
Schema Registry Component Architecture
Schema Registry Concepts
Schema Registry Overview
Schema Registry Overview
Schema Registry Use Cases
Securing Apache Kafka
Set up MirrorMaker in Cloudera Manager
Setting up the HortonworksSchemaRegistry Controller Service
Setting user limits for Kafka
Settings to avoid data loss
Simple Client Examples
SimpleConsumer.java
SimpleProducer.java
Sizing estimation based on network and disk message throughput
SMM Overview
Step 1: Generate keys and certificates for Kafka brokers
Step 2: Create your own certificate authority
Step 3: Sign the certificate
Step 4: Configure Kafka brokers
Step 5: Configure Kafka clients
Streams Messaging
Streams Messaging cluster layout
Streams Messaging Manager
Streams Messaging Manager Overview
Subscribing to a topic
System Level Broker Tuning
TLS
Tool usage
Topics
Tuning Apache Kafka Performance
Understanding NiFi Record Based Processing
Understanding the kafka-run-class Bash Script
Unlock Kafka metadata in Zookeeper
Unsupported command line tools
Unsupported features
Updating a Notifier
Updating an Alert Policy
Use Case 1: Registering and Querying a Schema for a Kafka Topic
Use Case 2: Reading/Deserializing and Writing/Serializing Data from and to a Kafka Topic
Use Case 3: Dataflow Management with Schema-based Routing
Use cases
Use rsync to copy files from one broker to another
Using Kafka's inter-broker security
Using Record-Enabled Processors
Using Schema Registry
Virtual memory handling
What's new in Apache Kafka
What's new in Cloudera Data Flow for Data Hub
What's new in Schema Registry
What's new in Streams Messaging Manager
zookeeper-security-migration
«
Fixed issues
▶︎
What's new in Cloudera Data Flow for Data Hub
What's new in Apache Kafka
What's new in Streams Messaging Manager
What's new in Schema Registry
Component support
Known issues
▼
Fixed issues
Fixed issues in Apache Kafka
Fixed issues in Streams Messaging Manager
Fixed issues in Schema Registry
Unsupported features
»
Release Notes
Fixed issues
Summary of fixed issues for this release.
Fixed issues in Apache Kafka
This section lists the issues that have been fixed since the previous release.
Fixed issues in Streams Messaging Manager
This section lists the issues fixed since the previous release
Fixed issues in Schema Registry
This section lists the issues that have been fixed since the previous version.
Feedback
We want your opinion
How can we improve this page?
What kind of feedback do you have?
I like something
I have an idea
Something's not working
Can we contact you for follow-up on this?
Back
Submit
OK
7.3.1
7.2
7.2.18
7.2.17
7.2.16
7.2.15
7.2.14
7.2.12
7.2.11
7.2.10
7.2.9
7.2.8
7.2.7
7.2.6
7.2.2
7.2.1
7.2.0
7.1.0
7.0.2