ReadyFlow: Kafka to Cloudera Operational Database

Learn about the Kafka to Cloudera Operational Database ReadyFlow.

This ReadyFlow consumes JSON, CSV or Avro data from a source Kafka topic, parses the schema by looking up the schema name in the CDP Schema Registry and ingests it into an HBase table in COD. Failed HBase write operations are retried automatically to handle transient issues. Define a KPI on the failure_WriteToCOD connection to monitor failed write operations.

ReadyFlow details
Source Kafka topic
Source Format JSON, CSV, Avro
Destination Cloudera Operational Database
Destination Format HBase Table
Table 1. Kafka to Cloudera Operational Database ReadyFlow configuration parameters
Parameter Name Description Example
CDP Workload User Specify the CDP machine user or workload user name that you want to use to authenticate to Kafka and COD. Ensure this user has the appropriate access rights to the Kafka topics and HBase table.
CDP Workload User Password Specify the password of the CDP machine user or workload user you are using to authenticate against Kafka and COD.
CDPEnvironment

Use this parameter to upload the hbase-site.xml file of your target HBase cluster. DataFlow will also use this parameter to auto-populate the Flow Deployment with additional Hadoop configuration files required to interact with HBase.

DataFlow automatically adds all required configuration files to interact with Data Lake services. Unnecessary files that are added do not impact the deployment process.

COD Column Family Name Specify the column family to use when inserting data into Cloudera Operational Database.
COD Row Identifier Field Name Specify the name of a record field whose value should be used as the row ID for the given record.
COD Row Identifier Field Name Specify the target table name in Cloudera Operational Database.
COD Table Name Specify the target table name in Cloudera Operational Database.
CSV Delimiter If your source data is CSV, specify the delimiter here.
Data Input Format Specify the format of your input data.
  • CSV
  • JSON
  • AVRO
Kafka Broker Endpoint Specify the Kafka bootstrap servers string as a comma separated list.
Kafka Consumer Group ID Specify the ID for the consumer group used for the source topic you are consuming from.
Kafka Source Topic Specify the topic name from which you want to read.
Schema Name Specify the schema name that you want to look up in Schema Registry.
Schema Registry Hostname Specify the hostname of the Schema Registry you want to connect to. This must be the direct hostname of the Schema Registry itself, not the Knox Endpoint.