Deploying Kafka to Kudu ReadyFlow

Learn how to use the Deployment wizard to deploy the Kafka to Kudu ReadyFlow using the information you collected with the help of the prerequisites checklist.

The CDF Catalog is where you manage the flow definition lifecycle, from initial import, to versioning, and to deploying a flow definition.

  1. In DataFlow, from the left navigation pane, click Catalog.
    Flow definitions available for you to deploy are displayed, one definition per row.
  2. Launch the Deployment wizard.
    1. Click the row to display the flow definition details and versions.
    2. Click on a row representing a flow definition version to display flow definition version details and the Deploy New Flow button.
    3. Click Deploy New Flow to launch the Deployment wizard.
  3. From the Deployment wizard, select the environment to which you want to deploy this version of your flow definition.
  4. From the Overview, give your flow deployment a unique name and pick the NiFi Runtime Version for your flow deployment .
    • You can use this name to distinguish between different versions of a flow definition, flow definitions deployed to different environments, and similar.

    • You can pick the NiFi Runtime Version for your flow deployment. Cloudera recommends that you always use the latest available version, if possible.

  5. In Parameters, specify parameter values like connection strings, usernames and similar, and upload files like truststores, and similar.

    To deploy the Kafka to Kudu flow, configure the following parameters using the information you collected in the Meeting the pre-requisites.

    Table 1. Kafka to Kudu ReadyFlow configuration parameters
    Parameter Name Description Example
    CDP Workload User Specify the CDP machine user or workload user name that you want to use to authenticate to Kafka and Kudu. Ensure this user has the appropriate access rights to the Kafka topics and Kudu table.
    CDP Workload User Password Specify the password of the CDP machine user or workload user you are using to authenticate against Kafka and Kudu.
    CSV Delimiter If your source data is CSV, specify the delimiter here.
    Data Input Format Specify the format of your input data. You can use "CSV", "JSON" or "AVRO" with this ReadyFlow.
    Kafka Broker Endpoint Specify the Kafka bootstrap servers string as a comma separated list.
    Kafka Consumer Group ID Specify the ID for the consumer group used for the source topic you are consuming from.
    Kafka Source Topic Specify a topic name that you want to read from.
    Kudu Master Hosts Specify the Kudu Master hostnames in a comma separated list.
    Kudu Operation Type

    Specify the operation that you want to use when writing data to Kudu.

    Valid values are:

    • INSERT


    • UPSERT

    • UPDATE

    • DELETE

    Kudu Table Name Specify the Kudu table name you want to write to.
    Schema Name Specify the schema name to be looked up in the Schema Registry.
    Schema Registry Hostname Specify the hostname of the Schema Registry you want to connect to. This must be the direct hostname of the Schema Registry itself, not the Knox Endpoint.
  6. Specify your Sizing & Scaling configurations.
    • The size of your cluster from Extra Small to Large

    • Whether you want to automatically scale your cluster according to flow deployment capacity requirements.

    • The number of nodes from 1 to 64

  7. From KPIs, you may choose to identify key performance indicators (KPIs), the metrics to track those KPIs, and when and how to receive alerts about the KPI metrics tracking.

    See Working with KPIs for complete information about the KPIs available to you and how to monitor them.

  8. Review a summary of the information provided and make any necessary edits by clicking Previous. When you are finished, complete your flow deployment by clicking Deploy.

Once you click Deploy, you are being redirected to the Alerts tab in the detail view for the deployment where you can track its progress.