Meeting the prerequisites

Learn how to collect the information you need to deploy the ListenHTTP filter to Kafka ReadyFlow, and meet other prerequisites.

For your data ingest source

For DataFlow

For your data ingest target

  • You have created a Streams Messaging cluster in CDP Public Cloud to host your Schema Registry.

    For information on how to create a Streams Messaging cluster, see Creating your First Streams Messaging Cluster in CDP Public Cloud.

  • You have created at least one Kafka topic.

    1. Navigate to Management Console > Environments and select your environment.
    2. Select your Streams Messaging cluster.
    3. Click on the Streams Messaging Manager icon.
    4. Navigate to the Topics page.
    5. Click Add New and provide the following information:
      • Topic name
      • Number of partitions
      • Level of availability
      • Cleanup policy
    6. Click Save.
  • You have created a schema for your data and have uploaded it to the Schema Registry in the Streams Messaging cluster.

    For information on how to create a new schema, see Creating a new schema. For example:
    
    {
       "type":"record",
       "name":"SensorReading",
       "namespace":"com.cloudera.example",
       "doc":"This is a sample sensor reading",
       "fields":[
          {
             "name":"sensor_id",
             "doc":"Sensor identification number.",
             "type":"int"
          },
          {
             "name":"sensor_ts",
             "doc":"Timestamp of the collected readings.",
             "type":"long"
          },
          {
             "name":"sensor_0",
             "doc":"Reading #0.",
             "type":"int"
          },
          {
             "name":"sensor_1",
             "doc":"Reading #1.",
             "type":"int"
          },
          {
             "name":"sensor_2",
             "doc":"Reading #2.",
             "type":"int"
          },
          {
             "name":"sensor_3",
             "doc":"Reading #3.",
             "type":"int"
          }
       ]
    }
    
  • You have the Schema Registry Host Name.

    1. From the Management Console, go to Data Hub Clusters and select the Streams Messaging cluster you are using.
    2. Navigate to the Hardware tab to locate the Master Node FQDN. Schema Registry is always running on the Master node, so copy the Master node FQDN.
  • You have the Kafka broker end points.

    1. From the Management Console, click Data Hub Clusters.
    2. Select the Streams Messaging cluster from which you want to ingest data.
    3. Click the Hardware tab.
    4. Note the Kafka Broker FQDNs for each node in your cluster.
    5. Construct your Kafka Broker Endpoints by using the FQDN and Port number 9093 separated by a colon. Separate endpoints by a comma. For example:
      broker1.fqdn:9093,broker2.fqdn:9093,broker3.fqdn:9093

      Kafka broker FQDNs are listed under the Core_broker section.

  • You have the Kafka Consumer Group ID.

    This ID is defined by the user. Pick an ID and then create a Ranger policy for it. Use the ID when deploying the flow in DataFlow.

  • You have the Kafka Consumer Group ID.

    This ID is defined by the user. Pick an ID and then create a Ranger policy for it. Use the ID when deploying the flow in DataFlow.

  • You have assigned the CDP Workload User policies to access the consumer group ID and topic.

    1. Navigate to Management Console > Environments, and select the environment where you have created your cluster.
    2. Select Ranger. You are redirected to the Ranger Service Manager page.
    3. Select your Streams Messaging cluster under the Kafka folder.
    4. Create a policy to enable your Workload User to access the Kafka source topic.
    5. On the Create Policy page, give the policy a name, select topic from the drop-down list, add the user, and assign the Consume permission.
    6. Create another policy to give your Workload User access to the consumer group ID.
    7. On the Create Policy page, give the policy a name, select consumergroup from the drop-down list, add the user, and assign the Consume permission.
  • You have assigned the CDP Workload User read-access to the schema.

    1. Navigate to Management Console > Environments, and select the environment where you have created your cluster.
    2. Select Ranger. You are redirected to the Ranger Service Manager page.
    3. Select your Streams Messaging cluster under the Schema Registry folder.
    4. Click Add New Policy.
    5. On the Create Policy page, give the policy a name, specify the schema details, add the user, and assign the Read permission.