Chapter 2. Prepare Your Environment
Deploying Your HDF Clusters
About This Task
Now that you have reviewed the reference architecture and planned the deployment of your trucking application, you can begin installing HDF according to your use case specifications. To fully build the trucking application as described in this Getting Started with Stream Analytics document, use the following steps.
Steps
Install Ambari 2.5.1.
Install HDP 2.6.1 Cluster via Ambari.
Install HDF 3.0 Management Pack.
Update HDF 3.0 Base URL.
Add HDF 3.0 Services to HDP 2.6.1 cluster.
Find instructions for these installation steps in Installing HDF Services on a New HDP Cluster.
More Information
Registering Schemas in Schema Registry
The trucking application streams raw events that are serialized into Avro from the two sensors to its respective Kafka topics. NiFi consumes from these topics, and then routes, enriches, and delivers them to another set of Kafka topics for consumption by the streaming anlatyics applications. To do this, you must perform the following tasks:
Creating the 4 Kafka topics
Registering Schemas for each of the Kafka topics in the Schema Registry
Create the Kafka Topics
About This Task
Kafka topics are categories or feed names to which records are published.
Steps
Log into the node where Kafka broker is running.
Create the Kafka topics using the following commands:
cd /usr/[hdf/\hdp]current/kafka-broker/bin/ ./kafka-topics.sh \ --create \ --zookeeper <zookeeper-host>:2181 \ --replication-factor 2 \ --partition 3 \ --topic raw-truck_events_avro ./kafka-topics.sh \ --create \ --zookeeper <zookeeper-host>:2181 \ --replication-factor 2 \ --partition 3 \ --topic raw-truck_speed_events_avro ./kafka-topics.sh \ --create \ --zookeeper <zookeeper-host>:2181 \ --replication-factor 2 \ --partition 3 \ --topic truck_events_avro ./kafka-topics.sh \ --create \ --zookeeper <zookeeper-host>:2181 \ --replication-factor 2 \ --partition 3 \ --topic truck_speed_events_avro
More Information
Register Schemas for the Kafka Topics
About This Task
Register the schemas for the 2 Kafka topics that NiFi will consume from and the two other Kafka topics that NiFi will publish the enriched events to. Registering the Kafka topic schemas is benefiicial in several ways. Schema Registry provides a centralized schema location, allowing you to stream records into topics without having to attach the schema to each record.
Steps
Go to the Schema Registry UI by selecting the Registry service in Ambari and under 'Quick Links' selecting 'Registry UI'
Click the "+" button to add a schema, schema group and schema metadata for the Raw Geo Event Sensor Kafka topic:
Name = raw-truck_events_avro
Description = Raw Geo events from trucks in Kafka Topic
Type = Avro schema provider
Schema Group = truck-sensors-kafka
Compatibility: BACKWARD
Check the evolve check box
Copy the schema from here and paste it into the Schema Text area.
Click Save
Click the "+" button to add a schema, schema group (exists from previous step), and schema metadata for the Raw Speed Event Sensor Kafka topic:
Name = raw-truck_speed_events_avro
Description = Raw Speed Events from trucks in Kafka Topic
Type = Avro schema provider
Schema Group = truck-sensors-kafka
Compatibility: BACKWARD
Check the evolve check box
Copy the schema from here and paste it into the Schema Text area.
Click Save
Click the "+" button to add a schema, schema group and schema metadata for the Geo Event Sensor Kafka topic:
Name = truck_events_avro
Description = Schema for the Kafka topic named 'truck_events_avro'
Type = Avro schema provider
Schema Group = truck-sensors-kafka
Compatibility: BACKWARD
Check the evolve checkbox
Copy the schema from here and paste it into the Schema Text area.
Click Save
Click the "+" button to add a schema, schema group (exists from previous step), and schema metadata for the Speed Event Sensor Kafka topic:
Name = truck_speed_events_avro
Description = Schema for the Kafka topic named 'truck_speed_events_avro'
Type = Avro schema provider
Schema Group = truck-sensors-kafka
Compatibility: BACKWARD
Check the evolve check box
Copy the schema from here and paste it into the Schema Text area.
Click
.
More Information
If you want to create these schemas programmatically using the Schema Registry client via REST rather than through the UI, you can find examples at this Github location.