Enrichment and Normalization of Model Features
Now that the model has been added to the model registry, you can use it in the streaming application by the PMML processor. Before the model can be executed, you must enrich and normalize the streaming events with the features required by the model. As the above diagram illustrates, there are 7 features in the model. None of these features come as part of the stream from the two sensors. So, based on the driverId and the latitude and longitude location, enrich the streaming event with these features and then normalize it required by the model. The table below describe each feature, enrichment store, and the normalization required.
Feature | Description | Enrichment Store | Normalization |
Model_Feature_Certification | Identifies if the driver is certified or not | HBase/Phoenix table called drivers |
"yes" → normalize to 1 "no" → normalize to 0 |
Model_Feature_WagePlan | Identifies if the driver is on an hourly or by miles wage plan | HBase/Phoenix table called drivers |
"Hourly" → normalize to 1 "Miles" → normalize to 0 |
Model_Feature_FatigueByHours | The total number of hours driven by the driver in the last week | HBase/Phoenix table called timesheet | Scale by 100 to improve algorithm performance (e.g: hours/100) |
Model_Feature_FatigueByMiles | The total number of miles driven by the driver in the last week | HBase/Pheonix table called timesheet | Scale by 1000 to improve algorithm performance (e.g: miles/1000) |
Model_Feature_FoggyWeather | Determines if for the given time and location, if the conditions are foggy | API to WeatherService | if (foggy) → normalize to 1 else 0 |
Model_Feature_RainyWeather | Determines if for the given time and location, if the conditions are rainy | API to WeatherService | if (raining) –> normalize to 1 else 0 |
Model_Feature_WindyWeather | Determines if for the given time and location, if the conditions are windy | API to WeatherService | if (windy) → normalize to 1 else 0 |