Description

This service can be used to enrich a record oriented data with values predicted by Cloudera Machine Learning. This service is compatible with models which API is using JSON.

Usage

Service specific configuration on the LookupRecord processor

The service requires cml.payload dynamic property to be present on the LookupRecord processor. In this property it can be selected with a RecordPath which parts of the incoming record are going to be sent to the CML service. In case of / the whole record will be used as the input which will be converted to JSON by the CMLLookupService as it is the required format by CML. The structure though, vary between model/AMPs, therefore we advise to study the documentation of the given model/AMP.

Configuration of the CMLLookupService

Project Hostname

Hostname of the CML project, can be obtained from the browser search bar or from the Sample Code section of the Overview tab on the CML UI. In case of the "Churn Modeling with scikit-learn" AMP, it is modelservice.ml-xxxxxxxx-xxx.se-sandb.xxxx-xxxx.cloudera.site which is extracted from https://modelservice.ml-xxxxxxxx-xxx.se-sandb.xxxx-xxxx.cloudera.site/model.

CML Access Key Unique access key associated with the CML model.
CML Api Key Unique api key used for authentication with the CML model.
Web Client Service Provider The service uses http calls under the hood, this service provides support and additional configuration for it.
Record Path

An optional record path that can be used to define where in a record to get the real data to merge into the record set to be enriched. Since the response from CML can be big and could contain unnecessary information this property can be used to select the required fields. In case of the "Churn Modeling with scikit-learn" AMP, we may be interested only in the probability of the customer's churn which can be selected with the following RecordPath: /response/prediction/probability

Date Format

Specifies the format to use when reading/writing Date fields. If not specified, Date fields will be assumed to be number of milliseconds since epoch (Midnight, Jan 1, 1970 GMT). If specified, the value must match the Java Simple Date Format (for example, MM/dd/yyyy for a two-digit month, followed by a two-digit day, followed by a four-digit year, all separated by '/' characters, as in 01/01/2017).

Time Format

Specifies the format to use when reading/writing Time fields. If not specified, Time fields will be assumed to be number of milliseconds since epoch (Midnight, Jan 1, 1970 GMT). If specified, the value must match the Java Simple Date Format (for example, HH:mm:ss for a two-digit hour in 24-hour format, followed by a two-digit minute, followed by a two-digit second, all separated by ':' characters, as in 18:04:15).

Timestamp Format

Specifies the format to use when reading/writing Timestamp fields. If not specified, Timestamp fields will be assumed to be number of milliseconds since epoch (Midnight, Jan 1, 1970 GMT). If specified, the value must match the Java Simple Date Format (for example, MM/dd/yyyy HH:mm:ss for a two-digit month, followed by a two-digit day, followed by a four-digit year, all separated by '/' characters; and then followed by a two-digit hour in 24-hour format, followed by a two-digit minute, followed by a two-digit second, all separated by ':' characters, as in 01/01/2017 18:04:15).