CMLLookupService 2.3.0.4.10.0.0-147

Bundle
com.cloudera | nifi-cdf-cml-services-nar
Description
Lookup a record from CML associated with the specified key. The coordinates that are passed to the lookup must contain the key 'cml.payload'.
Tags
cdp, cml, enrich, lookup, machine-learning
Input Requirement
Supports Sensitive Dynamic Properties
false
  • Additional Details for CMLLookupService 2.3.0.4.10.0.0-147

    CMLLookupService

    Description

    This service can be used to enrich a record oriented data with values predicted by Cloudera Machine Learning. This service is compatible with models which API is using JSON.

    Usage

    Service specific configuration on the LookupRecord processor

    The service requires cml.payload dynamic property to be present on the LookupRecord processor. In this property it can be selected with a RecordPath which parts of the incoming record are going to be sent to the CML service. In case of / the whole record will be used as the input which will be converted to JSON by the CMLLookupService as it is the required format by CML. The structure though, vary between model/AMPs, therefore we advise to study the documentation of the given model/AMP.

    Configuration of the CMLLookupService

    Property name Property value
    Project Hostname Hostname of the CML project, can be obtained from the browser search bar or from the Sample Code section of the Overview tab on the CML UI. In case of the “Churn Modeling with scikit-learn” AMP, it is modelservice.ml-xxxxxxxx-xxx.se-sandb.xxxx-xxxx.cloudera.site which is extracted from https://modelservice.ml-xxxxxxxx-xxx.se-sandb.xxxx-xxxx.cloudera.site/model.
    CML Access Key Unique access key associated with the CML model.
    CML Api Key Unique api key used for authentication with the CML model.
    Web Client Service Provider The service uses http calls under the hood, this service provides support and additional configuration for it.
    Record Path An optional record path that can be used to define where in a record to get the real data to merge into the record set to be enriched. Since the response from CML can be big and could contain unnecessary information this property can be used to select the required fields. In case of the “Churn Modeling with scikit-learn” AMP, we may be interested only in the probability of the customer’s churn which can be selected with the following RecordPath: /response/prediction/probability
    Date Format Specifies the format to use when reading/writing Date fields. If not specified, Date fields will be assumed to be number of milliseconds since epoch (Midnight, Jan 1, 1970 GMT). If specified, the value must match the Java Simple Date Format (for example, MM/dd/yyyy for a two-digit month, followed by a two-digit day, followed by a four-digit year, all separated by ‘/’ characters, as in 01/01/2017).
    Time Format Specifies the format to use when reading/writing Time fields. If not specified, Time fields will be assumed to be number of milliseconds since epoch (Midnight, Jan 1, 1970 GMT). If specified, the value must match the Java Simple Date Format (for example, HH:mm:ss for a two-digit hour in 24-hour format, followed by a two-digit minute, followed by a two-digit second, all separated by ‘:’ characters, as in 18:04:15).
    Timestamp Format Specifies the format to use when reading/writing Timestamp fields. If not specified, Timestamp fields will be assumed to be number of milliseconds since epoch (Midnight, Jan 1, 1970 GMT). If specified, the value must match the Java Simple Date Format (for example, MM/dd/yyyy HH:mm:ss for a two-digit month, followed by a two-digit day, followed by a four-digit year, all separated by ‘/’ characters; and then followed by a two-digit hour in 24-hour format, followed by a two-digit minute, followed by a two-digit second, all separated by ‘:’ characters, as in 01/01/2017 18:04:15).
Properties