Tracking Model Metrics
Tracking model metrics without deploying a model
We recommend that you develop and test model metrics in a workbench session before actually deploying the model. This workflow avoids the need to rebuild and redeploy a model to test every change.
Metrics tracked in this way are stored in a local, in-memory
datastore instead of the metrics database, and are forgotten when the session exits. You can
access these metrics in the same session using the regular metrics API in the cdsw.py
file.
The following example demonstrates how to track metrics locally
within a session, and use the read_metrics
function to read the metrics in the same session by querying by the time window.
-
use_model_metrics.py
-
predict_with_metrics.py
The predict function from the predict_with_metrics.py
file shown in the following example is similar to the
function with the same name in the predict.py
file. It takes input and returns output, and can be deployed as a model. But unlike the
function in the predict.py
file, the predict
function from the predict_with_metrics.py
file tracks mathematical metrics. These metrics can include information such as input, output,
feature values, convergence metrics, and error estimates. In this simple example, only input
and output are tracked. The function is equipped to track metrics by applying the decorator
cdsw.model_metrics
.
@cdsw.model_metrics
def predict(args):
# Track the input.
cdsw.track_metric("input", args)
# If this model involved features, ie transformations of the
# raw input, they could be tracked as well.
# cdsw.track_metric("feature_vars", {"a":1,"b":23})
petal_length = float(args.get('petal_length'))
result = model.predict([[petal_length]])
# Track the output.
cdsw.track_metric("predict_result", result[0][0])
return result[0][0]
predict(
{"petal_length": 3}
)
dev
keyword argument to True
in
the use_model_metrics.py
file. You can query
the metrics by model, model build, or model deployment using the variables cdsw.dev_model_crn
and cdsw.dev_model_build_crn
or cdsw.dev_model_deploy_crn
respectively. For
example:end_timestamp_ms=int(round(time.time() * 1000))
cdsw.read_metrics(model_deployment_crn=cdsw.dev_model_deployment_crn,
start_timestamp_ms=0,
end_timestamp_ms=end_timestamp_ms,
dev=True)
where CRN denotes Cloudera Resource Name, which is a unique identifier from CDP, analogous to Amazon's ARN.
Tracking metrics for deployed models
When you have finished developing your metrics tracking code and
the code that consumes the metrics, simply deploy the predict function from predict_with_metrics.py
as a model. No code changes
are necessary.
Calls to read_metrics
, track_delayed_metrics
and track_aggregate_metrics
need to be changed to take the CRN of the deployed model,
build or deployment. These CRNs can be found in the model’s Overview page.
Calls to call_model
will also require the model’s access key (model_access_key
in use_model_metrics.py
) from the model’s Settings page. If authentication has been
enabled for the model (the default), a model API key for the user (model_api_token
in use_model_metrics.py
) is also required. This can be
obtained from the user’s Settings
page.