Deploying models with Canary deployment using API
Cloudera AI Inference service allows users to control traffic percentage to specific model deployments.
-
Deploy a model. The traffic value defaults to 100. The following is an example
payload:
# cat ./examples/mlflow/model-spec-cml-registry.json
{ "namespace": "serving-default", "name": "mlflow-wine-test-from-registry-onnx", "source": { "registry_source": { "version": 1, "model_id": "yf0o-hrxq-l0xj-8tk9" } } }
-
After deploying a model to the endpoint, direct the traffic to the Canary
model by updating the model with the desired traffic value set:
# cat ./examples/mlflow/model-spec-cml-registry-canary.json
{ "namespace": "serving-default", "name": "mlflow-wine-test-from-registry-onnx", "source": { "registry_source": { "version": 2, "model_id": "yf0o-hrxq-l0xj-8tk9" } } "traffic": "35", }
The percentage of the traffic set in the traffic field diverts to the newly deployed model version and the remaining traffic is directed to the previously deployed model version.
-
Update the model endpoint as follows, when you want your canary model to serve all of
your incoming requests:
# cat ./examples/mlflow/model-spec-cml-registry-promote.json
{ "namespace": "serving-default", "name": "mlflow-wine-test-from-registry-onnx", "source": { "registry_source": { "version": 2, "model_id": "yf0o-hrxq-l0xj-8tk9" } } "traffic": "100", }
-
Use the following payload to content to the model:
curl -H "Content-Type: application/json" -H "Authorization: Bearer ${CDP_TOKEN}" https://${DOMAIN}/namespaces/serving-default/endpoints/mlflow-wine-test-from-registry-onnx/v2/models/yf0o-hrxq-l0xj-8tk9/infer -d @./examples/url/wine-input.json {"model_name":"yf0o-hrxq-l0xj-8tk9","model_version":"2","outputs":[{"name":"variable","datatype":"FP32","shape":[1,1],"data":[0.0]}]}