Deploying additional model frameworks

Cloudera AI Inference service now supports direct deployment for XGBoost, PyTorch, and TensorFlow models using the Cloudera AI Registry. Learn how to train, register, and deploy these models.

Prerequisites

To ensure compatibility with the inference runtime, it is recommended to use the package versions listed below in your Cloudera AI Workbench session.


Package	Version	Reason
mlflow	2.19.0	Serialization compatibility
torch	2.5.1	Matches runtime environment
tensorflow	2.18.0	Matches runtime environment
xgboost	3.1.2	API compatibility
scikit-learn	1.8.0	Pickle format compatibility
transformers	4.46.3	Model loading compatibility

Steps

Build and register the model artifact

Follow the instructions for your specific framework to train the model and log it to the Cloudera AI Registry.

XGBoost Models
PyTorch (TorchScript) Models
TensorFlow (SavedModel) Models

Run your training script in the Cloudera AI Workbench. Ensure you generate a mandatory signature using infer_signature and use mlflow.xgboost.log_model to register the artifact.

Following is an example script to train a model, generate the mandatory signature, and log it to the registry:

import xgboost as xgb
import mlflow
import mlflow.xgboost
from mlflow.models.signature import infer_signature
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

# 1. Prepare Data
X, y = make_classification(n_samples=1000, n_features=20, random_state=42)
X = X.astype('float32') # Use float32 to match runtime requirements
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# 2. Train Model
dtrain = xgb.DMatrix(X_train, label=y_train)
params = {'objective': 'binary:logistic', 'eval_metric': 'logloss'}
model = xgb.train(params, dtrain, num_boost_round=100)

# 3. Create Signature (REQUIRED)
# Use a sample input to infer the schema
sample_input = X_train[:5]
sample_output = model.predict(xgb.DMatrix(sample_input))
signature = infer_signature(sample_input, sample_output)

# 4. Log to Registry
mlflow.set_experiment("xgboost-binary-classifier")
with mlflow.start_run(run_name="xgboost-test"):
    mlflow.xgboost.log_model(
        xgb_model=model,
        artifact_path="model",
        signature=signature,
        registered_model_name="xgboost-binary-classifier"
    )

Convert your model to TorchScript format using torch.jit.trace. Use mlflow.pytorch.log_model to log the traced artifact to the registry.

Following is an example code to trace your model and log it as a TorchScript artifact:

import torch
import torch.nn as nn
import mlflow
import mlflow.pytorch
from mlflow.models.signature import infer_signature

# 1. Define Model
class TinyCNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv = nn.Conv2d(3, 16, 3, padding=1)
        self.fc = nn.Linear(16 * 224 * 224, 10) # Simplified for example

    def forward(self, x):
        x = torch.relu(self.conv(x))
        x = x.view(x.size(0), -1)
        return self.fc(x)

# 2. Create Dummy Input
model = TinyCNN().eval()
x = torch.randn(1, 3, 224, 224)

# 3. Convert to TorchScript (Tracing)
ts_model = torch.jit.trace(model, x)

# 4. Log to Registry
mlflow.set_experiment("tiny-torchscript-model")
with mlflow.start_run():
    mlflow.pytorch.log_model(
        pytorch_model=ts_model,
        artifact_path="model",
        signature=infer_signature(x.numpy(), model(x).detach().numpy()),
        registered_model_name="pytorch-torchscript-test"
    )

Ensure the model is saved in the SavedModel format. Log the model using mlflow.tensorflow.log_model with a defined input signature.

Following is an example code to save the model, and log the model with a defined input signature:

import tensorflow as tf
import numpy as np
import mlflow
import mlflow.tensorflow
from mlflow.models.signature import infer_signature

# 1. Create and Compile Model
model = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(224, 224, 3)),
    tf.keras.layers.Conv2D(16, 3, padding='same', activation='relu'),
    tf.keras.layers.GlobalAveragePooling2D(),
    tf.keras.layers.Dense(10)
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy')

# 2. Create Signature
sample_input = np.random.rand(1, 224, 224, 3).astype(np.float32)
sample_output = model.predict(sample_input)
signature = infer_signature(sample_input, sample_output)

# 3. Log to Registry
mlflow.set_experiment("tensorflow-savedmodel-test")
with mlflow.start_run():
    mlflow.tensorflow.log_model(
        model=model,
        artifact_path="model",
        signature=signature,
        registered_model_name="tensorflow-savedmodel-test"
    )

Deploy to Cloudera AI Inference service
Once your model is successfully registered, follow these steps to deploy it.
1. Navigate to the Registered Models page on the Cloudera AI control plane UI.
2. Click on your model name.
3. Click Deploy. The model endpoint creation dialog box is displayed.
4. Select the Cloudera AI Inference service cluster you wish to deploy it to, and click Deploy.
5. Create the model endpoint using either the UI or API.
Inference Payload Structure
All supported frameworks require flattened input data. Below is the standard JSON payload format for your deployed endpoint.
```
{
  "inputs": [{
    "name": "INPUT__0",
    "shape": [1, 224, 224, 3],
    "datatype": "FP32",
    "data": [/* Flattened image data */]
  }],
  "outputs": [{"name": "OUTPUT__0"}]
}
```
- XGBoost: Typically uses [batch_size, n_features]
- TensorFlow: Uses [batch, channels, height, width]
- PyTorch: Typically uses [batch, height, width, channels]