Deploying Additional Model Frameworks
Cloudera AI Inference service now supports direct deployment for XGBoost, PyTorch, and TensorFlow models using the Cloudera AI Registry. Learn how to train, register, and deploy these models.
Prerequisites
| Package | Version | Reason |
|---|---|---|
| mlflow | 2.19.0 | Serialization compatibility |
| torch | 2.5.1 | Matches runtime environment |
| tensorflow | 2.18.0 | Matches runtime environment |
| xgboost | 3.1.2 | API compatibility |
| scikit-learn | 1.8.0 | Pickle format compatibility |
| transformers | 4.46.3 | Model loading compatibility |
Steps
- Build and register the model artifact
Follow the instructions for your specific framework to train the model and log it to the Cloudera AI Registry.
Run your training script in the Cloudera AI Workbench. Ensure you generate a mandatory signature using infer_signature and use mlflow.xgboost.log_model to register the artifact.
Following is an example script to train a model, generate the mandatory signature, and log it to the registry:import xgboost as xgb import mlflow import mlflow.xgboost from mlflow.models.signature import infer_signature from sklearn.datasets import make_classification from sklearn.model_selection import train_test_split # 1. Prepare Data X, y = make_classification(n_samples=1000, n_features=20, random_state=42) X = X.astype('float32') # Use float32 to match runtime requirements X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2) # 2. Train Model dtrain = xgb.DMatrix(X_train, label=y_train) params = {'objective': 'binary:logistic', 'eval_metric': 'logloss'} model = xgb.train(params, dtrain, num_boost_round=100) # 3. Create Signature (REQUIRED) # Use a sample input to infer the schema sample_input = X_train[:5] sample_output = model.predict(xgb.DMatrix(sample_input)) signature = infer_signature(sample_input, sample_output) # 4. Log to Registry mlflow.set_experiment("xgboost-binary-classifier") with mlflow.start_run(run_name="xgboost-test"): mlflow.xgboost.log_model( xgb_model=model, artifact_path="model", signature=signature, registered_model_name="xgboost-binary-classifier" )Convert your model to TorchScript format using torch.jit.trace. Use mlflow.pytorch.log_model to log the traced artifact to the registry.
Following is an example code to trace your model and log it as a TorchScript artifact:import torch import torch.nn as nn import mlflow import mlflow.pytorch from mlflow.models.signature import infer_signature # 1. Define Model class TinyCNN(nn.Module): def __init__(self): super().__init__() self.conv = nn.Conv2d(3, 16, 3, padding=1) self.fc = nn.Linear(16 * 224 * 224, 10) # Simplified for example def forward(self, x): x = torch.relu(self.conv(x)) x = x.view(x.size(0), -1) return self.fc(x) # 2. Create Dummy Input model = TinyCNN().eval() x = torch.randn(1, 3, 224, 224) # 3. Convert to TorchScript (Tracing) ts_model = torch.jit.trace(model, x) # 4. Log to Registry mlflow.set_experiment("tiny-torchscript-model") with mlflow.start_run(): mlflow.pytorch.log_model( pytorch_model=ts_model, artifact_path="model", signature=infer_signature(x.numpy(), model(x).detach().numpy()), registered_model_name="pytorch-torchscript-test" )Ensure the model is saved in the SavedModel format. Log the model using mlflow.tensorflow.log_model with a defined input signature.
Following is an example code to save the model, and log the model with a defined input signature:
import tensorflow as tf import numpy as np import mlflow import mlflow.tensorflow from mlflow.models.signature import infer_signature # 1. Create and Compile Model model = tf.keras.Sequential([ tf.keras.layers.Input(shape=(224, 224, 3)), tf.keras.layers.Conv2D(16, 3, padding='same', activation='relu'), tf.keras.layers.GlobalAveragePooling2D(), tf.keras.layers.Dense(10) ]) model.compile(optimizer='adam', loss='sparse_categorical_crossentropy') # 2. Create Signature sample_input = np.random.rand(1, 224, 224, 3).astype(np.float32) sample_output = model.predict(sample_input) signature = infer_signature(sample_input, sample_output) # 3. Log to Registry mlflow.set_experiment("tensorflow-savedmodel-test") with mlflow.start_run(): mlflow.tensorflow.log_model( model=model, artifact_path="model", signature=signature, registered_model_name="tensorflow-savedmodel-test" ) - Deploy to Cloudera AI Inference service
Once your model is successfully registered, follow these steps to deploy it.
- Navigate to the Registered Models page on the Cloudera AI control plane UI.
- Click on your model name.
- Click Deploy. The model endpoint creation dialogbox is displayed.
- Select the Cloudera AI Inference service cluster you wish to deploy it to, and click Deploy.
- Create the model endpoint using either the UI or API.
- Inference Payload StructureAll supported frameworks require flattened input data. Below is the standard JSON payload format for your deployed endpoint.
{ "inputs": [{ "name": "INPUT__0", "shape": [1, 224, 224, 3], "datatype": "FP32", "data": [/* Flattened image data */] }], "outputs": [{"name": "OUTPUT__0"}] }- XGBoost: Typically uses
[batch_size, n_features] -
TensorFlow: Uses
[batch, channels, height, width] -
PyTorch: Typically uses
[batch, height, width, channels]
- XGBoost: Typically uses
