Deploying Additional Model Frameworks

Cloudera AI Inference service now supports direct deployment for XGBoost, PyTorch, and TensorFlow models using the Cloudera AI Registry. Learn how to train, register, and deploy these models.

Prerequisites

To ensure compatibility with the inference runtime, it is recommended to use the package versions listed below in your Cloudera AI Workbench session.
Package Version Reason
mlflow 2.19.0 Serialization compatibility
torch 2.5.1 Matches runtime environment
tensorflow 2.18.0 Matches runtime environment
xgboost 3.1.2 API compatibility
scikit-learn 1.8.0 Pickle format compatibility
transformers 4.46.3 Model loading compatibility

Steps

  1. Build and register the model artifact

    Follow the instructions for your specific framework to train the model and log it to the Cloudera AI Registry.

    Run your training script in the Cloudera AI Workbench. Ensure you generate a mandatory signature using infer_signature and use mlflow.xgboost.log_model to register the artifact.

    Following is an example script to train a model, generate the mandatory signature, and log it to the registry:
    import xgboost as xgb
    import mlflow
    import mlflow.xgboost
    from mlflow.models.signature import infer_signature
    from sklearn.datasets import make_classification
    from sklearn.model_selection import train_test_split
    
    # 1. Prepare Data
    X, y = make_classification(n_samples=1000, n_features=20, random_state=42)
    X = X.astype('float32') # Use float32 to match runtime requirements
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
    
    # 2. Train Model
    dtrain = xgb.DMatrix(X_train, label=y_train)
    params = {'objective': 'binary:logistic', 'eval_metric': 'logloss'}
    model = xgb.train(params, dtrain, num_boost_round=100)
    
    # 3. Create Signature (REQUIRED)
    # Use a sample input to infer the schema
    sample_input = X_train[:5]
    sample_output = model.predict(xgb.DMatrix(sample_input))
    signature = infer_signature(sample_input, sample_output)
    
    # 4. Log to Registry
    mlflow.set_experiment("xgboost-binary-classifier")
    with mlflow.start_run(run_name="xgboost-test"):
        mlflow.xgboost.log_model(
            xgb_model=model,
            artifact_path="model",
            signature=signature,
            registered_model_name="xgboost-binary-classifier"
        )

    Convert your model to TorchScript format using torch.jit.trace. Use mlflow.pytorch.log_model to log the traced artifact to the registry.

    Following is an example code to trace your model and log it as a TorchScript artifact:
    import torch
    import torch.nn as nn
    import mlflow
    import mlflow.pytorch
    from mlflow.models.signature import infer_signature
    
    # 1. Define Model
    class TinyCNN(nn.Module):
        def __init__(self):
            super().__init__()
            self.conv = nn.Conv2d(3, 16, 3, padding=1)
            self.fc = nn.Linear(16 * 224 * 224, 10) # Simplified for example
    
        def forward(self, x):
            x = torch.relu(self.conv(x))
            x = x.view(x.size(0), -1)
            return self.fc(x)
    
    # 2. Create Dummy Input
    model = TinyCNN().eval()
    x = torch.randn(1, 3, 224, 224)
    
    # 3. Convert to TorchScript (Tracing)
    ts_model = torch.jit.trace(model, x)
    
    # 4. Log to Registry
    mlflow.set_experiment("tiny-torchscript-model")
    with mlflow.start_run():
        mlflow.pytorch.log_model(
            pytorch_model=ts_model,
            artifact_path="model",
            signature=infer_signature(x.numpy(), model(x).detach().numpy()),
            registered_model_name="pytorch-torchscript-test"
        )

    Ensure the model is saved in the SavedModel format. Log the model using mlflow.tensorflow.log_model with a defined input signature.

    Following is an example code to save the model, and log the model with a defined input signature:

    import tensorflow as tf
    import numpy as np
    import mlflow
    import mlflow.tensorflow
    from mlflow.models.signature import infer_signature
    
    # 1. Create and Compile Model
    model = tf.keras.Sequential([
        tf.keras.layers.Input(shape=(224, 224, 3)),
        tf.keras.layers.Conv2D(16, 3, padding='same', activation='relu'),
        tf.keras.layers.GlobalAveragePooling2D(),
        tf.keras.layers.Dense(10)
    ])
    
    model.compile(optimizer='adam', loss='sparse_categorical_crossentropy')
    
    # 2. Create Signature
    sample_input = np.random.rand(1, 224, 224, 3).astype(np.float32)
    sample_output = model.predict(sample_input)
    signature = infer_signature(sample_input, sample_output)
    
    # 3. Log to Registry
    mlflow.set_experiment("tensorflow-savedmodel-test")
    with mlflow.start_run():
        mlflow.tensorflow.log_model(
            model=model,
            artifact_path="model",
            signature=signature,
            registered_model_name="tensorflow-savedmodel-test"
        )
  2. Deploy to Cloudera AI Inference service

    Once your model is successfully registered, follow these steps to deploy it.

    1. Navigate to the Registered Models page on the Cloudera AI control plane UI.
    2. Click on your model name.
    3. Click Deploy. The model endpoint creation dialogbox is displayed.
    4. Select the Cloudera AI Inference service cluster you wish to deploy it to, and click Deploy.
    5. Create the model endpoint using either the UI or API.
  3. Inference Payload Structure
    All supported frameworks require flattened input data. Below is the standard JSON payload format for your deployed endpoint.
    {
      "inputs": [{
        "name": "INPUT__0",
        "shape": [1, 224, 224, 3],
        "datatype": "FP32",
        "data": [/* Flattened image data */]
      }],
      "outputs": [{"name": "OUTPUT__0"}]
    }
    • XGBoost: Typically uses [batch_size, n_features]
    • TensorFlow: Uses [batch, channels, height, width]
    • PyTorch: Typically uses [batch, height, width, channels]