Supported Model Artifact Formats

Lists Cloudera AI Inference service supported models:

  • Text-generating Large Language Models (LLMs), embedding, ranking and object detection models packaged as NVIDIA NIM.
  • Hugging Face transformer models supported by the vLLM engine.
  • Predictive models in the ONNX representation, registered to Cloudera AI Registry from a Cloudera AI Workbench. See Register an ONNX model to Cloudera AI Registry as an example showing how to convert a sklearn model to ONNX and then register it to the Cloudera AI Registry. Refer to Export a PyTorch model to ONNX or Getting Started Converting TensorFlow to ONNX documentation regarding how models using these frameworks can be converted to the ONNX representation.