Cloudera AI Inference servicePDF version

Supported Model Artifact Formats

Lists Cloudera AI Inference service supported models:

  • Text-generating Large Language Models (LLMs), embedding, ranking and object detection models packaged as NVIDIA NIM.
  • Hugging Face transformer models supported by the vLLM engine.
  • Predictive models in the ONNX representation, registered to Cloudera AI Registry from a Cloudera AI Workbench. See Register an ONNX model to Cloudera AI Registry as an example showing how to convert a sklearn model to ONNX and then register it to the Cloudera AI Registry. Refer to Export a PyTorch model to ONNX or Getting Started Converting TensorFlow to ONNX documentation regarding how models using these frameworks can be converted to the ONNX representation.

We want your opinion

How can we improve this page?

What kind of feedback do you have?