Interacting with Model Endpoints You can interact with the Cloudera AI Inference service API using an HTTP/REST client, such as cURL. Making an inference call to a Model Endpoint with an OpenAI APILanguage models for text generation are deployed using NVIDIA’s NIM microservices. These model endpoints are compliant with the OpenAI Protocol. See NVIDIA NIM documentation for supported OpenAI APIs and NVIDIA NIM specific extensions such as Function Calling and Structured Generation.Making an inference call to a Model Endpoint with Open Inference ProtocolCloudera AI Inference service serves predictive ONNX models using the Nvidia Triton Server. The deployed model endpoints are compliant with Open Interface Protocol version 2.