Cloudera AI Inference servicePDF version

OpenAI Inference Protocol Using Curl

Consider this example for OpenAI Inference Protocol Using Curl.

An example inference payload for the OpenAI Protocol:

# cat ./llama-input.json
{
    "messages": [
        {
            "content": "You are a polite and respectful chatbot helping people plan a vacation.",
            "role": "system"
        },
        {
            "content": "What should I do for a 4 day vacation in Spain?",
            "role": "user"
        }
    ],
    "model": "meta/llama-3_1-8b-instruct",
    "max_tokens": 200,
    "top_p": 1,
    "n": 1,
    "stream": false,
    "stop": "\n",
    "frequency_penalty": 0.0
}

curl -H "Content-Type: application/json" -H "Authorization: Bearer ${CDP_TOKEN}" "https://${DOMAIN}/namespaces/serving-default/endpoints/llama-3-1/v1/chat/completions" -d @./llama-input.json
You will receive response similar to the following:
Spain offers a diverse range of experiences, making it perfect for a 4-day vacation…

We want your opinion

How can we improve this page?

What kind of feedback do you have?