Supported tranformer model architectures for vLLM0.8.4
vLLM0.8.4 supports the listed tranformer model architectures for Cloudera AI Inference service.
- aquilaforcausallm
- aquilamodel
- arcticforcausallm
- ariaforconditionalgeneration
- ayavisionforconditionalgeneration
- baichuanforcausallm
- bambaforcausallm
- bartforconditionalgeneration
- bartmodel
- bertforsequenceclassification
- bertmodel
- blip2forconditionalgeneration
- bloomforcausallm
- chameleonforconditionalgeneration
- chatglmforconditionalgeneration
- chatglmmodel
- cohere2forcausallm
- cohereforcausallm
- dbrxforcausallm
- decilmforcausallm
- deepseekforcausallm
- deepseekmtpmodel
- deepseekv2forcausallm
- deepseekv3forcausallm
- deepseekvlv2forcausallm
- eaglellamaforcausallm
- eaglemodel
- exaoneforcausallm
- falconforcausallm
- falconmambaforcausallm
- fairseq2llamaforcausallm
- florence2forconditionalgeneration
- fuyuforcausallm
- gemma2forcausallm
- gemma2model
- gemma3forcausallm
- gemma3forconditionalgeneration
- gemmaforcausallm
- glm4forcausallm
- glm4vforcausallm
- glmforcausallm
- gpt2lmheadmodel
- gptbigcodeforcausallm
- gptjforcausallm
- gptneoxforcausallm
- graniteforcausallm
- granitemoeforcausallm
- granitemoesharedforcausallm
- gritlm
- grok1modelforcausallm
- h2ovlchatmodel
- idefics3forconditionalgeneration
- internlm2forcausallm
- internlm2forrewardmodel
- internlm2veforcausallm
- internlm3forcausallm
- internlmforcausallm
- internvlchatmodel
- jaislmheadmodel
- jambaforcausallm
- jambaforsequenceclassification
- llama4forconditionalgeneration
- llamaforcausallm
- llamamodel
- llavaforconditionalgeneration
- llavanextforconditionalgeneration
- llavanextvideoforconditionalgeneration
- llavaonevisionforconditionalgeneration
- mambaforcausallm
- mamba2forcausallm
- mantisforconditionalgeneration
- medusamodel
- minicpm3forcausallm
- minicpmforcausallm
- minicpmo
- minicpmv
- minimaxtext01forcausallm
- mistral3forconditionalgeneration
- mistralforcausallm
- mistralmodel
- mixtralforcausallm
- mlpspeculatorpretrainedmodel
- mllamaforconditionalgeneration
- molmoforcausallm
- mptforcausallm
- nemotronforcausallm
- nvlm_d
- olmoforcausallm
- olmo2forcausallm
- olmoeforcausallm
- optforcausallm
- orionforcausallm
- paligemmaforconditionalgeneration
- persimmonforcausallm
- phi3forcausallm
- phi3smallforcausallm
- phi3vforcausallm
- phi4mmforcausallm
- phiforcausallm
- phimoeforcausallm
- pixtralforconditionalgeneration
- prithvigeospatialmae
- qwen2_5_vlforconditionalgeneration
- qwen2audioforconditionalgeneration
- qwen2forcausallm
- qwen2forprocessrewardmodel
- qwen2forrewardmodel
- qwen2forsequenceclassification
- qwen2model
- qwen2moeforcausallm
- qwen2vlforconditionalgeneration
- qwen3forcausallm
- qwen3moeforcausallm
- qwenvlforconditionalgeneration
- qwenlmheadmodel
- quantmixtralforcausallm
- robertaformaskedlm
- robertaforsequenceclassification
- robertamodel
- rwforcausallm
- skyworkr1vchatmodel
- smolvlmforconditionalgeneration
- solarforcausallm
- stablelmepochforcausallm
- stablelmforcausallm
- starcoder2forcausallm
- telechat2forcausallm
- teleflmforcausallm
- transformersforcausallm
- ultravoxmodel
- whisperforconditionalgeneration
- xlmrobertaforsequenceclassification
- xlmrobertamodel
- xverseforcausallm
- zamba2forcausallm
