Commit Graph

972 Commits

Author SHA1 Message Date
2fc9075b82 [V1] Structured Outputs + Thinking compatibility (#16577)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
2025-05-14 15:45:24 -07:00
964472b966 [Doc] Update prefix cache metrics to counting tokens (#18138)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
2025-05-14 15:23:30 +00:00
9ccc6ded42 [doc] add missing import (#18133)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-05-14 10:57:34 +00:00
82e7f9bb03 [Misc] replace does not exist model (#18119)
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
2025-05-14 02:13:47 -07:00
63ad622233 [New Model]: support GTE NewModel (#17986) 2025-05-14 01:31:31 -07:00
0189a65a2e [Docs] Expand security doc with firewall info (#18081)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-05-13 19:36:00 +00:00
906f0598fc [doc] add download/list/delete HF model CLI usage (#17940)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-05-13 11:15:51 +00:00
acee8f48aa [Model] Support MiMo-7B inference with MTP (#17433)
Signed-off-by: wp-alpha <wangpeng66@xiaomi.com>
Co-authored-by: wangpeng66 <wangpeng66@xiaomi.com>
2025-05-12 23:25:33 +00:00
98ea35601c [Lora][Frontend]Add default local directory LoRA resolver plugin. (#16855)
Signed-off-by: jberkhahn <jaberkha@us.ibm.com>
2025-05-12 10:39:10 -07:00
3a5ea75129 [Feature] Support DeepSeekV3 Function Call (#17784)
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
Signed-off-by: Xu Wenqing <xuwq1993@qq.com>
2025-05-12 00:45:21 -07:00
021c16c7ca [Model] Broadcast Ovis2 implementation to fit Ovis1.6 (#17861)
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-05-11 17:56:30 -07:00
e4b8713380 [New Model]: nomic-embed-text-v2-moe (#17785) 2025-05-11 00:59:43 -07:00
9cea90eab4 [Frontend] Add /classify endpoint (#17032)
Signed-off-by: Frieda (Jingying) Huang <jingyingfhuang@gmail.com>
2025-05-11 07:57:07 +00:00
d1110f5b5a [doc] update lora doc (#17936)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-05-11 15:56:21 +08:00
ec61ea20a8 [Misc] add dify integration (#17895)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-05-09 03:42:39 -07:00
ff8c400502 [Doc] remove visible token in doc (#17884)
Signed-off-by: yan <yanma1@habana.ai>
2025-05-09 01:21:31 -07:00
89a0315f4c [Doc] Update several links in reasoning_outputs.md (#17846)
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2025-05-09 01:20:55 -07:00
3d1e387652 [Docs] Add Slides from NYC Meetup (#17879)
Signed-off-by: simon-mo <simon.mo@hey.com>
2025-05-08 21:46:54 -07:00
53d0cb7423 [Misc] add chatbox integration (#17828)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-05-08 10:05:26 +00:00
96722aa81d [Frontend] Chat template fallbacks for multimodal models (#17805)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-05-07 23:05:54 -07:00
7ea2adb802 [Core] Support full cuda graph in v1 (#16072)
Signed-off-by: Chanh Nguyen <cnguyen@linkedin.com>
Co-authored-by: Chanh Nguyen <cnguyen@linkedin.com>
2025-05-07 22:30:15 -07:00
66ab3b13c9 Don't call the venv vllm (#17810)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-05-08 04:06:39 +00:00
7377dd0307 [doc] update the issue link (#17782)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-05-07 20:29:05 +08:00
8a15c2603a [Frontend] Add missing chat templates for various MLLMs (#17758)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-05-07 00:10:01 -07:00
022afbeb4e Fix doc build performance (#17748)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-05-07 00:36:41 +00:00
6115b11582 Make right sidebar more readable in "Supported Models" (#17723)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-05-06 16:48:26 +00:00
7525d5f3d5 [doc] Add RAG Integration example (#17692)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-05-06 16:10:23 +00:00
0d115460a7 [Docs] Use gh-file to add links to tool_calling.md (#17709)
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2025-05-06 15:27:19 +00:00
05e1f96419 Fix dockerfilegraph pre-commit hook (#17698)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-05-06 08:56:48 +00:00
63ced7b43f [Doc] Update notes for H2O-VL and Gemma3 (#17219)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-05-06 07:51:02 +00:00
999328be0d [Model] Add GraniteMoeHybrid 4.0 model (#17497)
Signed-off-by: Thomas Ortner <boh@zurich.ibm.com>
Signed-off-by: Stanislaw Wozniak <stw@zurich.ibm.com>
Co-authored-by: Thomas Ortner <boh@zurich.ibm.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Tyler Michael Smith <tysmith@redhat.com>
2025-05-06 12:00:31 +08:00
98834fefaa Update nm to rht in doc links + refine fp8 doc (#17678)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-05-06 00:41:14 +00:00
d6484ef3c3 Add full API docs and improve the UX of navigating them (#17485)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-05-03 19:42:43 -07:00
182f40ea8b Add NVIDIA TensorRT Model Optimizer in vLLM documentation (#17561) 2025-05-02 11:36:46 -07:00
3a500cd0b6 [doc] miss result (#17589)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-05-02 07:04:49 -07:00
6d1479ca4b [doc] add the print result (#17584)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-05-02 05:24:45 -07:00
afb12e4294 [Doc] note that not all unit tests pass on CPU platforms (#17554)
Signed-off-by: David Xia <david@davidxia.com>
2025-05-02 02:57:21 +00:00
4acfa3354a [ROCm] update installation guide to include build aiter from source instructions (#17542)
Signed-off-by: Hongxia Yang <hongxia.yang@amd.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-05-01 11:01:28 -07:00
98060b001d [Feature][Frontend]: Deprecate --enable-reasoning (#17452)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-05-01 06:46:16 -07:00
7169f87ad0 [doc] add streamlit integration (#17522)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-05-01 13:34:02 +00:00
1144a8efe7 [Bugfix] Temporarily disable gptq_bitblas on ROCm (#17411)
Signed-off-by: Yan Cangang <nalanzeyu@gmail.com>
2025-04-30 19:51:45 -07:00
2ac74d098e [doc] add install tips (#17373)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-04-30 17:02:41 +00:00
0b7e701dd4 [Docs] Update optimization.md doc (#17482)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-04-30 09:34:02 -07:00
39317cf42b [Docs] Add command for running mypy tests from CI (#17475)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-04-30 08:06:09 -07:00
77073c77bc [Core] Prevent side-channel attacks via cache salting (#17045)
Signed-off-by: Marko Rosenmueller <5467316+dr75@users.noreply.github.com>
2025-04-30 20:27:21 +08:00
54072f315f [MODEL ADDITION] Ovis2 Model Addition (#15826)
Signed-off-by: Marco <121761685+mlinmg@users.noreply.github.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-04-30 07:33:29 +00:00
ed6cfb90c8 [Hardware][Intel GPU] Upgrade to torch 2.7 (#17444)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Co-authored-by: Qiming Zhang <qiming1.zhang@intel.com>
2025-04-30 00:03:58 -07:00
a44c4f1d2f Support LoRA for Mistral3 (#17428)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-04-29 21:10:30 -07:00
2c4f59afc3 Update PyTorch to 2.7.0 (#16859) 2025-04-29 19:08:04 -07:00
792595b59d [TPU][V1][CI] Replace python3 setup.py develop with standard pip install --e on TPU (#17374)
Signed-off-by: NickLucche <nlucches@redhat.com>
2025-04-29 10:36:48 -07:00