1597 Commits

Author SHA1 Message Date
33a0ea5f32 [Docs] add Shanghai Meetup - 2025/10 (#27545)
Signed-off-by: Kebe <mail@kebe7jun.com>
Signed-off-by: esmeetu <jasonailu87@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: esmeetu <jasonailu87@gmail.com>
2025-10-31 00:33:13 +08:00
9956aae4ea [Model][Ouro] Support Ouro Model (#27794)
Signed-off-by: yinfan.1024 <yinfan.1024@bytedance.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: yinfan.1024 <yinfan.1024@bytedance.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
2025-10-30 22:34:41 +08:00
4e68cc9b6a [Model] Introduce Kimi Linear to vLLM (#27809)
Signed-off-by: lizhiyuan <lizhiyuan@moonshot.cn>
Signed-off-by: Zhiyuan Li <uniartisan2017@gmail.com>
2025-10-30 21:02:27 +08:00
4464723f22 [Frontend][Doc][5/N] Improve all pooling task | Polish encode (pooling) api & Document. (#25524)
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-10-30 12:13:05 +00:00
31b55ffc62 use stringData in secret yaml to store huggingface token (#25685)
Signed-off-by: yiting.jiang <yiting.jiang@daocloud.io>
2025-10-30 00:47:36 -07:00
8bff831f0a [Benchmark] Cleanup deprecated nightly benchmark and adjust the docstring for performance benchmark (#25786)
Signed-off-by: KuntaiDu <kuntai@uchicago.edu>
2025-10-30 04:43:37 +00:00
17d055f527 [Feat] Adds runai distributed streamer (#27230)
Signed-off-by: bbartels <benjamin@bartels.dev>
Signed-off-by: Benjamin Bartels <benjamin@bartels.dev>
Co-authored-by: omer-dayan <omdayan@nvidia.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-10-29 21:09:10 -07:00
b5bae42f91 [XPU] Update latest IPEX 2.8 release (#27735)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2025-10-30 11:17:13 +08:00
ecca3fee76 [Frontend] Add vllm bench sweep to CLI (#27639)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-29 05:59:48 -07:00
130aa8cbcf Add load pattern configuration guide to benchmarks (#26886)
Signed-off-by: Matvei Pashkovskii <mpashkov@amd.com>
Signed-off-by: Matvei Pashkovskii <matvei.pashkovskii@amd.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-28 10:49:15 -07:00
5b3c35a68e [ROCm] [Doc] Update ROCm installation docs (#27327)
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
2025-10-28 13:00:50 +08:00
69f064062b Code quality improvements: version update, type annotation enhancement, and enum usage simplification (#27581)
Signed-off-by: Bradley <bradley.b.pitt@gmail.com>
2025-10-27 17:50:22 +00:00
4f882be4a0 [Model] Siglip2 Model Support (#27566)
Signed-off-by: piood <2477084691@qq.com>
2025-10-27 06:57:37 -07:00
7c2bdb83dc [Misc] Clean up utils (#27552)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-27 09:05:40 +00:00
2d631d28c6 [Doc] Slight improvement to M2 and beyond (#27554)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-10-27 09:02:10 +00:00
181bf5bbde [Docs] reemove the incorrect enable_reasoning parameter (#27550)
Signed-off-by: zxw <1020938856@qq.com>
2025-10-26 23:17:19 -07:00
8fb7b2fab9 [Doc] Fix links to GH projects (#27530)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-26 17:55:51 +08:00
be7b55a83d [Doc] Remove Molmo warning (#27527)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-26 16:22:52 +08:00
17af6aa0da [Document] Add ms-swift library to rlhf.md (#27469) 2025-10-24 20:31:50 +00:00
6454afec90 [Doc] Fix minor issues in docs/design/metrics.md (#27436)
Signed-off-by: Lifan Shen <lifans@meta.com>
2025-10-24 05:40:54 -07:00
88d3141ec6 [Docs] remove v1 column for embedding models (#27446)
Signed-off-by: piood <2477084691@qq.com>
Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-10-23 23:55:03 -07:00
85fee74b33 [Bugfix][CI] Move resolving cudagraph_mode before initializing attn_metadata_builder (#27427)
Signed-off-by: fhl2000 <63384265+fhl2000@users.noreply.github.com>
2025-10-23 20:31:14 -07:00
0552cfb195 [Model] Siglip Embedding Support (#27324)
Signed-off-by: piood <2477084691@qq.com>
2025-10-23 20:19:48 +00:00
3fa2c12185 [Frontend][4/N] Improve all pooling task | Add plugin pooling task (#26973)
Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: Christian Pinto <christian.pinto@ibm.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Christian Pinto <christian.pinto@ibm.com>
2025-10-23 14:46:18 +00:00
fe2016de2d [CI/Build] Remove unnecessary flags from test registry (#27353)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-23 14:42:40 +00:00
ff93cc8c84 [CORE] Support Prefix Caching with Prompt Embeds (#27219)
Signed-off-by: Andrew Sansom <andrew@protopia.ai>
2025-10-22 22:18:07 -07:00
1cb8c6c5fe [Doc] Fix numbering sequence in prefix caching (#27357)
Signed-off-by: William Song <jinwook@umich.edu>
2025-10-22 17:35:47 +00:00
e05a6754a8 [Model] Revert PR #26715: Restore custom PaliGemma and Gemma3-MM impl… (#27309)
Signed-off-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Co-authored-by: Luciano Martins <lucianommartins@users.noreply.github.com>
2025-10-22 10:05:34 -07:00
58fab50d82 [Frontend] Require flag for loading text and image embeds (#27204)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-22 15:52:02 +00:00
675aa2ec64 [Model] Upstream Deepseek-OCR model (#27247)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-10-22 07:59:15 -07:00
141d3b9fc5 [docs] Update v1 metrics design doc (#27332)
Signed-off-by: Simon Mo <simon.mo@hey.com>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: atalhens <sneh.lata@nutanix.com>
Co-authored-by: Simon Mo <simon.mo@hey.com>
Co-authored-by: atalhens <sneh.lata@nutanix.com>
2025-10-22 06:29:15 -07:00
ceacedc1f9 [Benchmark] Add plot utility for parameter sweep (#27168)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-21 20:30:03 -07:00
265ecb05fb [DOC] [ROCm] Add ROCm quickstart guide (#26505)
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
2025-10-22 03:10:48 +00:00
becb7de40b Update PyTorch to 2.9.0+cu129 (#24994)
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
2025-10-21 17:20:18 -04:00
f32bf7582e [Model][VLM] Support Bee-8B Model (#27012)
Signed-off-by: uyzhang <yi.zhang.4096@gmail.com>
Signed-off-by: Yi Zhang <zhangyi970819@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-10-20 02:31:26 +00:00
b3aba04e5a [Benchmark] Convenience script for multiple parameter combinations (#27085)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-18 23:57:01 -07:00
83e760c57d [V1][Metrics][Plugin] Add plugin support for custom StatLoggerBase implementations (#22456)
Signed-off-by: tovam <tovam@pliops.com>
2025-10-18 15:12:46 -07:00
a1946c9f61 [Chore] Separate out profiling utilities from vllm.utils (#27150)
Signed-off-by: dongbo910220 <1275604947@qq.com>
2025-10-18 19:12:01 +00:00
12e21701e7 [DOC][FEATURES][CPU]update cpu feature for v1 (#27135)
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
2025-10-18 01:10:45 -07:00
b038d9c40c [Data-parallel] Allow DP>1 for world_size > num_gpus on node (8) (#26367)
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>
Signed-off-by: Rui Qiao <ruisearch42@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Rui Qiao <ruisearch42@gmail.com>
2025-10-17 08:24:42 -07:00
6c9fdbf725 [Docs] Replace rst style double-backtick with md single-backtick (#27091)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-17 02:47:34 -07:00
483ea64611 [Docs] Replace all explicit anchors with real links (#27087)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-17 02:22:06 -07:00
bbc1b29665 Update troubleshooting.md and remind VLLM_TRACE_FUNCTION usage (#27069)
Signed-off-by: cong-meta <prowindy@hotmail.com>
2025-10-17 01:53:06 -07:00
acb1bfa601 [CI] fix docs build failed (#27082)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-10-17 07:53:40 +00:00
3aeb19a39e [Model] Add support for LightOnOCR (#26916)
Signed-off-by: Said Taghadouini <taghadouinisaid@gmail.com>
Signed-off-by: Said Taghadouini <84044788+staghado@users.noreply.github.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-10-17 05:05:24 +00:00
8c017b3490 [Model] Always use Transformers backend for PaliGemma and Gemma3-MM (#26715)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-17 05:03:35 +00:00
4ffd6e8942 [Docs] Reduce custom syntax used in docs (#27009)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-16 20:05:34 -07:00
fb0571b077 [GPTOSS][DP/EP][Marlin] Enable GPTOSS Batched DP/EP using Marlin kernels (#25997)
Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
2025-10-16 12:53:11 -07:00
02d709a6f1 [docs] standardize Hugging Face env var to HF_TOKEN (deprecates HUGGING_FACE_HUB_TOKEN) (#27020)
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2025-10-16 15:31:02 +01:00
509cdc0370 [DOC][XPU]update feature parity with Intel GPU (#26954)
Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
2025-10-15 20:07:10 -07:00