|
|
f54f85129e
|
[Model][2/N] Improve all pooling task | Support multi-vector retrieval (#25370)
Signed-off-by: wang.yuqi <noooop@126.com>
|
2025-10-15 11:14:41 +00:00 |
|
|
|
6256697997
|
[Doc] ruff format remaining Python examples (#26795)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-15 01:25:49 -07:00 |
|
|
|
96b9aa5aa0
|
[Frontend][torch.compile] CompilationConfig Overhaul (#20283): name change compilation level to compilation mode, deprecation compilation level (#26355)
Signed-off-by: morrison-turnansky <mturnans@redhat.com>
Signed-off-by: Morrison Turnansky <mturnans@redhat.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2025-10-15 02:51:16 +00:00 |
|
|
|
8317f72354
|
[Misc][DP] support customized aggregated logger for dp (#24354)
Signed-off-by: Lu Fang <fanglu@fb.com>
|
2025-10-13 17:45:59 -07:00 |
|
|
|
d2a7938582
|
[Frontend][1/N] Improve all pooling task | Support FP16 Embedding Base64 (Still uses fp32 by default). (#26414)
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: Maximilien de Bayser <maxdebayser@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-10-13 19:06:43 +00:00 |
|
|
|
767c3ab869
|
[Model][0/N] Improve all pooling task | clean up (#25817)
Signed-off-by: wang.yuqi <noooop@126.com>
|
2025-10-13 16:44:50 +08:00 |
|
|
|
3cd36660f7
|
docs: wrong command in structured_outputs README (#26677)
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
|
2025-10-12 20:59:01 -07:00 |
|
|
|
8fcaaf6a16
|
Update Optional[x] -> x | None and Union[x, y] to x | y (#26633)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-12 09:51:31 -07:00 |
|
|
|
c6187f55f7
|
Refactor MistralTokenizer (#26358)
Signed-off-by: Julien Denize <julien.denize@mistral.ai>
|
2025-10-09 22:48:58 +00:00 |
|
|
|
e09d1753ec
|
Remove Python 3.9 support ahead of PyTorch 2.9 in v0.11.1 (#26416)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-08 10:40:42 -07:00 |
|
|
|
08d26a1b7e
|
[Model] Use merge_by_field_config for MM models (Ovis family) (#26308)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-10-07 12:54:22 +00:00 |
|
|
|
7e4cd070b0
|
[V0 Deprecation] Remove VLLM_USE_V1 from docs and scripts (#26336)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-07 16:46:44 +08:00 |
|
|
|
46b0779996
|
[BugFix] Update KV block hash type from BlockHash to ExternalBlockHash in kv_events_subscriber - #26264 (#26265)
Signed-off-by: atalhens <sneh.lata@nutanix.com>
|
2025-10-07 08:42:28 +00:00 |
|
|
|
19a00eb210
|
[Model] Use merge_by_field_config for MM models (Llava family) (#26280)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-06 09:45:26 +00:00 |
|
|
|
6c04638214
|
Fix per file ruff ignores related to line length (#26262)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-06 05:12:40 +00:00 |
|
|
|
4e256cadc2
|
Remove all references to yapf as it's no longer used (#26251)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-05 09:18:11 -07:00 |
|
|
|
d6953beb91
|
Convert formatting to use ruff instead of yapf + isort (#26247)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-05 07:06:22 -07:00 |
|
|
|
59a85c366e
|
[Model] Use merge_by_field_config for MM models (H-L) (#26230)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-05 11:54:17 +08:00 |
|
|
|
4570535ec4
|
[Model] CLIP Embedding Support (#26010)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-04 06:21:42 -07:00 |
|
|
|
f9a8084e48
|
[Model] Use merge_by_field_config for MM models (InternVL family) (#26153)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-03 01:59:06 -07:00 |
|
|
|
d00d652998
|
[CI/Build] Replace vllm.entrypoints.openai.api_server entrypoint with vllm serve command (#25967)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-02 10:04:57 -07:00 |
|
|
|
9a9f48dff7
|
[V1] [P/D] Add Support for KV Load Failure Recovery (#19330)
Signed-off-by: David Ben-David <davidb@pliops.com>
Co-authored-by: David Ben-David <davidb@pliops.com>
|
2025-09-30 14:57:08 -07:00 |
|
|
|
2f652e6cdf
|
[Doc] Improve MM Pooling model documentation (#25966)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-09-30 18:58:29 +00:00 |
|
|
|
8eb0a1d906
|
[Doc] Polish example for torchrun dp (#25899)
|
2025-09-29 21:31:34 +00:00 |
|
|
|
23b8ee672d
|
[Misc] Update openai client example file for multimodal (#25795)
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-09-27 07:57:07 +00:00 |
|
|
|
c70ac4b8ff
|
[spec decode] Consolidate speculative decode method name for MTP (#25232)
Signed-off-by: zixi-qi <qizixi@meta.com>
|
2025-09-26 22:27:05 +00:00 |
|
|
|
6e30010d2f
|
fix: print outputt offline_inference/base/chat.py example (#25744)
Signed-off-by: Iceber Gu <caiwei95@hotmail.com>
|
2025-09-26 01:18:24 -07:00 |
|
|
|
8c853050e7
|
[Docs] Enable fail_on_warning for the docs build in CI (#25580)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-09-24 19:30:33 +00:00 |
|
|
|
867ecdd1c8
|
[Spec Decode][CI] Add e2e test for examples/spec_decode.py and prevent breaking Acceptance Length (#24531)
Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
|
2025-09-23 10:46:40 -07:00 |
|
|
|
4c966e440e
|
[XPU] Fix MOE DP accuracy issue on XPU (#25465)
|
2025-09-23 14:32:57 +00:00 |
|
|
|
922979bfcc
|
[DP] support torchrun external launcher with Data Parallelism (#24899)
Signed-off-by: Lu Fang <fanglu@fb.com>
Signed-off-by: Zhuohan Li <zhuohan123@gmail.com>
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
|
2025-09-22 12:06:05 -07:00 |
|
|
|
7b57a433da
|
[Model] Support Dots OCR (#24645)
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: yinz-aizip <yinz@aizip.ai>
|
2025-09-22 02:24:40 +00:00 |
|
|
|
bc6e542d9f
|
Remove V0 attention backends (#25351)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-09-21 16:03:28 -07:00 |
|
|
|
52c2a8d4ad
|
[V0 Deprecation] Remove LLMEngine (#25033)
Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-09-20 17:56:30 -07:00 |
|
|
|
6c117cff7d
|
[Frontend] Pass API server count to each process (#23717)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-09-20 01:15:19 +08:00 |
|
|
|
058525b997
|
Move PoolerConfig from config/__init__.py to config/pooler.py (#25181)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-09-19 11:02:55 +00:00 |
|
|
|
c4cb0af98a
|
[spec decode] Fix MTP inference path for MiMo-7B model (#25136)
Signed-off-by: zixi-qi <qizixi@meta.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-09-18 09:12:19 -07:00 |
|
|
|
5f696c33b1
|
[New Model] Support BertForTokenClassification / Named Entity Recognition (NER) task (#24872)
Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-09-18 23:22:01 +08:00 |
|
|
|
29283e8976
|
[Chore] Cleanup guided namespace, move to structured outputs config (#22772)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-09-18 09:20:27 +00:00 |
|
|
|
7ae9887542
|
[V1] Logits processor docs (#22919)
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
Signed-off-by: afeldman-nm <156691304+afeldman-nm@users.noreply.github.com>
Co-authored-by: Joseph Marinier <Joseph.Marinier@gmail.com>
|
2025-09-17 11:53:12 -07:00 |
|
|
|
0f7acdd73c
|
[Model] Support Qwen3-VL Model Series (#24727)
Signed-off-by: Roger Wang <hey@rogerw.io>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Huang Jie <92386084+JJJYmmm@users.noreply.github.com>
Co-authored-by: 松灵 <26085463+wulipc@users.noreply.github.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-09-17 05:01:04 +00:00 |
|
|
|
567939953b
|
[Core/DBO][1/N] Add Dual-Batch Overlap mechanism to VLLM (#23693)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Sage Moore <sage@neuralmagic.com>
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: Lucas Wilkinson <lwilkins@redhat.com>
Co-authored-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
Co-authored-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
|
2025-09-16 12:21:48 -04:00 |
|
|
|
de3e53a75b
|
feat: Add Grafana and Perces monitoring dashboards for vLLM (#23498)
|
2025-09-16 05:53:40 -07:00 |
|
|
|
759ef49b15
|
Remove V0 Encoder-Decoder Support (#24907)
Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>
|
2025-09-15 21:17:14 -07:00 |
|
|
|
0e219cd50b
|
[Bugfix] Fix GLM4.1V multimodal processor with compatability for Transformers v4.56 (#24822)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-09-15 20:45:06 +08:00 |
|
|
|
bf214ca226
|
[Misc] Fix examples openai_pooling_client.py (#24853)
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-09-15 11:57:30 +00:00 |
|
|
|
41ae4a1eab
|
[Doc]: fix typos in various files (#24798)
Signed-off-by: Didier Durand <durand.didier@gmail.com>
|
2025-09-13 00:43:33 -07:00 |
|
|
|
7f2ea7074e
|
[Frontend][Multimodal] Allow skipping media data when UUIDs are provided. (#23950)
Signed-off-by: Roger Wang <hey@rogerw.io>
Signed-off-by: Chenheli Hua <huachenheli@outlook.com>
Signed-off-by: Roger Wang <hey@rogerw.me>
Co-authored-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.me>
|
2025-09-13 02:16:06 +00:00 |
|
|
|
37e8182bfe
|
[v1] Add Whisper model support (encoder-decoder) (#21088)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: NickLucche <nlucches@redhat.com>
|
2025-09-10 13:53:35 -07:00 |
|
|
|
3d2a2de8f7
|
[RL] fast weight update with zmq + ipc handles (#24295)
Signed-off-by: huangweixiao <huangweixiao@msh.team>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2025-09-09 16:57:46 +08:00 |
|