|
|
c11013db8b
|
[Meta] Llama4 EAGLE Support (#20591)
Signed-off-by: qizixi <qizixi@meta.com>
Co-authored-by: qizixi <qizixi@meta.com>
|
2025-07-15 21:14:15 -07:00 |
|
|
|
e7e3e6d263
|
Voxtral (#20970)
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-07-15 07:35:30 -07:00 |
|
|
|
33d560001e
|
[Docs] Improve documentation for ray cluster launcher helper script (#20602)
Signed-off-by: Ricardo Decal <rdecal@anyscale.com>
|
2025-07-15 03:55:45 -07:00 |
|
|
|
235bfd5dfe
|
[Docs] Improve documentation for RLHF example (#20598)
Signed-off-by: Ricardo Decal <rdecal@anyscale.com>
|
2025-07-15 01:54:10 -07:00 |
|
|
|
251595368f
|
Fix DeepSeek-R1-0528 chat template (#20717)
Signed-off-by: Benjamin Merkel <benjamin.merkel@tngtech.com>
Co-authored-by: Benjamin Merkel <benjamin.merkel@tngtech.com>
|
2025-07-10 17:47:36 +00:00 |
|
|
|
4bed167768
|
[Model][VLM] Support JinaVL Reranker (#20260)
Signed-off-by: shineran96 <shinewang96@gmail.com>
|
2025-07-10 10:43:43 -07:00 |
|
|
|
853487bc1b
|
[Docs] Improve docs for RLHF co-location example (#20599)
Signed-off-by: Ricardo Decal <rdecal@anyscale.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-07-09 08:06:43 -07:00 |
|
|
|
977180c912
|
[Docs] Improve documentation for multi-node service helper script (#20600)
Signed-off-by: Ricardo Decal <rdecal@anyscale.com>
|
2025-07-08 19:44:26 -07:00 |
|
|
|
b91cb3fa5c
|
[Docs] Improve documentation for Deepseek R1 on Ray Serve LLM (#20601)
Signed-off-by: Ricardo Decal <rdecal@anyscale.com>
|
2025-07-08 02:09:06 -07:00 |
|
|
|
72d14d0eed
|
[Frontend] [Core] Integrate Tensorizer in to S3 loading machinery, allow passing arbitrary arguments during save/load (#19619)
Signed-off-by: Sanger Steel <sangersteel@gmail.com>
Co-authored-by: Eta <esyra@coreweave.com>
|
2025-07-07 22:47:43 -07:00 |
|
|
|
e60d422f19
|
[Docs] Improve docstring for ray data llm example (#20597)
Signed-off-by: Ricardo Decal <rdecal@anyscale.com>
|
2025-07-07 20:06:26 -07:00 |
|
|
|
110df74332
|
[Model][Last/4] Automatic conversion of CrossEncoding model (#19675)
Signed-off-by: wang.yuqi <noooop@126.com>
|
2025-07-07 14:46:04 +00:00 |
|
|
|
9fb52e523a
|
[V1] Support any head size for FlexAttention backend (#20467)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-07-06 09:54:36 -07:00 |
|
|
|
e202dd2736
|
[V0 deprecation] Remove V0 CPU/XPU/TPU backends (#20412)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: jiang1.li <jiang1.li@intel.com>
Co-authored-by: Li, Jiang <jiang1.li@intel.com>
|
2025-07-06 08:48:13 -07:00 |
|
|
|
fe1e924811
|
[Frontend] Support image object in llm.chat (#19635)
Signed-off-by: sfeng33 <4florafeng@gmail.com>
Signed-off-by: Flora Feng <4florafeng@gmail.com>
|
2025-07-06 06:47:13 +00:00 |
|
|
|
1caca5a589
|
[Misc] Add SPDX-FileCopyrightText (#20428)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-07-04 07:40:42 +00:00 |
|
|
|
a7bab0c9e5
|
[Misc] small update (#20462)
Signed-off-by: reidliu41 <reid201711@gmail.com>
|
2025-07-03 20:33:44 -07:00 |
|
|
|
25950dca9b
|
Add ignore consolidated file in mistral example code (#20420)
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
|
2025-07-04 02:55:07 +00:00 |
|
|
|
359200f6ac
|
[doc] fix link (#20417)
Signed-off-by: reidliu41 <reid201711@gmail.com>
|
2025-07-03 00:21:57 -07:00 |
|
|
|
363528de27
|
[Feature] Support MiniMax-M1 function calls features (#20297)
Signed-off-by: QscQ <qscqesze@gmail.com>
Signed-off-by: qingjun <qingjun@minimaxi.com>
|
2025-07-03 06:48:27 +00:00 |
|
|
|
8452946c06
|
[Model][VLM] Support Keye-VL-8B-Preview (#20126)
Signed-off-by: Kwai-Keye <Keye@kuaishou.com>
|
2025-07-01 23:35:04 -07:00 |
|
|
|
314af8617c
|
[Docs] Update transcriptions API to use openai client with stream=True (#20271)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-07-01 15:47:13 +00:00 |
|
|
|
ed70f3c64f
|
Add GLM4.1V model (Draft) (#19331)
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-07-01 12:48:26 +00:00 |
|
|
|
92ee7baaf9
|
[Example] add one-click runnable example for P2P NCCL XpYd (#20246)
Signed-off-by: KuntaiDu <kuntai@uchicago.edu>
|
2025-06-30 21:03:55 -07:00 |
|
|
|
7151f92241
|
[Misc] Fix spec decode example (#20296)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-06-30 21:01:48 -07:00 |
|
|
|
2965c99c86
|
[Spec Decode] Clean up spec decode example (#20240)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-06-30 08:28:13 -07:00 |
|
|
|
d45417b804
|
fix ci issue distributed 4 gpu test (#20204)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-06-27 22:50:00 -07:00 |
|
|
|
9502c38138
|
[Benchmark][Bug] Fix multiple bugs in bench and add args to spec_decode offline (#20083)
|
2025-06-25 22:06:27 -07:00 |
|
|
|
e795d723ed
|
[Frontend] Add /v1/audio/translations OpenAI API endpoint (#19615)
Signed-off-by: Roger Wang <ywang@roblox.com>
Signed-off-by: NickLucche <nlucches@redhat.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2025-06-25 17:54:14 +00:00 |
|
|
|
26d34eb67e
|
refactor example - qwen3_reranker (#19847)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-06-24 14:03:20 +00:00 |
|
|
|
c3649e4fee
|
[Docs] Fix syntax highlighting of shell commands (#19870)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
|
2025-06-23 17:59:09 +00:00 |
|
|
|
b82e0f82cb
|
[doc] use MkDocs collapsible blocks - supplement (#19973)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-06-23 10:54:16 +00:00 |
|
|
|
c3bf9bad11
|
[New model support]Support Tarsier2 (#19887)
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
|
2025-06-21 04:01:51 +00:00 |
|
|
|
e384f2f108
|
[Misc] refactor example - openai_transcription_client (#19851)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-06-20 08:02:21 +00:00 |
|
|
|
089a306f19
|
[Misc] update cuda version (#19526)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-06-20 07:25:15 +00:00 |
|
|
|
1d0ae26c85
|
Add xLAM tool parser support (#17148)
|
2025-06-19 14:26:41 +08:00 |
|
|
|
799397ee4f
|
Support embedding models in V1 (#16188)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Signed-off-by: Max de Bayser <maxdebayser@gmail.com>
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
Co-authored-by: 22quinn <33176974+22quinn@users.noreply.github.com>
|
2025-06-18 21:36:33 -07:00 |
|
|
|
eccdc8318c
|
[V1][P/D] An native implementation of xPyD based on P2P NCCL (#18242)
Signed-off-by: Abatom <abzhonghua@gmail.com>
|
2025-06-18 06:32:36 +00:00 |
|
|
|
aed8468642
|
[Doc] Add missing llava family multi-image examples (#19698)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-06-17 07:05:21 +00:00 |
|
|
|
3e7506975c
|
[DOC] Add reasoning capability to vLLM streamlit code (#19557)
|
2025-06-16 07:09:12 -04:00 |
|
|
|
7b3c9ff91d
|
[Doc] uses absolute links for structured outputs (#19582)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
|
2025-06-13 03:35:17 +00:00 |
|
|
|
dba68f9159
|
[Doc] Unify structured outputs examples (#18196)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
|
2025-06-12 22:50:31 +00:00 |
|
|
|
017ef648e9
|
[Spec Decode][Benchmark] Generalize spec decode offline benchmark to more methods and datasets (#18847)
|
2025-06-12 10:30:56 -07:00 |
|
|
|
dff680001d
|
Fix typo (#19525)
Signed-off-by: 2niuhe <carlton2tang@gmail.com>
|
2025-06-12 09:24:45 +00:00 |
|
|
|
943ffa5703
|
[Bugfix] Update the example code, make it work with the latest lmcache (#19453)
Signed-off-by: Runzhen Wang <wangrunzhen@gmail.com>
|
2025-06-11 12:42:20 +00:00 |
|
|
|
3952731e8f
|
[New Model]: Support Qwen3 Embedding & Reranker (#19260)
|
2025-06-10 20:07:30 -07:00 |
|
|
|
6b1391ca7e
|
[Misc] refactor neuron_multimodal and profiling (#19397)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-06-10 06:12:42 +00:00 |
|
|
|
122cdca5f6
|
[Misc] refactor context extension (#19246)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-06-07 05:13:21 +00:00 |
|
|
|
c8dcc15921
|
Allow AsyncLLMEngine.generate to target a specific DP rank (#19102)
Signed-off-by: Jon Swenson <jmswen@gmail.com>
|
2025-06-04 08:26:47 -07:00 |
|
|
|
02658c2dfe
|
Add DeepSeek-R1-0528 function call chat template (#18874)
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
|
2025-06-04 13:24:18 +00:00 |
|