|
|
9b7edc0343
|
cleanup data_parallel.py
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-03 13:02:12 +00:00 |
|
|
|
0e499c4f4d
|
first round of cleanups
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-02 21:11:28 +00:00 |
|
|
|
0767d9863f
|
fix data_parallel.py
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-02 19:25:59 +00:00 |
|
|
|
c0efbbb5de
|
misc changes
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-02 16:56:30 +00:00 |
|
|
|
f7a3ee0ea1
|
Merge remote-tracking branch 'origin/main' into lwilkinson/attn-slicing
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
2025-07-02 16:52:19 +00:00 |
|
|
|
d833982e48
|
random push
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-06-30 17:08:51 +00:00 |
|
|
|
2965c99c86
|
[Spec Decode] Clean up spec decode example (#20240)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-06-30 08:28:13 -07:00 |
|
|
|
4672c72f44
|
capture works replay does not
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-06-28 19:14:48 +00:00 |
|
|
|
d45417b804
|
fix ci issue distributed 4 gpu test (#20204)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-06-27 22:50:00 -07:00 |
|
|
|
9502c38138
|
[Benchmark][Bug] Fix multiple bugs in bench and add args to spec_decode offline (#20083)
|
2025-06-25 22:06:27 -07:00 |
|
|
|
26d34eb67e
|
refactor example - qwen3_reranker (#19847)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-06-24 14:03:20 +00:00 |
|
|
|
c3649e4fee
|
[Docs] Fix syntax highlighting of shell commands (#19870)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
|
2025-06-23 17:59:09 +00:00 |
|
|
|
c3bf9bad11
|
[New model support]Support Tarsier2 (#19887)
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
|
2025-06-21 04:01:51 +00:00 |
|
|
|
799397ee4f
|
Support embedding models in V1 (#16188)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Signed-off-by: Max de Bayser <maxdebayser@gmail.com>
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
Co-authored-by: 22quinn <33176974+22quinn@users.noreply.github.com>
|
2025-06-18 21:36:33 -07:00 |
|
|
|
0889f66297
|
Merge branch 'main' of https://github.com/neuralmagic/vllm into lwilkinson/attn-slicing
|
2025-06-18 13:56:24 +00:00 |
|
|
|
aed8468642
|
[Doc] Add missing llava family multi-image examples (#19698)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-06-17 07:05:21 +00:00 |
|
|
|
017ef648e9
|
[Spec Decode][Benchmark] Generalize spec decode offline benchmark to more methods and datasets (#18847)
|
2025-06-12 10:30:56 -07:00 |
|
|
|
dff680001d
|
Fix typo (#19525)
Signed-off-by: 2niuhe <carlton2tang@gmail.com>
|
2025-06-12 09:24:45 +00:00 |
|
|
|
3952731e8f
|
[New Model]: Support Qwen3 Embedding & Reranker (#19260)
|
2025-06-10 20:07:30 -07:00 |
|
|
|
6b1391ca7e
|
[Misc] refactor neuron_multimodal and profiling (#19397)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-06-10 06:12:42 +00:00 |
|
|
|
642bf2dd8b
|
Merge branch 'main' of https://github.com/neuralmagic/vllm into lwilkinson/attn-slicing
|
2025-06-08 18:02:06 +00:00 |
|
|
|
122cdca5f6
|
[Misc] refactor context extension (#19246)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-06-07 05:13:21 +00:00 |
|
|
|
f8848bb201
|
misc fixes. lm_eval still gets a wrong answer but it no longer hangs
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-06-04 22:46:18 +00:00 |
|
|
|
3336c8cfbe
|
Fix #19130 (#19132)
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
|
2025-06-04 01:42:06 -07:00 |
|
|
|
2e3484c237
|
debugging
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-06-03 19:25:01 +00:00 |
|
|
|
02f0c7b220
|
[Misc] Add SPDX-FileCopyrightText (#19100)
Signed-off-by: simon-mo <simon.mo@hey.com>
|
2025-06-03 11:20:17 -07:00 |
|
|
|
1282bd812e
|
Add tarsier model support (#18985)
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
|
2025-06-03 13:13:13 +08:00 |
|
|
|
18e7d6c7b8
|
Merge branch 'main' of https://github.com/neuralmagic/vllm into lwilkinson/attn-slicing
|
2025-06-03 00:52:39 +00:00 |
|
|
|
9112b443a0
|
[Hardware][TPU] Initial support of model parallelism with single worker using SPMD (#18011)
Signed-off-by: Siyuan Liu <lsiyuan@google.com>
Co-authored-by: Hossein Sarshar <hossein.sarshar@gmail.com>
Co-authored-by: Chengji Yao <chengjiyao@google.com>
|
2025-06-03 00:06:20 +00:00 |
|
|
|
c57d577e8d
|
add an absolute path for run.sh (#18258)
Signed-off-by: calvin chen <120380290@qq.com>
|
2025-06-02 19:38:23 +00:00 |
|
|
|
8332924320
|
dp format
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-06-02 19:15:23 +00:00 |
|
|
|
8ea80fca4a
|
revert offline_inference/basic.py
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-06-02 18:05:48 +00:00 |
|
|
|
21d9529a79
|
revert offline_inference/basic.py
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-06-02 18:05:26 +00:00 |
|
|
|
9a1b9b99d7
|
[BugFix] Fix multi-node offline data-parallel (#18981)
Signed-off-by: Nick Hill <nhill@redhat.com>
Co-authored-by: Yizhou Liu <liu_yizhou@outlook.com>
|
2025-05-31 08:34:52 -07:00 |
|
|
|
2a50ef5760
|
[Neuron] Add Multi-Modal model support for Neuron (#18921)
Signed-off-by: Satyajith Chilappagari <satchill@amazon.com>
Co-authored-by: Ashraf Mahgoub <ashymahg@amazon.com>
Co-authored-by: Rohith Nallamaddi <nalrohit@amazon.com>
Co-authored-by: FeliciaLuo <luof@amazon.com>
Co-authored-by: Elaine Zhao <elaineyz@amazon.com>
|
2025-05-31 10:39:11 +00:00 |
|
|
|
62da375465
|
more fixes
|
2025-05-30 21:17:06 +00:00 |
|
|
|
435fa95444
|
[Frontend] add run batch to CLI (#18804)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-28 07:08:57 -07:00 |
|
|
|
3e9ce609bd
|
[Bugfix] Fix nomic max_model_len (#18755)
|
2025-05-27 20:29:53 -07:00 |
|
|
|
06a0338015
|
[V1][Metrics] Add API for accessing in-memory Prometheus metrics (#17010)
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
|
2025-05-27 09:37:06 +00:00 |
|
|
|
fc6d0c290f
|
[Misc] improve docs (#18734)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-27 07:07:01 +00:00 |
|
|
|
753944fa9b
|
[Doc] Update reproducibility doc and example (#18741)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-27 07:03:13 +00:00 |
|
|
|
27bebcd897
|
Convert examples to ruff-format (#18400)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-05-26 16:57:54 +00:00 |
|
|
|
75f81750f3
|
[VLM] Initialize video input support for InternVL models (#18499)
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-05-25 04:51:25 +00:00 |
|
|
|
4fc1bf813a
|
[Bugfix] Migrate to REGEX Library to prevent catastrophic backtracking (#18454)
Signed-off-by: Crucifixion-Fxl <xmufxl@gmail.com>
Co-authored-by: Crucifixion-Fxl <xmufxl@gmail.com>
|
2025-05-23 16:16:26 -07:00 |
|
|
|
04eb88dc80
|
Re-submit: Fix: Proper RGBA -> RGB conversion for PIL images. (#18569)
Signed-off-by: Chenheli Hua <huachenheli@outlook.com>
|
2025-05-23 01:59:18 +00:00 |
|
|
|
ffb740ae95
|
manually manage stream
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
2025-05-22 20:51:36 +00:00 |
|
|
|
f93bdd3151
|
support more args in dp example
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
2025-05-22 20:51:35 +00:00 |
|
|
|
df8f889f37
|
support MLA
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
2025-05-22 20:51:35 +00:00 |
|
|
|
37c9babaa0
|
enable naive microbatching
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
2025-05-22 20:51:35 +00:00 |
|
|
|
cb506ecb5a
|
[Misc] improve Automatic Prefix Caching example (#18554)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-22 14:50:46 +00:00 |
|