8f18feb191
Remove last level references not removed in #26355 ( #27260 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2025-10-22 09:18:17 +00:00
647214f3d5
[V0 Deprecation] Remove V0 executors ( #27142 )
...
Signed-off-by: Nick Hill <nhill@redhat.com >
2025-10-21 11:09:37 -07:00
7a6c8c3fa1
[Chore] Separate out vllm.utils.network_utils ( #27164 )
...
Signed-off-by: iAmir97 <Amir.balwel@embeddedllm.com >
Co-authored-by: iAmir97 <Amir.balwel@embeddedllm.com >
2025-10-19 03:06:32 -07:00
4d4d6bad19
[Chore] Separate out vllm.utils.importlib ( #27022 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2025-10-17 00:48:59 +00:00
e3fdb627d9
[FrontEnd] UNREVERT CompilationConfig overhaul ( #20283 ): deprecate use_inductor in favor of backend, simplify custom_ops ( #26502 )
...
Signed-off-by: morrison-turnansky <mturnans@redhat.com >
Signed-off-by: Morrison Turnansky <mturnans@redhat.com >
Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com >
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com >
Co-authored-by: Jiangyun Zhu <riverclouds.zhu@qq.com >
2025-10-13 22:47:16 +00:00
8fcaaf6a16
Update Optional[x] -> x | None and Union[x, y] to x | y ( #26633 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2025-10-12 09:51:31 -07:00
5728da11ea
Revert #26113 "[Frontend] CompilationConfig overhaul ( #20283 ): deprecate use_inductor in favor of backend, simplify custom_ops" ( #26472 )
...
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com >
2025-10-09 05:43:55 -07:00
0c824fc46f
[Frontend] CompilationConfig overhaul ( #20283 ): deprecate use_inductor in favor of backend, simplify custom_ops ( #26113 )
...
Signed-off-by: morrison-turnansky <mturnans@redhat.com >
Signed-off-by: Morrison Turnansky <mturnans@redhat.com >
Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com >
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com >
Co-authored-by: Jiangyun Zhu <riverclouds.zhu@qq.com >
2025-10-07 12:53:43 -07:00
4e256cadc2
Remove all references to yapf as it's no longer used ( #26251 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2025-10-05 09:18:11 -07:00
d6953beb91
Convert formatting to use ruff instead of yapf + isort ( #26247 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2025-10-05 07:06:22 -07:00
502640c3f9
[Perf] Fix and reapply move apply w8a8 block fp8 linear to class ( #25696 )
...
Signed-off-by: ElizaWszola <ewszola@redhat.com >
Signed-off-by: ElizaWszola <elizaw.9289@gmail.com >
Signed-off-by: Luka Govedič <lgovedic@redhat.com >
Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com >
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com >
Co-authored-by: Michael Goin <mgoin64@gmail.com >
Co-authored-by: Luka Govedič <lgovedic@redhat.com >
2025-10-02 19:35:13 +00:00
be22bb6f3d
Run:ai model streamer add GCS package support ( #24909 )
...
Signed-off-by: Peter Schuurman <psch@google.com >
2025-10-01 20:59:13 -07:00
6a113d9aed
[V0 Deprecation] Remove vllm.worker and update according imports ( #25901 )
2025-09-29 23:26:11 +00:00
d346ec695e
[CI/Build] Consolidate model loader tests and requirements ( #25765 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2025-09-26 21:45:20 -07:00
bc9d7b5595
[CI/Build] Split up Distributed Tests ( #25572 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2025-09-26 14:49:33 +02:00
1260180c67
Revert "[Performance] Move apply_w8a8_block_fp8_linear to an op class… ( #25607 )
...
Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com >
2025-09-25 08:05:21 +00:00
63400259d0
[Performance] Move apply_w8a8_block_fp8_linear to an op class ( #24666 )
...
Signed-off-by: ElizaWszola <ewszola@redhat.com >
Signed-off-by: ElizaWszola <elizaw.9289@gmail.com >
Signed-off-by: Luka Govedič <lgovedic@redhat.com >
Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com >
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com >
Co-authored-by: Michael Goin <mgoin64@gmail.com >
Co-authored-by: Luka Govedič <lgovedic@redhat.com >
2025-09-23 12:03:10 -07:00
d588cd2406
[Bugfix] fix custom op test ( #25429 )
...
Signed-off-by: Luka Govedič <lgovedic@redhat.com >
2025-09-23 00:07:43 +00:00
6c036615dc
[V0 Deprecation] Remove misc V0 tests ( #25118 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu >
2025-09-17 19:41:55 -07:00
e3db5ebb66
[CI Bugfix] Fix failing test_model_load_with_params tests due to tokenizer refactor ( #25086 )
...
Signed-off-by: mgoin <mgoin64@gmail.com >
2025-09-17 11:15:05 -07:00
f36355abfd
Move LoadConfig from config/__init__.py to config/load.py ( #24566 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2025-09-10 06:14:18 -07:00
7c195d43da
[ROCm][Bugfix] Fix Aiter RMSNorm ( #23412 )
...
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com >
2025-09-10 21:08:03 +08:00
37efc63b64
[V0 deprecation] Guided decoding ( #21347 )
...
Signed-off-by: Reza Barazesh <rezabarazesh@meta.com >
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2025-07-29 03:15:30 -07:00
86ae693f20
[Deprecation][2/N] Replace --task with --runner and --convert ( #21470 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2025-07-27 19:42:40 -07:00
1cd6eaba54
Support encoder-only models without KV-Cache ( #21270 )
...
Signed-off-by: Max de Bayser <maxdebayser@gmail.com >
Signed-off-by: Max de Bayser <mbayser@br.ibm.com >
Co-authored-by: Russell Bryant <rbryant@redhat.com >
2025-07-26 21:09:52 +08:00
610852a423
[Core] Support model loader plugins ( #21067 )
...
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com >
2025-07-24 01:49:44 -07:00
c401c64b4c
[CI/Build] Fix model executor tests ( #21387 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2025-07-22 20:25:37 -07:00
d97841078b
[Misc] unify variable for LLM instance ( #20996 )
...
Signed-off-by: Andy Xie <andy.xning@gmail.com >
2025-07-21 12:18:33 +01:00
c81259d33a
Fix/remove some broken model executor tests ( #21224 )
...
Signed-off-by: Rabi Mishra <ramishra@redhat.com >
2025-07-19 12:15:07 +00:00
d6902ce79f
[V0][V1][Core] Add outlines integration for V1, and update V0 integration. ( #15975 )
...
Signed-off-by: Nathan Hoos <thwackyy.y@gmail.com >
2025-07-10 15:30:26 -04:00
aafabaa0d5
[Fix][torch.compile] Enable custom ops by default when Inductor off ( #20102 )
...
Signed-off-by: luka <luka@neuralmagic.com >
2025-06-27 09:00:42 -06:00
61f4fc5dc6
[Bugfix][v1] Fix step pooler implementation and step pooling usage in v1 ( #19956 )
...
Signed-off-by: Isotr0py <2037008807@qq.com >
2025-06-23 18:38:06 +00:00
02f0c7b220
[Misc] Add SPDX-FileCopyrightText ( #19100 )
...
Signed-off-by: simon-mo <simon.mo@hey.com >
2025-06-03 11:20:17 -07:00
b78f844a67
[Bugfix][FailingTest]Fix test_model_load_with_params.py ( #18758 )
...
Signed-off-by: rabi <ramishra@redhat.com >
2025-05-28 05:42:54 +00:00
a9944aabfa
fix: typos ( #18151 )
...
Signed-off-by: omahs <73983677+omahs@users.noreply.github.com >
2025-05-15 02:16:15 -07:00
612c2edb4f
[FEAT] [ROCm]: Add AITER CK 2 Stages MoE support ( #17110 )
...
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com >
Co-authored-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com >
2025-05-14 03:03:11 -07:00
40de1ef455
[FEAT] [ROCm]: Add AITER Block-Scaled GEMM Feature ( #14968 )
...
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com >
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com >
Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com >
2025-05-13 19:08:20 -07:00
a6977dbd15
Simplify (and fix) passing of guided decoding backend options ( #17008 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2025-04-29 19:02:23 +00:00
eef364723c
[FEAT] [ROCm]: AITER Fused MOE V1 Support ( #16752 )
...
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com >
Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com >
2025-04-25 11:06:50 +08:00
5ebf66748b
[FEAT][ROCm] Integrate Fused MoE Kernels from AITER ( #14967 )
...
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com >
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com >
Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com >
2025-03-26 16:30:30 +08:00
ec870fba9a
[FEAT] [ROCm]: Add AITER RMS Norm (Layer Norm) Feature ( #14959 )
...
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com >
2025-03-21 22:36:14 -07:00
1f16b7fe74
[Core][V0] Add guidance backend for structured output ( #14589 )
...
Signed-off-by: Russell Bryant <rbryant@redhat.com >
Co-authored-by: Loc Huynh <lohuynh@microsoft.com >
Co-authored-by: Michal Moskal <michal@moskal.me >
Co-authored-by: Aaron Pham <contact@aarnphm.xyz >
2025-03-19 21:33:51 -07:00
c0efdd655b
[Fix][Structured Output] using vocab_size to construct matcher ( #14868 )
...
Signed-off-by: Russell Bryant <rbryant@redhat.com >
Signed-off-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com >
Signed-off-by: Aaron Pham <contact@aarnphm.xyz >
Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com >
Co-authored-by: Russell Bryant <rbryant@redhat.com >
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com >
2025-03-17 11:42:45 -04:00
cf069aa8aa
Update deprecated Python 3.8 typing ( #13971 )
2025-03-02 17:34:51 -08:00
bf33700ecd
[v0][structured output] Support reasoning output ( #12955 )
...
Signed-off-by: Ce Gao <cegao@tensorchord.ai >
2025-03-02 14:49:42 -05:00
76c89fcadd
Use smaller embedding model when not testing model specifically ( #13891 )
2025-02-28 00:50:43 -08:00
bfbc0b32c6
[Frontend] Add backend-specific options for guided decoding ( #13505 )
...
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com >
2025-02-20 15:07:58 -05:00
e489ad7a21
[Misc] Add SPDX-License-Identifier headers to python source files ( #12628 )
...
- **Add SPDX license headers to python source files**
- **Check for SPDX headers using pre-commit**
commit 9d7ef44c3cfb72ca4c32e1c677d99259d10d4745
Author: Russell Bryant <rbryant@redhat.com >
Date: Fri Jan 31 14:18:24 2025 -0500
Add SPDX license headers to python source files
This commit adds SPDX license headers to python source files as
recommended to
the project by the Linux Foundation. These headers provide a concise way
that is
both human and machine readable for communicating license information
for each
source file. It helps avoid any ambiguity about the license of the code
and can
also be easily used by tools to help manage license compliance.
The Linux Foundation runs license scans against the codebase to help
ensure
we are in compliance with the licenses of the code we use, including
dependencies. Having these headers in place helps that tool do its job.
More information can be found on the SPDX site:
- https://spdx.dev/learn/handling-license-info/
Signed-off-by: Russell Bryant <rbryant@redhat.com >
commit 5a1cf1cb3b80759131c73f6a9dddebccac039dea
Author: Russell Bryant <rbryant@redhat.com >
Date: Fri Jan 31 14:36:32 2025 -0500
Check for SPDX headers using pre-commit
Signed-off-by: Russell Bryant <rbryant@redhat.com >
---------
Signed-off-by: Russell Bryant <rbryant@redhat.com >
2025-02-02 11:58:18 -08:00
59a0192fb9
[Core] Interface for accessing model from VllmRunner ( #10353 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2025-01-20 15:00:59 +08:00
d697dc01b4
[Bugfix] Fix RobertaModel loading ( #11940 )
...
Signed-off-by: NickLucche <nlucches@redhat.com >
2025-01-11 14:05:09 +00:00