09e974d483
[Bugfix] Check dimensions of multimodal embeddings in V1 ( #15816 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2025-03-31 09:01:35 -07:00
e5ef4fa99a
Upgrade transformers to v4.50.3 ( #13905 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2025-03-31 08:59:37 -07:00
cf5c8f1686
Separate base model from TransformersModel ( #15467 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
Signed-off-by: Isotr0py <2037008807@qq.com >
Co-authored-by: Isotr0py <2037008807@qq.com >
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn >
2025-03-26 18:13:38 +08:00
97cfa65df7
Add pipeline parallel support to TransformersModel ( #12832 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
Signed-off-by: Isotr0py <2037008807@qq.com >
Co-authored-by: Isotr0py <2037008807@qq.com >
2025-03-25 10:41:45 +08:00
2bb0e1a799
[Bugfix][ROCm] running new process using spawn method for rocm in tests. ( #14810 )
...
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com >
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com >
Co-authored-by: TJian <tunjian.tan@embeddedllm.com >
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com >
2025-03-17 11:33:35 +00:00
b9b5bdfc7d
[Misc] Catching Ray Compiled Graph PP test failures for V1 ( #14847 )
2025-03-16 15:46:42 -07:00
d4d93db2c5
[V1] V1 Enablement Oracle ( #13726 )
...
Signed-off-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com >
Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com >
Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com >
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com >
Co-authored-by: Michael Goin <michael@neuralmagic.com >
2025-03-14 22:02:20 -07:00
601bd3268e
[Misc] Clean up type annotation for SupportsMultiModal ( #14794 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2025-03-14 00:59:56 -07:00
cf069aa8aa
Update deprecated Python 3.8 typing ( #13971 )
2025-03-02 17:34:51 -08:00
c9944acbf9
[misc] Rename Ray ADAG to Compiled Graph ( #13928 )
2025-02-26 20:03:28 -08:00
5d2965b7d7
[Bugfix] Fix 2 Node and Spec Decode tests ( #13341 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2025-02-16 22:20:22 +08:00
f2b20fe491
Consolidate Llama model usage in tests ( #13094 )
2025-02-13 22:18:03 -08:00
1bc3b5e71b
[VLM] Separate text-only and vision variants of the same model architecture ( #13157 )
2025-02-13 06:19:15 -08:00
9605c1256e
[V1][core] Implement pipeline parallel on Ray ( #12996 )
2025-02-13 08:02:46 +00:00
08b2d845d6
[Model] Ultravox Model: Support v0.5 Release ( #12912 )
...
Signed-off-by: Farzad Abdolhosseini <farzad@fixie.ai >
2025-02-10 22:02:48 +00:00
e489ad7a21
[Misc] Add SPDX-License-Identifier headers to python source files ( #12628 )
...
- **Add SPDX license headers to python source files**
- **Check for SPDX headers using pre-commit**
commit 9d7ef44c3cfb72ca4c32e1c677d99259d10d4745
Author: Russell Bryant <rbryant@redhat.com >
Date: Fri Jan 31 14:18:24 2025 -0500
Add SPDX license headers to python source files
This commit adds SPDX license headers to python source files as
recommended to
the project by the Linux Foundation. These headers provide a concise way
that is
both human and machine readable for communicating license information
for each
source file. It helps avoid any ambiguity about the license of the code
and can
also be easily used by tools to help manage license compliance.
The Linux Foundation runs license scans against the codebase to help
ensure
we are in compliance with the licenses of the code we use, including
dependencies. Having these headers in place helps that tool do its job.
More information can be found on the SPDX site:
- https://spdx.dev/learn/handling-license-info/
Signed-off-by: Russell Bryant <rbryant@redhat.com >
commit 5a1cf1cb3b80759131c73f6a9dddebccac039dea
Author: Russell Bryant <rbryant@redhat.com >
Date: Fri Jan 31 14:36:32 2025 -0500
Check for SPDX headers using pre-commit
Signed-off-by: Russell Bryant <rbryant@redhat.com >
---------
Signed-off-by: Russell Bryant <rbryant@redhat.com >
2025-02-02 11:58:18 -08:00
d927dbcd88
[Model] Refactor Ultravox to use merged input processor ( #11198 )
...
Signed-off-by: Isotr0py <2037008807@qq.com >
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com >
2024-12-16 10:09:53 +00:00
ffa48c9146
[Model] PP support for Mamba-like models ( #10992 )
...
Signed-off-by: mzusman <mor.zusmann@gmail.com >
2024-12-10 21:53:37 -05:00
c889d5888b
[Doc] Explicitly state that PP isn't compatible with speculative decoding yet ( #10975 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2024-12-07 17:20:49 +00:00
9db713a1dc
[Model] Add OLMo November 2024 model ( #10503 )
2024-11-25 17:26:40 -05:00
972112d82f
[Bugfix] Fix unable to load some models ( #10312 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2024-11-14 16:55:54 -08:00
73b9083e99
[misc] improve cloudpickle registration and tests ( #10202 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com >
2024-11-11 00:10:53 +00:00
2b5bf20988
[torch.compile] Adding torch compile annotations to some models ( #9876 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com >
Co-authored-by: youkaichao <youkaichao@gmail.com >
2024-11-01 00:25:47 -07:00
d27cfbf791
[torch.compile] Adding torch compile annotations to some models ( #9641 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com >
Co-authored-by: youkaichao <youkaichao@gmail.com >
2024-10-24 09:31:42 -07:00
8a02cd045a
[torch.compile] Adding torch compile annotations to some models ( #9639 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com >
Co-authored-by: youkaichao <youkaichao@gmail.com >
2024-10-24 00:54:57 -07:00
836e8ef6ee
[Bugfix] Fix PP for ChatGLM and Molmo ( #9422 )
2024-10-24 06:12:05 +00:00
fc6c274626
[Model] Add Qwen2-Audio model support ( #9248 )
...
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk >
2024-10-23 17:54:22 +00:00
b729901139
[Bugfix]: serialize config by value for --trust-remote-code ( #6751 )
...
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com >
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com >
2024-10-21 19:46:24 -07:00
051eaf6db3
[Model] Add user-configurable task for models that support both generation and embedding ( #9424 )
2024-10-18 11:31:58 -07:00
b22b798471
[Model] PP support for embedding models and update docs ( #9090 )
...
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com >
2024-10-06 16:35:27 +08:00
0f6d7a9a34
[Models] Add remaining model PP support ( #7168 )
...
Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai >
Signed-off-by: Murali Andoorveedu <muralidhar.andoorveedu@centml.ai >
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk >
2024-10-04 10:56:58 +08:00
bc4eb65b54
[Bugfix] Fix Fuyu tensor parallel inference ( #8986 )
2024-10-01 17:51:41 +08:00
4b377d6feb
[BugFix] Fix test breakages from transformers 4.45 upgrade ( #8829 )
2024-09-26 16:46:43 -07:00
a79e522984
[Model] Support pp for qwen2-vl ( #8696 )
2024-09-23 13:46:59 +00:00
9b4a3b235e
[CI/Build] Enable InternVL2 PP test only on single node ( #8437 )
2024-09-13 06:35:20 +00:00
b61bd98f90
[CI/Build] Disable multi-node test for InternVL2 ( #8428 )
2024-09-12 15:05:35 -07:00
1230263e16
[Bugfix] Fix InternVL2 vision embeddings process with pipeline parallel ( #8299 )
2024-09-11 10:11:01 +08:00
8685ba1a1e
Inclusion of InternVLChatModel In PP_SUPPORTED_MODELS(Pipeline Parallelism) ( #7860 )
2024-09-05 11:33:37 +00:00
4706eb628e
[aDAG] Unflake aDAG + PP tests ( #7600 )
2024-08-16 20:49:30 -07:00
4cd7d47fed
[ci/test] rearrange tests and make adag test soft fail ( #7572 )
2024-08-15 19:39:04 -07:00
198d6a2898
[Core] Shut down aDAG workers with clean async llm engine exit ( #7224 )
...
Signed-off-by: Rui Qiao <ruisearch42@gmail.com >
2024-08-12 17:57:16 -07:00
a0d164567c
[ci][distributed] disable ray dag tests ( #7099 )
2024-08-02 22:32:04 -07:00
05308891e2
[Core] Pipeline parallel with Ray ADAG ( #6837 )
...
Support pipeline-parallelism with Ray accelerated DAG.
Signed-off-by: Rui Qiao <ruisearch42@gmail.com >
2024-08-02 13:55:40 -07:00
252357793d
[ci][distributed] try to fix pp test ( #7054 )
2024-08-01 22:03:12 -07:00
443c7cf4cf
[ci][distributed] fix flaky tests ( #6806 )
2024-07-25 17:44:09 -07:00
5e8ca973eb
[Bugfix] fix flashinfer cudagraph capture for PP ( #6708 )
2024-07-24 01:49:44 +00:00
b5672a112c
[Core] Multiprocessing Pipeline Parallel support ( #6130 )
...
Co-authored-by: Murali Andoorveedu <muralidhar.andoorveedu@centml.ai >
2024-07-18 19:15:52 -07:00
f53b8f0d05
[ci][test] add correctness test for cpu offloading ( #6549 )
2024-07-18 23:41:06 +00:00
b5af8c223c
[Model] Pipeline parallel support for Mixtral ( #6516 )
2024-07-17 19:26:04 -07:00
5fa6e9876e
[Bugfix] Fix for multinode crash on 4 PP ( #6495 )
...
Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai >
2024-07-17 08:25:10 +00:00