|
|
1b886aa104
|
[Model] Adding Support for Qwen2VL as an Embedding Model. Using MrLight/dse-qwen2-2b-mrl-v1 (#9944)
Signed-off-by: FurtherAI <austin.veselka@lighton.ai>
Co-authored-by: FurtherAI <austin.veselka@lighton.ai>
|
2024-11-13 08:28:13 +00:00 |
|
|
|
d58268c56a
|
[V1] Make v1 more testable (#9888)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
|
2024-11-06 11:57:35 -08:00 |
|
|
|
21063c11c7
|
[CI/Build] drop support for Python 3.8 EOL (#8464)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
|
2024-11-06 07:11:55 +00:00 |
|
|
|
cc98f1e079
|
[CI/Build] VLM Test Consolidation (#9372)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
|
2024-10-30 09:32:17 -07:00 |
|
|
|
6650e6a930
|
[Model] Add classification Task with Qwen2ForSequenceClassification (#9704)
Signed-off-by: Kevin-Yang <ykcha9@gmail.com>
Co-authored-by: Kevin-Yang <ykcha9@gmail.com>
|
2024-10-26 17:53:35 +00:00 |
|
|
|
c866e0079d
|
[CI/Build] Fix VLM test failures when using transformers v4.46 (#9666)
|
2024-10-25 01:40:40 +08:00 |
|
|
|
3ff57ebfca
|
[Model] Initialize Florence-2 language backbone support (#9555)
|
2024-10-23 10:42:47 +00:00 |
|
|
|
831540cf04
|
[Model] Support E5-V (#9576)
|
2024-10-23 11:35:29 +08:00 |
|
|
|
3ddbe25502
|
[Hardware][CPU] using current_platform.is_cpu (#9536)
|
2024-10-22 00:50:43 -07:00 |
|
|
|
d11bf435a0
|
[MISC] Consolidate cleanup() and refactor offline_inference_with_prefix.py (#9510)
|
2024-10-18 14:30:55 -07:00 |
|
|
|
051eaf6db3
|
[Model] Add user-configurable task for models that support both generation and embedding (#9424)
|
2024-10-18 11:31:58 -07:00 |
|
|
|
7abba39ee6
|
[Model] VLM2Vec, the first multimodal embedding model in vLLM (#9303)
|
2024-10-16 14:31:00 +08:00 |
|
|
|
8c6de96ea1
|
[Model] Explicit interface for vLLM models and support OOT embedding models (#9108)
|
2024-10-07 06:10:35 +00:00 |
|
|
|
18b296fdb2
|
[core] remove beam search from the core (#9105)
|
2024-10-07 05:47:04 +00:00 |
|
|
|
168cab6bbf
|
[Frontend] API support for beam search (#9087)
Co-authored-by: youkaichao <youkaichao@126.com>
|
2024-10-05 23:39:03 -07:00 |
|
|
|
15986f598c
|
[Model] Support Gemma2 embedding model (#9004)
|
2024-10-05 06:57:05 +00:00 |
|
|
|
26aa325f4f
|
[Core][VLM] Test registration for OOT multimodal models (#8717)
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-10-04 10:38:25 -07:00 |
|
|
|
e1a3f5e831
|
[CI/Build] Update models tests & examples (#8874)
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2024-09-28 09:54:35 -07:00 |
|
|
|
4b377d6feb
|
[BugFix] Fix test breakages from transformers 4.45 upgrade (#8829)
|
2024-09-26 16:46:43 -07:00 |
|
|
|
300da09177
|
[Kernel] Fullgraph and opcheck tests (#8479)
|
2024-09-25 08:35:52 -06:00 |
|
|
|
01b6f9e1f0
|
[Core][Bugfix] Support prompt_logprobs returned with speculative decoding (#8047)
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
|
2024-09-24 17:29:56 -07:00 |
|
|
|
0250dd68c5
|
re-implement beam search on top of vllm core (#8726)
Co-authored-by: Brendan Wong <bjwpokemon@gmail.com>
|
2024-09-23 22:08:12 -07:00 |
|
|
|
a8c1d161a7
|
[Core] *Prompt* logprobs support in Multi-step (#8199)
|
2024-09-18 08:38:43 -07:00 |
|
|
|
9d104b5beb
|
[CI/Build] Update Ruff version (#8469)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2024-09-18 11:00:56 +00:00 |
|
|
|
a84e598e21
|
[CI/Build] Reorganize models tests (#7820)
|
2024-09-13 10:20:06 -07:00 |
|
|
|
d31174a4e1
|
[Hotfix][Pixtral] Fix multiple images bugs (#8415)
|
2024-09-12 15:21:51 -07:00 |
|
|
|
6a512a00df
|
[model] Support for Llava-Next-Video model (#7559)
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2024-09-10 22:21:36 -07:00 |
|
|
|
2f707fcb35
|
[Model] Multi-input support for LLaVA (#8238)
|
2024-09-07 02:57:24 +00:00 |
|
|
|
4664ceaad6
|
support bitsandbytes 8-bit and FP4 quantized models (#7445)
|
2024-08-29 19:09:08 -04:00 |
|
|
|
5340a2dccf
|
[Model] Add multi-image input support for LLaVA-Next offline inference (#7230)
|
2024-08-28 07:09:02 +08:00 |
|
|
|
1ca0d4f86b
|
[Model] Add UltravoxModel and UltravoxConfig (#7615)
|
2024-08-21 22:49:39 +00:00 |
|
|
|
aae6927be0
|
[VLM][Model] Add test for InternViT vision encoder (#7409)
|
2024-08-20 23:10:20 +08:00 |
|
|
|
16422ea76f
|
[misc][plugin] add plugin system implementation (#7426)
|
2024-08-13 16:24:17 -07:00 |
|
|
|
7025b11d94
|
[Bugfix] Fix weight loading for Chameleon when TP>1 (#7410)
|
2024-08-13 05:33:41 +00:00 |
|
|
|
7eb4a51c5f
|
[Core] Support serving encoder/decoder models (#7258)
|
2024-08-09 10:39:41 +08:00 |
|
|
|
fd95e026e0
|
[Core] Subclass ModelRunner to support cross-attention & encoder sequences (towards eventual encoder/decoder model support) (#4942)
Co-authored-by: Andrew Feldman <afeld2012@gmail.com>
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
|
2024-08-06 16:51:47 -04:00 |
|
|
|
7b86e7c9cd
|
[Model] Add multi-image support for minicpmv (#7122)
Co-authored-by: hezhihui <hzh7269@modelbest.cn>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2024-08-05 09:23:17 +08:00 |
|
|
|
fb3db61688
|
[CI/Build] Remove sparseml requirement from testing (#7037)
|
2024-08-01 12:00:51 -07:00 |
|
|
|
9e169a4c61
|
[Model] Adding support for MiniCPM-V (#4087)
|
2024-07-24 20:59:30 -07:00 |
|
|
|
97234be0ec
|
[Misc] Manage HTTP connections in one place (#6600)
|
2024-07-22 21:32:02 -07:00 |
|
|
|
9042d68362
|
[Misc] Consolidate and optimize logic for building padded tensors (#6541)
|
2024-07-20 04:17:24 +00:00 |
|
|
|
7bd82002ae
|
[Core] Allow specifying custom Executor (#6557)
|
2024-07-20 01:25:06 +00:00 |
|
|
|
38ef94888a
|
[CI/Build] Remove "boardwalk" image asset (#6460)
|
2024-07-16 08:59:36 -07:00 |
|
|
|
d97011512e
|
[CI/Build] vLLM cache directory for images (#6444)
|
2024-07-15 23:12:25 -07:00 |
|
|
|
9831aec49f
|
[Core] Dynamic image size support for VLMs (#5276)
Signed-off-by: Xiaowei Jiang <xwjiang2010@gmail.com>
Co-authored-by: Xiaowei Jiang <xwjiang2010@gmail.com>
Co-authored-by: ywang96 <ywang@roblox.com>
Co-authored-by: xwjiang2010 <87673679+xwjiang2010@users.noreply.github.com>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
|
2024-07-02 20:34:00 -07:00 |
|
|
|
4d26d806e1
|
Update conftest.py (#6076)
|
2024-07-02 20:14:22 +00:00 |
|
|
|
98d6682cd1
|
[VLM] Remove image_input_type from VLM config (#5852)
Signed-off-by: Xiaowei Jiang <xwjiang2010@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2024-07-02 07:57:09 +00:00 |
|
|
|
3476ed0809
|
[Core] Optimize block_manager_v2 vs block_manager_v1 (to make V2 default) (#5602)
|
2024-07-01 20:10:37 -07:00 |
|
|
|
2be6955a3f
|
[ci][distributed] fix device count call
[ci][distributed] fix some cuda init that makes it necessary to use spawn (#5991)
|
2024-06-30 08:06:13 +00:00 |
|
|
|
8dbfcd35bf
|
[ CI/Build ] Added E2E Test For Compressed Tensors (#5839)
Co-authored-by: Michael Goin <michael@neuralmagic.com>
Co-authored-by: Robert Shaw <rshaw@neuralmagic>
|
2024-06-29 21:12:58 +08:00 |
|