|
|
3f04a7fbf2
|
[Doc] Update V1 user guide for multi-modality (#15460)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-03-25 11:01:58 +00:00 |
|
|
|
97cfa65df7
|
Add pipeline parallel support to TransformersModel (#12832)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
|
2025-03-25 10:41:45 +08:00 |
|
|
|
6dd55af6c9
|
[Doc] Update docs on handling OOM (#15357)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2025-03-24 14:29:34 -07:00 |
|
|
|
3eb08ed9b1
|
[DOC] Add Kubernetes deployment guide with CPUs (#14865)
|
2025-03-24 10:48:43 -07:00 |
|
|
|
761702fd19
|
[Core] Integrate fastsafetensors loader for loading model weights (#10647)
Signed-off-by: Manish Sethi <Manish.sethi1@ibm.com>
|
2025-03-24 08:08:02 -07:00 |
|
|
|
3892e58ad7
|
[Misc] Upgrade BNB version (#15183)
|
2025-03-24 05:51:42 +00:00 |
|
|
|
9c5c81b0da
|
[Misc][Doc] Add note regarding loading generation_config by default (#15281)
Signed-off-by: Roger Wang <ywang@roblox.com>
|
2025-03-23 14:00:55 -07:00 |
|
|
|
d6cd59f122
|
[Frontend] Support tool calling and reasoning parser (#14511)
Signed-off-by: WangErXiao <863579016@qq.com>
|
2025-03-23 14:00:07 -07:00 |
|
|
|
50c9636d87
|
[V1][Usage] Refactor speculative decoding configuration and tests (#14434)
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
|
2025-03-22 19:28:10 -10:00 |
|
|
|
b877031d80
|
Remove openvino support in favor of external plugin (#15339)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2025-03-22 14:06:39 -07:00 |
|
|
|
2f4bd358f1
|
[Model] Support Tele-FLM Model (#15023)
Signed-off-by: Naitong Yu <ntyu@baai.ac.cn>
Signed-off-by: jiangxin <horizon94@outlook.com>
Co-authored-by: Jason Fang <jasonfang3900@gmail.com>
Co-authored-by: jiangxin <horizon94@outlook.com>
|
2025-03-22 02:04:44 -07:00 |
|
|
|
baec0d4de9
|
Revert "[Feature] specify model in config.yaml (#14855)" (#15293)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-03-21 08:30:23 -07:00 |
|
|
|
61e8c18350
|
[Misc] Add cProfile helpers (#15074)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2025-03-21 04:56:09 -07:00 |
|
|
|
0fa3970deb
|
[Feature] specify model in config.yaml (#14855)
Signed-off-by: weizeng <weizeng@roblox.com>
|
2025-03-21 00:26:03 -07:00 |
|
|
|
7297941b38
|
[Doc] Update LWS docs (#15163)
Signed-off-by: Edwinhr716 <Edandres249@gmail.com>
|
2025-03-20 21:18:47 -07:00 |
|
|
|
6edbfa924d
|
Mention extra_body as a way top pass vLLM only parameters using the OpenAI client (#15240)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-03-20 19:18:36 -07:00 |
|
|
|
10f55fe6c5
|
[Misc] Clean up the BitsAndBytes arguments (#15140)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-03-20 19:17:12 -07:00 |
|
|
|
4cb1c05c9e
|
[Doc] Clarify run vllm only on one node in distributed inference (#15148)
Signed-off-by: Rui Qiao <ruisearch42@gmail.com>
|
2025-03-20 09:55:59 +08:00 |
|
|
|
073d1ed354
|
[Doc] Update tip info on using latest transformers when creating a custom Dockerfile (#15070)
|
2025-03-19 13:33:40 +00:00 |
|
|
|
61f412187d
|
[Bugfix] Re-enable Gemma3 for V1 (#14980)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-03-18 23:58:22 -07:00 |
|
|
|
228b768db6
|
[Doc] Minor v1_user_guide update (#15064)
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
|
2025-03-18 16:10:45 -07:00 |
|
|
|
452e8fd968
|
[MODEL] Add support for Zamba2 models (#13185)
Signed-off-by: Yury Tokpanov <yury@zyphra.com>
Signed-off-by: Quentin Anthony <qganthony@yahoo.com>
Co-authored-by: Quentin Anthony <qganthony@yahoo.com>
Co-authored-by: Tyler Michael Smith <tysmith@redhat.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-03-18 08:56:21 -07:00 |
|
|
|
f863ffc965
|
[Mistral-Small 3.1] Update docs and tests (#14977)
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2025-03-18 03:29:42 -07:00 |
|
|
|
d1695758b2
|
[Doc][V1] Fix V1 APC doc (#14920)
|
2025-03-18 08:15:46 +00:00 |
|
|
|
37e3806132
|
[Bugfix] Make Gemma3 MM V0 only for now (#14971)
Signed-off-by: Roger Wang <ywang@roblox.com>
|
2025-03-17 10:04:21 -07:00 |
|
|
|
90df7f23aa
|
[Doc] Add guidance for using ccache with pip install -e . in doc (#14901)
|
2025-03-16 23:10:04 +00:00 |
|
|
|
9ed6ee92d6
|
[Bugfix] EAGLE output norm bug (#14464)
Signed-off-by: Bryan Lu <yuzhelu@amazon.com>
|
2025-03-15 06:50:33 +00:00 |
|
|
|
aaacf17324
|
[Doc] V1 user guide (#13991)
Signed-off-by: Jennifer Zhao <7443418+JenZhao@users.noreply.github.com>
Signed-off-by: Roger Wang <ywang@roblox.com>
Signed-off-by: Jennifer Zhao <ai.jenniferzhao@gmail.com>
Co-authored-by: Jennifer Zhao <7443418+JenZhao@users.noreply.github.com>
Co-authored-by: Jennifer Zhao <JenZhao@users.noreply.github.com>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-03-14 22:17:59 -07:00 |
|
|
|
a2ae496589
|
[CPU] Support FP8 KV cache (#14741)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2025-03-14 22:07:36 -07:00 |
|
|
|
877e352262
|
[Docs] Add new East Coast vLLM Meetup slides to README and meetups.md (#14852)
|
2025-03-14 22:06:38 -07:00 |
|
|
|
54a8804455
|
[Doc] More neutral K8s deployment guide (#14084)
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
|
2025-03-14 16:12:36 -07:00 |
|
|
|
9d2b4a70f4
|
[V1][Metrics] Updated list of deprecated metrics in v0.8 (#14695)
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
|
2025-03-15 00:45:25 +08:00 |
|
|
|
601bd3268e
|
[Misc] Clean up type annotation for SupportsMultiModal (#14794)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-03-14 00:59:56 -07:00 |
|
|
|
95d680b862
|
[Bugfix][IPEX] Add VLLM_CPU_MOE_PREPACK to allow disabling MoE prepack when CPU does not support it (#14681)
Signed-off-by: Thien Tran <gau.nernst@yahoo.com.sg>
|
2025-03-13 20:43:18 -07:00 |
|
|
|
60c872d4b6
|
[Doc] Fix small typo in Transformers fallback (#14791)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
|
2025-03-13 20:33:12 -07:00 |
|
|
|
3fb17d26c8
|
[Doc] Fix typo in documentation (#14783)
Signed-off-by: yasu52 <tsuguro4649@gmail.com>
|
2025-03-13 20:33:09 -07:00 |
|
|
|
b1cc4dfef5
|
[VLM] Support loading InternVideo2.5 models as original InternVLChatModel (#14738)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-03-13 03:10:02 -07:00 |
|
|
|
382403921f
|
[VLM] Support pan-and-scan for Gemma3 multi-modal processor (#14672)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2025-03-13 02:23:12 -07:00 |
|
|
|
c0c25e25fa
|
[Model] Add support for Gemma 3 (#14660)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: Roger Wang <ywang@roblox.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-03-12 08:36:33 -07:00 |
|
|
|
c6e14a61ab
|
[Hardware][Intel GPU] upgrade IPEX dependency to 2.6.10. (#14564)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
|
2025-03-11 17:11:47 +00:00 |
|
|
|
07964e2f30
|
docs: Add documentation for s390x cpu implementation (#14198)
Signed-off-by: Dilip Gowda Bhagavan <dilip.bhagavan@ibm.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-03-11 17:02:17 +00:00 |
|
|
|
af295e9b01
|
[Bugfix] Update --hf-overrides for Alibaba-NLP/gte-Qwen2 (#14609)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-03-11 07:59:43 -07:00 |
|
|
|
bc2d4473bf
|
[Docs] Make installation URLs nicer (#14556)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-03-10 10:43:08 -07:00 |
|
|
|
3b352a2f92
|
Correct capitalisation: VLLM -> vLLM (#14562)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-03-10 16:36:21 +00:00 |
|
|
|
001a9c7b0d
|
[Doc] Update PaliGemma note to a warning (#14565)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-03-10 15:02:28 +00:00 |
|
|
|
b0746fae3d
|
[Frontend] support image embeds (#13955)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-03-10 12:36:03 +00:00 |
|
|
|
60a98b2de5
|
[Docs] Mention model_impl arg when explaining Transformers fallback (#14552)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-03-10 12:13:10 +00:00 |
|
|
|
206e2577fa
|
Move requirements into their own directory (#12547)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-03-08 16:44:35 +00:00 |
|
|
|
cfd0ae8234
|
Add RLHF document (#14482)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-03-08 09:51:39 +00:00 |
|
|
|
be0b399d74
|
Add training doc signposting to TRL (#14439)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-03-08 07:35:07 +00:00 |
|