|
|
3f674a49b5
|
[VLM][Core] Support profiling with multiple multi-modal inputs per prompt (#7126)
|
2024-08-14 17:55:42 +00:00 |
|
|
|
00c3d68e45
|
[Frontend][Core] Add plumbing to support audio language models (#7446)
|
2024-08-13 17:39:33 +00:00 |
|
|
|
e6e42e4b17
|
[Core][VLM] Support image embeddings as input (#6613)
|
2024-08-12 16:16:06 +08:00 |
|
|
|
757ac70a64
|
[Model] Rename MiniCPMVQwen2 to MiniCPMV2.6 (#7273)
|
2024-08-08 14:02:41 +00:00 |
|
|
|
0e12cd67a8
|
[Doc] add online speculative decoding example (#7243)
|
2024-08-07 09:58:02 -07:00 |
|
|
|
789937af2e
|
[Doc] [SpecDecode] Update MLPSpeculator documentation (#7100)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
|
2024-08-05 23:29:43 +00:00 |
|
|
|
179a6a36f2
|
[Model]Refactor MiniCPMV (#7020)
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2024-08-04 08:12:41 +00:00 |
|
|
|
2f4e108f75
|
[Bugfix] Clean up MiniCPM-V (#6939)
Co-authored-by: hezhihui <hzh7269@modelbest.cn>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2024-07-31 14:39:19 +00:00 |
|
|
|
7cbd9ec7a9
|
[Model] Initialize support for InternVL2 series models (#6514)
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2024-07-29 10:16:30 +00:00 |
|
|
|
1ad86acf17
|
[Model] Initial support for BLIP-2 (#5920)
Co-authored-by: ywang96 <ywang@roblox.com>
|
2024-07-27 11:53:07 +00:00 |
|
|
|
ecb33a28cb
|
[CI/Build][Doc] Update CI and Doc for VLM example changes (#6860)
|
2024-07-27 09:54:14 +00:00 |
|
|
|
281977bd6e
|
[Doc] Add Nemotron to supported model docs (#6843)
|
2024-07-26 17:32:44 -04:00 |
|
|
|
9e169a4c61
|
[Model] Adding support for MiniCPM-V (#4087)
|
2024-07-24 20:59:30 -07:00 |
|
|
|
cb1362a889
|
[Docs] Announce llama3.1 support (#6688)
|
2024-07-23 08:18:15 -07:00 |
|
|
|
22fa2e35cb
|
[VLM][Model] Support image input for Chameleon (#6633)
|
2024-07-22 23:50:48 -07:00 |
|
|
|
739b61a348
|
[Frontend] Refactor prompt processing (#4028)
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2024-07-22 10:13:53 -07:00 |
|
|
|
5bf35a91e4
|
[Doc][CI/Build] Update docs and tests to use vllm serve (#6431)
|
2024-07-17 07:43:21 +00:00 |
|
|
|
94162beb9f
|
[Doc] Fix the lora adapter path in server startup script (#6230)
|
2024-07-16 10:11:04 -07:00 |
|
|
|
6ef3bf912c
|
Remove unnecessary trailing period in spec_decode.rst (#6405)
|
2024-07-14 07:58:09 +00:00 |
|
|
|
540c0368b1
|
[Model] Initialize Fuyu-8B support (#3924)
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2024-07-14 05:27:14 +00:00 |
|
|
|
6206dcb29e
|
[Model] Add PaliGemma (#5189)
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2024-07-07 09:25:50 +08:00 |
|
|
|
9389380015
|
[Doc] Move guide for multimodal model and other improvements (#6168)
|
2024-07-06 17:18:59 +08:00 |
|
|
|
175c43eca4
|
[Doc] Reorganize Supported Models by Type (#6167)
|
2024-07-06 05:59:36 +00:00 |
|
|
|
ae96ef8fbd
|
[VLM] Calculate maximum number of multi-modal tokens by model (#6121)
|
2024-07-04 16:37:23 -07:00 |
|
|
|
d9e98f42e4
|
[vlm] Remove vision language config. (#6089)
Signed-off-by: Xiaowei Jiang <xwjiang2010@gmail.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2024-07-03 22:14:16 +00:00 |
|
|
|
9831aec49f
|
[Core] Dynamic image size support for VLMs (#5276)
Signed-off-by: Xiaowei Jiang <xwjiang2010@gmail.com>
Co-authored-by: Xiaowei Jiang <xwjiang2010@gmail.com>
Co-authored-by: ywang96 <ywang@roblox.com>
Co-authored-by: xwjiang2010 <87673679+xwjiang2010@users.noreply.github.com>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
|
2024-07-02 20:34:00 -07:00 |
|
|
|
9d6a8daa87
|
[Model] Jamba support (#4115)
Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
Co-authored-by: Erez Schwartz <erezs@ai21.com>
Co-authored-by: Mor Zusman <morz@ai21.com>
Co-authored-by: tomeras91 <57313761+tomeras91@users.noreply.github.com>
Co-authored-by: Tomer Asida <tomera@ai21.com>
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
Co-authored-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
|
2024-07-02 23:11:29 +00:00 |
|
|
|
98d6682cd1
|
[VLM] Remove image_input_type from VLM config (#5852)
Signed-off-by: Xiaowei Jiang <xwjiang2010@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2024-07-02 07:57:09 +00:00 |
|
|
|
5cbe8d155c
|
[Core] Registry for processing model inputs (#5214)
Co-authored-by: ywang96 <ywang@roblox.com>
|
2024-06-28 12:09:56 +00:00 |
|
|
|
79c92c7c8a
|
[Model] Add Gemma 2 (#5908)
|
2024-06-27 13:33:56 -07:00 |
|
|
|
96354d6a29
|
[Model] Add base class for LoRA-supported models (#5018)
|
2024-06-27 16:03:04 +08:00 |
|
|
|
3aa7b6cf66
|
[Misc][Doc] Add Example of using OpenAI Server with VLM (#5832)
|
2024-06-25 20:34:25 -07:00 |
|
|
|
f23871e9ee
|
[Doc] Add notice about breaking changes to VLMs (#5818)
|
2024-06-25 01:25:03 -07:00 |
|
|
|
1744cc99ba
|
[Doc] Add Phi-3-medium to list of supported models (#5788)
|
2024-06-24 10:48:55 -07:00 |
|
|
|
daef218b55
|
[Model] Initialize Phi-3-vision support (#4986)
|
2024-06-17 19:34:33 -07:00 |
|
|
|
0ce7b952f8
|
[Doc] Update LLaVA docs (#5437)
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2024-06-13 11:22:07 -07:00 |
|
|
|
89ec06c33b
|
[Docs] [Spec decode] Fix docs error in code example (#5427)
|
2024-06-11 10:31:56 -07:00 |
|
|
|
4c2ffb28ff
|
[Speculative decoding] Initial spec decode docs (#5400)
|
2024-06-11 10:15:40 -07:00 |
|
|
|
246598a6b1
|
[CI] docfix (#5410)
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: ywang96 <ywang@roblox.com>
|
2024-06-11 01:28:50 -07:00 |
|
|
|
856c990041
|
[Docs] Add Docs on Limitations of VLM Support (#5383)
|
2024-06-10 09:53:50 -07:00 |
|
|
|
6b29d6fe70
|
[Model] Initial support for LLaVA-NeXT (#4199)
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2024-06-10 12:47:15 +00:00 |
|
|
|
7a9cb294ae
|
[Frontend] Add OpenAI Vision API Support (#5237)
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-06-07 11:23:32 -07:00 |
|
|
|
7a64d24aad
|
[Core] Support image processor (#4197)
|
2024-06-02 22:56:41 -07:00 |
|
|
|
657579113f
|
[Doc] Add checkmark for GPTBigCodeForCausalLM LoRA support (#5171)
|
2024-05-31 17:20:19 -07:00 |
|
|
|
8e192ff967
|
[Kernel][Backend][Model] Blocksparse flash attention kernel and Phi-3-Small model (#4799)
Co-authored-by: beagleski <yunanzhang@microsoft.com>
Co-authored-by: bapatra <bapatra@microsoft.com>
Co-authored-by: Barun Patra <codedecde@users.noreply.github.com>
Co-authored-by: Michael Goin <michael@neuralmagic.com>
|
2024-05-24 22:00:52 -07:00 |
|
|
|
f12c3b5b3d
|
[Model] Add Phi-2 LoRA support (#4886)
|
2024-05-21 14:24:17 +09:00 |
|
|
|
ac1fbf7fd2
|
[Doc] Shorten README by removing supported model list (#4796)
|
2024-05-13 16:23:54 -07:00 |
|
|
|
e7c46b9527
|
[Scheduler] Warning upon preemption and Swapping (#4647)
Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com>
|
2024-05-13 23:50:44 +09:00 |
|
|
|
51d4094fda
|
chunked-prefill-doc-syntax (#4603)
Fix the docs: https://docs.vllm.ai/en/latest/models/performance.html
Co-authored-by: sang <rkooo567@gmail.com>
|
2024-05-10 14:13:23 +09:00 |
|
|
|
36fb68f947
|
[Doc] Chunked Prefill Documentation (#4580)
|
2024-05-04 00:18:00 -07:00 |
|