74f8c2cf5f
Add openai.beta.chat.completions.parse example to structured_outputs.rst ( #10433 )
2024-11-19 04:37:46 +00:00
31894a2155
[Doc] Add documentation for Structured Outputs ( #9943 )
...
Signed-off-by: ismael-dm <ismaeldm99@gmail.com >
2024-11-18 09:52:12 -08:00
4186be8111
[Doc] Update doc for LoRA support in GLM-4V ( #10425 )
...
Signed-off-by: B-201 <Joy25810@foxmail.com >
2024-11-18 15:08:30 +00:00
755b85359b
[doc] add doc for the plugin system ( #10372 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com >
2024-11-15 21:46:27 -08:00
2ac6d0e75b
[Misc] Consolidate pooler config overrides ( #10351 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2024-11-15 06:59:00 +00:00
b40cf6402e
[Model] Support Qwen2 embeddings and use tags to select model tests ( #10184 )
2024-11-14 20:23:09 -08:00
0b8bb86bf1
[1/N] Initial prototype for multi-modal processor ( #10044 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk >
2024-11-13 12:39:03 +00:00
d909acf9fe
[Model][LoRA]LoRA support added for idefics3 ( #10281 )
...
Signed-off-by: B-201 <Joy25810@foxmail.com >
2024-11-13 17:25:59 +08:00
1b886aa104
[Model] Adding Support for Qwen2VL as an Embedding Model. Using MrLight/dse-qwen2-2b-mrl-v1 ( #9944 )
...
Signed-off-by: FurtherAI <austin.veselka@lighton.ai >
Co-authored-by: FurtherAI <austin.veselka@lighton.ai >
2024-11-13 08:28:13 +00:00
3945c82346
[Model] Add support for Qwen2-VL video embeddings input & multiple image embeddings input with varied resolutions ( #10221 )
...
Signed-off-by: imkero <kerorek@outlook.com >
2024-11-13 07:07:22 +00:00
40d0e7411d
[Doc] Update FAQ links in spec_decode.rst ( #9662 )
...
Signed-off-by: whyiug <whyiug@hotmail.com >
2024-11-08 04:44:58 +00:00
406d4cc480
[Model][LoRA]LoRA support added for Qwen2VLForConditionalGeneration ( #10022 )
...
Signed-off-by: ericperfect <ericperfectttt@gmail.com >
2024-11-06 14:13:15 +00:00
a5bba7d234
[Model] Add Idefics3 support ( #9767 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com >
Signed-off-by: B-201 <Joy25810@foxmail.com >
Co-authored-by: B-201 <Joy25810@foxmail.com >
2024-11-06 11:41:17 +00:00
2003cc3513
[Model][LoRA]LoRA support added for LlamaEmbeddingModel ( #10071 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com >
2024-11-06 09:49:19 +00:00
6e056bcf04
[Doc] Update VLM doc about loading from local files ( #9999 )
...
Signed-off-by: Roger Wang <ywang@roblox.com >
2024-11-04 19:47:11 +00:00
54597724f4
[Model] Add support for H2OVL-Mississippi models ( #9747 )
...
Signed-off-by: Shanshan Wang <shanshan.wang@h2o.ai >
Signed-off-by: Roger Wang <ywang@roblox.com >
Co-authored-by: Roger Wang <ywang@roblox.com >
2024-11-04 00:15:36 +00:00
eed92f12fc
[Docs] Update Granite 3.0 models in supported models table ( #9930 )
...
Signed-off-by: Nick Hill <nhill@redhat.com >
Signed-off-by: Nick Hill <nickhill@us.ibm.com >
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com >
2024-11-02 09:02:18 +00:00
ba0d892074
[Frontend] Use a proper chat template for VLM2Vec ( #9912 )
2024-11-01 14:09:07 +00:00
06386a64dd
[Frontend] Chat-based Embeddings API ( #9759 )
2024-11-01 08:13:35 +00:00
d3aa2a8b2f
[Doc] Update multi-input support ( #9906 )
2024-11-01 07:34:49 +00:00
2b5bf20988
[torch.compile] Adding torch compile annotations to some models ( #9876 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com >
Co-authored-by: youkaichao <youkaichao@gmail.com >
2024-11-01 00:25:47 -07:00
5608e611c2
[Doc] Update Qwen documentation ( #9869 )
2024-10-31 08:54:18 +00:00
c5d7fb9ddc
[Doc] fix third-party model example ( #9771 )
...
Signed-off-by: Russell Bryant <rbryant@redhat.com >
2024-10-28 19:39:21 -07:00
6650e6a930
[Model] Add classification Task with Qwen2ForSequenceClassification ( #9704 )
...
Signed-off-by: Kevin-Yang <ykcha9@gmail.com >
Co-authored-by: Kevin-Yang <ykcha9@gmail.com >
2024-10-26 17:53:35 +00:00
b979143d5b
[Doc] Move additional tips/notes to the top ( #9647 )
2024-10-24 09:43:59 +00:00
8a02cd045a
[torch.compile] Adding torch compile annotations to some models ( #9639 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com >
Co-authored-by: youkaichao <youkaichao@gmail.com >
2024-10-24 00:54:57 -07:00
836e8ef6ee
[Bugfix] Fix PP for ChatGLM and Molmo ( #9422 )
2024-10-24 06:12:05 +00:00
fc6c274626
[Model] Add Qwen2-Audio model support ( #9248 )
...
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk >
2024-10-23 17:54:22 +00:00
831540cf04
[Model] Support E5-V ( #9576 )
2024-10-23 11:35:29 +08:00
bb392ea2d2
[Model][VLM] Initialize support for Mono-InternVL model ( #9528 )
2024-10-22 16:01:46 +00:00
f7db5f0fa9
[Doc] Use shell code-blocks and fix section headers ( #9508 )
...
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com >
2024-10-22 06:43:24 +00:00
f6b97293aa
[Model] FalconMamba Support ( #9325 )
2024-10-21 12:50:16 -04:00
3921a2f29e
[Model] Support Pixtral models in the HF Transformers format ( #9036 )
2024-10-18 13:29:56 -06:00
051eaf6db3
[Model] Add user-configurable task for models that support both generation and embedding ( #9424 )
2024-10-18 11:31:58 -07:00
81ede99ca4
[Core] Deprecating block manager v1 and make block manager v2 default ( #8704 )
...
Removing the block manager v1. This is the initial piece of prefix-caching-centric design. In order to achieve prefix-caching-centric design, we need to simplify the code path so that we only use v2 block manager (which has much higher performance on prefix caching).
2024-10-17 11:38:15 -05:00
5b8a1fde84
[Model][Bugfix] Add FATReLU activation and support for openbmb/MiniCPM-S-1B-sft ( #9396 )
2024-10-16 16:40:24 +00:00
59230ef32b
[Misc] Consolidate example usage of OpenAI client for multimodal models ( #9412 )
...
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk >
2024-10-16 11:20:51 +00:00
7abba39ee6
[Model] VLM2Vec, the first multimodal embedding model in vLLM ( #9303 )
2024-10-16 14:31:00 +08:00
8e836d982a
[Doc] Fix code formatting in spec_decode.rst ( #9348 )
2024-10-14 21:29:11 -07:00
169b530607
[Bugfix] Clean up some cruft in mamba.py ( #9343 )
2024-10-15 00:24:25 +00:00
dfe43a2071
[Model] Molmo vLLM Integration ( #9016 )
...
Co-authored-by: sanghol <sanghol@allenai.org >
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com >
Co-authored-by: Roger Wang <ywang@roblox.com >
2024-10-14 07:56:24 -07:00
8baf85e4e9
[Doc] Compatibility matrix for mutual exclusive features ( #8512 )
...
Signed-off-by: Wallas Santos <wallashss@ibm.com >
2024-10-11 11:18:50 -07:00
6cf1167c1a
[Model] Add GLM-4v support and meet vllm==0.6.2 ( #9242 )
2024-10-11 17:36:13 +00:00
7342a7d7f8
[Model] Support Mamba ( #6484 )
2024-10-11 15:40:06 +00:00
e808156f30
[Misc] Collect model support info in a single process per model ( #9233 )
2024-10-11 11:08:11 +00:00
04de9057ab
[Model] support input image embedding for minicpmv ( #9237 )
2024-10-10 15:00:47 +00:00
dc4aea677a
[Doc] Fix VLM prompt placeholder sample bug ( #9170 )
2024-10-09 08:59:42 +00:00
1874c6a1b0
[Doc] Update vlm.rst to include an example on videos ( #9155 )
...
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com >
2024-10-08 18:12:29 +00:00
151ef4efd2
[Model] Support NVLM-D and fix QK Norm in InternViT ( #9045 )
...
Co-authored-by: Roger Wang <ywang@roblox.com >
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn >
2024-10-07 11:55:12 +00:00
b22b798471
[Model] PP support for embedding models and update docs ( #9090 )
...
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com >
2024-10-06 16:35:27 +08:00