|
|
a5bba7d234
|
[Model] Add Idefics3 support (#9767)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Signed-off-by: B-201 <Joy25810@foxmail.com>
Co-authored-by: B-201 <Joy25810@foxmail.com>
|
2024-11-06 11:41:17 +00:00 |
|
|
|
21063c11c7
|
[CI/Build] drop support for Python 3.8 EOL (#8464)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
|
2024-11-06 07:11:55 +00:00 |
|
|
|
2bcbae704c
|
[Bugfix] Fix edge-case crash when using chat with the Mistral Tekken Tokenizer (#10051)
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
|
2024-11-06 04:28:29 +00:00 |
|
|
|
02462465ea
|
[CI] Prune tests/models/decoder_only/language/* tests (#9940)
Signed-off-by: mgoin <michael@neuralmagic.com>
|
2024-11-05 16:02:23 -05:00 |
|
|
|
54597724f4
|
[Model] Add support for H2OVL-Mississippi models (#9747)
Signed-off-by: Shanshan Wang <shanshan.wang@h2o.ai>
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2024-11-04 00:15:36 +00:00 |
|
|
|
a78dd3303e
|
[Encoder Decoder] Add flash_attn kernel support for encoder-decoder models (#9559)
|
2024-11-01 23:22:49 -07:00 |
|
|
|
6c0b7f548d
|
[Core][VLM] Add precise multi-modal placeholder tracking (#8346)
Signed-off-by: Peter Salas <peter@fixie.ai>
|
2024-11-01 16:21:10 -07:00 |
|
|
|
30a2e80742
|
[CI/Build] Add Model Tests for PixtralHF (#9813)
|
2024-11-01 07:55:29 -06:00 |
|
|
|
16b8f7a86f
|
[CI/Build] Add Model Tests for Qwen2-VL (#9846)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-10-31 09:10:52 -07:00 |
|
|
|
cc98f1e079
|
[CI/Build] VLM Test Consolidation (#9372)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
|
2024-10-30 09:32:17 -07:00 |
|
|
|
ab6f981671
|
[CI][Bugfix] Skip chameleon for transformers 4.46.1 (#9808)
|
2024-10-29 11:12:43 -07:00 |
|
|
|
5f8d8075f9
|
[Model][VLM] Add multi-video support for LLaVA-Onevision (#8905)
Co-authored-by: litianjian <litianjian@bytedance.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-10-28 18:04:10 +00:00 |
|
|
|
4e2d95e372
|
[Hardware][ROCM] using current_platform.is_rocm (#9642)
Signed-off-by: wangshuai09 <391746016@qq.com>
|
2024-10-28 04:07:00 +00:00 |
|
|
|
3cb07a36a2
|
[Misc] Upgrade to pytorch 2.5 (#9588)
Signed-off-by: Bill Nell <bill@neuralmagic.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2024-10-27 09:44:24 +00:00 |
|
|
|
6650e6a930
|
[Model] Add classification Task with Qwen2ForSequenceClassification (#9704)
Signed-off-by: Kevin-Yang <ykcha9@gmail.com>
Co-authored-by: Kevin-Yang <ykcha9@gmail.com>
|
2024-10-26 17:53:35 +00:00 |
|
|
|
9f7b4ba865
|
[ci/Build] Skip Chameleon for transformers 4.46.0 on broadcast test #9675 (#9676)
|
2024-10-24 20:59:00 -07:00 |
|
|
|
722d46edb9
|
[Model] Compute Llava Next Max Tokens / Dummy Data From Gridpoints (#9650)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
|
2024-10-24 10:42:24 -07:00 |
|
|
|
c866e0079d
|
[CI/Build] Fix VLM test failures when using transformers v4.46 (#9666)
|
2024-10-25 01:40:40 +08:00 |
|
|
|
31a08f5bd2
|
[Model] Add min_pixels / max_pixels to Qwen2VL as mm_processor_kwargs (#9612)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
|
2024-10-23 14:05:18 +00:00 |
|
|
|
3ff57ebfca
|
[Model] Initialize Florence-2 language backbone support (#9555)
|
2024-10-23 10:42:47 +00:00 |
|
|
|
831540cf04
|
[Model] Support E5-V (#9576)
|
2024-10-23 11:35:29 +08:00 |
|
|
|
bb392ea2d2
|
[Model][VLM] Initialize support for Mono-InternVL model (#9528)
|
2024-10-22 16:01:46 +00:00 |
|
|
|
3ddbe25502
|
[Hardware][CPU] using current_platform.is_cpu (#9536)
|
2024-10-22 00:50:43 -07:00 |
|
|
|
f6b97293aa
|
[Model] FalconMamba Support (#9325)
|
2024-10-21 12:50:16 -04:00 |
|
|
|
696b01af8f
|
[CI/Build] Split up decoder-only LM tests (#9488)
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
|
2024-10-20 21:27:50 -07:00 |
|
|
|
d11bf435a0
|
[MISC] Consolidate cleanup() and refactor offline_inference_with_prefix.py (#9510)
|
2024-10-18 14:30:55 -07:00 |
|
|
|
051eaf6db3
|
[Model] Add user-configurable task for models that support both generation and embedding (#9424)
|
2024-10-18 11:31:58 -07:00 |
|
|
|
343f8e0905
|
Support BERTModel (first encoder-only embedding model) (#9056)
Signed-off-by: Max de Bayser <maxdebayser@gmail.com>
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: Andrew Feldman <afeldman@neuralmagic.com>
Co-authored-by: afeldman-nm <156691304+afeldman-nm@users.noreply.github.com>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Co-authored-by: laishzh <laishengzhang@gmail.com>
Co-authored-by: Max de Bayser <maxdebayser@gmail.com>
Co-authored-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2024-10-17 23:21:01 +00:00 |
|
|
|
fb60ae9b91
|
[Kernel][Model] Improve continuous batching for Jamba and Mamba (#9189)
|
2024-10-16 12:12:43 -04:00 |
|
|
|
cee711fdbb
|
[Core] Rename input data types (#8688)
|
2024-10-16 10:49:37 +00:00 |
|
|
|
7abba39ee6
|
[Model] VLM2Vec, the first multimodal embedding model in vLLM (#9303)
|
2024-10-16 14:31:00 +08:00 |
|
|
|
f0fe4fe86d
|
[Model] Make llama3.2 support multiple and interleaved images (#9095)
|
2024-10-14 15:24:26 -07:00 |
|
|
|
6cf1167c1a
|
[Model] Add GLM-4v support and meet vllm==0.6.2 (#9242)
|
2024-10-11 17:36:13 +00:00 |
|
|
|
7342a7d7f8
|
[Model] Support Mamba (#6484)
|
2024-10-11 15:40:06 +00:00 |
|
|
|
f19da64871
|
[Core] Refactor GGUF parameters packing and forwarding (#8859)
|
2024-10-07 10:01:46 +00:00 |
|
|
|
4f95ffee6f
|
[Hardware][CPU] Cross-attention and Encoder-Decoder models support on CPU backend (#9089)
|
2024-10-07 06:50:35 +00:00 |
|
|
|
8c6de96ea1
|
[Model] Explicit interface for vLLM models and support OOT embedding models (#9108)
|
2024-10-07 06:10:35 +00:00 |
|
|
|
cfadb9c687
|
[Bugfix] Deprecate registration of custom configs to huggingface (#9083)
|
2024-10-05 21:56:40 +08:00 |
|
|
|
15986f598c
|
[Model] Support Gemma2 embedding model (#9004)
|
2024-10-05 06:57:05 +00:00 |
|
|
|
26aa325f4f
|
[Core][VLM] Test registration for OOT multimodal models (#8717)
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-10-04 10:38:25 -07:00 |
|
|
|
0e36fd4909
|
[Misc] Move registry to its own file (#9064)
|
2024-10-04 10:01:37 +00:00 |
|
|
|
0f6d7a9a34
|
[Models] Add remaining model PP support (#7168)
Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
Signed-off-by: Murali Andoorveedu <muralidhar.andoorveedu@centml.ai>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-10-04 10:56:58 +08:00 |
|
|
|
19f0d25796
|
[Model] Adding Granite MoE. (#8206)
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
|
2024-10-03 09:33:57 +08:00 |
|
|
|
f13a07b1f8
|
[Kernel][Model] Varlen prefill + Prefill chunking support for mamba kernels and Jamba model (#8533)
|
2024-09-29 17:35:58 -04:00 |
|
|
|
26a68d5d7e
|
[CI/Build] Add test decorator for minimum GPU memory (#8925)
|
2024-09-29 02:50:51 +00:00 |
|
|
|
e1a3f5e831
|
[CI/Build] Update models tests & examples (#8874)
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2024-09-28 09:54:35 -07:00 |
|
|
|
6d792d2f31
|
[Bugfix][VLM] Fix Fuyu batching inference with max_num_seqs>1 (#8892)
|
2024-09-27 01:15:58 -07:00 |
|
|
|
4b377d6feb
|
[BugFix] Fix test breakages from transformers 4.45 upgrade (#8829)
|
2024-09-26 16:46:43 -07:00 |
|
|
|
770ec6024f
|
[Model] Add support for the multi-modal Llama 3.2 model (#8811)
Co-authored-by: simon-mo <xmo@berkeley.edu>
Co-authored-by: Chang Su <chang.s.su@oracle.com>
Co-authored-by: Simon Mo <simon.mo@hey.com>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2024-09-25 13:29:32 -07:00 |
|
|
|
8ff7ced996
|
[Model] Expose Phi3v num_crops as a mm_processor_kwarg (#8658)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-09-24 07:36:46 +00:00 |
|