|
|
51d7c6a2b2
|
[Model] Support Mistral3 in the HF Transformers format (#15505)
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-04-01 06:10:05 -07:00 |
|
|
|
d330558bab
|
[Docs] Fix small error in link text (#15868)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-04-01 10:05:14 +00:00 |
|
|
|
a76f547e11
|
Rename fallback model and refactor supported models section (#15829)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-03-31 22:49:41 -07:00 |
|
|
|
e5ef4fa99a
|
Upgrade transformers to v4.50.3 (#13905)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-03-31 08:59:37 -07:00 |
|
|
|
3aa2b6a637
|
[Model] Update support for NemotronNAS models (#15008)
Signed-off-by: Nave Assaf <nassaf@nvidia.com>
|
2025-03-31 20:35:14 +08:00 |
|
|
|
de1cb38769
|
[Model] Support Skywork-R1V (#15397)
Signed-off-by: jiacai.liu <932997367@qq.com>
Co-authored-by: jiacai.liu <932997367@qq.com>
|
2025-03-28 20:39:21 -07:00 |
|
|
|
2914006fe0
|
[doc] add missing imports (#15699)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-03-28 15:56:48 +00:00 |
|
|
|
ac5bc615b0
|
[Model] MiniCPM-V/O supports V1 (#15487)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-03-27 06:07:29 -07:00 |
|
|
|
cf5c8f1686
|
Separate base model from TransformersModel (#15467)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-03-26 18:13:38 +08:00 |
|
|
|
997c8811d6
|
[Model] Support multi-image for Molmo (#15438)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-03-26 11:26:33 +08:00 |
|
|
|
97cfa65df7
|
Add pipeline parallel support to TransformersModel (#12832)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
|
2025-03-25 10:41:45 +08:00 |
|
|
|
761702fd19
|
[Core] Integrate fastsafetensors loader for loading model weights (#10647)
Signed-off-by: Manish Sethi <Manish.sethi1@ibm.com>
|
2025-03-24 08:08:02 -07:00 |
|
|
|
9c5c81b0da
|
[Misc][Doc] Add note regarding loading generation_config by default (#15281)
Signed-off-by: Roger Wang <ywang@roblox.com>
|
2025-03-23 14:00:55 -07:00 |
|
|
|
2f4bd358f1
|
[Model] Support Tele-FLM Model (#15023)
Signed-off-by: Naitong Yu <ntyu@baai.ac.cn>
Signed-off-by: jiangxin <horizon94@outlook.com>
Co-authored-by: Jason Fang <jasonfang3900@gmail.com>
Co-authored-by: jiangxin <horizon94@outlook.com>
|
2025-03-22 02:04:44 -07:00 |
|
|
|
61f412187d
|
[Bugfix] Re-enable Gemma3 for V1 (#14980)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-03-18 23:58:22 -07:00 |
|
|
|
452e8fd968
|
[MODEL] Add support for Zamba2 models (#13185)
Signed-off-by: Yury Tokpanov <yury@zyphra.com>
Signed-off-by: Quentin Anthony <qganthony@yahoo.com>
Co-authored-by: Quentin Anthony <qganthony@yahoo.com>
Co-authored-by: Tyler Michael Smith <tysmith@redhat.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-03-18 08:56:21 -07:00 |
|
|
|
f863ffc965
|
[Mistral-Small 3.1] Update docs and tests (#14977)
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2025-03-18 03:29:42 -07:00 |
|
|
|
37e3806132
|
[Bugfix] Make Gemma3 MM V0 only for now (#14971)
Signed-off-by: Roger Wang <ywang@roblox.com>
|
2025-03-17 10:04:21 -07:00 |
|
|
|
60c872d4b6
|
[Doc] Fix small typo in Transformers fallback (#14791)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
|
2025-03-13 20:33:12 -07:00 |
|
|
|
b1cc4dfef5
|
[VLM] Support loading InternVideo2.5 models as original InternVLChatModel (#14738)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-03-13 03:10:02 -07:00 |
|
|
|
382403921f
|
[VLM] Support pan-and-scan for Gemma3 multi-modal processor (#14672)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2025-03-13 02:23:12 -07:00 |
|
|
|
c0c25e25fa
|
[Model] Add support for Gemma 3 (#14660)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: Roger Wang <ywang@roblox.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-03-12 08:36:33 -07:00 |
|
|
|
af295e9b01
|
[Bugfix] Update --hf-overrides for Alibaba-NLP/gte-Qwen2 (#14609)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-03-11 07:59:43 -07:00 |
|
|
|
001a9c7b0d
|
[Doc] Update PaliGemma note to a warning (#14565)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-03-10 15:02:28 +00:00 |
|
|
|
60a98b2de5
|
[Docs] Mention model_impl arg when explaining Transformers fallback (#14552)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-03-10 12:13:10 +00:00 |
|
|
|
4f27044aab
|
[Doc] Correct beam_search using in generative_models.md (#14363)
|
2025-03-06 15:37:10 +00:00 |
|
|
|
5d802522a7
|
[V1][VLM][Pixtral-HF] Support Pixtral-HF on V1 (#14275)
Signed-off-by: Linkun Chen <github@lkchen.net>
|
2025-03-06 08:58:41 +00:00 |
|
|
|
1769928079
|
[Model] Update Paligemma multimodal processing with PromptUpdate (#14015)
Signed-off-by: Kyle Huang <kylhuang@nvidia.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-03-06 08:31:38 +00:00 |
|
|
|
0a995d5434
|
[Model] New model support for Phi-4-multimodal-instruct (#14119)
|
2025-03-04 20:57:01 -08:00 |
|
|
|
c060b71408
|
[Model] Add support for GraniteMoeShared models (#13313)
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-03-04 08:04:52 +08:00 |
|
|
|
98175b2816
|
Improve the docs for TransformersModel (#14147)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-03-03 17:03:05 +00:00 |
|
|
|
cc5e8f6db8
|
[Model] Add LoRA support for TransformersModel (#13770)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-03-02 09:17:34 +08:00 |
|
|
|
edf309ebbe
|
[VLM] Support multimodal inputs for Florence-2 models (#13320)
|
2025-02-27 02:06:41 -08:00 |
|
|
|
07c4353057
|
[Model] Support Grok1 (#13795)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-02-26 01:07:12 +00:00 |
|
|
|
1c3c975766
|
[FEATURE] Enables /score endpoint for embedding models (#12846)
|
2025-02-20 22:09:47 -08:00 |
|
|
|
992e5c3d34
|
Merge similar examples in offline_inference into single basic example (#12737)
|
2025-02-20 04:53:51 -08:00 |
|
|
|
512368e34a
|
[Misc] Qwen2.5 VL support LoRA (#13261)
|
2025-02-19 18:37:55 -08:00 |
|
|
|
fd84857f64
|
[Doc] Add clarification note regarding paligemma (#13511)
|
2025-02-18 22:24:03 -08:00 |
|
|
|
2358ca527b
|
[Doc]: Improve feature tables (#13224)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-02-18 18:52:39 +08:00 |
|
|
|
67ef8f666a
|
[Model] Enable quantization support for transformers backend (#12960)
|
2025-02-17 19:52:47 -08:00 |
|
|
|
b7d309860e
|
[V1] Update doc and examples for H2O-VL (#13349)
Signed-off-by: Roger Wang <ywang@roblox.com>
|
2025-02-16 10:35:54 +00:00 |
|
|
|
579d7a63b2
|
[Bugfix][Docs] Fix offline Whisper (#13274)
|
2025-02-14 21:32:37 -08:00 |
|
|
|
1bc3b5e71b
|
[VLM] Separate text-only and vision variants of the same model architecture (#13157)
|
2025-02-13 06:19:15 -08:00 |
|
|
|
c9d3ecf016
|
[VLM] Merged multi-modal processor for Molmo (#12966)
|
2025-02-13 04:34:00 -08:00 |
|
|
|
08b2d845d6
|
[Model] Ultravox Model: Support v0.5 Release (#12912)
Signed-off-by: Farzad Abdolhosseini <farzad@fixie.ai>
|
2025-02-10 22:02:48 +00:00 |
|
|
|
86222a3dab
|
[VLM] Merged multi-modal processor for GLM4V (#12449)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-02-08 20:32:16 +00:00 |
|
|
|
256a2d29dc
|
[Doc] Correct HF repository for TeleChat2 models (#12949)
|
2025-02-08 01:42:15 -08:00 |
|
|
|
d88506dda4
|
[Model] LoRA Support for Ultravox model (#11253)
|
2025-02-05 19:54:13 -08:00 |
|
|
|
75404d041b
|
[VLM] Update compatibility with transformers 4.49
|
2025-02-05 19:09:45 -08:00 |
|
|
|
bf3b79efb8
|
[VLM] Qwen2.5-VL
|
2025-02-05 13:31:38 -08:00 |
|