|
|
fda819837e
|
Update to Transformers v4.56.2 (#24638)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:57 -07:00 |
|
|
|
d437ba32fd
|
[Model] MTP fallback to eager for DeepSeek v32 (#25982)
Signed-off-by: Lu Fang <fanglu@fb.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:57 -07:00 |
|
|
|
fd56f2e644
|
[gpt-oss] use vLLM instead of openai types for streaming (#25186)
Signed-off-by: Andrew Xia <axia@meta.com>
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:57 -07:00 |
|
|
|
b3e1846da6
|
Add explicit pooling classes for the Transformers backend (#25322)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:57 -07:00 |
|
|
|
8328d39d40
|
[V1] [P/D] Add Support for KV Load Failure Recovery (#19330)
Signed-off-by: David Ben-David <davidb@pliops.com>
Co-authored-by: David Ben-David <davidb@pliops.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:57 -07:00 |
|
|
|
493acdb7e2
|
[Doc] Improve MM Pooling model documentation (#25966)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:57 -07:00 |
|
|
|
8914d52869
|
[CI] Move applicable tests to CPU (#24080)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:57 -07:00 |
|
|
|
bf8bb7e250
|
[NIXL] Add support for MLA caches with different latent dim (#25902)
Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
Co-authored-by: Chen Zhang <zhangch99@outlook.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:57 -07:00 |
|
|
|
a1898466a6
|
[Model] Move vision_feature_select_strategy into resolve_visual_encoder_outputs (#25938)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:57 -07:00 |
|
|
|
0230cd0afb
|
[New Model] DeepSeek-V3.2 (Rebased to Main) (#25896)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: Yongye Zhu <zyy1102000@gmail.com>
Signed-off-by: Barry Kang <43644113+Barry-Delaney@users.noreply.github.com>
Signed-off-by: Lucia Fang <fanglu@meta.com>
Co-authored-by: Chen Zhang <zhangch99@outlook.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Lucas Wilkinson <lwilkins@redhat.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
Co-authored-by: Lucia Fang <116399278+luccafong@users.noreply.github.com>
Co-authored-by: Lucia Fang <fanglu@meta.com>
Co-authored-by: NickLucche <nlucches@redhat.com>
Co-authored-by: Siyuan Fu <siyuanf@nvidia.com>
Co-authored-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Xiaozhu Meng <mxz297@gmail.com>
Co-authored-by: Barry Kang <43644113+Barry-Delaney@users.noreply.github.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:56 -07:00 |
|
|
|
da71651386
|
[Bugfix]: Clean up chunked prefill logging when using whisper (#25075)
Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:56 -07:00 |
|
|
|
6941d53c0c
|
Test Prompt Embeds/LoRA compatibility and Enable LoRA Support for OPT Models (#25717)
Signed-off-by: Andrew Sansom <andrew@protopia.ai>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:56 -07:00 |
|
|
|
97f1312f8c
|
[V0 Deprecation] Remove vllm.worker and update according imports (#25901)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:56 -07:00 |
|
|
|
c692506e10
|
[BugFix][torch.compile] KV scale calculation issues with FP8 quantization (#25513)
Signed-off-by: adabeyta <aabeyta@redhat.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:56 -07:00 |
|
|
|
2405817748
|
[Model] Remove MotifForCausalLM (#25866)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:56 -07:00 |
|
|
|
c33992154a
|
[Bugfix][Speculative Decoding] Fix Eagle3 quantization config issue (#25883)
Signed-off-by: Rahul Tuli <rtuli@redhat.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:56 -07:00 |
|
|
|
f84b2a0dd0
|
[Nixl][P/D] Add cuda2cpu support (HD->DH transfer) (#24690)
Signed-off-by: Chenxi Yang <cxyang@fb.com>
Co-authored-by: Chenxi Yang <cxyang@fb.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:56 -07:00 |
|
|
|
00eba10dd1
|
[XPU]Fix xpu spec decoding UTs, avoid using cuda graph (#25847)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:56 -07:00 |
|
|
|
e40c12696a
|
Update GLM-4.5 Doc transformers version (#25830)
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:56 -07:00 |
|
|
|
04384cb9da
|
[Core] GC Debug callback (#24829)
Signed-off-by: Jialin Ouyang <jialino@meta.com>
Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>
Co-authored-by: Jialin Ouyang <jialino@meta.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:56 -07:00 |
|
|
|
c7ae7edb33
|
Fix GPTQ model loading in Transformers backend (#25770)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:56 -07:00 |
|
|
|
1cb6005627
|
Add filtering for chat template kwargs (#25794)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:56 -07:00 |
|
|
|
0f97a2e1db
|
[CI/Build] Reorganize root-level V1 tests (#25767)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:56 -07:00 |
|
|
|
0a4674c871
|
[CI/Build] Consolidate model loader tests and requirements (#25765)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:56 -07:00 |
|
|
|
38c2df831a
|
[Multimodal][Speculative Decoding]Eagle Eagle3 mm support, enablement on qwen2.5vl (#22872)
Signed-off-by: Junhong <liujunhong11@huawei.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Co-authored-by: Junhong <liujunhong11@huawei.com>
Co-authored-by: LJH-LBJ <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:56 -07:00 |
|
|
|
dbb7782d5b
|
Add option to restrict media domains (#25783)
Signed-off-by: Chenheli Hua <huachenheli@outlook.com>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Chenheli Hua <huachenheli@outlook.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:56 -07:00 |
|
|
|
93ba7648d0
|
[Spec decode] automatically disable mm for text-only draft models (#25667)
Signed-off-by: Jonas Kuebler <kuebj@amazon.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:56 -07:00 |
|
|
|
c4b9864e22
|
Kernel-override Determinism [1/n] (#25603)
Signed-off-by: Bram Wasti <bwasti@meta.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:56 -07:00 |
|
|
|
1356ae0aa8
|
[spec decode] Consolidate speculative decode method name for MTP (#25232)
Signed-off-by: zixi-qi <qizixi@meta.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:56 -07:00 |
|
|
|
ceb346015c
|
[V1] address post issues related to #20059 (part 1) (#23046)
Signed-off-by: fhl2000 <63384265+fhl2000@users.noreply.github.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:56 -07:00 |
|
|
|
b6f16d37b0
|
[CI] Add E2E Blackwell Quantized MoE Test (#25723)
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:56 -07:00 |
|
|
|
f16c440c9f
|
[Bugfix] Improve GLM4 MoE Reasoning Parser's is_reasoning_end Condition (#25355)
Signed-off-by: frankwang28 <frank.wbb@hotmail.com>
Signed-off-by: Frank Wang <41319051+frankwang28@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:56 -07:00 |
|
|
|
c72298213d
|
[Misc] fix unique_filepath (#25732)
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:55 -07:00 |
|
|
|
41174e2803
|
[ray][metrics] Replace ':' with '_' for OpenTelemetry compatibility in Ray (#25439)
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <58963096+eicherseiji@users.noreply.github.com>
Co-authored-by: Rui Qiao <161574667+ruisearch42@users.noreply.github.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:55 -07:00 |
|
|
|
129a643b4c
|
[CI/Build] Fix some V1 tests not being run (#25569)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:55 -07:00 |
|
|
|
d3c732e985
|
[CI/Build] Split up Distributed Tests (#25572)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:55 -07:00 |
|
|
|
fb0eece290
|
[Bugfix] Properly abort pooling request. (#25734)
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:55 -07:00 |
|
|
|
515e30b023
|
[CI] Fix test_shared_storage_connector_hashes (#25748)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:55 -07:00 |
|
|
|
62ae26c870
|
[Model] Mamba2 varlen refactor (#21467)
Signed-off-by: Chih-Chieh-Yang <7364402+cyang49@users.noreply.github.com>
Co-authored-by: RishiAstra <40644327+RishiAstra@users.noreply.github.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:55 -07:00 |
|
|
|
9e6628ccfc
|
EVS Support (Video tokens pruning) (#22980)
Signed-off-by: Eugene Khvedchenia <ekhvedchenia@nvidia.com>
Signed-off-by: Eugene Khvedchenya <ekhvedchenya@gmail.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:55 -07:00 |
|
|
|
1d1436c3f7
|
[Model] rename NemotronH_Nano_VL -> NemotronH_Nano_VL_V2 (#25708)
Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:55 -07:00 |
|
|
|
f3a478b55e
|
[Spec Decode] Add Batch Parallel Ngram. Upto 8x lower overhead. (#24986)
Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:55 -07:00 |
|
|
|
b0e9f04bbd
|
[Misc] Simplify test_argsort_mm_positions (#25690)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:55 -07:00 |
|
|
|
a355561291
|
[V0 deprecation] Remove _VLLM_V1 suffixes from attention backend names (#25489)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:55 -07:00 |
|
|
|
34e6a31e40
|
[Model] Define merge_by_field_config MM interface (#25676)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:55 -07:00 |
|
|
|
91d4299774
|
[Misc] Remove cruft file in repo (#25678)
Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:55 -07:00 |
|
|
|
f7f76a8668
|
[Bugfix] Fix InternS1 video processing after Transformers v4.56 (#25644)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:55 -07:00 |
|
|
|
af10a37c6c
|
[mypy] Fix wrong type annotations related to tuple (#25660)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:55 -07:00 |
|
|
|
222411313d
|
[CI/Build] Fix flaky entrypoints test (#25663)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:55 -07:00 |
|
|
|
f3d9099b44
|
[V0 deprecation] Remove unreachable model_config.supported_tasks (#25642)
Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 13:35:55 -07:00 |
|