|
|
4b91c927f6
|
[Misc] refactor example series (#16972)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-04-22 11:44:21 +00:00 |
|
|
|
205d84aaa9
|
[VLM] Clean up models (#16873)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-04-19 12:13:06 +00:00 |
|
|
|
83f3c3bd91
|
[Model] Refactor Phi-4-multimodal to use merged processor and support V1 (#15477)
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-04-19 02:26:11 -07:00 |
|
|
|
2ef0dc53b8
|
[Frontend] Add sampling params to v1/audio/transcriptions endpoint (#16591)
Signed-off-by: Jannis Schönleber <joennlae@gmail.com>
Signed-off-by: NickLucche <nlucches@redhat.com>
Co-authored-by: Jannis Schönleber <joennlae@gmail.com>
|
2025-04-19 07:03:54 +00:00 |
|
|
|
2c1bd848a6
|
[Model][VLM] Add Qwen2.5-Omni model support (thinker only) (#15130)
Signed-off-by: fyabc <suyang.fy@alibaba-inc.com>
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Xiong Wang <wangxiongts@163.com>
|
2025-04-18 23:14:36 -07:00 |
|
|
|
5a5e29de88
|
[Misc] refactor examples series - Chat Completion Client With Tools (#16829)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-04-18 23:24:42 +00:00 |
|
|
|
aadb656562
|
[Misc] Clean up Kimi-VL (#16833)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-04-18 05:15:09 -07:00 |
|
|
|
e78587a64c
|
Improve-mm-and-pooler-and-decoding-configs (#16789)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-04-17 22:13:32 -07:00 |
|
|
|
7a4a5de729
|
[Misc] Update outdated note: LMCache now supports chunked prefill (#16697)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-04-18 05:12:42 +00:00 |
|
|
|
3408e47159
|
[P/D][V1] KV Connector API V1 (#15960)
Signed-off-by: ApostaC <yihua98@uchicago.edu>
Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>
Signed-off-by: remi <remi@mistral.ai>
Co-authored-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Co-authored-by: Rémi Delacourt <54138269+Flechman@users.noreply.github.com>
Co-authored-by: Tyler Michael Smith <tysmith@redhat.com>
|
2025-04-17 13:22:40 -07:00 |
|
|
|
11c3b98491
|
[Doc] Document Matryoshka Representation Learning support (#16770)
|
2025-04-17 13:37:37 +00:00 |
|
|
|
99ed526101
|
[Misc] refactor examples series - lmcache (#16758)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-04-17 11:02:35 +00:00 |
|
|
|
8cac35ba43
|
[Ray] Improve documentation on batch inference (#16609)
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
|
2025-04-16 22:19:26 -07:00 |
|
|
|
cb072ce93b
|
[Bugfix] Update Florence-2 tokenizer to make grounding tasks work (#16734)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-04-17 04:17:39 +00:00 |
|
|
|
7168920491
|
[Misc] refactor examples series (#16708)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-04-16 10:16:36 +00:00 |
|
|
|
6ae996a873
|
[Misc] refactor argument parsing in examples (#16635)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-04-15 08:05:30 +00:00 |
|
|
|
b1308b84a3
|
[Model][VLM] Add Kimi-VL model support (#16387)
Signed-off-by: courage17340 <courage17340@163.com>
|
2025-04-14 21:41:48 +00:00 |
|
|
|
7cbfc10943
|
[Misc] refactor examples (#16563)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-04-14 09:59:15 +00:00 |
|
|
|
3cdc57669f
|
[Misc] Delete redundant code (#16530)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-04-12 11:21:37 +00:00 |
|
|
|
d9fc8cd9da
|
[V1] Enable multi-input by default (#15799)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-04-12 08:52:39 +00:00 |
|
|
|
f069f3ea74
|
[Misc] Openai transcription client example use same Whisper model (#16487)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-04-12 07:27:03 +00:00 |
|
|
|
fbf722c6e6
|
[Frontend] support matryoshka representation / support embedding API dimensions (#16331)
|
2025-04-11 23:23:10 -07:00 |
|
|
|
16eda8c43a
|
[Frontend] Added chat templates for LLaMa4 pythonic tool calling (#16463)
Signed-off-by: Ye (Charlotte) Qi <yeq@meta.com>
Co-authored-by: Kai Wu <kaiwu@meta.com>
|
2025-04-12 06:26:17 +08:00 |
|
|
|
35e076b3a8
|
[Misc] update api_client example (#16459)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-04-11 10:05:40 +00:00 |
|
|
|
93195146ea
|
[Bugfix][VLM] Fix failing Phi-4-MM multi-images tests and add vision-speech test (#16424)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-04-11 04:57:16 +00:00 |
|
|
|
e8224f3dca
|
[V1][Spec Decode] Eagle Model loading (#16035)
Signed-off-by: LiuXiaoxuanPKU <lilyliupku@gmail.com>
|
2025-04-10 11:21:48 -07:00 |
|
|
|
61de3ef74b
|
[Model] Remove image mm limit for LLaMa4 (#16365)
Signed-off-by: Ye (Charlotte) Qi <yeq@meta.com>
|
2025-04-10 09:36:27 +00:00 |
|
|
|
1bff42c4b7
|
[Misc] refactor Structured Outputs example (#16322)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-04-09 23:32:42 +00:00 |
|
|
|
a25866ac8d
|
[Bugfix] Fix profiling.py (#16202)
Signed-off-by: zh Wang <rekind133@outlook.com>
|
2025-04-09 17:03:34 +00:00 |
|
|
|
102bf967f0
|
[Model] Add smolvlm support (#16017)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-04-08 19:12:17 -07:00 |
|
|
|
2755c34a8f
|
[V1] Update structured output offline inference example (#15721)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2025-04-08 22:34:09 +00:00 |
|
|
|
4ebc0b9640
|
[Bugfix] Proper input validation for multi-modal encoder-decoder models (#16156)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-04-08 09:45:21 -07:00 |
|
|
|
1f5d13ab9f
|
[New Model]: jinaai/jina-embeddings-v3 (#16120)
|
2025-04-08 08:39:12 -07:00 |
|
|
|
7f00899ff7
|
[Misc] format and refactor some examples (#16252)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-04-08 10:42:32 +00:00 |
|
|
|
55dcce91df
|
Upstream Llama4 Support to Main (#16113)
Signed-off-by: Aston Zhang <22279212+astonzhang@users.noreply.github.com>
Signed-off-by: Chris Thi <chris.c.thi@gmail.com>
Signed-off-by: drisspg <drisspguessous@gmail.com>
Signed-off-by: Jon Swenson <jmswen@gmail.com>
Signed-off-by: Keyun Tong <tongkeyun@gmail.com>
Signed-off-by: Lu Fang <fanglu@meta.com>
Signed-off-by: Xiaodong Wang <xdwang@meta.com>
Signed-off-by: Yang Chen <yangche@fb.com>
Signed-off-by: Ye (Charlotte) Qi <yeq@meta.com>
Signed-off-by: Yong Hoon Shin <yhshin@meta.com>
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com>
Signed-off-by: Lu Fang <lufang@fb.com>
Signed-off-by: Lu Fang <fanglu@fb.com>
Signed-off-by: Lucia Fang <fanglu@fb.com>
Signed-off-by: Roger Wang <ywang@roblox.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Lu Fang <fanglu@fb.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-04-07 08:06:27 -07:00 |
|
|
|
dc3529dbf6
|
[Misc] improve example mlpspeculator and llm_engine_example (#16175)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-04-07 11:53:52 +00:00 |
|
|
|
7c80368710
|
[VLM] Florence-2 supports online serving (#16164)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-04-07 04:04:02 -07:00 |
|
|
|
0a57386721
|
[Misc] Update Mistral-3.1 example (#16147)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-04-07 03:57:37 +00:00 |
|
|
|
b6c502a150
|
[Misc] refactor example eagle (#16100)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-04-06 09:42:48 +00:00 |
|
|
|
eb07c8cb5b
|
[Frontend] Fix typo in tool chat templates for llama3.2 and toolace (#14501)
Signed-off-by: Ben Jackson <ben@ben.com>
|
2025-04-06 07:44:36 +00:00 |
|
|
|
d8f094a92a
|
[Misc] format output for encoder_decoder.py (#16095)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-04-05 19:57:18 -07:00 |
|
|
|
af51d80fa1
|
Revert "[V1] Scatter and gather placeholders in the model runner" (#16075)
|
2025-04-04 14:50:57 -07:00 |
|
|
|
f5722a5052
|
[V1] Scatter and gather placeholders in the model runner (#15712)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2025-04-04 21:26:44 +00:00 |
|
|
|
463bbb1835
|
[Bugfix][V1] Fix bug from putting llm_engine.model_executor in a background process (#15367)
Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>
|
2025-04-03 07:32:10 +00:00 |
|
|
|
8b664706aa
|
[bugfix] add seed in torchrun_example.py (#15980)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-04-03 12:25:01 +08:00 |
|
|
|
cefb9e5a28
|
[Frontend] Implement Tool Calling with tool_choice='required' (#13483)
Signed-off-by: Liangfu Chen <liangfc@amazon.com>
Signed-off-by: Matt, Matthias <matthias.matt@tuwien.ac.at>
Co-authored-by: Liangfu Chen <liangfc@amazon.com>
Co-authored-by: mgoin <michael@neuralmagic.com>
|
2025-04-02 07:45:45 -07:00 |
|
|
|
38327cf454
|
[Model] Aya Vision (#15441)
Signed-off-by: Jennifer Zhao <ai.jenniferzhao@gmail.com>
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2025-04-01 16:30:43 +00:00 |
|
|
|
51d7c6a2b2
|
[Model] Support Mistral3 in the HF Transformers format (#15505)
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-04-01 06:10:05 -07:00 |
|
|
|
a164aea35d
|
[Frontend] Add Phi-4-mini function calling support (#14886)
Signed-off-by: Kinfey <kinfeylo@microsoft.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-03-31 22:50:05 -07:00 |
|
|
|
239b7befdd
|
[V1][Spec Decode] Remove deprecated spec decode config params (#15466)
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
|
2025-03-31 09:19:35 -07:00 |
|