|
|
3f52738dce
|
[Doc] Add max_lora_rank configuration guide (#22782)
Signed-off-by: chiliu <cliu_whu@yeah.net>
|
2025-08-13 04:10:07 -07:00 |
|
|
|
3a7e3bbdd2
|
[Doc] Added unmentioned required option "method" in the usage of EAGLE-3 based models (#21737)
Signed-off-by: Dilute-l <dilu2333@163.com>
Co-authored-by: Dilute-l <dilu2333@163.com>
|
2025-08-12 00:14:51 -07:00 |
|
|
|
7be7f3824a
|
[Docs] Improve API docs (+small tweaks) (#22459)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-08-08 03:02:51 -07:00 |
|
|
|
099c046463
|
[Doc] Sleep mode documentation (#22310)
Signed-off-by: iAmir97 <Amir.balwel@embeddedllm.com>
Signed-off-by: iAmir97 <71513472+iAmir97@users.noreply.github.com>
Co-authored-by: iAmir97 <Amir.balwel@embeddedllm.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Hong Hanh <hanh.usth@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2025-08-08 12:25:18 +08:00 |
|
|
|
289b18e670
|
[Docs] Update features/disagg_prefill, add v1 examples and development (#22165)
Signed-off-by: David Chen <530634352@qq.com>
|
2025-08-07 00:59:23 -07:00 |
|
|
|
5e8398805e
|
[Doc] Fix link to prefix caching design (#22384)
Signed-off-by: Yong Hoon Shin <yhshin@meta.com>
|
2025-08-07 00:28:15 -07:00 |
|
|
|
0a6d305e0f
|
feat(multimodal): Add customizable background color for RGBA to RGB conversion (#22052)
Signed-off-by: Jinheng Li <ahengljh@gmail.com>
Co-authored-by: Jinheng Li <ahengljh@gmail.com>
|
2025-08-01 06:07:33 -07:00 |
|
|
|
4931486988
|
[Doc] Added warning of speculating with draft model (#22047)
Signed-off-by: Dilute-l <dilu2333@163.com>
Co-authored-by: Dilute-l <dilu2333@163.com>
|
2025-08-01 02:11:56 -07:00 |
|
|
|
79731a79f0
|
[Doc] Fix a syntax error of example code in structured_outputs.md (#22045)
Signed-off-by: wangzi <3220100013@zju.edu.cn>
Co-authored-by: wangzi <3220100013@zju.edu.cn>
|
2025-08-01 00:01:22 -07:00 |
|
|
|
5c8fe389d6
|
[Docs] Fix the example code of streaming chat completions in reasoning (#21825)
Signed-off-by: wangzi <3220100013@zju.edu.cn>
Co-authored-by: wangzi <3220100013@zju.edu.cn>
Co-authored-by: Zi Wang <66560864+BruceW-07@users.noreply.github.com>
|
2025-07-30 12:11:58 +00:00 |
|
|
|
5bbaf492a6
|
[Doc] Update partial support (#21916)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-07-30 01:32:39 -07:00 |
|
|
|
ba5c5e5404
|
[Docs] Switch to better markdown linting pre-commit hook (#21851)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-07-29 19:45:08 -07:00 |
|
|
|
ab714131e4
|
[Doc] Update compatibility matrix for pooling and multimodal models (#21831)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-07-29 06:29:51 -07:00 |
|
|
|
947e982ede
|
[Docs] Minimize spacing for supported_hardware.md table (#21779)
|
2025-07-28 18:46:39 -07:00 |
|
|
|
86ae693f20
|
[Deprecation][2/N] Replace --task with --runner and --convert (#21470)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-07-27 19:42:40 -07:00 |
|
|
|
97349fe2bc
|
[Docs] add offline serving multi-modal video input expamle Qwen2.5-VL (#21530)
Signed-off-by: David Chen <530634352@qq.com>
|
2025-07-25 18:37:32 -07:00 |
|
|
|
5ac3168ee3
|
[Docs] add auto-round quantization readme (#21600)
Signed-off-by: Wenhua Cheng <wenhua.cheng@intel.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-07-25 08:52:42 -07:00 |
|
|
|
6eca337ce0
|
Replace --expand-tools-even-if-tool-choice-none with --exclude-tools-when-tool-choice-none for v0.10.0 (#20544)
Signed-off-by: okada <kokuzen@gmail.com>
Signed-off-by: okada shintarou <okada@preferred.jp>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-07-24 02:56:36 -07:00 |
|
|
|
82ec66f514
|
[V0 Deprecation] Remove Prompt Adapters (#20588)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-07-23 16:36:48 -07:00 |
|
|
|
23637dcdef
|
[Docs] Fix bullets and grammars in tool_calling.md (#21440)
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
|
2025-07-23 01:23:20 -07:00 |
|
|
|
d97841078b
|
[Misc] unify variable for LLM instance (#20996)
Signed-off-by: Andy Xie <andy.xning@gmail.com>
|
2025-07-21 12:18:33 +01:00 |
|
|
|
be54a951a3
|
[Docs] Fix hardcoded links in docs (#21287)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-07-21 02:23:57 -07:00 |
|
|
|
5a7fb3ab9e
|
[Model] Add ToolParser and MoE Config for Hunyuan A13B (#20820)
Signed-off-by: Asher Zhang <asherszhang@tencent.com>
|
2025-07-17 09:10:09 +00:00 |
|
|
|
01513a334a
|
Support FP8 Quantization and Inference Run on Intel Gaudi (HPU) using INC (Intel Neural Compressor) (#12010)
Signed-off-by: Nir David <ndavid@habana.ai>
Signed-off-by: Uri Livne <ulivne@habana.ai>
Co-authored-by: Uri Livne <ulivne@habana.ai>
|
2025-07-16 15:33:41 -04:00 |
|
|
|
313ae8c16a
|
[Deprecation] Remove everything scheduled for removal in v0.10.0 (#20979)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-07-15 15:57:53 +00:00 |
|
|
|
5f0af36af5
|
Update kimi-k2 tool calling docs, enable unit tests (#20821)
Signed-off-by: wangzhengtao <wangzhengtao@moonshot.cn>
Co-authored-by: wangzhengtao <wangzhengtao@moonshot.cn>
Co-authored-by: wangzhengtao <wangzhengtao@msh.team>
|
2025-07-11 20:16:14 +00:00 |
|
|
|
6fb162447b
|
[doc] fix ordered list issue (#20819)
Signed-off-by: reidliu41 <reid201711@gmail.com>
|
2025-07-11 06:49:46 -07:00 |
|
|
|
6a9e6b2abf
|
[doc] fold long code block (#20795)
Signed-off-by: reidliu41 <reid201711@gmail.com>
|
2025-07-10 23:16:41 -07:00 |
|
|
|
41060c6e08
|
[Core] Add Support for Default Modality Specific LoRAs [generate / chat completions] (#19126)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
|
2025-07-10 21:09:37 +01:00 |
|
|
|
332d4cb17b
|
[Feature][Quantization] MXFP4 support for MOE models (#17888)
Signed-off-by: Felix Marty <felmarty@amd.com>
Signed-off-by: Bowen Bao <bowenbao@amd.com>
Signed-off-by: Felix Marty <Felix.Marty@amd.com>
Co-authored-by: Bowen Bao <bowenbao@amd.com>
|
2025-07-09 13:19:02 -07:00 |
|
|
|
70ca5484f5
|
[Doc] Update notes (#20668)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-07-09 03:46:36 -07:00 |
|
|
|
f95570a52d
|
[Docs] fix minimax tool_calling docs error (#20667)
Signed-off-by: qingjun <qingjun@minimaxi.com>
|
2025-07-09 00:37:07 -07:00 |
|
|
|
b942c094e3
|
Stop using title frontmatter and fix doc that can only be reached by search (#20623)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-07-08 03:27:40 -07:00 |
|
|
|
b4bab81660
|
Remove unnecessary explicit title anchors and use relative links instead (#20620)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-07-08 02:49:13 -07:00 |
|
|
|
af107d5a0e
|
Make distinct code and console admonitions so readers are less likely to miss them (#20585)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-07-07 19:55:28 -07:00 |
|
|
|
923147b5e8
|
[Doc] Fix internal links so they don't always point to latest (#20563)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-07-07 04:15:50 -07:00 |
|
|
|
45877ef740
|
[Doc] Use gh-pr and gh-issue everywhere we can in the docs (#20564)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-07-07 03:54:22 -07:00 |
|
|
|
fe1e924811
|
[Frontend] Support image object in llm.chat (#19635)
Signed-off-by: sfeng33 <4florafeng@gmail.com>
Signed-off-by: Flora Feng <4florafeng@gmail.com>
|
2025-07-06 06:47:13 +00:00 |
|
|
|
d3f05c9248
|
[Doc] fix mutltimodal_inputs.md gh examples link (#20497)
Signed-off-by: Guy Stone <guys@spotify.com>
|
2025-07-04 16:41:35 -07:00 |
|
|
|
1819fbda63
|
[Quantization] Bump to use latest bitsandbytes (#20424)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-07-03 21:58:46 +08:00 |
|
|
|
363528de27
|
[Feature] Support MiniMax-M1 function calls features (#20297)
Signed-off-by: QscQ <qscqesze@gmail.com>
Signed-off-by: qingjun <qingjun@minimaxi.com>
|
2025-07-03 06:48:27 +00:00 |
|
|
|
3dd359147d
|
[Docs] Update EAGLE example (#20375)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-07-02 17:13:51 -07:00 |
|
|
|
b95877509b
|
Documentation update tool_calling: mapping back to function from response (#20373)
|
2025-07-02 05:55:49 -07:00 |
|
|
|
b205e8467d
|
[Doc][TPU] Add models and features supporting matrix. (#20230)
Signed-off-by: Qiliang Cui <cuiq@google.com>
|
2025-07-02 06:33:20 +00:00 |
|
|
|
be0cfb2b68
|
fix[Docs]: link anchor is incorrect #20309 (#20315)
Signed-off-by: zxw <1020938856@qq.com>
|
2025-07-02 06:32:34 +00:00 |
|
|
|
3d19d47d91
|
[Frontend] Expand tools even if tool_choice="none" (#17177)
Signed-off-by: okada shintarou <okada@preferred.jp>
|
2025-07-01 12:47:38 -04:00 |
|
|
|
c3649e4fee
|
[Docs] Fix syntax highlighting of shell commands (#19870)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
|
2025-06-23 17:59:09 +00:00 |
|
|
|
f17aec0d63
|
[doc] Fold long code blocks to improve readability (#19926)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-06-23 05:24:23 +00:00 |
|
|
|
1d0ae26c85
|
Add xLAM tool parser support (#17148)
|
2025-06-19 14:26:41 +08:00 |
|
|
|
7b3c9ff91d
|
[Doc] uses absolute links for structured outputs (#19582)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
|
2025-06-13 03:35:17 +00:00 |
|