|
|
b81fe83b2c
|
[doc] add alibaba cloud as sponsor (#22597)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-08-10 23:13:47 +08:00 |
|
|
|
0757551c96
|
[doc] add beijing meetup links (#22596)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-08-10 22:51:36 +08:00 |
|
|
|
00976db0c3
|
[Docs] Fix warnings in docs build (#22588)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-08-10 05:49:51 -07:00 |
|
|
|
010e0e39ea
|
[Doc] Fix API doc link in side navigation (#22585)
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
|
2025-08-10 01:35:22 -07:00 |
|
|
|
c49848396d
|
Refactor sliding window configuration to Transformers best practice (#21927)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-08-09 20:50:48 -07:00 |
|
|
|
5a16fa614c
|
[Model] Gemma3n MM (#20495)
Signed-off-by: ShriKode <shrikode@gmail.com>
Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: Roger Wang <hey@rogerw.me>
Co-authored-by: ShriKode <shrikode@gmail.com>
Co-authored-by: Roger Wang <hey@rogerw.me>
|
2025-08-09 09:56:25 -07:00 |
|
|
|
56186474f6
|
[Docs] Reduce noise in docs and --help from the JSON tip (#22567)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-08-09 08:31:32 -07:00 |
|
|
|
a6022e6fbc
|
GLM-4.5V with new class name at transformers (#22520)
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-08-09 00:50:21 -07:00 |
|
|
|
2be07a0db1
|
Update docs for Minimax-Text support (#22562)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
|
2025-08-09 00:18:18 -07:00 |
|
|
|
8a0ffd6285
|
Remove mamba_ssm from vLLM requirements; install inside test container using --no-build-isolation (#22541)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
|
2025-08-08 23:05:32 -07:00 |
|
|
|
23472ff51c
|
[Doc] Add usage of implicit text-only mode (#22561)
Signed-off-by: Roger Wang <hey@rogerw.me>
Co-authored-by: Flora Feng <4florafeng@gmail.com>
|
2025-08-08 23:04:19 -07:00 |
|
|
|
baece8c3d2
|
[Frontend] Add unix domain socket support (#18097)
Signed-off-by: <yyweiss@gmail.com>
Signed-off-by: yyw <yyweiss@gmail.com>
|
2025-08-08 16:23:44 -07:00 |
|
|
|
2fcf6b27b6
|
[Docs] fix broken links in metrics.md (#22315)
Signed-off-by: Guy Stone <guys@spotify.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-08-08 16:22:35 -07:00 |
|
|
|
e290594072
|
[Docs] Rename “Distributed inference and serving” to “Parallelism & Scaling” (#22466)
Signed-off-by: Ricardo Decal <rdecal@anyscale.com>
|
2025-08-08 19:26:21 +00:00 |
|
|
|
7be7f3824a
|
[Docs] Improve API docs (+small tweaks) (#22459)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-08-08 03:02:51 -07:00 |
|
|
|
099c046463
|
[Doc] Sleep mode documentation (#22310)
Signed-off-by: iAmir97 <Amir.balwel@embeddedllm.com>
Signed-off-by: iAmir97 <71513472+iAmir97@users.noreply.github.com>
Co-authored-by: iAmir97 <Amir.balwel@embeddedllm.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Hong Hanh <hanh.usth@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2025-08-08 12:25:18 +08:00 |
|
|
|
139d155781
|
[Frontend] Use engine argument to control MM cache size (#22441)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-08-07 09:47:10 -07:00 |
|
|
|
766bc8162c
|
[Core] Store only the keys for multi-modal data in P0 (#22198)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-08-07 01:45:04 -07:00 |
|
|
|
289b18e670
|
[Docs] Update features/disagg_prefill, add v1 examples and development (#22165)
Signed-off-by: David Chen <530634352@qq.com>
|
2025-08-07 00:59:23 -07:00 |
|
|
|
a2c6696bfe
|
[Docs] Factor out troubleshooting to its own guide; add section for Ray Observability (#21578)
Signed-off-by: Ricardo Decal <rdecal@anyscale.com>
|
2025-08-07 00:29:13 -07:00 |
|
|
|
5e8398805e
|
[Doc] Fix link to prefix caching design (#22384)
Signed-off-by: Yong Hoon Shin <yhshin@meta.com>
|
2025-08-07 00:28:15 -07:00 |
|
|
|
609b533cb6
|
[Bugfix] Add proper comparison for package versions (#22314)
Signed-off-by: Syed Muhammad Bin Asif <syedmba7@connect.hku.hk>
|
2025-08-06 20:31:03 -07:00 |
|
|
|
41b67f4263
|
[model] Support MiniCPM-V 4.0 (#22166)
Co-authored-by: imning3 <hbning@pku.edu.cn>
|
2025-08-06 18:35:46 -07:00 |
|
|
|
46a13949d5
|
[v1] - Mamba1 Attention Metadata (#21249)
Signed-off-by: asafg <asafg@ai21.com>
Co-authored-by: asafg <asafg@ai21.com>
|
2025-08-06 17:03:42 -07:00 |
|
|
|
54991c548a
|
[gpt-oss] add model to supported models doc (#22336)
Signed-off-by: Roger Wang <hey@rogerw.me>
|
2025-08-06 01:49:44 -07:00 |
|
|
|
d1bf1b9711
|
[Docs][TPU] Highlight TPU Software version selection (#22242)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-08-05 02:33:46 -07:00 |
|
|
|
6fa41e0c32
|
self.gate dtype update for GLM-4.5 (#22203)
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
|
2025-08-04 19:12:38 -07:00 |
|
|
|
1539ced93a
|
[Doc] Update pooling model docs (#22186)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-08-04 03:37:06 -07:00 |
|
|
|
a7b8788d2c
|
[Misc] Modify the organization of GLM series (#22171)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-08-03 23:51:20 -07:00 |
|
|
|
83f7bbb318
|
Add chat doc in quick start (#21213)
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-08-03 07:47:55 -07:00 |
|
|
|
25373b6c6c
|
for glm-4.1V update (#22000)
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
|
2025-08-02 01:46:57 -07:00 |
|
|
|
067c34a155
|
docs: remove deprecated disable-log-requests flag (#22113)
Signed-off-by: Roger Wang <hey@rogerw.me>
|
2025-08-02 00:19:48 -07:00 |
|
|
|
97608dc276
|
[Docs] use uv in CPU installation docs (#22089)
Signed-off-by: David Xia <david@davidxia.com>
|
2025-08-01 07:55:55 -07:00 |
|
|
|
0a6d305e0f
|
feat(multimodal): Add customizable background color for RGBA to RGB conversion (#22052)
Signed-off-by: Jinheng Li <ahengljh@gmail.com>
Co-authored-by: Jinheng Li <ahengljh@gmail.com>
|
2025-08-01 06:07:33 -07:00 |
|
|
|
4931486988
|
[Doc] Added warning of speculating with draft model (#22047)
Signed-off-by: Dilute-l <dilu2333@163.com>
Co-authored-by: Dilute-l <dilu2333@163.com>
|
2025-08-01 02:11:56 -07:00 |
|
|
|
79731a79f0
|
[Doc] Fix a syntax error of example code in structured_outputs.md (#22045)
Signed-off-by: wangzi <3220100013@zju.edu.cn>
Co-authored-by: wangzi <3220100013@zju.edu.cn>
|
2025-08-01 00:01:22 -07:00 |
|
|
|
61dcc280fa
|
[Doc] Add Voxtral to Supported Models page (#22059)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-07-31 23:10:56 -07:00 |
|
|
|
9484641616
|
[Model] Add step3 vl (#21998)
Signed-off-by: oliveryuan <yuansong@step.ai>
Co-authored-by: oliveryuan <yuansong@step.ai>
|
2025-07-31 23:19:06 +08:00 |
|
|
|
d2aab336ad
|
[CI/Build] get rid of unused VLLM_FA_CMAKE_GPU_ARCHES (#21599)
Signed-off-by: Daniele Trifirò <dtrifiro@redhat.com>
|
2025-07-31 15:00:08 +08:00 |
|
|
|
bf668b5bf5
|
[Feature] Support multiple api keys in server (#18548)
Signed-off-by: Yan Pashkovsky <yanp.bugz@gmail.com>
|
2025-07-30 07:03:23 -07:00 |
|
|
|
fcfd1eb9c5
|
[Doc] Remove vLLM prefix and add citation for PagedAttention (#21910)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-07-30 06:36:34 -07:00 |
|
|
|
5c8fe389d6
|
[Docs] Fix the example code of streaming chat completions in reasoning (#21825)
Signed-off-by: wangzi <3220100013@zju.edu.cn>
Co-authored-by: wangzi <3220100013@zju.edu.cn>
Co-authored-by: Zi Wang <66560864+BruceW-07@users.noreply.github.com>
|
2025-07-30 12:11:58 +00:00 |
|
|
|
5bbaf492a6
|
[Doc] Update partial support (#21916)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-07-30 01:32:39 -07:00 |
|
|
|
02f82fe438
|
[Doc] Update Intern-S1 info (#21908)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-07-29 23:58:57 -07:00 |
|
|
|
4cd7fe6cea
|
[Docs] Expand introduction to Ray in Multi-node deployment section (#21584)
Signed-off-by: Ricardo Decal <rdecal@anyscale.com>
|
2025-07-29 22:07:28 -07:00 |
|
|
|
16f3250527
|
[CI/Build] Fix pre-commit failure in docs (#21897)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-07-29 21:53:08 -07:00 |
|
|
|
65f311ce59
|
[Frontend] Add LLM.reward specific to reward models (#21720)
Signed-off-by: wang.yuqi <noooop@126.com>
|
2025-07-29 20:56:03 -07:00 |
|
|
|
b917da442b
|
Expose PyTorch profiler configuration to environment variables (#21803)
Signed-off-by: Csrayz <33659823+Csrayz@users.noreply.github.com>
|
2025-07-29 19:46:31 -07:00 |
|
|
|
fb58e3a651
|
[Docs] Update docker.md with HF_TOKEN, new model, and podman fix (#21856)
|
2025-07-29 19:45:41 -07:00 |
|
|
|
76080cff79
|
[DOC] Fix path of v1 related figures (#21868)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-07-29 19:45:18 -07:00 |
|