youngkingdom/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
CSWYF3634076	644d57d531	[Model] Add Ernie4.5 VL Model Support (#22514 ) Signed-off-by: wangyafeng <wangyafeng@baidu.com>	2025-08-26 21:02:55 -07:00
Harry Mellor	6dab89b8ec	[Docs] Fix math rendering in docs (#23676 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-08-26 18:47:08 -07:00
Harry Mellor	6421b66bf4	[Docs] Move quant supported hardware table to README (#23663 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-08-26 22:26:46 +00:00
Chen Zhang	d696f86e7b	[doc] Hybrid KV Cache Manager design doc (#22688 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-08-26 20:19:05 +00:00
Isotr0py	9816b81f5f	[Model] Enable video support for InternVL3.5 models (#23658 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-08-26 19:46:52 +00:00
Thomas Parnell	227e231b55	[Docs] [V1] [Hybrid] Update docs to remove FlashInfer constraint for hybrid models (#23665 ) Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>	2025-08-26 18:33:16 +00:00
Harry Mellor	379f828fba	[Docs] Reduce requirements for docs build (#23651 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-08-26 15:43:28 +00:00
Didier Durand	7c04779afa	[Doc]: fix various spelling issues in multiple files (#23636 ) Signed-off-by: Didier Durand <durand.didier@gmail.com>	2025-08-26 14:05:29 +00:00
Harry Mellor	164b2273c8	[Docs] Fix broken links to `docs/api/summary.md` (#23637 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-08-26 13:00:18 +00:00
Raghavan	ff77764f86	Fix CLI parameter documentation inconsistency in pooling_models.md (#23630 )	2025-08-26 01:05:37 -07:00
Harry Mellor	bfc1edc9f5	[Docs] Fix titles for multi-file examples that are rendered in the docs (#23573 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-08-26 00:16:44 -07:00
Terrence Zhao	7b6a837275	[Docs] Update Documentation of Cohere Command-A Models (#23584 ) Signed-off-by: Terrencezzj <terrence@cohere.ai> Signed-off-by: Abatom <abzhonghua@gmail.com> Co-authored-by: Zhonghua Deng <abzhonghua@gmail.com>	2025-08-25 21:53:52 +00:00
Cyrus Leung	e269be2ba2	[Doc] Add caution for API server scale-out (#23550 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-08-25 06:14:15 -07:00
youkaichao	d0a4a3f645	[misc] add shanghai meetup (#23535 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-08-25 17:00:03 +08:00
Didier Durand	47455c424f	[Doc: ]fix various typos in multiple files (#23487 ) Signed-off-by: Didier Durand <durand.didier@gmail.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-08-25 00:04:04 +00:00
Cyrus Leung	e2db1164a1	[Model] Enable BLOOM on V1 (#23488 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-08-24 13:30:47 +00:00
汪志鹏	416f05929a	[New Model]Donut model (#23229 ) Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>	2025-08-24 12:52:24 +00:00
Xu Wenqing	b8f17f5d98	Support DeepSeek-V3.1 tool call (#23454 ) Signed-off-by: Xu Wenqing <xuwq1993@qq.com>	2025-08-23 05:50:16 +00:00
WeiQing Chen	23c939fd30	[Model] Support DP for ViT on MiniCPM-V-4 (#23327 ) Signed-off-by: ycyaw66 <497410282@qq.com> Co-authored-by: ycyaw66 <497410282@qq.com>	2025-08-23 02:14:41 +00:00
Ilya Markov	0313cf854d	[PERF] PyTorch Symmetric Memory All-Reduce (#20759 ) Signed-off-by: ilmarkov <imarkov@redhat.com> Signed-off-by: ilmarkov <markovilya197@gmail.com> Signed-off-by: Michael Goin <mgoin64@gmail.com> Co-authored-by: ilmarkov <imarkov@redhat.com> Co-authored-by: Michael Goin <mgoin64@gmail.com>	2025-08-22 15:39:08 -06:00
Chen Zhang	a073be6d87	[Doc] Update the doc for log probs + prefix caching (#23399 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-08-22 13:20:39 +00:00
Bin Jia	5964069367	[New Model] Add Seed-Oss model (#23241 ) Signed-off-by: jiabin.00 <jiabin.00@bytedance.com> Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>	2025-08-22 04:58:10 +00:00
Cyrus Leung	8896eb72eb	[Deprecation] Remove `prompt_token_ids` arg fallback in `LLM.generate` and `LLM.embed` (#18800 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-08-22 10:56:57 +08:00
Cyrus Leung	5cc54f7c5b	[Doc] Fix batch-level DP example (#23325 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: youkaichao <youkaichao@gmail.com>	2025-08-21 06:16:38 -07:00
Paul Pak	2e2000f352	[Model] Add LFM2 architecture (#22845 ) Signed-off-by: Paul Pak <paulpak58@gmail.com>	2025-08-21 09:35:07 +02:00
22quinn	f571ff8eb6	[Sampler] Support returning final logprobs (#22387 ) Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com> Co-authored-by: Nick Hill <nhill@redhat.com> Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-08-20 21:28:32 -07:00
杨奇(yann qi)	655a09f653	[Model][VLM] Support R-4B Model (#23246 ) Signed-off-by: yannqi <yannqi@qq.com> Signed-off-by: 杨奇(yann qi) <51905299+yannqi@users.noreply.github.com> Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: yannqiyang <yannqiyang@tencent.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-08-21 04:08:52 +00:00
Asaf Joseph Gardin	3663870c72	[V1][Mamba1] - Full CUDA and Piecewise CUDA Graphs Support (#23035 ) Signed-off-by: asafg <asafg@ai21.com> Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com> Co-authored-by: asafg <asafg@ai21.com>	2025-08-20 20:08:51 -07:00
Cyrus Leung	5efd6905bc	[CLI][Doc] Formalize `--mm-encoder-tp-mode` (#23190 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-08-20 23:42:28 +08:00
Jee Jee Li	c6d80a7a96	[Model] Improve olmo and olmo2 (#23228 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-08-20 12:47:05 +00:00
Shiming Zhang	3aa8c10038	Fix missing quotes (#23242 ) Signed-off-by: Shiming Zhang <wzshiming@hotmail.com>	2025-08-20 10:46:59 +00:00
Cyrus Leung	64ab3c7253	[Doc] Update V1 status of various pooling models (#23189 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-08-20 10:33:41 +08:00
Michael Goin	21dce80ea9	[CI/Build] Add support for Python 3.13 (#13164 ) Signed-off-by: mgoin <michael@neuralmagic.com> Signed-off-by: mgoin <mgoin64@gmail.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-08-19 13:49:34 -07:00
myselvess	b87cb97a53	[Model] support new model ovis2.5 (#23084 ) Signed-off-by: myselvess <244285088@qq.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: Isotr0py <2037008807@qq.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-08-19 13:12:59 +00:00
Tialo	2c3f557f08	[Doc] use power of 2 (#23172 )	2025-08-19 03:16:23 -07:00
Jiangyun Zhu	fda9537c5e	[Model] Support Pipeline Parallelism for moonshotai/Kimi-VL-A3B-Thinking-2506 (#23114 ) Signed-off-by: zjy0516 <riverclouds.zhu@qq.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-08-19 14:24:31 +08:00
Cyrus Leung	27e8d1ea3e	[Refactor] Define MultiModalKwargsItems separate from MultiModalKwargs (#23053 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-08-18 09:52:00 +00:00
Kevinzz	16bff144be	[Misc] fix typo in the multimodal doc (#23051 )	2025-08-17 01:56:20 -07:00
Michael Goin	4fc722eca4	[Kernel/Quant] Remove AQLM (#22943 ) Signed-off-by: mgoin <mgoin64@gmail.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>	2025-08-16 19:38:21 +00:00
汪志鹏	829bbd7882	[New Model]mBART model (#22883 ) Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>	2025-08-16 12:16:58 +00:00
bnellnm	8ad7285ea2	[Kernels] Clean up FusedMoeMethodBase and modular kernel setup. Remove extra arguments from modular kernel methods. (#22035 ) Signed-off-by: Bill Nell <bnell@redhat.com> Co-authored-by: Michael Goin <mgoin64@gmail.com>	2025-08-15 14:46:00 -04:00
Csrayz	a0632a3e03	[Frontend] Expose do_log_stats interval to env (#22905 ) Signed-off-by: Csrayz <jover@cmbchina.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-08-15 13:00:20 +00:00
Nir	637093ae26	docs: update fastsafetensors usage instructions (#22891 ) Signed-off-by: Nir Levy <bhr166@gmail.com>	2025-08-14 19:56:54 +00:00
Jee Jee Li	92ff41abea	[Model] Modify the gate implementation of glm4_moe (#22832 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-08-14 05:28:50 -07:00
Daniele	0783f13960	[Doc] fix dead link (#22898 ) Signed-off-by: Daniele Trifirò <dtrifiro@redhat.com>	2025-08-14 04:06:13 -07:00
Louie Tsai	00e3f9da46	vLLM Benchmark suite improvement (#22119 ) Signed-off-by: Tsai, Louie <louie.tsai@intel.com> Signed-off-by: Louie Tsai <louie.tsai@intel.com> Co-authored-by: Li, Jiang <bigpyj64@gmail.com>	2025-08-14 07:12:17 +00:00
633WHU	3f52738dce	[Doc] Add max_lora_rank configuration guide (#22782 ) Signed-off-by: chiliu <cliu_whu@yeah.net>	2025-08-13 04:10:07 -07:00
Michael Goin	e18859298d	Add hardware plugins to installation doc (#22732 ) Signed-off-by: Michael Goin <mgoin64@gmail.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-08-12 17:14:46 -07:00
Jee Jee Li	fde0b611a3	[Model] Decouple glm4v (#22751 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-08-12 17:13:17 -07:00
Harry Mellor	45c3936e94	[Docs] Hide the navigation and toc sidebars on home page (#22749 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-08-12 17:12:26 -07:00

1 2 3 4 5 ...

1364 Commits