youngkingdom/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Navanit Dubey	3e7506975c	[DOC] Add reasoning capability to vLLM streamlit code (#19557 )	2025-06-16 07:09:12 -04:00
Aaron Pham	7b3c9ff91d	[Doc] uses absolute links for structured outputs (#19582 ) Signed-off-by: Aaron Pham <contact@aarnphm.xyz>	2025-06-13 03:35:17 +00:00
Aaron Pham	dba68f9159	[Doc] Unify structured outputs examples (#18196 ) Signed-off-by: Aaron Pham <contact@aarnphm.xyz>	2025-06-12 22:50:31 +00:00
Ekagra Ranjan	017ef648e9	[Spec Decode][Benchmark] Generalize spec decode offline benchmark to more methods and datasets (#18847 )	2025-06-12 10:30:56 -07:00
niu_he	dff680001d	Fix typo (#19525 ) Signed-off-by: 2niuhe <carlton2tang@gmail.com>	2025-06-12 09:24:45 +00:00
runzhen	943ffa5703	[Bugfix] Update the example code, make it work with the latest lmcache (#19453 ) Signed-off-by: Runzhen Wang <wangrunzhen@gmail.com>	2025-06-11 12:42:20 +00:00
wang.yuqi	3952731e8f	[New Model]: Support Qwen3 Embedding & Reranker (#19260 )	2025-06-10 20:07:30 -07:00
Reid	6b1391ca7e	[Misc] refactor neuron_multimodal and profiling (#19397 ) Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com>	2025-06-10 06:12:42 +00:00
Reid	122cdca5f6	[Misc] refactor context extension (#19246 ) Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com>	2025-06-07 05:13:21 +00:00
jmswen	c8dcc15921	Allow AsyncLLMEngine.generate to target a specific DP rank (#19102 ) Signed-off-by: Jon Swenson <jmswen@gmail.com>	2025-06-04 08:26:47 -07:00
Xu Wenqing	02658c2dfe	Add DeepSeek-R1-0528 function call chat template (#18874 ) Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>	2025-06-04 13:24:18 +00:00
汪志鹏	3336c8cfbe	Fix #19130 (#19132 ) Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>	2025-06-04 01:42:06 -07:00
Calvin Chen	8d646c2e53	[Cleanup][v1]:remote guided-decoding-backend for example (#19059 ) Signed-off-by: calvin chen <120380290@qq.com>	2025-06-04 04:23:26 +00:00
Jiaxin Shan	abd7df2fca	[Misc] Fix path and python alias errors in disagg_prefill exmaples (#18919 )	2025-06-03 17:15:18 -07:00
Simon Mo	02f0c7b220	[Misc] Add SPDX-FileCopyrightText (#19100 ) Signed-off-by: simon-mo <simon.mo@hey.com>	2025-06-03 11:20:17 -07:00
汪志鹏	1282bd812e	Add tarsier model support (#18985 ) Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>	2025-06-03 13:13:13 +08:00
Siyuan Liu	9112b443a0	[Hardware][TPU] Initial support of model parallelism with single worker using SPMD (#18011 ) Signed-off-by: Siyuan Liu <lsiyuan@google.com> Co-authored-by: Hossein Sarshar <hossein.sarshar@gmail.com> Co-authored-by: Chengji Yao <chengjiyao@google.com>	2025-06-03 00:06:20 +00:00
Calvin Chen	c57d577e8d	add an absolute path for run.sh (#18258 ) Signed-off-by: calvin chen <120380290@qq.com>	2025-06-02 19:38:23 +00:00
Nick Hill	9a1b9b99d7	[BugFix] Fix multi-node offline data-parallel (#18981 ) Signed-off-by: Nick Hill <nhill@redhat.com> Co-authored-by: Yizhou Liu <liu_yizhou@outlook.com>	2025-05-31 08:34:52 -07:00
Satyajith Chilappagari	2a50ef5760	[Neuron] Add Multi-Modal model support for Neuron (#18921 ) Signed-off-by: Satyajith Chilappagari <satchill@amazon.com> Co-authored-by: Ashraf Mahgoub <ashymahg@amazon.com> Co-authored-by: Rohith Nallamaddi <nalrohit@amazon.com> Co-authored-by: FeliciaLuo <luof@amazon.com> Co-authored-by: Elaine Zhao <elaineyz@amazon.com>	2025-05-31 10:39:11 +00:00
Mark McLoughlin	0e98964e94	[V1][Metrics] Remove metrics that were deprecated in 0.8 (#18837 ) Signed-off-by: Mark McLoughlin <markmc@redhat.com>	2025-05-28 18:54:12 +00:00
Reid	435fa95444	[Frontend] add run batch to CLI (#18804 ) Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com>	2025-05-28 07:08:57 -07:00
wang.yuqi	3e9ce609bd	[Bugfix] Fix nomic max_model_len (#18755 )	2025-05-27 20:29:53 -07:00
Mark McLoughlin	06a0338015	[V1][Metrics] Add API for accessing in-memory Prometheus metrics (#17010 ) Signed-off-by: Mark McLoughlin <markmc@redhat.com>	2025-05-27 09:37:06 +00:00
Reid	fc6d0c290f	[Misc] improve docs (#18734 ) Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com>	2025-05-27 07:07:01 +00:00
Cyrus Leung	753944fa9b	[Doc] Update reproducibility doc and example (#18741 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-05-27 07:03:13 +00:00
Harry Mellor	27bebcd897	Convert `examples` to `ruff-format` (#18400 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-26 16:57:54 +00:00
Cyrus Leung	82e2339b06	[Doc] Move examples and further reorganize user guide (#18666 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-05-26 07:38:04 -07:00
AlexZhao	8820821b59	[Misc] Fixed the abnormally high TTFT issue in the PD disaggregation example (#18644 ) Signed-off-by: zhaohaidao <zhaohaidao2008@hotmail.com> Signed-off-by: zhaohaiyuan <zhaohaiyuan@xiaohongshu.com> Co-authored-by: zhaohaiyuan <zhaohaiyuan@xiaohongshu.com>	2025-05-26 13:51:27 +08:00
Isotr0py	75f81750f3	[VLM] Initialize video input support for InternVL models (#18499 ) Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-05-25 04:51:25 +00:00
Feng XiaoLong	4fc1bf813a	[Bugfix] Migrate to REGEX Library to prevent catastrophic backtracking (#18454 ) Signed-off-by: Crucifixion-Fxl <xmufxl@gmail.com> Co-authored-by: Crucifixion-Fxl <xmufxl@gmail.com>	2025-05-23 16:16:26 -07:00
Chenheli Hua	04eb88dc80	Re-submit: Fix: Proper RGBA -> RGB conversion for PIL images. (#18569 ) Signed-off-by: Chenheli Hua <huachenheli@outlook.com>	2025-05-23 01:59:18 +00:00
Sanger Steel	c32e249a23	[Frontend] [Core] Add Tensorizer support for V1, LoRA adapter serialization and deserialization (#17926 ) Signed-off-by: Sanger Steel <sangersteel@gmail.com>	2025-05-22 18:44:18 -07:00
Kai Wu	c91fe7b1b9	[Frontend][Bug Fix] Update llama4 pythonic jinja template and llama4_pythonic parser (#17917 ) Signed-off-by: Kai Wu <kaiwu@meta.com>	2025-05-22 16:44:08 -07:00
Reid	cb506ecb5a	[Misc] improve Automatic Prefix Caching example (#18554 ) Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com>	2025-05-22 14:50:46 +00:00
Calvin Chen	3f505233fd	[Doc] Add stream flag for chat completion example (#18524 ) Signed-off-by: calvin chen <120380290@qq.com>	2025-05-22 14:07:10 +00:00
CYJiang	71075029f2	[Doc] Support --stream arg in openai_completion_client.py script (#18388 ) Signed-off-by: googs1025 <googs1025@gmail.com>	2025-05-22 13:20:17 +00:00
Reid	107f5fc4cb	[Misc] refactor disaggregated-prefill-v1 example (#18474 ) Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com>	2025-05-21 11:10:14 +00:00
Reid	8f55962a7f	[Misc] refactor prompt embedding examples (#18405 ) Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com>	2025-05-20 15:26:12 +00:00
Gong Shufan	8171221834	[Misc] Fix typo (#18330 )	2025-05-19 09:51:01 -07:00
Reid	27d0952600	[Misc] extract parser.parse_args() (#18323 ) Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com>	2025-05-19 04:06:26 +00:00
David Xia	5c04bb8b86	[doc] fix multimodal example script (#18089 ) Signed-off-by: David Xia <david@davidxia.com>	2025-05-16 06:05:34 +00:00
Lucia Fang	3d2779c29a	[Feature] Support Pipeline Parallism in torchrun SPMD offline inference for V1 (#17827 ) Signed-off-by: Lucia Fang <fanglu@fb.com>	2025-05-15 22:28:27 -07:00
Harry Mellor	51ff154639	Improve examples rendering in docs and GitHub (#18203 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-15 15:57:49 +00:00
omahs	a9944aabfa	fix: typos (#18151 ) Signed-off-by: omahs <73983677+omahs@users.noreply.github.com>	2025-05-15 02:16:15 -07:00
bnellnm	f9c069c85e	Modularize fused experts and integrate PPLX kernels (#15956 )	2025-05-14 13:11:54 -07:00
Ekagra Ranjan	418d2f8bfb	[V1][Spec Decode] Share input embedding of target model with EAGLE draft model to free ~1GB for llama 3 model (#17326 ) Co-authored-by: root <root@ekagra-8xh100.us-east5-a.c.serving-efficiency-poc.internal> Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-05-14 12:31:46 -07:00
majianpeng	e7ef61c1f0	[Bugfix][Example] make lmcache v0 work. (#18051 ) Signed-off-by: Ma, Jianpeng <jianpeng.ma@intel.com>	2025-05-13 23:43:44 -07:00
Ecthlion_zyy	33011318c2	Fix broken example: examples/offline_inference/profiling at scheduler_config (#18117 )	2025-05-13 23:19:14 -07:00
Tao He	60f7624334	Implements dual-chunk-flash-attn backend for dual chunk attention with sparse attention support (#11844 )	2025-05-12 19:52:47 -07:00

1 2 3 4 5 ...

464 Commits