youngkingdom/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Jee Jee Li	46c759c165	[Bugfix] Fix LoRA extra vocab size (#15047 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-03-18 09:40:29 -07:00
Patrick von Platen	f863ffc965	[Mistral-Small 3.1] Update docs and tests (#14977 ) Signed-off-by: Roger Wang <ywang@roblox.com> Co-authored-by: Roger Wang <ywang@roblox.com>	2025-03-18 03:29:42 -07:00
Cyrus Leung	6eaf1e5c52	[Misc] Add `--seed` option to offline multi-modal examples (#14934 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-03-17 03:00:17 -07:00
Nick Hill	b82662d952	[BugFix] Fix torch distributed stateless PG backend init (#14870 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-03-15 20:26:19 -07:00
Rémi Delacourt	61c6a5a796	[VLM] Merged multi-modal processor for Pixtral (#12211 ) Signed-off-by: remi <remi@mistral.ai> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-03-15 06:28:27 -07:00
Bryan Lu	9ed6ee92d6	[Bugfix] EAGLE output norm bug (#14464 ) Signed-off-by: Bryan Lu <yuzhelu@amazon.com>	2025-03-15 06:50:33 +00:00
WeiCheng	54cc46f3eb	[Bugfix] Fix small typo in the example of Streaming delimiter (#14793 )	2025-03-14 08:05:17 +00:00
yasu52	3fb17d26c8	[Doc] Fix typo in documentation (#14783 ) Signed-off-by: yasu52 <tsuguro4649@gmail.com>	2025-03-13 20:33:09 -07:00
Cyrus Leung	382403921f	[VLM] Support pan-and-scan for Gemma3 multi-modal processor (#14672 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Roger Wang <ywang@roblox.com> Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> Co-authored-by: Roger Wang <ywang@roblox.com>	2025-03-13 02:23:12 -07:00
Woosuk Kwon	c0c25e25fa	[Model] Add support for Gemma 3 (#14660 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> Signed-off-by: Roger Wang <ywang@roblox.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: Roger Wang <ywang@roblox.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-03-12 08:36:33 -07:00
Isotr0py	63d635d179	[Misc] Correct deepseek-vl2 chat template (#14558 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2025-03-11 04:37:11 +00:00
Harry Mellor	3b352a2f92	Correct capitalisation: `VLLM` -> `vLLM` (#14562 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-03-10 16:36:21 +00:00
Chengji Yao	212007b168	[Hardware][TPU] Fix the recompiling issue in logits processor after warmup (#14510 ) Signed-off-by: Chengji Yao <chengjiyao@google.com>	2025-03-09 05:44:39 -04:00
Isotr0py	03fe18ae0f	[VLM] Add TP support for Phi-4-MM (#14453 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2025-03-08 05:57:14 -08:00
Jee Jee Li	952a074980	[Misc] Add Phi4-MM example (#14343 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-03-07 17:28:52 +00:00
Tyler Michael Smith	cc2f9b32c8	[Distributed] Add enable_expert_parallel arg (#14305 ) Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>	2025-03-06 18:54:45 +00:00
youkaichao	151b08e0fe	[RLHF] use worker_extension_cls for compatibility with V0 and V1 (#14185 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-03-07 00:32:46 +08:00
Yanyi Liu	0ddc991f5c	[Doc] Update reasoning with stream example to use OpenAI library (#14077 ) Signed-off-by: liuyanyi <wolfsonliu@163.com>	2025-03-06 13:20:37 +00:00
Nicolò Lucchesi	fa82b93853	[Frontend][Docs] Transcription API streaming (#13301 ) Signed-off-by: NickLucche <nlucches@redhat.com>	2025-03-06 10:39:35 +00:00
Ce Gao	f5f7f00cd9	[Bugfix][Structured Output] Support outlines engine with reasoning outputs for DeepSeek R1 (#14114 )	2025-03-06 03:49:20 +00:00
Vincent	a4f1ee35d6	Deprecate `best_of` Sampling Parameter in anticipation for vLLM V1 (#13997 ) Signed-off-by: vincent-4 <vincentzhongy+githubvincent4@gmail.com> Signed-off-by: Brayden Zhong <b8zhong@uwaterloo.ca> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-03-05 20:22:43 +00:00
Isotr0py	f71b00a19e	[Bugfix] Fix broken vision language example (#14292 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2025-03-05 15:57:10 +00:00
Tyler Michael Smith	72c62eae5f	[V1] EP/TP MoE + DP Attention (#13931 )	2025-03-04 21:27:26 -08:00
lkchen	b3cf368d79	[V1][Molmo] Fix get_multimodal_embeddings() in molmo.py (#14161 )	2025-03-04 15:43:59 +00:00
Harry Mellor	cf069aa8aa	Update deprecated Python 3.8 typing (#13971 )	2025-03-02 17:34:51 -08:00
Ce Gao	bf33700ecd	[v0][structured output] Support reasoning output (#12955 ) Signed-off-by: Ce Gao <cegao@tensorchord.ai>	2025-03-02 14:49:42 -05:00
Isotr0py	fdcc405346	[Doc] Consolidate `whisper` and `florence2` examples (#14050 )	2025-02-28 22:49:15 -08:00
Isotr0py	edf309ebbe	[VLM] Support multimodal inputs for Florence-2 models (#13320 )	2025-02-27 02:06:41 -08:00
Chauncey	10c3b8c1cf	[Misc] fixed 'required' is an invalid argument for positionals (#13948 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2025-02-27 09:06:49 +00:00
Chauncey	d08b285adf	[Misc] fixed qwen_vl_utils parameter error (#13906 )	2025-02-26 08:31:53 -08:00
Albert	e656f638de	[Doc] fix the incorrect module path of tensorize_vllm_model (#13863 )	2025-02-25 22:56:19 -08:00
Jiayi Yao	2f42a4888c	[Feature] Support KV cache offloading and disagg prefill with LMCache connector. (#12953 )	2025-02-25 00:38:42 -08:00
Roger Meier	7940d8a6a7	[CI/Build] add python-json-logger to requirements-common (#12842 )	2025-02-24 06:10:33 -08:00
youkaichao	2382ad29d1	[ci] fix linter (#13701 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-02-22 20:28:59 +08:00
youkaichao	3e472d882a	[core] set up data parallel communication (#13591 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-02-22 19:28:59 +08:00
John Zheng	900edbfa48	fix typo of grafana dashboard, with correct datasource (#13668 ) Signed-off-by: John Zheng <john.zheng@hp.com>	2025-02-21 18:21:05 +00:00
Edwin Hernandez	981f3c831e	[Misc] Adding script to setup ray for multi-node vllm deployments (#12913 )	2025-02-20 21:16:40 -08:00
Joe Runde	bfbc0b32c6	[Frontend] Add backend-specific options for guided decoding (#13505 ) Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>	2025-02-20 15:07:58 -05:00
Harry Mellor	992e5c3d34	Merge similar examples in `offline_inference` into single `basic` example (#12737 )	2025-02-20 04:53:51 -08:00
Cyrus Leung	377d10bd14	[VLM][Bugfix] Pass processor kwargs properly on init (#13516 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-02-19 13:13:50 +00:00
Roger Wang	b7d309860e	[V1] Update doc and examples for H2O-VL (#13349 ) Signed-off-by: Roger Wang <ywang@roblox.com>	2025-02-16 10:35:54 +00:00
XiaobingZhang	84683fa271	[Bugfix] Offline example of disaggregated prefill (#13214 )	2025-02-13 20:20:47 -08:00
Nicolò Lucchesi	d84cef76eb	[Frontend] Add `/v1/audio/transcriptions` OpenAI API endpoint (#12909 )	2025-02-13 07:23:45 -08:00
Cyrus Leung	1bc3b5e71b	[VLM] Separate text-only and vision variants of the same model architecture (#13157 )	2025-02-13 06:19:15 -08:00
Michael Goin	d88c8666a1	[Bugfix][Example] Fix GCed profiling server for TPU (#12792 ) Signed-off-by: mgoin <michael@neuralmagic.com>	2025-02-13 11:52:11 +08:00
Christian Pinto	974dfd4971	[Model] IBM/NASA Prithvi Geospatial model (#12830 )	2025-02-11 20:34:30 -08:00
ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟	6c4dbe23eb	[BugFix] Pop instead of del CUDA_VISIBLE_DEVICES (#12962 ) Signed-off-by: Hollow Man <hollowman@opensuse.org>	2025-02-12 00:21:50 +08:00
Ce Gao	fc6485d277	[Bugfix]: Reasoning output bug according to the chat template change (#13025 ) Signed-off-by: Ce Gao <cegao@tensorchord.ai>	2025-02-11 15:49:03 +08:00
Farzad Abdolhosseini	08b2d845d6	[Model] Ultravox Model: Support v0.5 Release (#12912 ) Signed-off-by: Farzad Abdolhosseini <farzad@fixie.ai>	2025-02-10 22:02:48 +00:00
youkaichao	aa0ca5ebb7	[core][rlhf] add colocate example for RLHF (#12984 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-02-10 10:28:59 +08:00

1 2 3 4 5 ...

302 Commits