youngkingdom/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Maximilien de Bayser	05a4324f8e	Initialize the delta tool call fields explicitly (#17340 ) Signed-off-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: igmainc <igmainc@icloud.com>	2025-05-12 13:28:58 +00:00
Jee Jee Li	7ea6cb28b2	[Misc] Improve modelscope import error (#17983 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-05-12 10:46:45 +00:00
Aaruni Aggarwal	9fbf2bfbd5	Correcting testcases in builkite job for IBM Power (#17675 ) Signed-off-by: Aaruni Aggarwal <aaruniagg@gmail.com>	2025-05-12 08:11:55 +00:00
Xu Wenqing	3a5ea75129	[Feature] Support DeepSeekV3 Function Call (#17784 ) Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com> Signed-off-by: Xu Wenqing <xuwq1993@qq.com>	2025-05-12 00:45:21 -07:00
Brayden Zhong	891b9d33de	[Fix] Benchmark `"EngineClient" has no attribute "model_config"` (#17976 ) Signed-off-by: Brayden Zhong <b8zhong@uwaterloo.ca>	2025-05-11 22:55:53 -07:00
Siyuan Liu	430783018c	[Bugfix][TPU] Use np array when updating cache slot_mapping (#17971 ) Signed-off-by: Siyuan Liu <lsiyuan@google.com>	2025-05-12 12:58:33 +08:00
Li Wang	19a3c78d1f	[Bugfix] Fix pydantic.errors.PydanticUserError (#17962 ) Signed-off-by: wangli <wangli858794774@gmail.com>	2025-05-12 12:58:23 +08:00
Reid	ada50aa295	[bugfix] fix the wrong parser (#17958 ) Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com>	2025-05-12 04:58:02 +00:00
Cheng Kuan Yong Jason	08bf784078	[Bugfix] validate grammar and throw 400 error instead of crashing the engine when xgrammar validation fails (#17623 ) Signed-off-by: Jason Cheng <jasoncky96@gmail.com> Co-authored-by: Russell Bryant <rbryant@redhat.com>	2025-05-12 09:06:10 +08:00
youkaichao	d45fe333fb	[misc] add instructions on how to install nvshmem/pplx/deepep (#17964 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-05-11 18:02:39 -07:00
Isotr0py	021c16c7ca	[Model] Broadcast Ovis2 implementation to fit Ovis1.6 (#17861 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2025-05-11 17:56:30 -07:00
TJian	7de18d541b	[BUG] [ROCm] [MLA] Fix variable name bug due to change in variable name in PR #17483 (#17961 ) Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>	2025-05-11 09:14:30 -07:00
TJian	a810b5b088	[BugFix] [ROCm]: Bugfix and handle addition case of input for `rocm_aiter_rms_norm` (#17857 ) Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>	2025-05-11 04:17:11 -07:00
Reid	009b3d5382	[Misc] not show --model in vllm serve --help (#16691 ) Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com>	2025-05-11 08:47:58 +00:00
wang.yuqi	e4b8713380	[New Model]: nomic-embed-text-v2-moe (#17785 )	2025-05-11 00:59:43 -07:00
Gregory Shtrasberg	06c0922a69	[FP8][ROCm][Attention] Enable FP8 KV cache on ROCm for V1 (#17870 ) Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>	2025-05-11 15:58:45 +08:00
Dipika Sikka	cd3edfc908	[Misc] Add compressed-tensors NVFP4A16 emulation support (#17914 ) Signed-off-by: Dipika Sikka <dipikasikka1@gmail.com> Signed-off-by: Dipika <dipikasikka1@gmail.com>	2025-05-11 15:58:38 +08:00
Frieda Huang	9cea90eab4	[Frontend] Add /classify endpoint (#17032 ) Signed-off-by: Frieda (Jingying) Huang <jingyingfhuang@gmail.com>	2025-05-11 07:57:07 +00:00
Reid	d1110f5b5a	[doc] update lora doc (#17936 ) Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com>	2025-05-11 15:56:21 +08:00
Ben Browning	8132365b74	[Bugfix]: v1 engine - consider lora adapters in allowed_token_ids (#17855 ) Signed-off-by: Ben Browning <bbrownin@redhat.com>	2025-05-11 00:53:58 -07:00
Shiyan Deng	eea22a56ab	fix amd triton mla path (#17871 )	2025-05-11 07:53:31 +00:00
Kuntai Du	9112155283	[Perf] Use small max_num_batched_tokens for A100 (#17885 ) Signed-off-by: KuntaiDu <kuntai@uchicago.edu>	2025-05-11 07:53:23 +00:00
xinli-centml	90d0a74b60	[Bugfix] Add revision to `transformers.Auto*.from_pretrained` processors (#17948 ) Signed-off-by: Xin Li <xin@centml.ai>	2025-05-11 07:52:44 +00:00
Jinzhen Lin	d74e5f37bc	[Kernel] fp4 marlin kernel (#17687 ) Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com>	2025-05-10 19:58:49 -07:00
Chen Zhang	ca66a1674c	[v1] Rename specialized_manager.py to single_type_kv_cache_manager.py (#17946 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-05-10 16:14:12 -07:00
Chen Zhang	950751a987	[v1] Pass BlockTable and KVCacheSpec to AttentionMetadataBuilders (#17483 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-05-10 16:12:04 -07:00
Reid	4c31218f80	[Misc] remove --model from vllm serve usage (#17944 ) Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com>	2025-05-10 13:23:31 +00:00
Harry Mellor	68311891f5	Don't default construct `ModelConfig` when default constructing `VllmConfig` (#17943 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-10 13:23:00 +00:00
Ximo Guanter	fc4441a4ee	Add missing content type headers to /ping and /health (#17036 ) (#17786 ) Signed-off-by: Ximo Guanter <ximo.guanter@gmail.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-10 07:13:32 +01:00
tracelogfb	246e3e0a36	fix broken test vllm:test_kernels - test_attention_selector.py::test_flash_attn (#17873 ) Co-authored-by: Stephen Chen <tracelog@meta.com>	2025-05-10 10:46:54 +08:00
Mark McLoughlin	7042cc96b0	[V1][Spec Decoding] Log accumulated metrics after system goes idle (#17913 ) Signed-off-by: Mark McLoughlin <markmc@redhat.com>	2025-05-09 18:23:07 -07:00
Pavani Majety	0c0fdae84f	[Hardware/NVIDIA/Kernel] Enable nvidia/DeepSeek-R1-FP4 Model (#16362 )	2025-05-09 16:24:41 -07:00
Alexei-V-Ivanov-AMD	3b602cdea7	AMD conditional all test execution // new test groups (#17556 ) Signed-off-by: Alexei V. Ivanov <alexei.ivanov@amd.com> Signed-off-by: Yida Wu <yidawu@alumni.cmu.edu>	2025-05-09 15:35:58 -07:00
Harry Mellor	4b2ed7926a	Improve configs - the rest! (#17562 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-09 15:18:44 -07:00
Mark McLoughlin	7e3571134f	[V1][Spec Decoding] Include bonus tokens in mean acceptance length (#17908 ) Signed-off-by: Mark McLoughlin <markmc@redhat.com>	2025-05-09 13:32:36 -07:00
Richard Zou	ea2236bf95	Add option to use torch._inductor.standalone_compile (#17057 ) Signed-off-by: rzou <zou3519@gmail.com>	2025-05-09 12:59:04 -07:00
Harry Mellor	7d4aedae7c	Handle error when `str` passed to `/v1/audio/transcriptions` (#17909 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-09 19:23:59 +00:00
Michael Goin	22481fbfa3	Update CT WNA16MarlinMoE integration (#16666 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-05-09 13:19:45 -04:00
Isotr0py	5c4c08f6f1	[Misc] Auto fallback to float16 for pre-Ampere GPUs when detected bfloat16 config (#17265 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2025-05-09 17:16:12 +00:00
Rui Qiao	c44c384b1c	[Misc] Add references in ray_serve_deepseek example (#17907 ) Signed-off-by: Rui Qiao <ruisearch42@gmail.com>	2025-05-09 16:59:36 +00:00
Michael Goin	85b72cb7b1	Revert "[BugFix][AMD] Compatible patch for latest AITER(05/07/2025)" (#17910 )	2025-05-09 08:58:18 -07:00
Cyrus Leung	6e5595ca39	[CI/Build] Automatically retry flaky tests (#17856 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-05-09 09:55:17 -06:00
Chen Zhang	200da9a517	[v1] Move block management logic from KVCacheManager to SpecializedManager (#17474 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-05-09 15:25:34 +00:00
qli88	9f64e93415	[BugFix][AMD] Compatible patch for latest AITER(05/07/2025) (#17864 ) Signed-off-by: Qiang Li <qiang.li2@amd.com>	2025-05-09 08:59:36 -06:00
Reid	ec61ea20a8	[Misc] add dify integration (#17895 ) Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com>	2025-05-09 03:42:39 -07:00
Harry Mellor	c6798baa9c	Change `top_k` to be disabled with `0` (still accept `-1` for now) (#17773 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-09 10:01:49 +00:00
inkcherry	5b2dcbf0b8	Fix Whisper crash caused by invalid`` `max_num_batched_tokens``` config (#17853 ) Signed-off-by: inkcherry <mingzhi.liu@intel.com>	2025-05-09 09:16:26 +00:00
Isotr0py	6e4a93e3f7	[Bugfix][CPU] Fix broken AVX2 CPU TP support (#17252 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2025-05-09 08:55:14 +00:00
vllmellm	217db4baa6	[Bugfix][ROCm] Fix AITER MLA V1 (#17880 ) Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>	2025-05-09 08:38:21 +00:00
Yan Ma	ff8c400502	[Doc] remove visible token in doc (#17884 ) Signed-off-by: yan <yanma1@habana.ai>	2025-05-09 01:21:31 -07:00

... 6 7 8 9 10 ...

6768 Commits