youngkingdom/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Aaron Pham	c29fb540ff	[gpt-oss] tool parser supports for /chat/completions [1/n] (#22386 ) Signed-off-by: Aaron Pham <contact@aarnphm.xyz> Co-authored-by: Simon Mo <simon.mo@hey.com>	2025-09-04 20:39:12 -07:00
mgazz	51d5e9be7d	[Core][Model] Terratorch backend integration (#23513 ) Signed-off-by: Michele Gazzetti <michele.gazzetti1@ibm.com> Signed-off-by: Christian Pinto <christian.pinto@ibm.com> Co-authored-by: Christian Pinto <christian.pinto@ibm.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-09-04 00:22:41 -07:00
Flora Feng	712b273f65	[Refactor] Introduce basic Renderer for completion-style request (#24010 ) Signed-off-by: sfeng33 <4florafeng@gmail.com>	2025-09-04 05:21:12 +00:00
wuhang	a38f8bd54c	[Feature][Responses API]Support MCP tools with streaming mode + background mode (#23927 ) Signed-off-by: wuhang <wuhang6@huawei.com>	2025-09-04 04:05:10 +00:00
wang.yuqi	51383bd472	[CI] Accelerate mteb test by setting SentenceTransformers mteb score to a constant (#24088 ) Signed-off-by: wang.yuqi <noooop@126.com>	2025-09-03 17:23:56 +08:00
dsinghvi	70549c1245	[CI/Build] Serve images used by multimodal tests through local HTTP Server (#23907 ) Signed-off-by: Divyansh Singhvi <divyanshsinghvi@gmail.com> Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-09-03 16:13:11 +08:00
Didier Durand	d7e1e59972	[Doc]: fix typos in Python comments (#24093 ) Signed-off-by: Didier Durand <durand.didier@gmail.com>	2025-09-02 21:05:45 -07:00
Mark McLoughlin	2417798471	[Metrics] Deprecate TPOT in favor of ITL (#24110 ) Signed-off-by: Mark McLoughlin <markmc@redhat.com>	2025-09-02 18:10:10 +00:00
Chenheli Hua	f399182e8c	Run ruff format on a few files. (#24075 ) Signed-off-by: Chenheli Hua <huachenheli@outlook.com>	2025-09-02 17:55:32 +00:00
Aziz	ce30dca5c4	[CI]: reduce HTTP calls inside entrypoints openai tests (#23646 ) Signed-off-by: AzizCode92 <azizbenothman76@gmail.com> Signed-off-by: Aziz <azizbenothman76@gmail.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-09-02 10:49:32 +00:00
Nicolò Lucchesi	d46934b229	[Frontend] Gemma3n audio `transcriptions`/`translations` endpoint (#23735 ) Signed-off-by: NickLucche <nlucches@redhat.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-09-01 18:07:46 +08:00
Jee Jee Li	628d00cd7b	[Bugfix] Fix test_lora_resolvers.py (#23984 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-08-30 11:16:11 +00:00
dubejf	5b31cb1781	[Bugfix] Fix --config arg expansion called from api_server.py (#23944 ) Signed-off-by: Jean-Francois Dube <dubejf+gh@gmail.com> Co-authored-by: Jean-Francois Dube <dubejf+gh@gmail.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-08-29 21:36:39 -07:00
22quinn	4d7fe40fc0	[RL][BugFix] Fix missing tokenizer error for token-in-token-out (#23904 ) Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-08-30 01:09:55 +08:00
wang.yuqi	d9e00dbd1f	[Performance] V1 Classify Models E2E Performance Optimization (#23541 ) Signed-off-by: wang.yuqi <noooop@126.com>	2025-08-29 03:12:32 -07:00
Maximilien de Bayser	2554b27baa	[V0 Deprecation] Remove pooling model support in V0 (#23434 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> Signed-off-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-08-29 00:04:02 -07:00
Jee Jee Li	b4f9e9631c	[CI/Build] Clean up LoRA test (#23890 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-08-28 23:28:35 -07:00
Russell Bryant	c8b3b299c9	[tests] Improve speed and reliability of test_transcription_api_correctness (#23854 ) Signed-off-by: Russell Bryant <rbryant@redhat.com>	2025-08-29 04:25:33 +00:00
Chen Zhang	eb1995167e	[gpt-oss] Enable unit test for response API harmony integration (#23533 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-08-26 18:23:26 -07:00
Guillaume Calmettes	ebd5a77bb5	feat: add usage to TranscriptionResponse (text and json response_format) (#23576 ) Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com>	2025-08-26 05:26:26 -07:00
Cyrus Leung	ce0e9dbd43	[CI/Build] Fix typo in #23561 (#23616 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-08-25 23:13:03 -07:00
Cyrus Leung	6fd45e7b8a	[CI/Build] Use vLLM client's user agent to fetch images (#23561 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-08-25 19:34:12 -07:00
TeeKen Lau	5e021b4981	(Misc): add missing test for zero truncation size. (#23457 ) Signed-off-by: teekenl <teekenlau@gmail.com>	2025-08-24 18:12:47 +08:00
Aziz	d9a55204ba	fix(tests): Correct unreachable assertion in truncation test (#23425 ) Signed-off-by: AzizCode92 <azizbenothman76@gmail.com>	2025-08-23 05:23:54 +00:00
Cyrus Leung	8896eb72eb	[Deprecation] Remove `prompt_token_ids` arg fallback in `LLM.generate` and `LLM.embed` (#18800 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-08-22 10:56:57 +08:00
bigmoyan	582bbe6bd7	[Fix] correct tool_id for kimi-k2 when use tool_choice=required (#21259 ) Co-authored-by: wangzhengtao <wangzhengtao@msh.team>	2025-08-20 12:59:54 -07:00
rongfu.leng	38217877aa	[Fix] fix offline env use local mode path (#22526 ) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>	2025-08-20 13:34:49 +00:00
Nick Hill	8fd920924c	[BugFix] Fix stuck stats/metrics after requests are aborted (#22995 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-08-20 13:50:29 +08:00
Marko Rosenmueller	80141bbf2f	fix: use cache_salt for gpt-oss (#23186 ) Signed-off-by: Marko Rosenmueller <5467316+dr75@users.noreply.github.com>	2025-08-19 18:12:25 +00:00
22quinn	f7cf5b512e	[Frontend] Add `/collective_rpc` API endpoint (#23075 ) Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>	2025-08-19 17:29:32 +00:00
Yuge Zhang	24f4d1a224	Add return_token_ids parameter to OpenAI API endpoints (#22587 ) Signed-off-by: Yuge Zhang <scottyugochang@gmail.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Simon Mo <simon.mo@hey.com>	2025-08-19 09:48:31 -07:00
Michael Goin	3253ae765e	[Flaky CI] Increase timeout tolerance for test_mp_crash_detection+test_default_mm_lora_chat_completions (#23028 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-08-16 18:33:08 +00:00
Woonggi Min	68373d3126	[Frontend] Added support for HermesToolParser for models without special tokens (#16890 ) Signed-off-by: minpeter <kali2005611@gmail.com>	2025-08-16 17:38:42 +00:00
Andrew Sansom	78863f8c5c	[BugFix] Add support for loading prompt embeds tensors serialized on unavailable devices and sparse tensors (#22962 ) Signed-off-by: Andrew Sansom <andrew@protopia.ai>	2025-08-16 06:25:10 +00:00
Michael Goin	8a87cd27d9	[CI] Speed up Whisper tests by reusing server (#22859 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-08-15 16:56:31 -04:00
Nicolò Lucchesi	540d54ca8d	[CI] Re-enable transcriptions `test_long_audio_request` (#22890 ) Signed-off-by: NickLucche <nlucches@redhat.com>	2025-08-14 11:34:34 +00:00
Robert Shaw	a353bd083d	[CI] remove flaky v0 test (#22864 ) Signed-off-by: Robert Shaw <robshaw@redhat.com> Co-authored-by: Robert Shaw <robshaw@redhat.com>	2025-08-13 21:41:51 -07:00
Will Eaton	b6af24fba7	[CI][Entrypoints]: add filter to generation to filter out invalid tool calls (#22826 ) Signed-off-by: Will Eaton <weaton@redhat.com>	2025-08-13 20:09:07 -07:00
Kdump	653124bd46	[Frontend] Add chunked processing to handle long inputs in embedding models (#22280 ) Signed-off-by: x22x22 <wadeking@qq.com> Signed-off-by: Kdump <rootshellexp@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Maximilien de Bayser <maxdebayser@gmail.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-08-13 04:14:24 -07:00
Woosuk Kwon	71683ca6f6	[V0 Deprecation] Remove multi-step scheduling (#22138 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>	2025-08-12 20:18:39 -07:00
Michael Goin	ea1292ad3e	[CI Failure] Use float32 for tests/entrypoints/openai/test_audio.py (#22686 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-08-11 20:20:42 -07:00
Harry Mellor	839ab00349	Re-enable Xet on TPU tests now that `hf_xet` has been updated (#22666 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-08-11 19:54:40 -07:00
Chen Zhang	1891a265d3	[gpt-oss] Add test for response API + harmony (but skipped) (#22554 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-08-11 17:47:24 -07:00
wang.yuqi	84cf78acee	[Model] Pooling models default to using chunked prefill & prefix caching if supported. (#20930 ) Signed-off-by: wang.yuqi <noooop@126.com>	2025-08-11 09:41:37 -07:00
Maximilien de Bayser	39052dbca8	Support token_type_ids in V1 with less code changes (#21985 ) Signed-off-by: Max de Bayser <mbayser@br.ibm.com>	2025-08-10 22:54:59 -07:00
22quinn	b799f4b9ea	[CI/Build] Fix tensorizer test for load_format change (#22583 ) Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>	2025-08-10 19:30:00 -07:00
Russell Bryant	311d875614	Drop flaky test_healthcheck_response_time (#22539 ) Signed-off-by: Russell Bryant <rbryant@redhat.com>	2025-08-08 16:56:47 -07:00
yyweiss	baece8c3d2	[Frontend] Add unix domain socket support (#18097 ) Signed-off-by: <yyweiss@gmail.com> Signed-off-by: yyw <yyweiss@gmail.com>	2025-08-08 16:23:44 -07:00
Moritz Sanft	370661856b	[Frontend] Update OpenAI error response to upstream format (#22099 ) Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>	2025-08-06 23:06:00 -07:00
wang.yuqi	586f286789	[Model] Pooling model activation supports per request control by PoolingParams (#20538 ) Signed-off-by: wang.yuqi <noooop@126.com>	2025-08-05 00:37:00 -07:00

1 2 3 4 5 ...

421 Commits