|
|
910abdbd08
|
[Bugfix] fixed top_logprobs: -1 does not appear to work as intended (#26470)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-10-11 00:41:17 +08:00 |
|
|
|
1e6848a65d
|
[CI] fix test_run_batch.py::test_completions - AssertionError (#26578)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-10-10 22:16:28 +08:00 |
|
|
|
720d3cd0f0
|
[CI] fix ruff format (#26579)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-10-10 03:02:12 -07:00 |
|
|
|
ab196edefb
|
Remove LoRA bias support (#25807)
Signed-off-by: Ashwin Phadke <ashwinphadke12@rediffmail.com>
Signed-off-by: Ashwin Phadke <23502062+ashwin-phadke@users.noreply.github.com>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-10-10 09:50:33 +00:00 |
|
|
|
3ee202ea1e
|
[GPT-OSS] Add support for arrays at tool message content (#25593)
Signed-off-by: Luis Tomas Bolivar <ltomasbo@redhat.com>
|
2025-10-10 09:00:45 +00:00 |
|
|
|
ad430a67ca
|
[Metrics] Log multi-modal cache stats and fix reset (#26285)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-10 01:45:55 -07:00 |
|
|
|
da4455609d
|
[Chore]: One pythonic tool parser test uses the wrong parser (#26515)
Signed-off-by: Ben Browning <bbrownin@redhat.com>
|
2025-10-10 04:03:55 +00:00 |
|
|
|
4bdf7ac593
|
[Bugfix] Fix SHM cache initialization (#26427)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-09 02:48:04 -07:00 |
|
|
|
dc7976dd9f
|
[Misc] Upgrade more code to Python 3.10 (#26463)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-09 10:43:53 +01:00 |
|
|
|
1e4ecca1d0
|
[V0 Deprecation] Remove VLLM_USE_V1 from tests (#26341)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-07 15:42:31 +00:00 |
|
|
|
185d8ed44f
|
[responsesAPI][bugfix] serialize harmony messages (#26185)
Signed-off-by: Andrew Xia <axia@meta.com>
Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com>
|
2025-10-07 07:07:53 +00:00 |
|
|
|
6c04638214
|
Fix per file ruff ignores related to line length (#26262)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-06 05:12:40 +00:00 |
|
|
|
91ac7f764d
|
[CI][gpt-oss] Enable python tool tests in CI (#24315)
Signed-off-by: wuhang <wuhang6@huawei.com>
|
2025-10-06 04:20:06 +00:00 |
|
|
|
1c0c68202c
|
Fix per file ruff ignores related to typing (#26254)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-05 16:37:55 +00:00 |
|
|
|
d6953beb91
|
Convert formatting to use ruff instead of yapf + isort (#26247)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-05 07:06:22 -07:00 |
|
|
|
a964e5e6c3
|
[Bugfix] Allow --skip-tokenizer-init with echo and return_token_ids (#26238)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-05 05:38:53 +00:00 |
|
|
|
119f00630b
|
[Renderer] Clean up renderer code (#26216)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-04 17:05:29 +00:00 |
|
|
|
ea25a76c05
|
[BugFix] Use async Mistral Tokenizer in Chat Completions (#26134)
Signed-off-by: Ben Browning <bbrownin@redhat.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-10-04 09:42:08 +08:00 |
|
|
|
831b124151
|
[responsesAPI] add better error messaging for long prompts (#25724)
Signed-off-by: Andrew Xia <axia@meta.com>
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
|
2025-10-03 14:33:13 -07:00 |
|
|
|
812b7f54a8
|
[Renderer] Move Processor out of AsyncLLM (#24138)
Signed-off-by: Yang <lymailforjob@gmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-03 11:29:45 +00:00 |
|
|
|
2ed3f20dba
|
[openai] Fix missing tool usage check (system message) (#24768)
Signed-off-by: kyt <eluban4532@gmail.com>
|
2025-10-03 18:55:44 +08:00 |
|
|
|
e5017cd6d6
|
[gpt-oss] disable tool server initialization if no tool in request (#25790)
Signed-off-by: Andrew Xia <axia@meta.com>
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
|
2025-10-03 05:08:35 +00:00 |
|
|
|
5db1870bb9
|
[gpt-oss] use vLLM instead of openai types for streaming (#25186)
Signed-off-by: Andrew Xia <axia@meta.com>
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
|
2025-09-30 22:47:07 +00:00 |
|
|
|
78a47f87ce
|
Test Prompt Embeds/LoRA compatibility and Enable LoRA Support for OPT Models (#25717)
Signed-off-by: Andrew Sansom <andrew@protopia.ai>
|
2025-09-30 08:10:58 +08:00 |
|
|
|
3958b96bf5
|
Add option to restrict media domains (#25783)
Signed-off-by: Chenheli Hua <huachenheli@outlook.com>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Chenheli Hua <huachenheli@outlook.com>
|
2025-09-27 01:23:52 +00:00 |
|
|
|
3468f17ebe
|
[V0 deprecation] Remove _VLLM_V1 suffixes from attention backend names (#25489)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>
|
2025-09-25 17:37:50 +00:00 |
|
|
|
0bcc3a160d
|
[CI/Build] Fix flaky entrypoints test (#25663)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-09-25 12:19:40 +00:00 |
|
|
|
5caaeb714c
|
[Bugfix] [Frontend] Cleanup gpt-oss non-streaming chat tool calls (#25514)
Signed-off-by: Ben Browning <bbrownin@redhat.com>
|
2025-09-24 03:20:38 +00:00 |
|
|
|
95bc60e4cb
|
[gpt-oss][bugfix] remove logic to require resp_ in ResponseAPI (#25428)
Signed-off-by: Andrew Xia <axia@meta.com>
|
2025-09-23 15:46:46 -07:00 |
|
|
|
4322c553a6
|
[Test]: Hermes tool parser stream output error in Qwen3 case (#25203)
Signed-off-by: Andreas Hartel <andreas.hartel@aleph-alpha.com>
|
2025-09-23 17:56:31 +08:00 |
|
|
|
45d7d852d3
|
[Frontend] Responses API MCP tools for built in tools and to pass through headers (#24628)
Signed-off-by: Alec Solder <alecs@fb.com>
Signed-off-by: Alec S <10566873+alecsolder@users.noreply.github.com>
Co-authored-by: Alec Solder <alecs@fb.com>
Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com>
|
2025-09-22 23:38:19 +00:00 |
|
|
|
0eecb31663
|
[Bugfix] Fix hermes tool parser handling of non-string argument types (#22002)
Signed-off-by: wangzi <3220100013@zju.edu.cn>
Signed-off-by: David Chen <530634352@qq.com>
Co-authored-by: wangzi <3220100013@zju.edu.cn>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
|
2025-09-22 11:35:39 +08:00 |
|
|
|
04d3752329
|
[Bugfix][V0 Deprecation][CI] use async mock and await for async method (#25325)
Signed-off-by: Yang <lymailforjob@gmail.com>
|
2025-09-22 07:06:16 +08:00 |
|
|
|
72dd1595b4
|
[CI] Skip tests failing on main (#25326)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-09-20 19:57:46 -07:00 |
|
|
|
52c2a8d4ad
|
[V0 Deprecation] Remove LLMEngine (#25033)
Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-09-20 17:56:30 -07:00 |
|
|
|
f91480b2d4
|
[Bugfix] fix tool call arguments is empty (#25223)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Co-authored-by: xin.li <xin.li@daocloud.io>
|
2025-09-20 13:29:54 +08:00 |
|
|
|
c7e713616a
|
test: Remove vestigial skip for prompt embeds tests after landing v1 Prompt Embeds support (#25291)
Signed-off-by: Andrew Sansom <andrew@protopia.ai>
|
2025-09-19 17:33:40 -07:00 |
|
|
|
e69e0b8b5f
|
[Frontend] Responses API messages out, just harmony for now (#24985)
Signed-off-by: Alec Solder <alecs@fb.com>
Co-authored-by: Alec Solder <alecs@fb.com>
Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com>
|
2025-09-19 21:40:16 +00:00 |
|
|
|
47fd08aaf9
|
[CI/Build] fix test function_calling (#25072)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-09-19 12:16:32 -06:00 |
|
|
|
6d8246aaff
|
[gpt-oss] Add ResponseReasoningPartAddedEvent, ResponseReasoningPartDoneEvent for streaming (#24938)
Signed-off-by: Andrew Xia <axia@meta.com>
|
2025-09-18 19:11:59 -07:00 |
|
|
|
9a4600e4dc
|
[CORE] Prompt Embeddings Support for v1 Engine (#24278)
Signed-off-by: Andrew Sansom <andrew@protopia.ai>
Signed-off-by: Andrew Sansom <qthequartermasterman@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-09-19 08:03:09 +08:00 |
|
|
|
e19bce40a1
|
[V0 Deprecation] Remove AsyncLLMEngine (#25025)
Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-09-18 11:07:42 -07:00 |
|
|
|
cc935fdd7e
|
[Frontend] Support setting logprobs to -1 (#25031)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-09-18 10:34:42 +00:00 |
|
|
|
29283e8976
|
[Chore] Cleanup guided namespace, move to structured outputs config (#22772)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-09-18 09:20:27 +00:00 |
|
|
|
5c65a72bb1
|
[V0 Deprecation] Remove more V0 tests (#25117)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-09-17 22:05:25 -07:00 |
|
|
|
bec060fd99
|
Mark prompt logprobs as incompatible with prompt embeds at API level (#25077)
Signed-off-by: Andrew Sansom <andrew@protopia.ai>
|
2025-09-17 21:25:07 -07:00 |
|
|
|
544fe76b95
|
[Frontend] Support returning all prompt logprobs (#24956)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-09-17 09:03:52 +00:00 |
|
|
|
6c47f6bfa4
|
[Core] Remove tokenizer group in vLLM (#24078)
Signed-off-by: Zhuohan Li <zhuohan123@gmail.com>
|
2025-09-17 08:42:59 +00:00 |
|
|
|
5801e49776
|
[V0 Deprecation] Remove MQLLMEngine (#25019)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>
|
2025-09-16 21:29:27 -07:00 |
|
|
|
86daa875fe
|
[gpt-oss][1][bugfix] fix streaming final output (#24466)
Signed-off-by: Andrew Xia <axia@meta.com>
|
2025-09-16 13:56:16 -06:00 |
|