|
|
8f2720def9
|
[Frontend] Support Tool Calling with both tool_choice='required' and $defs. (#20629)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-07-10 13:56:35 +08:00 |
|
|
|
2155e95ef1
|
[Bugfix] Fix the issue where reasoning_content is None when Thinkng is enabled and tool_choice is set to 'required'. (#20662)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-07-09 07:39:58 +00:00 |
|
|
|
baed180aa0
|
[tech debt] Revisit lora request model checker (#20636)
Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
|
2025-07-09 09:42:41 +08:00 |
|
|
|
72d14d0eed
|
[Frontend] [Core] Integrate Tensorizer in to S3 loading machinery, allow passing arbitrary arguments during save/load (#19619)
Signed-off-by: Sanger Steel <sangersteel@gmail.com>
Co-authored-by: Eta <esyra@coreweave.com>
|
2025-07-07 22:47:43 -07:00 |
|
|
|
e601efcb10
|
[Misc] Add fully interleaved support for multimodal 'string' content format (#14047)
Signed-off-by: drobyshev.anton <drobyshev.anton@wb.ru>
Co-authored-by: drobyshev.anton <drobyshev.anton@wb.ru>
|
2025-07-07 19:43:08 +00:00 |
|
|
|
a37d75bbec
|
[Front-end] microbatch tokenization (#19334)
Signed-off-by: zt2370 <ztang2370@gmail.com>
|
2025-07-07 17:54:10 +01:00 |
|
|
|
462b269280
|
Implement OpenAI Responses API [1/N] (#20504)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-07-06 18:32:13 -07:00 |
|
|
|
fe1e924811
|
[Frontend] Support image object in llm.chat (#19635)
Signed-off-by: sfeng33 <4florafeng@gmail.com>
Signed-off-by: Flora Feng <4florafeng@gmail.com>
|
2025-07-06 06:47:13 +00:00 |
|
|
|
9e5452ee34
|
[Bug][Frontend] Fix structure of transcription's decoder_prompt (#18809)
Signed-off-by: sangbumlikeagod <oironese@naver.com>
|
2025-07-04 11:28:07 +00:00 |
|
|
|
6f1229f91d
|
[Model][2/N] Automatic conversion of CrossEncoding model (#19978)
Signed-off-by: wang.yuqi <noooop@126.com>
|
2025-07-03 13:59:23 +00:00 |
|
|
|
b024a42e93
|
[Core] Move multimodal placeholder from chat utils to model definition (#20355)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-07-03 08:18:30 +00:00 |
|
|
|
2e7cbf2d7d
|
[Frontend] Support configurable mm placeholder strings & flexible video sampling policies via CLI flags. (#20105)
Signed-off-by: Chenheli Hua <huachenheli@outlook.com>
|
2025-07-01 23:34:03 -07:00 |
|
|
|
ed70f3c64f
|
Add GLM4.1V model (Draft) (#19331)
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-07-01 12:48:26 +00:00 |
|
|
|
e28533a16f
|
[Bugfix] Fix include prompt in stream response when echo=true (#15233)
Signed-off-by: Yuan Fang <yuanfang@alauda.io>
|
2025-07-01 01:30:14 +00:00 |
|
|
|
6e244ae091
|
[Perf][Frontend] eliminate api_key and x_request_id headers middleware overhead (#19946)
Signed-off-by: Yazan-Sharaya <yazan.sharaya.yes@gmail.com>
|
2025-06-27 00:44:14 -04:00 |
|
|
|
e795d723ed
|
[Frontend] Add /v1/audio/translations OpenAI API endpoint (#19615)
Signed-off-by: Roger Wang <ywang@roblox.com>
Signed-off-by: NickLucche <nlucches@redhat.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2025-06-25 17:54:14 +00:00 |
|
|
|
ead2110297
|
[Core][Bugfix] Fix Online MM Beam Search (#19688)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
|
2025-06-19 17:18:07 +00:00 |
|
|
|
799397ee4f
|
Support embedding models in V1 (#16188)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Signed-off-by: Max de Bayser <maxdebayser@gmail.com>
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
Co-authored-by: 22quinn <33176974+22quinn@users.noreply.github.com>
|
2025-06-18 21:36:33 -07:00 |
|
|
|
ede5c4ebdf
|
[Frontend] add chunking audio for > 30s audio (#19597)
Signed-off-by: nguyenhoangthuan99 <thuanhppro12@gmail.com>
|
2025-06-17 11:34:00 +08:00 |
|
|
|
f40f763f12
|
[CI] Add mteb testing for rerank models (#19344)
|
2025-06-16 01:36:43 -07:00 |
|
|
|
2f1c19b245
|
[CI] change spell checker from codespell to typos (#18711)
Signed-off-by: Andy Xie <andy.xning@gmail.com>
|
2025-06-11 19:57:10 -07:00 |
|
|
|
2b1e2111b0
|
Fix test_max_model_len in tests/entrypoints/llm/test_generate.py (#19451)
Signed-off-by: Lu Fang <lufang@fb.com>
|
2025-06-11 12:54:59 +08:00 |
|
|
|
c1c7dbbeeb
|
[Bugfix][Core] Prevent token lengths exceeding max_model_len in V0 (#19348)
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
|
2025-06-09 23:01:29 +08:00 |
|
|
|
6e0cd10f72
|
[Easy][Test] Simplify test_function_tool_use with multiple parametrizes (#19269)
Signed-off-by: Lu Fang <lufang@fb.com>
|
2025-06-07 09:19:09 +08:00 |
|
|
|
8fc57501d3
|
[Bugfix]: Fix the incompatibility issue with stream when Thinking is disabled (#19135)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-06-05 06:24:24 +00:00 |
|
|
|
4de790fcad
|
[Bugfix]: Fix the incompatibility issue with tool_choice 'required' when Thinking is enabled (#19075)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-06-03 23:27:24 +00:00 |
|
|
|
02f0c7b220
|
[Misc] Add SPDX-FileCopyrightText (#19100)
Signed-off-by: simon-mo <simon.mo@hey.com>
|
2025-06-03 11:20:17 -07:00 |
|
|
|
20079c6e36
|
[Misc] add return token strs for tokenize (#18941)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-31 18:00:11 +00:00 |
|
|
|
dff80b0e42
|
[Frontend] Add rerank support to run_batch endpoint (#16278)
Signed-off-by: Pooya Davoodi <pooya.davoodi@parasail.io>
|
2025-05-31 07:40:01 +00:00 |
|
|
|
1dab4d5718
|
Tool parser regex timeout handling (#18960)
Signed-off-by: Will Eaton <weaton@redhat.com>
|
2025-05-30 21:02:54 +00:00 |
|
|
|
2dbe8c0774
|
[Perf] API-server scaleout with many-to-many server-engine comms (#17546)
|
2025-05-30 08:17:00 -07:00 |
|
|
|
c29034037d
|
[Deprecation] Disallow pos-args other than model when initializing LLM (#18802)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-29 09:36:58 -07:00 |
|
|
|
0e98964e94
|
[V1][Metrics] Remove metrics that were deprecated in 0.8 (#18837)
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
|
2025-05-28 18:54:12 +00:00 |
|
|
|
321331b8ae
|
[Core] Add Lora Support to Beam Search (#18346)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
|
2025-05-28 08:58:24 -07:00 |
|
|
|
435fa95444
|
[Frontend] add run batch to CLI (#18804)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-28 07:08:57 -07:00 |
|
|
|
de65fc8e1e
|
[CI] improve embed testing (#18747)
|
2025-05-28 00:16:35 -07:00 |
|
|
|
38b13dfe78
|
[CI/Build] Replace math.isclose with pytest.approx (#18703)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-26 02:05:17 -07:00 |
|
|
|
4fc1bf813a
|
[Bugfix] Migrate to REGEX Library to prevent catastrophic backtracking (#18454)
Signed-off-by: Crucifixion-Fxl <xmufxl@gmail.com>
Co-authored-by: Crucifixion-Fxl <xmufxl@gmail.com>
|
2025-05-23 16:16:26 -07:00 |
|
|
|
c32e249a23
|
[Frontend] [Core] Add Tensorizer support for V1, LoRA adapter serialization and deserialization (#17926)
Signed-off-by: Sanger Steel <sangersteel@gmail.com>
|
2025-05-22 18:44:18 -07:00 |
|
|
|
c91fe7b1b9
|
[Frontend][Bug Fix] Update llama4 pythonic jinja template and llama4_pythonic parser (#17917)
Signed-off-by: Kai Wu <kaiwu@meta.com>
|
2025-05-22 16:44:08 -07:00 |
|
|
|
1f3a1200e4
|
[Bugfix] make test_openai_schema.py pass (#18224)
Signed-off-by: David Xia <david@davidxia.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-05-22 18:34:06 +00:00 |
|
|
|
86847700d7
|
[CI] Add mteb testing to test the accuracy of the embedding model (#17175)
|
2025-05-20 06:51:12 -07:00 |
|
|
|
221cfc2fea
|
Feature/vllm/input embedding completion api (#17590)
Signed-off-by: Andrew Sansom <andrew@protopia.ai>
Signed-off-by: Nan2018 <nan@protopia.ai>
Co-authored-by: 临景 <linjing.yx@alibaba-inc.com>
Co-authored-by: Bryce1010 <bryceyx@gmail.com>
Co-authored-by: Andrew Sansom <andrew@protopia.ai>
Co-authored-by: Andrew Sansom <qthequartermasterman@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-05-18 20:18:05 -07:00 |
|
|
|
451da4bcbd
|
add tools into TokenizeChatRequest (#18187)
Signed-off-by: yangxia <yangxiast@gmail.com>
|
2025-05-15 04:01:49 -07:00 |
|
|
|
d066e52013
|
[Bugfix] Fix chat utils tests (#18139)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-14 05:38:21 -07:00 |
|
|
|
8f5dc41481
|
[Bugfix] Fix entrypoints audio test failure (#18111)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-14 09:08:07 +00:00 |
|
|
|
6685890d11
|
[Fix] Move "model_config" as keyword args in chat_utils.py (#18098)
Signed-off-by: Linkun <github@lkchen.net>
|
2025-05-13 23:27:26 -07:00 |
|
|
|
dc1a821768
|
[Feature][V1] Support tool_choice: required when using Xgrammar as the StructuredOutputBackend. (#17845)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-05-12 23:01:31 -07:00 |
|
|
|
05a4324f8e
|
Initialize the delta tool call fields explicitly (#17340)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: igmainc <igmainc@icloud.com>
|
2025-05-12 13:28:58 +00:00 |
|
|
|
9cea90eab4
|
[Frontend] Add /classify endpoint (#17032)
Signed-off-by: Frieda (Jingying) Huang <jingyingfhuang@gmail.com>
|
2025-05-11 07:57:07 +00:00 |
|