|
|
9d2b4a70f4
|
[V1][Metrics] Updated list of deprecated metrics in v0.8 (#14695)
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
|
2025-03-15 00:45:25 +08:00 |
|
|
|
3fb17d26c8
|
[Doc] Fix typo in documentation (#14783)
Signed-off-by: yasu52 <tsuguro4649@gmail.com>
|
2025-03-13 20:33:09 -07:00 |
|
|
|
b0746fae3d
|
[Frontend] support image embeds (#13955)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-03-10 12:36:03 +00:00 |
|
|
|
fa82b93853
|
[Frontend][Docs] Transcription API streaming (#13301)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-03-06 10:39:35 +00:00 |
|
|
|
abcc61e0af
|
[misc] Mention ray list nodes command to troubleshoot ray issues (#14318)
Signed-off-by: Rui Qiao <ruisearch42@gmail.com>
|
2025-03-06 02:00:36 +00:00 |
|
|
|
1088f06242
|
[Doc] Move multimodal Embedding API example to Online Serving page (#14017)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-02-28 07:12:04 +00:00 |
|
|
|
2cb8c1540e
|
[Metrics] Add --show-hidden-metrics-for-version CLI arg (#13295)
|
2025-02-22 00:20:45 -08:00 |
|
|
|
1c3c975766
|
[FEATURE] Enables /score endpoint for embedding models (#12846)
|
2025-02-20 22:09:47 -08:00 |
|
|
|
ad5a35c21b
|
[doc] clarify multi-node serving doc (#13558)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-02-19 22:32:17 +08:00 |
|
|
|
7b623fca0b
|
[VLM] Check required fields before initializing field config in DictEmbeddingItems (#13380)
|
2025-02-17 01:36:07 -08:00 |
|
|
|
d84cef76eb
|
[Frontend] Add /v1/audio/transcriptions OpenAI API endpoint (#12909)
|
2025-02-13 07:23:45 -08:00 |
|
|
|
08b2d845d6
|
[Model] Ultravox Model: Support v0.5 Release (#12912)
Signed-off-by: Farzad Abdolhosseini <farzad@fixie.ai>
|
2025-02-10 22:02:48 +00:00 |
|
|
|
8a69e0e20e
|
[CI/Build] Auto-fix Markdown files (#12941)
|
2025-02-08 04:25:15 -08:00 |
|
|
|
e64330910b
|
[doc][misc] clarify VLLM_HOST_IP for multi-node inference (#12667)
As more and more people are trying deepseek models with multi-node
inference, https://github.com/vllm-project/vllm/issues/7815 becomes more
frequent. Let's give clear message to users.
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-02-03 09:32:18 +08:00 |
|
|
|
dd6a3a02cb
|
[Doc] Convert docs to use colon fences (#12471)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-01-29 11:38:29 +08:00 |
|
|
|
0034b09ceb
|
[Frontend] Rerank API (Jina- and Cohere-compatible API) (#12376)
Signed-off-by: Kyle Mistele <kyle@mistele.com>
|
2025-01-26 19:58:45 -07:00 |
|
|
|
d07efb31c5
|
[Doc] Troubleshooting errors during model inspection (#12351)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-23 22:46:58 +08:00 |
|
|
|
f8ef146f03
|
[Doc] Add documentation for specifying model architecture (#12105)
|
2025-01-16 15:53:43 +08:00 |
|
|
|
43f3d9e699
|
[CI/Build] Add markdown linter (#11857)
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
|
2025-01-12 00:17:13 -08:00 |
|
|
|
482cdc494e
|
[Doc] Rename offline inference examples (#11927)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-01-10 23:50:29 +08:00 |
|
|
|
12664ddda5
|
[Doc] [1/N] Initial guide for merged multi-modal processor (#11925)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-10 14:30:25 +00:00 |
|
|
|
d85c47d6ad
|
Replace "online inference" with "online serving" (#11923)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-01-10 12:05:56 +00:00 |
|
|
|
6cd40a5bfe
|
[Doc][4/N] Reorganize API Reference (#11843)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-08 21:34:44 +08:00 |
|
|
|
aba8d6ee00
|
[Doc] Move examples into categories (#11840)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-01-08 13:09:53 +00:00 |
|
|
|
8ceffbf315
|
[Doc][3/N] Reorganize Serving section (#11766)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-07 11:20:01 +08:00 |
|
|
|
ee77fdb5de
|
[Doc][2/N] Reorganize Models and Usage sections (#11755)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-06 21:40:31 +08:00 |
|
|
|
2a622d704a
|
k8s-config: Update the secret to use stringData (#11679)
Signed-off-by: Suraj Deshmukh <surajd.service@gmail.com>
|
2025-01-06 08:01:22 +00:00 |
|
|
|
402d378360
|
[Doc] [1/N] Reorganize Getting Started section (#11645)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-06 02:18:33 +00:00 |
|
|
|
84c35c374a
|
According to vllm.EngineArgs, the name should be distributed_executor_backend (#11689)
|
2025-01-02 18:14:16 +00:00 |
|
|
|
32b4c63f02
|
[Doc] Convert list tables to MyST (#11594)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-29 15:56:22 +08:00 |
|
|
|
d427e5cfda
|
[Doc] Minor documentation fixes (#11580)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-28 21:53:59 +08:00 |
|
|
|
d003f3ea39
|
Update deploying_with_k8s.md with AMD ROCm GPU example (#11465)
Signed-off-by: Alex He <alehe@amd.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2024-12-27 10:00:04 +00:00 |
|
|
|
0c0c2015c5
|
Update openai_compatible_server.md (#11536)
Co-authored-by: Simon Mo <simon.mo@hey.com>
|
2024-12-26 16:26:18 -08:00 |
|
|
|
6ad909fdda
|
[Doc] Improve GitHub links (#11491)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-25 14:49:26 -08:00 |
|
|
|
9edca6bf8f
|
[Frontend] Online Pooling API (#11457)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-24 17:54:30 +08:00 |
|
|
|
32aa2059ad
|
[Docs] Convert rST to MyST (Markdown) (#11145)
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
|
2024-12-23 22:35:38 +00:00 |
|
|
|
995f56236b
|
[Core] Loading model from S3 using RunAI Model Streamer as optional loader (#10192)
Signed-off-by: OmerD <omer@run.ai>
|
2024-12-20 16:46:24 +00:00 |
|
|
|
7801f56ed7
|
[ci][gh200] dockerfile clean up (#11351)
Signed-off-by: drikster80 <ed.sealing@gmail.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: drikster80 <ed.sealing@gmail.com>
Co-authored-by: cenzhiyao <2523403608@qq.com>
|
2024-12-19 18:13:06 -08:00 |
|
|
|
66d4b16724
|
[Frontend] Add OpenAI API support for input_audio (#11027)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-16 22:09:58 -08:00 |
|
|
|
35bae114a8
|
fix gh200 tests on main (#11246)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-12-16 17:22:38 -08:00 |
|
|
|
35ffa682b1
|
[Docs] hint to enable use of GPU performance counters in profiling tools for multi-node distributed serving (#11235)
Co-authored-by: Michael Goin <michael@neuralmagic.com>
|
2024-12-16 22:20:39 +00:00 |
|
|
|
b3b1526f03
|
WIP: [CI/Build] simplify Dockerfile build for ARM64 / GH200 (#11212)
Signed-off-by: drikster80 <ed.sealing@gmail.com>
Co-authored-by: drikster80 <ed.sealing@gmail.com>
|
2024-12-16 09:20:49 +00:00 |
|
|
|
da6f409246
|
Update deploying_with_k8s.rst (#10922)
|
2024-12-15 16:33:58 -08:00 |
|
|
|
0920ab9131
|
[Doc] Reorganize online pooling APIs (#11172)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-14 00:22:22 +08:00 |
|
|
|
d4d5291cc2
|
fix(docs): typo in helm install instructions (#11141)
Signed-off-by: Ramon Ziai <ramon.ziai@bettermarks.com>
|
2024-12-12 17:36:32 +00:00 |
|
|
|
24a36d6d5f
|
Update link to LlamaStack remote vLLM guide in serving_with_llamastack.rst (#11112)
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
|
2024-12-12 02:39:21 +00:00 |
|
|
|
fe2e10c71b
|
Add example of helm chart for vllm deployment on k8s (#9199)
Signed-off-by: Maxime Fournioux <55544262+mfournioux@users.noreply.github.com>
|
2024-12-10 09:19:27 +00:00 |
|
|
|
6d525288c1
|
[Docs] Add dedicated tool calling page to docs (#10554)
Signed-off-by: mgoin <michael@neuralmagic.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
|
2024-12-09 20:15:34 -05:00 |
|
|
|
7406274041
|
[Doc] add KubeAI to serving integrations (#10837)
Signed-off-by: Sam Stoelinga <sammiestoel@gmail.com>
|
2024-12-06 17:03:56 +00:00 |
|
|
|
aa39a8e175
|
[Doc] Create a new "Usage" section (#10827)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-05 11:19:35 +08:00 |
|