|
|
444b0f0f62
|
[Misc][Docs] Raise error when flashinfer is not installed and VLLM_ATTENTION_BACKEND is set (#12513)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-02-24 10:43:21 -05:00 |
|
|
|
992e5c3d34
|
Merge similar examples in offline_inference into single basic example (#12737)
|
2025-02-20 04:53:51 -08:00 |
|
|
|
7b203b7694
|
[misc] fix debugging code (#13487)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-02-18 09:37:11 -08:00 |
|
|
|
da833b0aee
|
[Docs] Change myenv to vllm. Update python_env_setup.inc.md (#13325)
|
2025-02-16 16:04:21 +00:00 |
|
|
|
60c68df6d1
|
[Build] Automatically use the wheel of the base commit with Python-only build (#13178)
|
2025-02-12 23:10:28 -08:00 |
|
|
|
deb6c1c6b4
|
[Doc] Improve OpenVINO installation doc (#13102)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-02-11 18:02:46 +00:00 |
|
|
|
8a69e0e20e
|
[CI/Build] Auto-fix Markdown files (#12941)
|
2025-02-08 04:25:15 -08:00 |
|
|
|
eaa92d4437
|
[ROCm] [Feature] [Doc] [Dockerfile] [BugFix] Support Per-Token-Activation Per-Channel-Weight FP8 Quantization Inferencing (#12501)
|
2025-02-07 08:13:43 -08:00 |
|
|
|
afe74f7a96
|
[Doc] double quote cmake package in build.inc.md (#12840)
|
2025-02-06 09:17:55 -08:00 |
|
|
|
f256ebe4df
|
[Hardware][Intel GPU] add XPU bf16 support (#12392)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
|
2025-02-02 10:17:26 +00:00 |
|
|
|
60808bd4c7
|
[Doc] Improve installation signposting (#12575)
- Make device tab names more explicit
- Add comprehensive list of devices to
https://docs.vllm.ai/en/latest/getting_started/installation/index.html
- Add `attention` blocks to the intro of all devices that don't have
pre-built wheels/images
---------
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-01-31 15:38:35 -08:00 |
|
|
|
dd6a3a02cb
|
[Doc] Convert docs to use colon fences (#12471)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-01-29 11:38:29 +08:00 |
|
|
|
925d2f1908
|
[Doc] Fix typo for x86 CPU installation (#12514)
Signed-off-by: Jun Duan <jun.duan.phd@outlook.com>
|
2025-01-28 16:37:10 +00:00 |
|
|
|
9a0f3bdbe5
|
[Hardware][Gaudi][Doc] Add missing step in setup instructions (#12382)
|
2025-01-24 09:43:49 +00:00 |
|
|
|
d07efb31c5
|
[Doc] Troubleshooting errors during model inspection (#12351)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-23 22:46:58 +08:00 |
|
|
|
511627445e
|
[doc] explain common errors around torch.compile (#12340)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-23 14:56:02 +08:00 |
|
|
|
09ccc9c8f7
|
[Documentation][AMD] Add information about prebuilt ROCm vLLM docker for perf validation purpose (#12281)
Signed-off-by: Hongxia Yang <hongxyan@amd.com>
|
2025-01-22 07:49:22 +08:00 |
|
|
|
d4b62d4641
|
[AMD][Build] Porting dockerfiles from the ROCm/vllm fork (#11777)
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
|
2025-01-21 12:22:23 +08:00 |
|
|
|
c09503ddd6
|
[AMD][CI/Build][Bugfix] use pytorch stale wheel (#12172)
Signed-off-by: hongxyan <hongxyan@amd.com>
|
2025-01-18 11:15:53 +08:00 |
|
|
|
e8c23ff989
|
[Doc] Organise installation documentation into categories and tabs (#11935)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-01-13 12:27:36 +00:00 |
|
|
|
43f3d9e699
|
[CI/Build] Add markdown linter (#11857)
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
|
2025-01-12 00:17:13 -08:00 |
|
|
|
aa1e77a19c
|
[Hardware][CPU] Support MOE models on x86 CPU (#11831)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2025-01-10 11:07:58 -05:00 |
|
|
|
482cdc494e
|
[Doc] Rename offline inference examples (#11927)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-01-10 23:50:29 +08:00 |
|
|
|
d85c47d6ad
|
Replace "online inference" with "online serving" (#11923)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-01-10 12:05:56 +00:00 |
|
|
|
730e9592e9
|
[Doc] Recommend uv and python 3.12 for quickstart guide (#11849)
Signed-off-by: mgoin <michael@neuralmagic.com>
|
2025-01-09 11:37:48 +08:00 |
|
|
|
6cd40a5bfe
|
[Doc][4/N] Reorganize API Reference (#11843)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-08 21:34:44 +08:00 |
|
|
|
aba8d6ee00
|
[Doc] Move examples into categories (#11840)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-01-08 13:09:53 +00:00 |
|
|
|
cfd3219f58
|
[Hardware][Apple] Native support for macOS Apple Silicon (#11696)
Signed-off-by: Wallas Santos <wallashss@ibm.com>
Co-authored-by: Michael Goin <michael@neuralmagic.com>
|
2025-01-08 16:35:49 +08:00 |
|
|
|
ad9f1aa679
|
[doc] update wheels url (#11830)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-08 14:36:49 +08:00 |
|
|
|
5950f555a1
|
[Doc] Group examples into categories (#11782)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-01-08 09:20:12 +08:00 |
|
|
|
d9fa1c05ad
|
[doc] update how pip can install nightly wheels (#11806)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-07 21:42:58 +08:00 |
|
|
|
869e829b85
|
[doc] add doc to explain how to use uv (#11773)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-01-07 18:41:17 +08:00 |
|
|
|
8ceffbf315
|
[Doc][3/N] Reorganize Serving section (#11766)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-07 11:20:01 +08:00 |
|
|
|
402d378360
|
[Doc] [1/N] Reorganize Getting Started section (#11645)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-06 02:18:33 +00:00 |
|
|
|
32b4c63f02
|
[Doc] Convert list tables to MyST (#11594)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-29 15:56:22 +08:00 |
|
|
|
d427e5cfda
|
[Doc] Minor documentation fixes (#11580)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-28 21:53:59 +08:00 |
|
|
|
6ad909fdda
|
[Doc] Improve GitHub links (#11491)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-25 14:49:26 -08:00 |
|
|
|
32aa2059ad
|
[Docs] Convert rST to MyST (Markdown) (#11145)
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
|
2024-12-23 22:35:38 +00:00 |
|
|
|
2e726680b3
|
[Bugfix] torch nightly version in ROCm installation guide (#11423)
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
|
2024-12-23 17:20:22 +00:00 |
|
|
|
5d2248d81a
|
[doc] explain nccl requirements for rlhf (#11381)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-12-20 13:00:56 -08:00 |
|
|
|
1ecc645b8f
|
[doc] backward compatibility for 0.6.4 (#11359)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-12-19 21:33:53 -08:00 |
|
|
|
4863e5fba5
|
[Core] V1: Use multiprocessing by default (#11074)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2024-12-13 16:27:32 -08:00 |
|
|
|
e4c34c23de
|
[CI/Build] improve python-only dev setup (#9621)
Signed-off-by: Daniele Trifirò <dtrifiro@redhat.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2024-12-04 21:48:13 +00:00 |
|
|
|
c92acb9693
|
[ci/build] Update vLLM postmerge ECR repo (#10887)
|
2024-12-04 09:01:20 +00:00 |
|
|
|
7e4bbda573
|
[doc] format fix (#10789)
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
|
2024-11-30 11:38:40 +00:00 |
|
|
|
9a88f89799
|
custom allreduce + torch.compile (#10121)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2024-11-25 22:00:16 -08:00 |
|
|
|
a6760f6456
|
[Feature] vLLM ARM Enablement for AARCH64 CPUs (#9228)
Signed-off-by: Sanket Kale <sanketk.kale@fujitsu.com>
Co-authored-by: Sanket Kale <sanketk.kale@fujitsu.com>
Co-authored-by: mgoin <michael@neuralmagic.com>
|
2024-11-25 18:32:39 -08:00 |
|
|
|
63f1fde277
|
[Hardware][CPU] Support chunked-prefill and prefix-caching on CPU (#10355)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2024-11-20 10:57:39 +00:00 |
|
|
|
7629a9c6e5
|
[CI/Build] Support compilation with local cutlass path (#10423) (#10424)
|
2024-11-19 21:35:50 -08:00 |
|
|
|
4f168f69a3
|
[Docs] Misc updates to TPU installation instructions (#10165)
|
2024-11-15 13:26:17 -08:00 |
|