|
|
76144adf76
|
ci: Add CUDA + arm64 release builds (#21201)
Signed-off-by: Eli Uriegas <eliuriegas@meta.com>
|
2025-08-15 23:16:23 +00:00 |
|
|
|
e8b40c7fa2
|
[CI] Remove duplicated docs build from buildkite (#22924)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-08-15 05:58:06 -07:00 |
|
|
|
ba81acbdc1
|
[Bugfix] Bump DeepGEMM Version to Fix SMXX Layout Issues (#22606)
Signed-off-by: frankwang28 <frank.wbb@hotmail.com>
|
2025-08-12 15:43:06 -07:00 |
|
|
|
dc5e4a653c
|
Upgrade FlashInfer to v0.2.11 (#22613)
Signed-off-by: Po-Han Huang <pohanh@nvidia.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
|
2025-08-11 19:58:41 -07:00 |
|
|
|
d1af8b7be9
|
enable Docker-aware precompiled wheel setup (#22106)
Signed-off-by: dougbtv <dosmith@redhat.com>
|
2025-08-10 16:29:02 -07:00 |
|
|
|
e8961e963a
|
Update flashinfer-python==0.2.10 (#22389)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-08-06 18:10:24 -07:00 |
|
|
|
a7cb6101ca
|
[CI/Build] Update flashinfer to 0.2.9 (#22233)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-08-05 09:39:38 -07:00 |
|
|
|
c494f96fbc
|
Use UV_LINK_MODE=copy in Dockerfile to avoid hardlink fail (#22128)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-08-05 06:57:10 -07:00 |
|
|
|
da31f6ad3d
|
Revert precompile wheel changes (#22055)
|
2025-08-01 08:26:24 +00:00 |
|
|
|
e360316ab9
|
Add DeepGEMM to Dockerfile in vllm-base image (#21533)
Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
|
2025-07-31 18:01:55 -07:00 |
|
|
|
58bb902186
|
fix(setup): improve precompiled wheel setup for Docker builds (#22025)
Signed-off-by: dougbtv <dosmith@redhat.com>
|
2025-07-31 09:52:48 -07:00 |
|
|
|
d2aab336ad
|
[CI/Build] get rid of unused VLLM_FA_CMAKE_GPU_ARCHES (#21599)
Signed-off-by: Daniele Trifirò <dtrifiro@redhat.com>
|
2025-07-31 15:00:08 +08:00 |
|
|
|
a1873db23d
|
docker: docker-aware precompiled wheel support (#21127)
Signed-off-by: dougbtv <dosmith@redhat.com>
|
2025-07-29 14:45:19 -07:00 |
|
|
|
a33ea28b1b
|
Add flashinfer_python to CUDA wheel requirements (#21389)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-07-29 12:51:58 -07:00 |
|
|
|
01c753ed98
|
update flashinfer to v0.2.9rc2 (#21701)
Signed-off-by: Weiliang Liu <weiliangl@nvidia.com>
|
2025-07-28 19:31:47 +00:00 |
|
|
|
2dd72d23d9
|
update flashinfer to v0.2.9rc1 (#21485)
Signed-off-by: Weiliang Liu <weiliangl@nvidia.com>
|
2025-07-24 14:06:11 -07:00 |
|
|
|
5a19a6c670
|
[Fix] Update mamba_ssm to 2.2.5 (#21421)
Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
|
2025-07-24 03:25:41 -07:00 |
|
|
|
526078a96c
|
bump flashinfer to v0.2.8 (#21385)
Signed-off-by: cjackal <44624812+cjackal@users.noreply.github.com>
|
2025-07-24 03:20:38 -07:00 |
|
|
|
aa08a954f9
|
[Bugfix] Fix casing warning (#21468)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2025-07-23 20:41:23 -07:00 |
|
|
|
8188196a1c
|
[CI] Cleanup modelscope version constraint in Dockerfile (#21243)
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
|
2025-07-20 20:13:02 -07:00 |
|
|
|
a50d918225
|
[Docker] Allow FlashInfer to be built in the ARM CUDA Dockerfile (#21013)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-07-16 19:37:13 -07:00 |
|
|
|
1eb2b9c102
|
[CI] update typos config for CI pre-commit and fix some spells (#20919)
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
|
2025-07-15 21:12:40 -07:00 |
|
|
|
7976446015
|
Add Dockerfile argument for VLLM_USE_PRECOMPILED environment (#20943)
Signed-off-by: dougbtv <dosmith@redhat.com>
|
2025-07-15 19:53:57 -07:00 |
|
|
|
cf75cd2098
|
[CI Bugfix] Specify same TORCH_CUDA_ARCH_LIST for flashinfer aot and install (#20772)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-07-11 01:16:01 +00:00 |
|
|
|
4b9a9435bb
|
Update Dockerfile FlashInfer to v0.2.8rc1 (#20718)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-07-10 08:09:02 -07:00 |
|
|
|
b7d9e9416f
|
[CI/Build] Fix FlashInfer double build in Dockerfile (#20651)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-07-09 17:41:56 -06:00 |
|
|
|
5561681d04
|
[CI] add kvcache-connector dependency definition and add into CI build (#18193)
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
|
2025-07-04 06:49:18 -07:00 |
|
|
|
1819fbda63
|
[Quantization] Bump to use latest bitsandbytes (#20424)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-07-03 21:58:46 +08:00 |
|
|
|
bdb84e26b0
|
[Bugfix] Fixes for FlashInfer's TORCH_CUDA_ARCH_LIST (#20136)
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
Signed-off-by: Tyler Michael Smith <tysmith@redhat.com>
|
2025-07-02 17:15:11 -07:00 |
|
|
|
3c545c0c3b
|
[CI/Build] Allow hermetic builds (#18064)
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
Signed-off-by: Fabien Dupont <fabiendupont@pm.me>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
Co-authored-by: Elias Levy <eliaslevy@google.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-06-27 09:04:39 -07:00 |
|
|
|
296ce95d8e
|
[CI] Add SM120 to the Dockerfile (#19794)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-06-25 16:23:56 -07:00 |
|
|
|
497a91e9f7
|
[CI] Update FlashInfer to 0.2.6.post1 (#19297)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-06-11 22:57:28 +08:00 |
|
|
|
7d9216495c
|
[Doc] Update references to doc files (#18637)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-23 15:49:21 -07:00 |
|
|
|
1645b60196
|
Use prebuilt FlashInfer x86_64 PyTorch 2.7 CUDA 12.8 wheel for CI (#18537)
Signed-off-by: Huy Do <huydhn@gmail.com>
|
2025-05-23 21:17:16 +00:00 |
|
|
|
a1fe24d961
|
Migrate docs from Sphinx to MkDocs (#18145)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-05-23 02:09:53 -07:00 |
|
|
|
6e588da0f4
|
[Build/CI] Fix CUDA 11.8 build (#17679)
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
Signed-off-by: Tyler Michael Smith <tysmith@redhat.com>
Co-authored-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
|
2025-05-22 12:13:54 -07:00 |
|
|
|
371376f996
|
[Build] fix Dockerfile shell (#18402)
|
2025-05-21 07:32:06 -07:00 |
|
|
|
47fda6d089
|
[Build] Supports CUDA 12.6 and 11.8 after Blackwell Update (#18316)
Signed-off-by: simon-mo <simon.mo@hey.com>
|
2025-05-18 23:19:33 -07:00 |
|
|
|
dcfe95234c
|
Update Dockerfile to build for Blackwell (#18095)
|
2025-05-17 00:23:25 -07:00 |
|
|
|
7fdfa01530
|
[Sampler] Adapt to FlashInfer 0.2.3 sampler API (#15777)
Signed-off-by: Bowen Wang <abmfy@icloud.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
|
2025-05-16 15:14:03 -07:00 |
|
|
|
2c4f59afc3
|
Update PyTorch to 2.7.0 (#16859)
|
2025-04-29 19:08:04 -07:00 |
|
|
|
08e15defa9
|
[CI/Build] Add retry mechanism for add-apt-repository (#17107)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-04-29 10:40:52 -07:00 |
|
|
|
d1aeea7553
|
[Bugfix] Fix missing ARG in Dockerfile for arm64 platforms (#17261)
Signed-off-by: lkm-schulz <44176356+lkm-schulz@users.noreply.github.com>
|
2025-04-27 19:38:14 -07:00 |
|
|
|
b07d741661
|
[CI/Build] workaround for CI build failure (#17070)
Signed-off-by: csy1204 <josang1204@gmail.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
|
2025-04-23 16:14:18 -07:00 |
|
|
|
7bdfd29a35
|
[Misc] add collect_env to cli and docker image (#16759)
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
|
2025-04-17 22:13:35 -07:00 |
|
|
|
96bb8aa68b
|
[Bugfix] fix gpu docker image mis benchmarks dir (#16628)
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
|
2025-04-15 21:21:14 -07:00 |
|
|
|
e6e3c55ef2
|
Move dockerfiles into their own directory (#14549)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-03-31 13:47:32 -07:00 |
|