Commit Graph

93 Commits

Author SHA1 Message Date
76144adf76 ci: Add CUDA + arm64 release builds (#21201)
Signed-off-by: Eli Uriegas <eliuriegas@meta.com>
2025-08-15 23:16:23 +00:00
e8b40c7fa2 [CI] Remove duplicated docs build from buildkite (#22924)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-08-15 05:58:06 -07:00
ba81acbdc1 [Bugfix] Bump DeepGEMM Version to Fix SMXX Layout Issues (#22606)
Signed-off-by: frankwang28 <frank.wbb@hotmail.com>
2025-08-12 15:43:06 -07:00
dc5e4a653c Upgrade FlashInfer to v0.2.11 (#22613)
Signed-off-by: Po-Han Huang <pohanh@nvidia.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
2025-08-11 19:58:41 -07:00
d1af8b7be9 enable Docker-aware precompiled wheel setup (#22106)
Signed-off-by: dougbtv <dosmith@redhat.com>
2025-08-10 16:29:02 -07:00
81c57f60a2 [XPU] upgrade torch 2.8 on for XPU (#22300)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2025-08-08 17:03:45 -07:00
e8961e963a Update flashinfer-python==0.2.10 (#22389)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-08-06 18:10:24 -07:00
a7cb6101ca [CI/Build] Update flashinfer to 0.2.9 (#22233)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-08-05 09:39:38 -07:00
c494f96fbc Use UV_LINK_MODE=copy in Dockerfile to avoid hardlink fail (#22128)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-08-05 06:57:10 -07:00
da31f6ad3d Revert precompile wheel changes (#22055) 2025-08-01 08:26:24 +00:00
e360316ab9 Add DeepGEMM to Dockerfile in vllm-base image (#21533)
Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
2025-07-31 18:01:55 -07:00
53c21e492e Update torch_xla pin to 20250730 (#21956)
Signed-off-by: Xiongfei Wei <isaacwxf23@gmail.com>
2025-07-31 17:26:43 +00:00
58bb902186 fix(setup): improve precompiled wheel setup for Docker builds (#22025)
Signed-off-by: dougbtv <dosmith@redhat.com>
2025-07-31 09:52:48 -07:00
d2aab336ad [CI/Build] get rid of unused VLLM_FA_CMAKE_GPU_ARCHES (#21599)
Signed-off-by: Daniele Trifirò <dtrifiro@redhat.com>
2025-07-31 15:00:08 +08:00
a1873db23d docker: docker-aware precompiled wheel support (#21127)
Signed-off-by: dougbtv <dosmith@redhat.com>
2025-07-29 14:45:19 -07:00
a33ea28b1b Add flashinfer_python to CUDA wheel requirements (#21389)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-07-29 12:51:58 -07:00
01c753ed98 update flashinfer to v0.2.9rc2 (#21701)
Signed-off-by: Weiliang Liu <weiliangl@nvidia.com>
2025-07-28 19:31:47 +00:00
65e8466c37 [Bugfix] Fix environment variable setting in CPU Dockerfile (#21730)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-07-28 11:02:39 +00:00
f1b286b2fb [TPU] Update ptxla nightly version to 20250724 (#21555)
Signed-off-by: Chengji Yao <chengjiyao@google.com>
2025-07-25 17:09:00 -07:00
396ee94180 [CI] Unifying Dockerfiles for ARM and X86 Builds (#21343)
Signed-off-by: Kebe <mail@kebe7jun.com>
2025-07-25 07:33:56 -07:00
2dd72d23d9 update flashinfer to v0.2.9rc1 (#21485)
Signed-off-by: Weiliang Liu <weiliangl@nvidia.com>
2025-07-24 14:06:11 -07:00
5a19a6c670 [Fix] Update mamba_ssm to 2.2.5 (#21421)
Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
2025-07-24 03:25:41 -07:00
526078a96c bump flashinfer to v0.2.8 (#21385)
Signed-off-by: cjackal <44624812+cjackal@users.noreply.github.com>
2025-07-24 03:20:38 -07:00
aa08a954f9 [Bugfix] Fix casing warning (#21468)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
2025-07-23 20:41:23 -07:00
13e4ee1dc3 [XPU][UT] increase intel xpu CI test scope (#21492)
Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com>
2025-07-23 20:24:04 -07:00
8188196a1c [CI] Cleanup modelscope version constraint in Dockerfile (#21243)
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2025-07-20 20:13:02 -07:00
4de7146351 [V0 deprecation] Remove V0 HPU backend (#21131)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-07-17 16:37:36 -07:00
4ef00b5cac [VLM] Add Nemotron-Nano-VL-8B-V1 support (#20349)
Signed-off-by: Kyle Huang <kylhuang@nvidia.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-07-17 03:07:55 -07:00
58760e12b1 [TPU] Start using python 3.12 (#21000)
Signed-off-by: Xiongfei Wei <isaacwxf23@gmail.com>
2025-07-16 19:37:44 -07:00
a50d918225 [Docker] Allow FlashInfer to be built in the ARM CUDA Dockerfile (#21013)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-07-16 19:37:13 -07:00
1eb2b9c102 [CI] update typos config for CI pre-commit and fix some spells (#20919)
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
2025-07-15 21:12:40 -07:00
7976446015 Add Dockerfile argument for VLLM_USE_PRECOMPILED environment (#20943)
Signed-off-by: dougbtv <dosmith@redhat.com>
2025-07-15 19:53:57 -07:00
80d38b8ac8 [V1] [ROCm] [AITER] Upgrade AITER to commit 916bf3c and bugfix APIs (#20880)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
2025-07-13 15:19:32 +00:00
cf75cd2098 [CI Bugfix] Specify same TORCH_CUDA_ARCH_LIST for flashinfer aot and install (#20772)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-07-11 01:16:01 +00:00
4b9a9435bb Update Dockerfile FlashInfer to v0.2.8rc1 (#20718)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-07-10 08:09:02 -07:00
b7d9e9416f [CI/Build] Fix FlashInfer double build in Dockerfile (#20651)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-07-09 17:41:56 -06:00
2c5ebec064 [XPU][CI] add v1/core test in xpu hardware ci (#20537)
Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com>
2025-07-07 01:16:40 -07:00
5561681d04 [CI] add kvcache-connector dependency definition and add into CI build (#18193)
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
2025-07-04 06:49:18 -07:00
1819fbda63 [Quantization] Bump to use latest bitsandbytes (#20424)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-07-03 21:58:46 +08:00
7f0367109e [CI/Build][CPU] Enable cross compilation in CPU release pipeline (#20423)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-07-03 05:26:12 -07:00
9965c47d0d Enable CPU nightly performance benchmark and its Markdown report (#18444)
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
2025-07-02 17:50:25 -07:00
bdb84e26b0 [Bugfix] Fixes for FlashInfer's TORCH_CUDA_ARCH_LIST (#20136)
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
Signed-off-by: Tyler Michael Smith <tysmith@redhat.com>
2025-07-02 17:15:11 -07:00
3c545c0c3b [CI/Build] Allow hermetic builds (#18064)
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
Signed-off-by: Fabien Dupont <fabiendupont@pm.me>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
Co-authored-by: Elias Levy <eliaslevy@google.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-06-27 09:04:39 -07:00
b69781f107 [Hardware][Intel GPU] Add v1 Intel GPU support with Flash attention backend. (#19560)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2025-06-26 09:27:18 -07:00
0567c8249f [CPU] Fix torch version in x86 CPU backend (#19258)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-06-26 03:34:47 -07:00
296ce95d8e [CI] Add SM120 to the Dockerfile (#19794)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-06-25 16:23:56 -07:00
2de12be428 [ROCm] [AITER] [Bugfix] Patch for AITER commit 648764942e552a8bb5fe16026703716a81f05374 (#18990)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
2025-06-18 22:56:31 -07:00
497a91e9f7 [CI] Update FlashInfer to 0.2.6.post1 (#19297)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-06-11 22:57:28 +08:00
a45b979d9f [BugFix] Fix docker build cpu-dev image error (#19394)
Signed-off-by: niu_he <carlton2tang@gmail.com>
2025-06-10 20:56:40 -07:00
88be823d57 [AMD] Update compatible packaging version (#19309)
Signed-off-by: pramkuma <Pramendra.Kumar@amd.com>
2025-06-07 20:55:09 +08:00