Logo
Explore Help
Sign In
youngkingdom/vllm
1
0
Fork 0
You've already forked vllm
Code Issues Pull Requests Actions Packages Projects Releases Wiki Activity
Files
dc372b9c8aa97b5da5d4049cdccdaccef950d499
vllm/csrc/moe
History
Jinzhen Lin d4154c35a2 [Bugfix] fix moe marlin topk_weight loading (#18080)
Co-authored-by: mgoin <mgoin64@gmail.com>
2025-05-13 23:31:57 -07:00
..
marlin_moe_wna16
[Bugfix] fix moe marlin topk_weight loading (#18080)
2025-05-13 23:31:57 -07:00
permute_unpermute_kernels
permute/unpermute kernel for moe optimization (#14568)
2025-05-02 11:31:55 -07:00
moe_align_sum_kernels.cu
Optimize moe_align_block_size for deepseek_v3 (#12850)
2025-02-13 18:43:37 -05:00
moe_ops.h
[ROCm][Bugfix] Ensure that the moe_wna16_gemm kernel is not built on ROCm platforms. (#14629)
2025-03-12 08:00:28 -04:00
moe_permute_unpermute_op.cu
permute/unpermute kernel for moe optimization (#14568)
2025-05-02 11:31:55 -07:00
moe_wna16_utils.h
pre-commit autoupdate (#17380)
2025-04-29 06:46:55 -07:00
moe_wna16.cu
[BugFix] Accuracy fix for llama4 int4 - improperly casted scales (#16801)
2025-04-17 22:13:29 -07:00
topk_softmax_kernels.cu
[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047)
2024-06-09 16:23:30 -04:00
torch_bindings.cpp
[Kernel] fp4 marlin kernel (#17687)
2025-05-10 19:58:49 -07:00
Powered by Gitea Version: 1.24.2 Page: 267ms Template: 5ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API