Logo
Explore Help
Sign In
youngkingdom/vllm
1
0
Fork 0
You've already forked vllm
Code Issues Pull Requests Actions Packages Projects Releases Wiki Activity
3,476 Commits 157 Branches 93 Tags
2298e69b5f1dc77f00aee687a3843a4dae12cb91
Commit Graph

8 Commits

Author SHA1 Message Date
Michael Goin
235366fe2e [CI] Prune back the number of tests in tests/kernels/* (#9932)
Signed-off-by: mgoin <michael@neuralmagic.com>
2024-11-05 16:02:32 -05:00
wangshuai09
622b7ab955 [Hardware] using current_platform.seed_everything (#9785)
Signed-off-by: wangshuai09 <391746016@qq.com>
2024-10-29 14:47:44 +00:00
wangshuai09
4e2d95e372 [Hardware][ROCM] using current_platform.is_rocm (#9642)
Signed-off-by: wangshuai09 <391746016@qq.com>
2024-10-28 04:07:00 +00:00
Cyrus Leung
6ffa3f314c [CI/Build] Avoid CUDA initialization (#8534) 2024-09-18 10:38:11 +00:00
jon-chuang
50b8d08dbd [Misc/Testing] Use torch.testing.assert_close (#7324) 2024-08-16 04:24:04 +00:00
Michael Goin
978aed5300 [Kernel][Attention] Separate Attention.kv_scale into k_scale and v_scale (#6081) 2024-07-16 15:31:32 -07:00
Cyrus Leung
0e9164b40a [mypy] Enable type checking for test directory (#5017) 2024-06-15 04:45:31 +00:00
Eric Xihui Lin
8e192ff967 [Kernel][Backend][Model] Blocksparse flash attention kernel and Phi-3-Small model (#4799)
Co-authored-by: beagleski <yunanzhang@microsoft.com>
Co-authored-by: bapatra <bapatra@microsoft.com>
Co-authored-by: Barun Patra <codedecde@users.noreply.github.com>
Co-authored-by: Michael Goin <michael@neuralmagic.com>
2024-05-24 22:00:52 -07:00
Powered by Gitea Version: 1.24.2 Page: 203ms Template: 10ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API