Logo
Explore Help
Sign In
youngkingdom/vllm
1
0
Fork 0
You've already forked vllm
Code Issues Pull Requests Actions Packages Projects Releases Wiki Activity
3,226 Commits 157 Branches 93 Tags
d6459b4516dbac4f346ce29fe90d43ebfafa1114
Commit Graph

5 Commits

Author SHA1 Message Date
sroy745
a78dd3303e [Encoder Decoder] Add flash_attn kernel support for encoder-decoder models (#9559) 2024-11-01 23:22:49 -07:00
wangshuai09
4e2d95e372 [Hardware][ROCM] using current_platform.is_rocm (#9642)
Signed-off-by: wangshuai09 <391746016@qq.com>
2024-10-28 04:07:00 +00:00
youkaichao
a9b15c606f [torch.compile] use empty tensor instead of None for profiling (#8875) 2024-09-27 08:11:32 -07:00
afeldman-nm
fd95e026e0 [Core] Subclass ModelRunner to support cross-attention & encoder sequences (towards eventual encoder/decoder model support) (#4942)
Co-authored-by: Andrew Feldman <afeld2012@gmail.com>
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
2024-08-06 16:51:47 -04:00
afeldman-nm
543aa48573 [Kernel] Correctly invoke prefill & decode kernels for cross-attention (towards eventual encoder/decoder model support) (#4888)
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2024-07-08 17:12:15 +00:00
Powered by Gitea Version: 1.24.2 Page: 190ms Template: 8ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API