Logo
Explore Help
Sign In
youngkingdom/vllm
1
0
Fork 0
You've already forked vllm
Code Issues Pull Requests Actions Packages Projects Releases Wiki Activity
1,005 Commits 157 Branches 93 Tags
9c82a1bec3a177ff2d611c092c19d25cabd90bb0
Commit Graph

11 Commits

Author SHA1 Message Date
youkaichao
756b30a5f3 [Core][Test] move local_rank to the last arg with default value(#3711)
[Core][Test] move local_rank to the last arg with default value to keep api compatible (#3711)
2024-03-28 21:19:45 -07:00
SangBin Cho
26422e477b [Test] Make model tests run again and remove --forked from pytest (#3631)
Co-authored-by: Simon Mo <simon.mo@hey.com>
2024-03-28 21:06:40 -07:00
Roy
515386ef3c [Core] Support multi-node inference(eager and cuda graph) (#3686) 2024-03-28 15:01:55 -07:00
youkaichao
8f44facddd [Core] remove cupy dependency (#3625) 2024-03-27 00:33:26 -07:00
SangBin Cho
01bfb22b41 [CI] Try introducing isort. (#3495) 2024-03-25 07:59:47 -07:00
Hanzhi Zhou
380170038e Implement custom all reduce kernels (#2192) 2024-01-27 12:46:35 -08:00
Zhuohan Li
ef9b636e2d Simplify broadcast logic for control messages (#2501) 2024-01-19 11:23:30 -08:00
Simon Mo
6e01e8c1c8 [CI] Add Buildkite (#2355) 2024-01-14 12:37:58 -08:00
Zhuohan Li
358c328d69 [BUGFIX] Fix communication test (#2285) 2023-12-27 17:18:11 -05:00
Zhuohan Li
20d0699d49 [Fix] Fix comm test (#1691) 2023-11-16 16:28:39 -08:00
Zhuohan Li
ba0bfd40e2 TP/quantization/weight loading refactor part 1 - Simplify parallel linear logic (#1181) 2023-10-02 15:36:09 -07:00
Powered by Gitea Version: 1.24.2 Page: 116ms Template: 11ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API