Commit Graph

36 Commits

Author SHA1 Message Date
a283ec2eec Add contributing guideline and mypy config (#122) 2023-05-23 17:58:51 -07:00
c3442c1f6f Refactor system architecture (#109) 2023-05-20 13:06:59 -07:00
f756799b84 Use runtime profiling to replace manual memory analyzers (#81) 2023-05-19 11:35:44 -06:00
b322fd1607 Add docstrings to some modules and classes (#100) 2023-05-14 22:32:38 -07:00
55f8b0a5de Implement presence and frequency penalties (#95) 2023-05-10 23:39:12 -07:00
e331957784 Log system stats (#90) 2023-05-10 01:06:53 -07:00
8d66a7b6d7 Rename variables and methods (#91) 2023-05-10 00:58:31 -07:00
7c041ab578 Refactor system architecture (#82) 2023-05-09 15:30:12 -07:00
c9d5b6d4a8 Replace FlashAttention with xformers (#70) 2023-05-05 02:01:08 -07:00
27f1410d06 New weight loader without np copy (#52) 2023-05-03 15:32:04 +08:00
4858f3bb45 Add an option to launch cacheflow without ray (#51) 2023-04-30 15:42:17 +08:00
ee88a7e5f3 Add an option to use dummy model weights (#33) 2023-04-08 23:36:12 -07:00
0f40557af6 Implement block copy kernel to optimize beam search (#32) 2023-04-07 17:45:07 -07:00
12659a0bd7 Add CUDA graph-based all reduce launcher (#26) 2023-04-05 11:16:57 -07:00
897cb2ae28 Optimize data movement (#20) 2023-04-02 00:30:17 -07:00
2f49f15585 Support tensor parallel (#2) 2023-03-21 13:45:42 -07:00
cfae35b861 Add miscellaneous updates (#8) 2023-03-13 13:48:38 -07:00
1a7eb7da61 Support beam search & parallel generation (#7) 2023-03-10 09:58:21 -08:00
0deacbce6e Implement single_query_cached_kv_attention kernel (#3) 2023-03-01 15:02:19 -08:00
1ce1333573 Set default dtype to half 2023-02-23 21:31:39 +00:00
fdd0f2f472 Minor 2023-02-23 20:23:47 +00:00
1f6c7ef437 Add controller 2023-02-23 09:32:19 +00:00
343cea3dbc Add seq_ids to input metadata 2023-02-23 09:25:01 +00:00
4b1ac23f53 Fix slot mapping 2023-02-23 00:10:07 +00:00
8290fce47d Add Worker class 2023-02-22 19:01:38 +00:00
709a69176e Move worker/models -> models 2023-02-22 18:03:48 +00:00
6f058c7ba8 Implement cache ops 2023-02-16 07:47:03 +00:00
a1c67e6db8 Minor 2023-02-16 01:42:53 +00:00
9e68a6827e Fix return type error 2023-02-16 01:33:03 +00:00
8edcabc737 Add warning 2023-02-16 01:28:17 +00:00
2f4887de77 Fix KVCache shape 2023-02-16 01:24:45 +00:00
ee9442518d Fix get_model 2023-02-13 22:51:03 +00:00
fffa2e1f4b Add model_utils 2023-02-13 09:36:12 +00:00
bb59a3e730 Fix cache engine 2023-02-13 09:35:48 +00:00
e7bee2aa81 Add cache engine 2023-02-09 11:28:02 +00:00
39161c98a0 Add OPT 2023-02-09 11:25:37 +00:00