Commit Graph

22 Commits

Author SHA1 Message Date
a283ec2eec Add contributing guideline and mypy config (#122) 2023-05-23 17:58:51 -07:00
c3442c1f6f Refactor system architecture (#109) 2023-05-20 13:06:59 -07:00
f756799b84 Use runtime profiling to replace manual memory analyzers (#81) 2023-05-19 11:35:44 -06:00
b322fd1607 Add docstrings to some modules and classes (#100) 2023-05-14 22:32:38 -07:00
55f8b0a5de Implement presence and frequency penalties (#95) 2023-05-10 23:39:12 -07:00
e331957784 Log system stats (#90) 2023-05-10 01:06:53 -07:00
8d66a7b6d7 Rename variables and methods (#91) 2023-05-10 00:58:31 -07:00
7c041ab578 Refactor system architecture (#82) 2023-05-09 15:30:12 -07:00
c9d5b6d4a8 Replace FlashAttention with xformers (#70) 2023-05-05 02:01:08 -07:00
27f1410d06 New weight loader without np copy (#52) 2023-05-03 15:32:04 +08:00
ee88a7e5f3 Add an option to use dummy model weights (#33) 2023-04-08 23:36:12 -07:00
12659a0bd7 Add CUDA graph-based all reduce launcher (#26) 2023-04-05 11:16:57 -07:00
897cb2ae28 Optimize data movement (#20) 2023-04-02 00:30:17 -07:00
2f49f15585 Support tensor parallel (#2) 2023-03-21 13:45:42 -07:00
cfae35b861 Add miscellaneous updates (#8) 2023-03-13 13:48:38 -07:00
1a7eb7da61 Support beam search & parallel generation (#7) 2023-03-10 09:58:21 -08:00
0deacbce6e Implement single_query_cached_kv_attention kernel (#3) 2023-03-01 15:02:19 -08:00
1ce1333573 Set default dtype to half 2023-02-23 21:31:39 +00:00
fdd0f2f472 Minor 2023-02-23 20:23:47 +00:00
343cea3dbc Add seq_ids to input metadata 2023-02-23 09:25:01 +00:00
4b1ac23f53 Fix slot mapping 2023-02-23 00:10:07 +00:00
8290fce47d Add Worker class 2023-02-22 19:01:38 +00:00