Commit Graph

15 Commits

Author SHA1 Message Date
01bfb22b41 [CI] Try introducing isort. (#3495) 2024-03-25 07:59:47 -07:00
7e9bd08f60 Add batched RoPE kernel (#3095) 2024-03-13 13:45:26 -07:00
56f738ae9b [ROCm] Fix some kernels failed unit tests (#2498) 2024-02-05 14:25:36 -08:00
96b6f475dd Remove hardcoded device="cuda" to support more devices (#2503)
Co-authored-by: Jiang Li <jiang1.li@intel.com>
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
2024-02-01 15:46:39 -08:00
77af974b40 [FIX] Support non-zero CUDA devices in custom kernels (#1959) 2024-01-02 19:09:59 -08:00
9b294976a2 Add PyTorch-native implementation of custom layers (#1898) 2023-12-02 21:18:40 -08:00
e0c6f556e8 [Build] Avoid building too many extensions (#1624) 2023-11-23 16:31:19 -08:00
ba0bfd40e2 TP/quantization/weight loading refactor part 1 - Simplify parallel linear logic (#1181) 2023-10-02 15:36:09 -07:00
e67b4f2c2a Use FP32 in RoPE initialization (#1004)
Co-authored-by: One <imone@tuta.io>
2023-09-11 00:26:35 -07:00
320a622ec4 [BugFix] Implement RoPE for GPT-J (#941) 2023-09-06 11:54:33 +09:00
fbd80ad409 Clean up kernel unit tests (#938) 2023-09-05 16:57:38 -07:00
d6fa1be3a8 [Quality] Add code formatter and linter (#326) 2023-07-03 11:31:55 -07:00
0b98ba15c7 Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00
a283ec2eec Add contributing guideline and mypy config (#122) 2023-05-23 17:58:51 -07:00
825d8892b5 Use pytest format for unit tests (#107) 2023-05-17 17:11:23 -07:00