Commit Graph

11 Commits

Author SHA1 Message Date
df5dafaa5b [Misc] Remove deprecated code (#12383)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-24 14:45:20 -05:00
f842a7aff1 [misc] remove engine_use_ray (#8126) 2024-09-11 18:23:36 -07:00
70b746efcf [Misc] Deprecation Warning when setting --engine-use-ray (#7424)
Signed-off-by: Wallas Santos <wallashss@ibm.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
Co-authored-by: youkaichao <youkaichao@126.com>
2024-08-14 09:44:27 -07:00
4e7ee664e2 [Core] Fix engine-use-ray broken (#4105) 2024-04-16 05:24:53 +00:00
fb96c1e98c Asynchronous tokenization (#2879) 2024-03-15 23:37:01 +00:00
6e01e8c1c8 [CI] Add Buildkite (#2355) 2024-01-14 12:37:58 -08:00
fd4ea8ef5c Use NCCL instead of ray for control-plane communication to remove serialization overhead (#2221) 2024-01-03 11:30:22 -08:00
66b108d142 [BUGFIX] Fix API server test (#2270) 2023-12-26 10:37:06 -08:00
5ffc0d13a2 Migrate linter from pylint to ruff (#1665) 2023-11-20 11:58:01 -08:00
ba0bfd40e2 TP/quantization/weight loading refactor part 1 - Simplify parallel linear logic (#1181) 2023-10-02 15:36:09 -07:00
c07ece5ca4 Make AsyncLLMEngine more robust & fix batched abort (#969)
Signed-off-by: Antoni Baum <antoni.baum@protonmail.com>
Co-authored-by: Avnish Narayan <38871737+avnishn@users.noreply.github.com>
2023-09-07 13:43:45 -07:00