youngkingdom/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
iongpt	ac8d36f3e5	Refactor LLMEngine demo script for clarity and modularity (#1413 ) Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>	2023-10-30 09:14:37 -07:00
Zhuohan Li	9d9072a069	Implement prompt logprobs & Batched topk for computing logprobs (#1328 ) Co-authored-by: Yunmo Chen <16273544+wanmok@users.noreply.github.com>	2023-10-16 10:56:50 -07:00
Yunfeng Bai	09ff7f106a	API server support ipv4 / ipv6 dualstack (#1288 ) Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>	2023-10-07 15:15:54 -07:00
Woosuk Kwon	55fe8a81ec	Refactor scheduler (#658 )	2023-08-02 16:42:01 -07:00
Zhuohan Li	1b0bd0fe8a	Add Falcon support (new) (#592 )	2023-08-02 14:04:39 -07:00
Zhuohan Li	82ad323dee	[Fix] Add chat completion Example and simplify dependencies (#576 )	2023-07-25 23:45:48 -07:00
Zhuohan Li	d6fa1be3a8	[Quality] Add code formatter and linter (#326 )	2023-07-03 11:31:55 -07:00
Woosuk Kwon	14f0b39cda	[Bugfix] Fix a bug in RequestOutput.finished (#202 )	2023-06-22 00:17:24 -07:00
Woosuk Kwon	0b98ba15c7	Change the name to vLLM (#150 )	2023-06-17 03:07:40 -07:00
Zhuohan Li	e5464ee484	Rename servers to engines (#152 )	2023-06-17 17:25:21 +08:00
Zhuohan Li	eedb46bf03	Rename servers and change port numbers to reduce confusion (#149 )	2023-06-17 00:13:02 +08:00
Woosuk Kwon	311490a720	Add script for benchmarking serving throughput (#145 )	2023-06-14 19:55:38 -07:00
Zhuohan Li	5020e1e80c	Non-streaming simple fastapi server (#144 )	2023-06-10 10:43:07 -07:00
Zhuohan Li	4298374265	Add docstrings for LLMServer and related classes and examples (#142 )	2023-06-07 18:25:20 +08:00
Woosuk Kwon	211318d44a	Add throughput benchmarking script (#133 )	2023-05-28 03:20:05 -07:00
Zhuohan Li	057daef778	OpenAI Compatible Frontend (#116 )	2023-05-23 21:39:50 -07:00
Woosuk Kwon	655a5e48df	Introduce LLM class for offline inference (#115 )	2023-05-21 17:04:18 -07:00
Woosuk Kwon	f746ced08d	Implement stop strings and best_of (#114 )	2023-05-21 11:18:00 -07:00
Woosuk Kwon	c3442c1f6f	Refactor system architecture (#109 )	2023-05-20 13:06:59 -07:00

19 Commits