youngkingdom/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Zhuohan Li	0b7db411b5	[Bug] Fix the OOM condition for CPU cache (#260 )	2023-06-26 11:16:13 -07:00
BasicCoder	471a7a4566	Compatible with Decapoda Research llama hf version (#251 )	2023-06-26 09:23:57 -07:00
Lianmin Zheng	6214dd6ce9	Update README.md (#236 )	2023-06-25 16:58:06 -07:00
metacryptom	0603379863	fix wrong using getattr to get dict value (#232 )	2023-06-24 22:00:24 -07:00
Woosuk Kwon	665c48963b	[Docs] Add GPTBigCode to supported models (#213 )	2023-06-22 15:05:11 -07:00
Michael Feil	298695b766	GPTBigCode (StarCoder, SantaCoder Support) (#209 )	2023-06-23 01:49:27 +08:00
Zhuohan Li	83658c8ace	Bump up version to 0.1.1 (#204 ) v0.1.1	2023-06-22 15:33:32 +08:00
Zhuohan Li	1d24ccb96c	[Fix] Better error message when there is OOM during cache initialization (#203 )	2023-06-22 15:30:06 +08:00
Woosuk Kwon	14f0b39cda	[Bugfix] Fix a bug in RequestOutput.finished (#202 )	2023-06-22 00:17:24 -07:00
Zhuohan Li	2e0d314384	fix-ray (#193 )	2023-06-22 00:21:41 +08:00
Woosuk Kwon	67d96c29fb	Use slow tokenizer for open llama models (#168 ) v0.1.0	2023-06-20 14:19:47 +08:00
Zhuohan Li	033f5c78f5	Remove e.g. in README (#167 )	2023-06-20 14:00:28 +08:00
Woosuk Kwon	794e578de0	[Minor] Fix URLs (#166 )	2023-06-19 22:57:14 -07:00
Woosuk Kwon	caddfc14c1	[Minor] Fix icons in doc (#165 )	2023-06-19 20:35:38 -07:00
Zhuohan Li	fc72e39de3	Change image urls (#164 )	2023-06-20 11:15:15 +08:00
Woosuk Kwon	b7e62d3454	Fix repo & documentation URLs (#163 )	2023-06-19 20:03:40 -07:00
Woosuk Kwon	364536acd1	[Docs] Minor fix (#162 )	2023-06-19 19:58:23 -07:00
Zhuohan Li	0b32a987dd	Add and list supported models in README (#161 )	2023-06-20 10:57:46 +08:00
Woosuk Kwon	570fb2e9cc	[PyPI] Fix package info in setup.py (#158 )	2023-06-19 18:05:01 -07:00
Zhuohan Li	a255885f83	Add logo and polish readme (#156 )	2023-06-19 16:31:13 +08:00
Woosuk Kwon	5822ede66e	Add performance figures for dark mode (#160 )	2023-06-18 23:46:24 -07:00
Zhuohan Li	0370afa2e5	Remove benchmark_async_llm_server.py (#155 )	2023-06-19 11:12:37 +08:00
Woosuk Kwon	7e2a913c64	[Minor] Fix CompletionOutput.__repr__ (#157 )	2023-06-18 19:58:25 -07:00
Woosuk Kwon	3f92038b99	Add comments on swap space (#154 )	2023-06-18 11:39:35 -07:00
Woosuk Kwon	dcda03b4cb	Write README and front page of doc (#147 )	2023-06-18 03:19:38 -07:00
Zhuohan Li	bf5f121c02	Reduce GPU memory utilization to make sure OOM doesn't happen (#153 )	2023-06-18 17:33:50 +08:00
Zhuohan Li	bec7b2dc26	Add quickstart guide (#148 )	2023-06-18 01:26:12 +08:00
Woosuk Kwon	0b98ba15c7	Change the name to vLLM (#150 )	2023-06-17 03:07:40 -07:00
Zhuohan Li	e5464ee484	Rename servers to engines (#152 )	2023-06-17 17:25:21 +08:00
Woosuk Kwon	bab8f3dd0d	[Minor] Fix benchmark_throughput.py (#151 )	2023-06-16 21:00:52 -07:00
Zhuohan Li	eedb46bf03	Rename servers and change port numbers to reduce confusion (#149 )	2023-06-17 00:13:02 +08:00
Woosuk Kwon	311490a720	Add script for benchmarking serving throughput (#145 )	2023-06-14 19:55:38 -07:00
Woosuk Kwon	da5ddcd544	Remove redundant code in ColumnParallelLinear (#146 )	2023-06-10 21:25:11 -07:00
Zhuohan Li	5020e1e80c	Non-streaming simple fastapi server (#144 )	2023-06-10 10:43:07 -07:00
Zhuohan Li	4298374265	Add docstrings for LLMServer and related classes and examples (#142 )	2023-06-07 18:25:20 +08:00
Woosuk Kwon	e38074b1e6	Support FP32 (#141 )	2023-06-07 00:40:21 -07:00
Woosuk Kwon	376725ce74	[PyPI] Packaging for PyPI distribution (#140 )	2023-06-05 20:03:14 -07:00
Woosuk Kwon	456941cfe4	[Docs] Write the `Adding a New Model` section (#138 )	2023-06-05 20:01:26 -07:00
Zhuohan Li	1a956e136b	Fix various issues of async servers (#135 )	2023-06-05 23:44:50 +08:00
Woosuk Kwon	8274ca23ac	Add docstrings for LLM (#137 )	2023-06-04 12:52:41 -07:00
Woosuk Kwon	62ec38ea41	Document supported models (#127 )	2023-06-02 22:35:17 -07:00
Woosuk Kwon	0eda2e0953	Add .readthedocs.yaml (#136 )	2023-06-02 22:27:44 -07:00
Woosuk Kwon	211318d44a	Add throughput benchmarking script (#133 )	2023-05-28 03:20:05 -07:00
Woosuk Kwon	337871c6fd	Enable LLaMA fast tokenizer (#132 )	2023-05-28 02:51:42 -07:00
Woosuk Kwon	56b7f0efa4	Add a doc for installation (#128 )	2023-05-27 01:13:06 -07:00
Woosuk Kwon	d721168449	Improve setup script & Add a guard for bfloat16 kernels (#130 )	2023-05-27 00:59:32 -07:00
Woosuk Kwon	4a151dd453	Add activation registry (#126 )	2023-05-25 00:09:07 -07:00
Zhuohan Li	057daef778	OpenAI Compatible Frontend (#116 )	2023-05-23 21:39:50 -07:00
Woosuk Kwon	e86717833d	Incrementally decode output tokens (#121 )	2023-05-23 20:46:32 -07:00
Woosuk Kwon	aedba6d5ec	Print warnings/errors for large swap space (#123 )	2023-05-23 18:22:26 -07:00

1 2 3 4

196 Commits