youngkingdom/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Cody Yu	d11bf435a0	[MISC] Consolidate cleanup() and refactor offline_inference_with_prefix.py (#9510 )	2024-10-18 14:30:55 -07:00
Tyler Michael Smith	ae8b633ba3	[Bugfix] Fix offline_inference_with_prefix.py (#9505 )	2024-10-18 16:59:19 +00:00
Andy Dai	05c531be47	[Misc] Improved prefix cache example (#9077 )	2024-10-04 21:38:42 +00:00
Zhuohan Li	bd0e7802e0	[Bugfix] Add warmup for prefix caching example (#5235 )	2024-06-03 19:36:41 -07:00
Daniil Arapov	c2d6d2f960	[Bugfix]: Fix issues related to prefix caching example (#5177 ) (#5180 )	2024-06-01 15:53:52 -07:00
Woosuk Kwon	c0935c96d3	[Bugfix] Set enable_prefix_caching=True in prefix caching example (#3703 )	2024-03-28 16:26:30 -07:00
Simon Mo	8e67598aa6	[Misc] fix line length for entire codebase (#3444 )	2024-03-16 00:36:29 -07:00
Sage Moore	ce4f5a29fb	Add Automatic Prefix Caching (#2762 ) Co-authored-by: ElizaWszola <eliza@neuralmagic.com> Co-authored-by: Michael Goin <michael@neuralmagic.com>	2024-03-02 00:50:01 -08:00
Jason Zhu	5d80a9178b	Minor fix in prefill cache example (#2494 )	2024-01-18 09:40:34 -08:00
shiyi.c_98	d10f8e1d43	[Experimental] Prefix Caching Support (#1669 ) Co-authored-by: DouHappy <2278958187@qq.com> Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>	2024-01-17 16:32:10 -08:00