This website requires JavaScript.
Explore
Help
Sign In
youngkingdom
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
Packages
Projects
Releases
Wiki
Activity
Files
ba262c4e5aa9fa753c8cedfaea5c42941184a0db
vllm
/
tests
/
core
History
Cody Yu
e3580537a4
[Performance] Enable chunked prefill and prefix caching together (
#7753
)
2024-08-28 00:36:31 -07:00
..
block
[Performance][BlockManagerV2] Mark prefix cache block as computed after schedule (
#7822
)
2024-08-26 11:24:53 -07:00
__init__.py
[Tests] Add block manager and scheduler tests (
#3108
)
2024-03-05 18:23:34 -08:00
test_block_manager.py
[Performance] Enable chunked prefill and prefix caching together (
#7753
)
2024-08-28 00:36:31 -07:00
test_chunked_prefill_scheduler.py
[Performance] Enable chunked prefill and prefix caching together (
#7753
)
2024-08-28 00:36:31 -07:00
test_scheduler_encoder_decoder.py
[Core] Subclass ModelRunner to support cross-attention & encoder sequences (towards eventual encoder/decoder model support) (
#4942
)
2024-08-06 16:51:47 -04:00
test_scheduler.py
[Core] Subclass ModelRunner to support cross-attention & encoder sequences (towards eventual encoder/decoder model support) (
#4942
)
2024-08-06 16:51:47 -04:00
test_serialization.py
[Core] Optimize SPMD architecture with delta + serialization optimization (
#7109
)
2024-08-18 17:57:20 -07:00
utils.py
[Core] Asynchronous Output Processor (
#7049
)
2024-08-26 20:53:20 -07:00