vllm/core at 4238bc82f24d5887784b04a353ed93e2360623b4 - vllm - Gitea: Git with a cup of tea

youngkingdom/vllm

Files

History

afeldman-nm 4238bc82f2 [Core] Cross-attention KV caching and memory-management (towards eventual encoder/decoder model support) (#4837 )

2024-05-29 16:09:13 +00:00

..

[Core] Cross-attention KV caching and memory-management (towards eventual encoder/decoder model support) (#4837 )

2024-05-29 16:09:13 +00:00

__init__.py

[Tests] Add block manager and scheduler tests (#3108 )

2024-03-05 18:23:34 -08:00

test_block_manager.py

[Core] Cross-attention KV caching and memory-management (towards eventual encoder/decoder model support) (#4837 )

2024-05-29 16:09:13 +00:00

test_chunked_prefill_scheduler.py

[Core][Optimization] change python dict to pytorch tensor for blocks to swap (#4659 )

2024-05-08 12:07:05 -07:00

test_scheduler.py

[Scheduler] Warning upon preemption and Swapping (#4647 )

2024-05-13 23:50:44 +09:00

utils.py

[Core] Cross-attention KV caching and memory-management (towards eventual encoder/decoder model support) (#4837 )

2024-05-29 16:09:13 +00:00