vllm/core at 0df7ec0b2d890799ca71e2f862fdff5fcc52cdc0 - vllm - Gitea: Git with a cup of tea

youngkingdom/vllm

Files

History

Cody Yu 3ac50b47d0 [MISC] Add prefix cache hit rate to metrics (#7606 )

2024-08-19 11:52:07 -07:00

..

[MISC] Add prefix cache hit rate to metrics (#7606 )

2024-08-19 11:52:07 -07:00

__init__.py

[Tests] Add block manager and scheduler tests (#3108 )

2024-03-05 18:23:34 -08:00

test_block_manager.py

[Core] Avoid the need to pass None values to Sequence.inputs (#5099 )

2024-05-29 16:05:01 -07:00

test_chunked_prefill_scheduler.py

[mypy] Enable type checking for test directory (#5017 )

2024-06-15 04:45:31 +00:00

test_scheduler_encoder_decoder.py

[Core] Subclass ModelRunner to support cross-attention & encoder sequences (towards eventual encoder/decoder model support) (#4942 )

2024-08-06 16:51:47 -04:00

test_scheduler.py

[Core] Subclass ModelRunner to support cross-attention & encoder sequences (towards eventual encoder/decoder model support) (#4942 )

2024-08-06 16:51:47 -04:00

test_serialization.py

[Core] Optimize SPMD architecture with delta + serialization optimization (#7109 )

2024-08-18 17:57:20 -07:00

utils.py

[Bugfix][fast] Fix the get_num_blocks_touched logic (#6849 )

2024-08-08 10:43:30 -07:00