vllm/core at fce10dbed5441b4f918b23a2b63aae72bc00a2f6 - vllm

Files

Roger Wang b5d34af328 [Bugfix] Fix scheduling when repeated images in one request (#23544 )

Signed-off-by: Roger Wang <hey@rogerw.me>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.me>
Co-authored-by: knlnguyen1802 <knlnguyen1802@gmail.com>

2025-08-26 09:46:28 +00:00

__init__.py

Implement Async Scheduling (#19970 )

2025-07-14 23:01:46 -07:00

test_async_scheduler.py

[Spec Decode] Make propose_draft_token_ids non-blocking for lower TTFT (#23041 )

2025-08-18 17:20:38 -07:00

test_encoder_cache_manager.py

[Bugfix] Fix scheduling when repeated images in one request (#23544 )

2025-08-26 09:46:28 +00:00

test_kv_cache_utils.py

[Refactor] Allow optional MultiModalKwargsItem in IPC (#23022 )

2025-08-16 11:30:49 +00:00

test_prefix_caching.py

[Refactor] Allow optional MultiModalKwargsItem in IPC (#23022 )

2025-08-16 11:30:49 +00:00

test_scheduler_e2e.py

[Misc] unify variable for LLM instance (#20996 )

2025-07-21 12:18:33 +01:00

test_scheduler.py

[Core][Multimodal] Track encode cache entries by mm_hash and enable embedding sharing between requests (#22711 )

2025-08-25 00:41:17 -07:00

test_single_type_kv_cache_manager.py

[v1] Move block_hashes from KVCacheManager to Request.block_hashes (#19728 )

2025-08-15 16:52:52 -07:00

utils.py

[Core][Multimodal] Track encode cache entries by mm_hash and enable embedding sharing between requests (#22711 )

2025-08-25 00:41:17 -07:00