vllm/e2e at 3e36fcbee642f41278a4881c9e2bfbbd7c28e607 - vllm

Files

Yong Hoon Shin ad510309ee Override attention metadata for fast prefill in some KV sharing setups (#21590 )

Signed-off-by: Yong Hoon Shin <yhshin@meta.com>

2025-07-30 08:54:15 -07:00

__init__.py

2025-01-01 21:56:46 +09:00

test_cascade_attention.py

2025-07-09 00:34:28 -07:00

test_correctness_sliding_window.py

2025-07-29 19:58:29 +08:00

test_kv_sharing_fast_prefill.py

2025-07-30 08:54:15 -07:00

test_spec_decode.py

2025-07-15 21:14:15 -07:00