vllm/piecewise at 231c2c63e4decd0cbf863690dfffe88e1d97a003 - vllm

Files

Daisy-Ma-coder cfbee3d0e7 [CLI env var] Add VLLM_FLASH_ATTN_MAX_NUM_SPLITS_FOR_CUDA_GRAPH in env variables (#25274 )

Signed-off-by: qqma <qqma@amazon.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: qqma <qqma@amazon.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>

2025-09-22 10:37:43 -07:00

__init__.py

[torch.compile] rework compile control with piecewise cudagraph (#9715 )

2024-10-29 23:03:49 -07:00

test_full_cudagraph.py

[CLI env var] Add VLLM_FLASH_ATTN_MAX_NUM_SPLITS_FOR_CUDA_GRAPH in env variables (#25274 )

2025-09-22 10:37:43 -07:00

test_multiple_graphs.py

[CI] execute all piecewise compilation tests together (#24502 )

2025-09-09 11:05:25 -07:00

test_simple.py

[torch.compile] CUDAGraph Inductor partition integration (#24281 )

2025-09-20 01:02:15 +00:00

test_toy_llama.py

[CI] execute all piecewise compilation tests together (#24502 )

2025-09-09 11:05:25 -07:00