vllm/piecewise at c8bde93367fb252eca1e9a6ae78650caa4a9a951 - vllm

Files

Daisy-Ma-coder cfbee3d0e7 [CLI env var] Add VLLM_FLASH_ATTN_MAX_NUM_SPLITS_FOR_CUDA_GRAPH in env variables (#25274 )

Signed-off-by: qqma <qqma@amazon.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: qqma <qqma@amazon.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>

2025-09-22 10:37:43 -07:00

__init__.py

[torch.compile] rework compile control with piecewise cudagraph (#9715 )

2024-10-29 23:03:49 -07:00

test_full_cudagraph.py

[CLI env var] Add VLLM_FLASH_ATTN_MAX_NUM_SPLITS_FOR_CUDA_GRAPH in env variables (#25274 )

2025-09-22 10:37:43 -07:00

test_multiple_graphs.py

[CI] execute all piecewise compilation tests together (#24502 )

2025-09-09 11:05:25 -07:00

test_simple.py

[torch.compile] CUDAGraph Inductor partition integration (#24281 )

2025-09-20 01:02:15 +00:00

test_toy_llama.py

[CI] execute all piecewise compilation tests together (#24502 )

2025-09-09 11:05:25 -07:00