vllm/distributed at 54814fd85b5182fc140febfebbb2560420d2ed2a - vllm

Files

Lily Liu 7041de4384 [Kernel] Flashinfer for prefill & decode, with Cudagraph support for decode (#4628 )

Co-authored-by: LiuXiaoxuanPKU <llilyliupku@gmail.com>, bong-furiosa <bongwon.jang@furiosa.ai>

2024-06-28 15:28:49 -07:00

__init__.py

2024-05-13 23:50:09 +09:00

test_basic_distributed_correctness.py

2024-06-28 15:28:49 -07:00

test_chunked_prefill_distributed.py

2024-06-08 08:59:20 +00:00

test_comm_ops.py

2024-06-23 14:42:28 -07:00

test_custom_all_reduce.py

2024-06-23 14:42:28 -07:00

test_parallel_state.py

2024-06-28 15:20:22 +00:00

test_pynccl.py

2024-06-23 14:42:28 -07:00

test_same_node.py

2024-06-11 10:53:59 -07:00

test_shm_broadcast.py

2024-06-25 21:56:02 -07:00

test_utils.py

2024-06-25 15:56:15 -07:00