vllm/kernels at 0ee535b2945d042cbb1fc6e63fd3fddd94d491f2 - vllm - Gitea: Git with a cup of tea

youngkingdom/vllm

Files

History

youkaichao 230c4b38c1 [CI/Test] fix swap test for multi gpu (#4689 )

2024-05-08 13:14:02 -07:00

..

allclose_default.py

[ROCm] Fix some kernels failed unit tests (#2498 )

2024-02-05 14:25:36 -08:00

conftest.py

[Kernel] Use flashinfer for decoding (#4353 )

2024-05-03 15:51:27 -07:00

test_activation.py

[CI] Try introducing isort. (#3495 )

2024-03-25 07:59:47 -07:00

test_attention.py

[Core][Model runner refactoring 1/N] Refactor attn metadata term (#4518 )

2024-05-03 10:20:12 -07:00

test_cache.py

[CI/Test] fix swap test for multi gpu (#4689 )

2024-05-08 13:14:02 -07:00

test_layernorm.py

[Kernel] Layernorm performance optimization (#3662 )

2024-03-30 14:26:38 -07:00

test_moe.py

[Kernel] Support MoE Fp8 Checkpoints for Mixtral (Static Weights with Dynamic/Static Activations) (#4527 )

2024-05-04 11:45:16 -07:00

test_pos_encoding.py

[CI] Try introducing isort. (#3495 )

2024-03-25 07:59:47 -07:00

test_prefix_prefill.py

[Bugfix][Kernel] allow non-power-of-2 for prefix prefill with alibi (#4573 )

2024-05-08 09:19:58 -07:00

test_rand.py

[CI] Try introducing isort. (#3495 )

2024-03-25 07:59:47 -07:00

test_sampler.py

[CI] Try introducing isort. (#3495 )

2024-03-25 07:59:47 -07:00