vllm/csrc at 21d5daa4aca6e16c0c42dbfdf704fdfd0006ba4c - vllm - Gitea: Git with a cup of tea

youngkingdom/vllm

Files

History

kliuae 1b7c791d60 [ROCm] Fixes for GPTQ on ROCm (#2180 )

2023-12-18 10:41:04 -08:00

..

Replace head_mapping params with num_kv_heads to attention kernel. (#1997 )

2023-12-10 10:12:53 -08:00

[ROCm] Fixes for GPTQ on ROCm (#2180 )

2023-12-18 10:41:04 -08:00

activation_kernels.cu

Merge EmbeddedLLM/vllm-rocm into vLLM main (#1836 )

2023-12-07 23:16:52 -08:00

cache_kernels.cu

Merge EmbeddedLLM/vllm-rocm into vLLM main (#1836 )

2023-12-07 23:16:52 -08:00

cache.h

Avoid multiple redefinition (#1817 )

2023-12-14 09:35:58 -08:00

cuda_compat.h

Merge EmbeddedLLM/vllm-rocm into vLLM main (#1836 )

2023-12-07 23:16:52 -08:00

cuda_utils_kernels.cu

Merge EmbeddedLLM/vllm-rocm into vLLM main (#1836 )

2023-12-07 23:16:52 -08:00

cuda_utils.h

Avoid multiple redefinition (#1817 )

2023-12-14 09:35:58 -08:00

dispatch_utils.h

Avoid multiple redefinition (#1817 )

2023-12-14 09:35:58 -08:00

layernorm_kernels.cu

[Optimization] Implement fused add rmsnorm (#1667 )

2023-11-18 18:18:02 -08:00

ops.h

Add GPTQ support (#916 )

2023-12-15 03:04:22 -08:00

pos_encoding_kernels.cu

[BugFix] Fix RoPE kernel on long sequences(#2164 )

2023-12-17 17:09:10 -08:00

pybind.cpp

Add GPTQ support (#916 )

2023-12-15 03:04:22 -08:00

reduction_utils.cuh

Merge EmbeddedLLM/vllm-rocm into vLLM main (#1836 )

2023-12-07 23:16:52 -08:00