vllm/gptq at efffb63f584c1ce4fdcf4e7b7fd0bfc8b33a733a - vllm - Gitea: Git with a cup of tea

youngkingdom/vllm

Files

History

Antoni Baum a10d3056da [Core] Set linear_weights directly on the layer (#3977 )

2024-04-11 16:35:51 -04:00

..

compat.cuh

Add GPTQ support (#916 )

2023-12-15 03:04:22 -08:00

matrix_view.cuh

Add Support for 2/3/8-bit GPTQ Quantization Models (#2330 )

2024-02-28 21:52:23 -08:00

q_gemm.cu

[Core] Set linear_weights directly on the layer (#3977 )

2024-04-11 16:35:51 -04:00

qdq_2.cuh

Add Support for 2/3/8-bit GPTQ Quantization Models (#2330 )

2024-02-28 21:52:23 -08:00

qdq_3.cuh

Add Support for 2/3/8-bit GPTQ Quantization Models (#2330 )

2024-02-28 21:52:23 -08:00

qdq_4.cuh

Add Support for 2/3/8-bit GPTQ Quantization Models (#2330 )

2024-02-28 21:52:23 -08:00

qdq_8.cuh

Add Support for 2/3/8-bit GPTQ Quantization Models (#2330 )

2024-02-28 21:52:23 -08:00

qdq_util.cuh

Add GPTQ support (#916 )

2023-12-15 03:04:22 -08:00