vllm/quantization at c2637a613b6140dc16fecd5a1b0f5a9e1d0932ff - vllm

Files

Dipika Sikka c2637a613b [Kernel] w4a16 support for compressed-tensors (#5385 )

Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com>

2024-06-13 10:19:56 -04:00

__init__.py

2024-05-13 23:50:09 +09:00

test_bitsandbytes.py

2024-06-12 10:03:24 -07:00

test_compressed_tensors.py

2024-06-13 10:19:56 -04:00

test_configs.py

2024-04-29 09:35:34 -07:00

test_fp8.py

2024-06-12 14:07:26 -07:00