[Misc] Fused MoE Marlin support for GPTQ (#8217)

This commit is contained in:
Dipika Sikka
2024-09-09 23:02:52 -04:00
committed by GitHub
parent c7cb5c3335
commit 6cd5e5b07e
19 changed files with 912 additions and 204 deletions

View File

@ -0,0 +1,3 @@
compressed-tensors, nm-testing/Mixtral-8x7B-Instruct-v0.1-W4A16-quantized, main
compressed-tensors, nm-testing/Mixtral-8x7B-Instruct-v0.1-W4A16-channel-quantized, main
gptq_marlin, TheBloke/Mixtral-8x7B-v0.1-GPTQ, main