Logo
Explore Help
Sign In
youngkingdom/vllm
1
0
Fork 0
You've already forked vllm
Code Issues Pull Requests Actions Packages Projects Releases Wiki Activity
Files
214efc2c3cb568e8eb3f7d234f3bd8f5bbe24795
vllm/docs/source/quantization
History
Yan Ma 6b2d25efc7 [Hardware][XPU] AWQ/GPTQ support for xpu backend (#10107)
Signed-off-by: yan ma <yan.ma@intel.com>
2024-11-18 11:18:05 -07:00
..
auto_awq.rst
[Doc] fix the autoAWQ example (#7937)
2024-08-28 12:12:32 +00:00
bnb.rst
[[Misc]Upgrade bitsandbytes to the latest version 0.44.0 (#8768)
2024-09-24 17:08:55 -07:00
fp8_e4m3_kvcache.rst
[Core/Bugfix] Add FP8 K/V Scale and dtype conversion for prefix/prefill Triton Kernel (#7208)
2024-08-12 22:47:41 +00:00
fp8_e5m2_kvcache.rst
[Core/Bugfix] Add FP8 K/V Scale and dtype conversion for prefix/prefill Triton Kernel (#7208)
2024-08-12 22:47:41 +00:00
fp8.rst
Add lm-eval directly to requirements-test.txt (#9161)
2024-10-08 18:22:31 -07:00
gguf.rst
[Doc] Add documentation for GGUF quantization (#8618)
2024-09-19 13:15:55 -06:00
int8.rst
[Doc] Add docs for llmcompressor INT8 and FP8 checkpoints (#7444)
2024-08-16 13:59:16 -07:00
supported_hardware.rst
[Hardware][XPU] AWQ/GPTQ support for xpu backend (#10107)
2024-11-18 11:18:05 -07:00
Powered by Gitea Version: 1.24.2 Page: 159ms Template: 5ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API