vllm/quantization at b4bab81660a184693543ca9261ced745db1fc2a7 - vllm

Files

Harry Mellor b4bab81660 Remove unnecessary explicit title anchors and use relative links instead (#20620 )

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

2025-07-08 02:49:13 -07:00

auto_awq.md

Remove unnecessary explicit title anchors and use relative links instead (#20620 )

2025-07-08 02:49:13 -07:00

bitblas.md

Remove unnecessary explicit title anchors and use relative links instead (#20620 )

2025-07-08 02:49:13 -07:00

bnb.md

Remove unnecessary explicit title anchors and use relative links instead (#20620 )

2025-07-08 02:49:13 -07:00

fp8.md

Remove unnecessary explicit title anchors and use relative links instead (#20620 )

2025-07-08 02:49:13 -07:00

gguf.md

Remove unnecessary explicit title anchors and use relative links instead (#20620 )

2025-07-08 02:49:13 -07:00

gptqmodel.md

Remove unnecessary explicit title anchors and use relative links instead (#20620 )

2025-07-08 02:49:13 -07:00

int4.md

Remove unnecessary explicit title anchors and use relative links instead (#20620 )

2025-07-08 02:49:13 -07:00

int8.md

Remove unnecessary explicit title anchors and use relative links instead (#20620 )

2025-07-08 02:49:13 -07:00

modelopt.md

Make distinct code and console admonitions so readers are less likely to miss them (#20585 )

2025-07-07 19:55:28 -07:00

quantized_kvcache.md

Remove unnecessary explicit title anchors and use relative links instead (#20620 )

2025-07-08 02:49:13 -07:00

quark.md

Remove unnecessary explicit title anchors and use relative links instead (#20620 )

2025-07-08 02:49:13 -07:00

README.md

Remove unnecessary explicit title anchors and use relative links instead (#20620 )

2025-07-08 02:49:13 -07:00

supported_hardware.md

Remove unnecessary explicit title anchors and use relative links instead (#20620 )

2025-07-08 02:49:13 -07:00

torchao.md

Make distinct code and console admonitions so readers are less likely to miss them (#20585 )

2025-07-07 19:55:28 -07:00

README.md

title

title
Quantization

Quantization trades off model precision for smaller memory footprint, allowing large models to be run on a wider range of devices.

Contents: