vllm/quantization at 182f40ea8b5981864b23e08bb2a5aafc5800e976 - vllm - Gitea: Git with a cup of tea

youngkingdom/vllm

Files

History

Zhiyu 182f40ea8b Add NVIDIA TensorRT Model Optimizer in vLLM documentation (#17561 )

2025-05-02 11:36:46 -07:00

..

auto_awq.md

[doc] update wrong hf model links (#17184 )

2025-04-25 16:40:54 +00:00

bitblas.md

[doc] update wrong hf model links (#17184 )

2025-04-25 16:40:54 +00:00

bnb.md

[doc] update wrong hf model links (#17184 )

2025-04-25 16:40:54 +00:00

fp8.md

[doc] miss result (#17589 )

2025-05-02 07:04:49 -07:00

gguf.md

doc: fix some typos in doc (#16154 )

2025-04-07 05:32:06 +00:00

gptqmodel.md

[doc] update wrong model id (#17287 )

2025-04-28 04:20:51 -07:00

index.md

Add NVIDIA TensorRT Model Optimizer in vLLM documentation (#17561 )

2025-05-02 11:36:46 -07:00

int4.md

[doc] add install tips (#17373 )

2025-04-30 17:02:41 +00:00

int8.md

[doc] add install tips (#17373 )

2025-04-30 17:02:41 +00:00

modelopt.md

Add NVIDIA TensorRT Model Optimizer in vLLM documentation (#17561 )

2025-05-02 11:36:46 -07:00

quantized_kvcache.md

[doc] add install tips (#17373 )

2025-04-30 17:02:41 +00:00

quark.md

[doc] add install tips (#17373 )

2025-04-30 17:02:41 +00:00

supported_hardware.md

Add NVIDIA TensorRT Model Optimizer in vLLM documentation (#17561 )

2025-05-02 11:36:46 -07:00

torchao.md

[doc] update wrong hf model links (#17184 )

2025-04-25 16:40:54 +00:00