vllm/fp8 at fec347dee130a79d8a56390ffcb2dde2e480f6ca - vllm

Files

elvischenv dbeee3844c [Perf] Use NVIDIA hardware-accelerated instruction for float to fp8_e4m3 quantization (#24757 )

Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>

2025-09-13 00:16:24 -07:00

2025-06-15 20:05:28 -07:00

2025-09-13 00:16:24 -07:00

common.cu

2025-08-05 02:36:43 -07:00

common.cuh

2025-09-13 00:16:24 -07:00

per_token_group_quant.cu

2025-07-29 21:50:46 -06:00