vllm/quantization at bb2fc08072db2d96e547407b4301fb6ba141d9d6 - vllm - Gitea: Git with a cup of tea

youngkingdom/vllm

Files

History

Tyler Michael Smith fea59c7712 [Bugfix][Kernel] Use int64_t for indices in fp8 quant kernels (#6649 )

2024-07-22 14:08:30 -06:00

..

[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00

[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00

compressed_tensors

[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00

[Kernel] Turn off CUTLASS scaled_mm for Ada Lovelace (#6384 )

2024-07-14 13:37:19 +00:00

[Bugfix][Kernel] Use int64_t for indices in fp8 quant kernels (#6649 )

2024-07-22 14:08:30 -06:00

[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00

[Kernel][Core] Add AWQ support to the Marlin kernel (#6612 )

2024-07-21 19:41:42 -04:00

[Kernel][Core] Add AWQ support to the Marlin kernel (#6612 )

2024-07-21 19:41:42 -04:00

[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00