|
|
48e925fab5
|
[Misc] Clean up test docstrings and names (#17521)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-01 05:19:32 -07:00 |
|
|
|
726efc6a32
|
[Quantization][V1] BitsAndBytes support V1 (#15611)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-03-28 10:12:47 +08:00 |
|
|
|
e42389f9d7
|
Transformers backend already supports V1 (#15463)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-03-25 20:26:16 -07:00 |
|
|
|
cf069aa8aa
|
Update deprecated Python 3.8 typing (#13971)
|
2025-03-02 17:34:51 -08:00 |
|
|
|
67ef8f666a
|
[Model] Enable quantization support for transformers backend (#12960)
|
2025-02-17 19:52:47 -08:00 |
|
|
|
33e0602e59
|
[Misc] Fix improper placement of SPDX header in scripts (#12694)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2025-02-03 11:16:59 -08:00 |
|
|
|
a1a2aaadb9
|
[Model]: Add transformers backend support (#11330)
# Adds support for `transformers` as a backend
Following https://github.com/huggingface/transformers/pull/35235, a
bunch of models should already be supported, we are ramping up support
for more models.
Thanks @Isotr0py for the TP support, and @hmellor for his help as well!
This includes:
- `trust_remote_code=True` support: any model on the hub, if it
implements attention the correct way can be natively supported!!
- tensor parallel support
---------
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <41363108+Isotr0py@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-02-03 21:30:38 +08:00 |
|