|
|
cc276443b5
|
[doc] organize installation doc and expose per-commit docker (#8931)
|
2024-09-28 17:48:41 -07:00 |
|
|
|
d86f6b2afb
|
[misc] fix wheel name (#8919)
|
2024-09-27 22:10:44 -07:00 |
|
|
|
3b00b9c26c
|
[Core] renamePromptInputs and inputs (#8876)
|
2024-09-26 20:35:15 -07:00 |
|
|
|
344cd2b6f4
|
[Feature] Add support for Llama 3.1 and 3.2 tool use (#8343)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
|
2024-09-26 17:01:42 -07:00 |
|
|
|
70de39f6b4
|
[misc][installation] build from source without compilation (#8818)
|
2024-09-26 13:19:04 -07:00 |
|
|
|
4bb98f2190
|
[Misc] Update config loading for Qwen2-VL and remove Granite (#8837)
|
2024-09-26 07:45:30 -07:00 |
|
|
|
e2c6e0a829
|
[Doc] Update doc for Transformers 4.45 (#8817)
|
2024-09-25 13:29:48 -07:00 |
|
|
|
770ec6024f
|
[Model] Add support for the multi-modal Llama 3.2 model (#8811)
Co-authored-by: simon-mo <xmo@berkeley.edu>
Co-authored-by: Chang Su <chang.s.su@oracle.com>
Co-authored-by: Simon Mo <simon.mo@hey.com>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2024-09-25 13:29:32 -07:00 |
|
|
|
4f1ba0844b
|
Revert "rename PromptInputs and inputs with backward compatibility (#8760) (#8810)
|
2024-09-25 10:36:26 -07:00 |
|
|
|
28e1299e60
|
rename PromptInputs and inputs with backward compatibility (#8760)
|
2024-09-25 09:36:47 -07:00 |
|
|
|
1c046447a6
|
[CI/Build][Bugfix][Doc][ROCm] CI fix and doc update after ROCm 6.2 upgrade (#8777)
|
2024-09-25 22:26:37 +08:00 |
|
|
|
13f9f7a3d0
|
[[Misc]Upgrade bitsandbytes to the latest version 0.44.0 (#8768)
|
2024-09-24 17:08:55 -07:00 |
|
|
|
3185fb0cca
|
Revert "[Core] Rename PromptInputs to PromptType, and inputs to prompt" (#8750)
|
2024-09-24 05:45:20 +00:00 |
|
|
|
530821d00c
|
[Hardware][AMD] ROCm6.2 upgrade (#8674)
|
2024-09-23 18:52:39 -07:00 |
|
|
|
ee5f34b1c2
|
[CI/Build] use setuptools-scm to set __version__ (#4738)
Co-authored-by: youkaichao <youkaichao@126.com>
|
2024-09-23 09:44:26 -07:00 |
|
|
|
d23679eb99
|
[Bugfix] fix docker build for xpu (#8652)
|
2024-09-22 22:54:18 -07:00 |
|
|
|
d4a2ac8302
|
[build] enable existing pytorch (for GH200, aarch64, nightly) (#8713)
|
2024-09-22 12:47:54 -07:00 |
|
|
|
5b59532760
|
[Model][VLM] Add LLaVA-Onevision model support (#8486)
Co-authored-by: litianjian <litianjian@bytedance.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-09-22 10:51:44 -07:00 |
|
|
|
4dfdf43196
|
[Doc] Fix typo in AMD installation guide (#8689)
|
2024-09-21 00:24:12 -07:00 |
|
|
|
0057894ef7
|
[Core] Rename PromptInputs and inputs(#8673)
|
2024-09-20 19:00:54 -07:00 |
|
|
|
7c8566aa4f
|
[Doc] neuron documentation update (#8671)
Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>
|
2024-09-20 15:04:37 -07:00 |
|
|
|
3b63de9353
|
[Model] Add OLMoE (#7922)
|
2024-09-20 09:31:41 -07:00 |
|
|
|
260d40b5ea
|
[Core] Support Lora lineage and base model metadata management (#6315)
|
2024-09-20 06:20:56 +00:00 |
|
|
|
ea4647b7d7
|
[Doc] Add documentation for GGUF quantization (#8618)
|
2024-09-19 13:15:55 -06:00 |
|
|
|
e18749ff09
|
[Model] Support Solar Model (#8386)
Co-authored-by: Michael Goin <michael@neuralmagic.com>
|
2024-09-18 11:04:00 -06:00 |
|
|
|
7c7714d856
|
[Core][Bugfix][Perf] Introduce MQLLMEngine to avoid asyncio OH (#8157)
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com>
Co-authored-by: Simon Mo <simon.mo@hey.com>
|
2024-09-18 13:56:58 +00:00 |
|
|
|
fa0c114fad
|
[doc] improve installation doc (#8550)
Co-authored-by: Andy Dai <76841985+Imss27@users.noreply.github.com>
|
2024-09-17 16:24:06 -07:00 |
|
|
|
2759a43a26
|
[doc] update doc on testing and debugging (#8514)
|
2024-09-16 12:10:23 -07:00 |
|
|
|
8a0cf1ddc3
|
[Model] support minicpm3 (#8297)
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-09-14 14:50:26 +00:00 |
|
|
|
f57092c00b
|
[Doc] Add oneDNN installation to CPU backend documentation (#8467)
|
2024-09-13 18:06:30 +00:00 |
|
|
|
a84e598e21
|
[CI/Build] Reorganize models tests (#7820)
|
2024-09-13 10:20:06 -07:00 |
|
|
|
cab69a15e4
|
[doc] recommend pip instead of conda (#8446)
|
2024-09-12 23:52:41 -07:00 |
|
|
|
c6202daeed
|
[Model] Support multiple images for qwen-vl (#8247)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-09-12 10:10:54 -07:00 |
|
|
|
d394787e52
|
Pixtral (#8377)
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2024-09-11 14:41:55 -07:00 |
|
|
|
3b7fea770f
|
[Model][VLM] Add Qwen2-VL model support (#7905)
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-09-11 09:31:19 -07:00 |
|
|
|
6a512a00df
|
[model] Support for Llava-Next-Video model (#7559)
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2024-09-10 22:21:36 -07:00 |
|
|
|
a1d874224d
|
Add NVIDIA Meetup slides, announce AMD meetup, and add contact info (#8319)
|
2024-09-09 23:21:00 -07:00 |
|
|
|
e807125936
|
[Model][VLM] Support multi-images inputs for InternVL2 models (#8201)
|
2024-09-07 16:38:23 +08:00 |
|
|
|
2f707fcb35
|
[Model] Multi-input support for LLaVA (#8238)
|
2024-09-07 02:57:24 +00:00 |
|
|
|
12dd715807
|
[misc] [doc] [frontend] LLM torch profiler support (#7943)
|
2024-09-06 17:48:48 -07:00 |
|
|
|
23f322297f
|
[Misc] Remove SqueezeLLM (#8220)
|
2024-09-06 16:29:03 -06:00 |
|
|
|
db3bf7c991
|
[Core] Support load and unload LoRA in api server (#6566)
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
|
2024-09-05 18:10:33 -07:00 |
|
|
|
2febcf2777
|
[Documentation][Spec Decode] Add documentation about lossless guarantees in Speculative Decoding in vLLM (#7962)
|
2024-09-05 16:25:29 -04:00 |
|
|
|
9da25a88aa
|
[MODEL] Qwen Multimodal Support (Qwen-VL / Qwen-VL-Chat) (#8029)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-09-05 12:48:10 +00:00 |
|
|
|
288a938872
|
[Doc] Indicate more information about supported modalities (#8181)
|
2024-09-05 10:51:53 +00:00 |
|
|
|
e02ce498be
|
[Feature] OpenAI-Compatible Tools API + Streaming for Hermes & Mistral models (#5649)
Co-authored-by: constellate <constellate@1-ai-appserver-staging.codereach.com>
Co-authored-by: Kyle Mistele <kyle@constellate.ai>
|
2024-09-04 13:18:13 -07:00 |
|
|
|
61f4a93d14
|
[TPU][Bugfix] Use XLA rank for persistent cache path (#8137)
|
2024-09-03 18:35:33 -07:00 |
|
|
|
1248e8506a
|
[Model] Adding support for MSFT Phi-3.5-MoE (#7729)
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Zeqi Lin <zelin@microsoft.com>
Co-authored-by: Zeqi Lin <Zeqi.Lin@microsoft.com>
|
2024-08-30 13:42:57 -06:00 |
|
|
|
058344f89a
|
[Frontend]-config-cli-args (#7737)
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Kaunil Dhruv <kaunil_dhruv@intuit.com>
|
2024-08-30 08:21:02 -07:00 |
|
|
|
dc13e99348
|
[MODEL] add Exaone model support (#7819)
|
2024-08-29 23:34:20 -07:00 |
|