youngkingdom/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
TimWang	93cf74a8a7	[Doc]: Add deploying_with_k8s guide (#8451 )	2024-10-07 13:31:45 -07:00
Cyrus Leung	151ef4efd2	[Model] Support NVLM-D and fix QK Norm in InternViT (#9045 ) Co-authored-by: Roger Wang <ywang@roblox.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2024-10-07 11:55:12 +00:00
Cyrus Leung	b22b798471	[Model] PP support for embedding models and update docs (#9090 ) Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>	2024-10-06 16:35:27 +08:00
Cyrus Leung	f22619fe96	[Misc] Remove user-facing error for removed VLM args (#9104 )	2024-10-06 01:33:52 -07:00
Andy Dai	5df1834895	[Bugfix] Fix order of arguments matters in config.yaml (#8960 )	2024-10-05 17:35:11 +00:00
Roger Wang	26aa325f4f	[Core][VLM] Test registration for OOT multimodal models (#8717 ) Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-10-04 10:38:25 -07:00
Cyrus Leung	0e36fd4909	[Misc] Move registry to its own file (#9064 )	2024-10-04 10:01:37 +00:00
Murali Andoorveedu	0f6d7a9a34	[Models] Add remaining model PP support (#7168 ) Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai> Signed-off-by: Murali Andoorveedu <muralidhar.andoorveedu@centml.ai> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-10-04 10:56:58 +08:00
代君	3dbb215b38	[Frontend][Feature] support tool calling for internlm/internlm2_5-7b-chat model (#8405 )	2024-10-04 10:36:39 +08:00
Nick Hill	18c2e30c57	[Doc] Update Granite model docs (#9025 )	2024-10-03 02:42:24 +00:00
Sergey Shlyapnikov	f58d4fccc9	[OpenVINO] Enable GPU support for OpenVINO vLLM backend (#8192 )	2024-10-02 17:50:01 -04:00
Cyrus Leung	4f341bd4bf	[Doc] Update list of supported models (#8987 )	2024-10-02 00:35:39 +08:00
whyiug	e01ab595d8	[Model] support input embeddings for qwen2vl (#8856 )	2024-09-30 03:16:10 +00:00
youkaichao	cc276443b5	[doc] organize installation doc and expose per-commit docker (#8931 )	2024-09-28 17:48:41 -07:00
youkaichao	d86f6b2afb	[misc] fix wheel name (#8919 )	2024-09-27 22:10:44 -07:00
Cyrus Leung	3b00b9c26c	[Core] rename`PromptInputs` and `inputs` (#8876 )	2024-09-26 20:35:15 -07:00
Maximilien de Bayser	344cd2b6f4	[Feature] Add support for Llama 3.1 and 3.2 tool use (#8343 ) Signed-off-by: Max de Bayser <mbayser@br.ibm.com>	2024-09-26 17:01:42 -07:00
youkaichao	70de39f6b4	[misc][installation] build from source without compilation (#8818 )	2024-09-26 13:19:04 -07:00
Roger Wang	4bb98f2190	[Misc] Update config loading for Qwen2-VL and remove Granite (#8837 )	2024-09-26 07:45:30 -07:00
Roger Wang	e2c6e0a829	[Doc] Update doc for Transformers 4.45 (#8817 )	2024-09-25 13:29:48 -07:00
Chen Zhang	770ec6024f	[Model] Add support for the multi-modal Llama 3.2 model (#8811 ) Co-authored-by: simon-mo <xmo@berkeley.edu> Co-authored-by: Chang Su <chang.s.su@oracle.com> Co-authored-by: Simon Mo <simon.mo@hey.com> Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com> Co-authored-by: Roger Wang <ywang@roblox.com>	2024-09-25 13:29:32 -07:00
Simon Mo	4f1ba0844b	Revert "rename PromptInputs and inputs with backward compatibility (#8760 ) (#8810 )	2024-09-25 10:36:26 -07:00
Cyrus Leung	28e1299e60	rename PromptInputs and inputs with backward compatibility (#8760 )	2024-09-25 09:36:47 -07:00
Hongxia Yang	1c046447a6	[CI/Build][Bugfix][Doc][ROCm] CI fix and doc update after ROCm 6.2 upgrade (#8777 )	2024-09-25 22:26:37 +08:00
Jee Jee Li	13f9f7a3d0	[[Misc]Upgrade bitsandbytes to the latest version 0.44.0 (#8768 )	2024-09-24 17:08:55 -07:00
Simon Mo	3185fb0cca	Revert "[Core] Rename `PromptInputs` to `PromptType`, and `inputs` to `prompt`" (#8750 )	2024-09-24 05:45:20 +00:00
Hongxia Yang	530821d00c	[Hardware][AMD] ROCm6.2 upgrade (#8674 )	2024-09-23 18:52:39 -07:00
Daniele	ee5f34b1c2	[CI/Build] use setuptools-scm to set __version__ (#4738 ) Co-authored-by: youkaichao <youkaichao@126.com>	2024-09-23 09:44:26 -07:00
Yan Ma	d23679eb99	[Bugfix] fix docker build for xpu (#8652 )	2024-09-22 22:54:18 -07:00
youkaichao	d4a2ac8302	[build] enable existing pytorch (for GH200, aarch64, nightly) (#8713 )	2024-09-22 12:47:54 -07:00
litianjian	5b59532760	[Model][VLM] Add LLaVA-Onevision model support (#8486 ) Co-authored-by: litianjian <litianjian@bytedance.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Roger Wang <ywang@roblox.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-09-22 10:51:44 -07:00
Andy Dai	4dfdf43196	[Doc] Fix typo in AMD installation guide (#8689 )	2024-09-21 00:24:12 -07:00
Cyrus Leung	0057894ef7	[Core] Rename `PromptInputs` and `inputs`(#8673 )	2024-09-20 19:00:54 -07:00
omrishiv	7c8566aa4f	[Doc] neuron documentation update (#8671 ) Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>	2024-09-20 15:04:37 -07:00
Niklas Muennighoff	3b63de9353	[Model] Add OLMoE (#7922 )	2024-09-20 09:31:41 -07:00
Jiaxin Shan	260d40b5ea	[Core] Support Lora lineage and base model metadata management (#6315 )	2024-09-20 06:20:56 +00:00
Isotr0py	ea4647b7d7	[Doc] Add documentation for GGUF quantization (#8618 )	2024-09-19 13:15:55 -06:00
Geun, Lim	e18749ff09	[Model] Support Solar Model (#8386 ) Co-authored-by: Michael Goin <michael@neuralmagic.com>	2024-09-18 11:04:00 -06:00
Alexander Matveev	7c7714d856	[Core][Bugfix][Perf] Introduce `MQLLMEngine` to avoid `asyncio` OH (#8157 ) Co-authored-by: Nick Hill <nickhill@us.ibm.com> Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com> Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com> Co-authored-by: Simon Mo <simon.mo@hey.com>	2024-09-18 13:56:58 +00:00
youkaichao	fa0c114fad	[doc] improve installation doc (#8550 ) Co-authored-by: Andy Dai <76841985+Imss27@users.noreply.github.com>	2024-09-17 16:24:06 -07:00
youkaichao	2759a43a26	[doc] update doc on testing and debugging (#8514 )	2024-09-16 12:10:23 -07:00
ywfang	8a0cf1ddc3	[Model] support minicpm3 (#8297 ) Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-09-14 14:50:26 +00:00
Isotr0py	f57092c00b	[Doc] Add oneDNN installation to CPU backend documentation (#8467 )	2024-09-13 18:06:30 +00:00
Cyrus Leung	a84e598e21	[CI/Build] Reorganize models tests (#7820 )	2024-09-13 10:20:06 -07:00
youkaichao	cab69a15e4	[doc] recommend pip instead of conda (#8446 )	2024-09-12 23:52:41 -07:00
Alex Brooks	c6202daeed	[Model] Support multiple images for qwen-vl (#8247 ) Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-09-12 10:10:54 -07:00
Patrick von Platen	d394787e52	Pixtral (#8377 ) Co-authored-by: Roger Wang <ywang@roblox.com>	2024-09-11 14:41:55 -07:00
Yang Fan	3b7fea770f	[Model][VLM] Add Qwen2-VL model support (#7905 ) Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-09-11 09:31:19 -07:00
Yangshen⚡Deng	6a512a00df	[model] Support for Llava-Next-Video model (#7559 ) Co-authored-by: Roger Wang <ywang@roblox.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2024-09-10 22:21:36 -07:00
Simon Mo	a1d874224d	Add NVIDIA Meetup slides, announce AMD meetup, and add contact info (#8319 )	2024-09-09 23:21:00 -07:00

1 2 3 4 5 ...

380 Commits