youngkingdom/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
sixgod	6cf1167c1a	[Model] Add GLM-4v support and meet vllm==0.6.2 (#9242 )	2024-10-11 17:36:13 +00:00
Alex Brooks	a3691b6b5e	[Core][Frontend] Add Support for Inference Time mm_processor_kwargs (#9131 ) Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>	2024-10-08 14:12:56 +00:00
Cyrus Leung	151ef4efd2	[Model] Support NVLM-D and fix QK Norm in InternViT (#9045 ) Co-authored-by: Roger Wang <ywang@roblox.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2024-10-07 11:55:12 +00:00
youkaichao	18b296fdb2	[core] remove beam search from the core (#9105 )	2024-10-07 05:47:04 +00:00
Andy Dai	05c531be47	[Misc] Improved prefix cache example (#9077 )	2024-10-04 21:38:42 +00:00
代君	3dbb215b38	[Frontend][Feature] support tool calling for internlm/internlm2_5-7b-chat model (#8405 )	2024-10-04 10:36:39 +08:00
Travis Johnson	19a4dd0990	[Bugfix] example template should not add parallel_tool_prompt if tools is none (#9007 )	2024-10-03 03:04:17 +00:00
Isotr0py	2ae25f79cf	[Model] Expose InternVL2 max_dynamic_patch as a mm_processor_kwarg (#8946 )	2024-09-30 13:01:20 +08:00
Cyrus Leung	e1a3f5e831	[CI/Build] Update models tests & examples (#8874 ) Co-authored-by: Roger Wang <ywang@roblox.com>	2024-09-28 09:54:35 -07:00
Maximilien de Bayser	344cd2b6f4	[Feature] Add support for Llama 3.1 and 3.2 tool use (#8343 ) Signed-off-by: Max de Bayser <mbayser@br.ibm.com>	2024-09-26 17:01:42 -07:00
Chen Zhang	770ec6024f	[Model] Add support for the multi-modal Llama 3.2 model (#8811 ) Co-authored-by: simon-mo <xmo@berkeley.edu> Co-authored-by: Chang Su <chang.s.su@oracle.com> Co-authored-by: Simon Mo <simon.mo@hey.com> Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com> Co-authored-by: Roger Wang <ywang@roblox.com>	2024-09-25 13:29:32 -07:00
Jee Jee Li	13f9f7a3d0	[[Misc]Upgrade bitsandbytes to the latest version 0.44.0 (#8768 )	2024-09-24 17:08:55 -07:00
Andy	2529d09b5a	[Frontend] Batch inference for llm.chat() API (#8648 ) Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk> Co-authored-by: Roger Wang <ywang@roblox.com> Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>	2024-09-24 09:44:11 -07:00
Alex Brooks	8ff7ced996	[Model] Expose Phi3v num_crops as a mm_processor_kwarg (#8658 ) Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-09-24 07:36:46 +00:00
litianjian	5b59532760	[Model][VLM] Add LLaVA-Onevision model support (#8486 ) Co-authored-by: litianjian <litianjian@bytedance.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Roger Wang <ywang@roblox.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-09-22 10:51:44 -07:00
Alex Brooks	8ca5051b9a	[Misc] Use NamedTuple in Multi-image example (#8705 ) Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>	2024-09-22 20:56:20 +08:00
Patrick von Platen	a54ed80249	[Model] Add mistral function calling format to all models loaded with "mistral" format (#8515 ) Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2024-09-17 17:50:37 +00:00
William Lin	ba77527955	[bugfix] torch profiler bug for single gpu with GPUExecutor (#8354 )	2024-09-12 21:30:00 -07:00
Roger Wang	360ddbd37e	[Misc] Update Pixtral example (#8431 )	2024-09-12 17:31:18 -07:00
Alex Brooks	c6202daeed	[Model] Support multiple images for qwen-vl (#8247 ) Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-09-12 10:10:54 -07:00
Patrick von Platen	d394787e52	Pixtral (#8377 ) Co-authored-by: Roger Wang <ywang@roblox.com>	2024-09-11 14:41:55 -07:00
Yang Fan	3b7fea770f	[Model][VLM] Add Qwen2-VL model support (#7905 ) Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-09-11 09:31:19 -07:00
Yangshen⚡Deng	6a512a00df	[model] Support for Llava-Next-Video model (#7559 ) Co-authored-by: Roger Wang <ywang@roblox.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2024-09-10 22:21:36 -07:00
Pavani Majety	efcf946a15	[Hardware][NV] Add support for ModelOpt static scaling checkpoints. (#6112 )	2024-09-11 00:38:40 -04:00
Isotr0py	e807125936	[Model][VLM] Support multi-images inputs for InternVL2 models (#8201 )	2024-09-07 16:38:23 +08:00
Kyle Mistele	41e95c5247	[Bugfix] Fix Hermes tool call chat template bug (#8256 ) Co-authored-by: Kyle Mistele <kyle@constellate.ai>	2024-09-07 10:49:01 +08:00
William Lin	12dd715807	[misc] [doc] [frontend] LLM torch profiler support (#7943 )	2024-09-06 17:48:48 -07:00
Dipika Sikka	23f322297f	[Misc] Remove `SqueezeLLM` (#8220 )	2024-09-06 16:29:03 -06:00
Alex Brooks	9da25a88aa	[MODEL] Qwen Multimodal Support (Qwen-VL / Qwen-VL-Chat) (#8029 ) Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-09-05 12:48:10 +00:00
Cyrus Leung	288a938872	[Doc] Indicate more information about supported modalities (#8181 )	2024-09-05 10:51:53 +00:00
Harsha vardhan manoj Bikki	008cf886c9	[Neuron] Adding support for adding/ overriding neuron configuration a… (#8062 ) Co-authored-by: Harsha Bikki <harbikh@amazon.com>	2024-09-04 16:33:43 -07:00
Kyle Mistele	e02ce498be	[Feature] OpenAI-Compatible Tools API + Streaming for Hermes & Mistral models (#5649 ) Co-authored-by: constellate <constellate@1-ai-appserver-staging.codereach.com> Co-authored-by: Kyle Mistele <kyle@constellate.ai>	2024-09-04 13:18:13 -07:00
Peter Salas	2be8ec6e71	[Model] Add Ultravox support for multiple audio chunks (#7963 )	2024-09-04 04:38:21 +00:00
Roger Wang	5231f0898e	[Frontend][VLM] Add support for multiple multi-modal items (#8049 )	2024-08-31 16:35:53 -07:00
Harsha vardhan manoj Bikki	257afc37c5	[Neuron] Adding support for context-lenght, token-gen buckets. (#7885 ) Co-authored-by: Harsha Bikki <harbikh@amazon.com>	2024-08-29 13:58:14 -07:00
ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟	0b769992ec	[Bugfix]: Use float32 for base64 embedding (#7855 ) Signed-off-by: Hollow Man <hollowman@opensuse.org>	2024-08-26 03:16:38 +00:00
Peter Salas	57792ed469	[Doc] Fix incorrect docs from #7615 (#7788 )	2024-08-22 10:02:06 -07:00
Peter Salas	1ca0d4f86b	[Model] Add UltravoxModel and UltravoxConfig (#7615 )	2024-08-21 22:49:39 +00:00
Ronen Schaffer	2aa00d59ad	[CI/Build] Pin OpenTelemetry versions and make errors clearer (#7266 ) [CI/Build] Pin OpenTelemetry versions and make a availability errors clearer (#7266)	2024-08-20 10:02:21 -07:00
nunjunj	3b19e39dc5	Chat method for offline llm (#5049 ) Co-authored-by: nunjunj <ray@g-3ff9f30f2ed650001.c.vllm-405802.internal> Co-authored-by: nunjunj <ray@g-1df6075697c3f0001.c.vllm-405802.internal> Co-authored-by: nunjunj <ray@g-c5a2c23abc49e0001.c.vllm-405802.internal> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-08-15 19:41:34 -07:00
Pooya Davoodi	249b88228d	[Frontend] Support embeddings in the run_batch API (#7132 ) Co-authored-by: Simon Mo <simon.mo@hey.com>	2024-08-09 09:48:21 -07:00
Isotr0py	67abdbb42f	[VLM][Doc] Add `stop_token_ids` to InternVL example (#7354 )	2024-08-09 14:51:04 +00:00
Cyrus Leung	7eb4a51c5f	[Core] Support serving encoder/decoder models (#7258 )	2024-08-09 10:39:41 +08:00
Jee Jee Li	757ac70a64	[Model] Rename MiniCPMVQwen2 to MiniCPMV2.6 (#7273 )	2024-08-08 14:02:41 +00:00
afeldman-nm	fd95e026e0	[Core] Subclass ModelRunner to support cross-attention & encoder sequences (towards eventual encoder/decoder model support) (#4942 ) Co-authored-by: Andrew Feldman <afeld2012@gmail.com> Co-authored-by: Nick Hill <nickhill@us.ibm.com>	2024-08-06 16:51:47 -04:00
Isotr0py	360bd67cf0	[Core] Support loading GGUF model (#5191 ) Co-authored-by: Michael Goin <michael@neuralmagic.com>	2024-08-05 17:54:23 -06:00
Jungho Christopher Cho	c0d8f1636c	[Model] SiglipVisionModel ported from transformers (#6942 ) Co-authored-by: Roger Wang <ywang@roblox.com>	2024-08-05 06:22:12 +00:00
Isotr0py	7cbd9ec7a9	[Model] Initialize support for InternVL2 series models (#6514 ) Co-authored-by: Roger Wang <ywang@roblox.com>	2024-07-29 10:16:30 +00:00
Cyrus Leung	1ad86acf17	[Model] Initial support for BLIP-2 (#5920 ) Co-authored-by: ywang96 <ywang@roblox.com>	2024-07-27 11:53:07 +00:00
Wang Ran (汪然)	a57d75821c	[bugfix] make args.stream work (#6831 )	2024-07-27 09:07:02 +00:00

1 2 3 4

152 Commits