youngkingdom/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
youkaichao	3610fb4930	[doc] add "Failed to infer device type" to faq (#14200 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-03-04 20:47:06 +08:00
Travis Johnson	c060b71408	[Model] Add support for GraniteMoeShared models (#13313 ) Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-03-04 08:04:52 +08:00
Qubitium-ModelCloud	cd1d3c3df8	[Docs] Add GPTQModel (#14056 ) Signed-off-by: mgoin <mgoin64@gmail.com> Co-authored-by: mgoin <mgoin64@gmail.com>	2025-03-03 21:59:09 +00:00
Harry Mellor	98175b2816	Improve the docs for `TransformersModel` (#14147 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-03-03 17:03:05 +00:00
Harry Mellor	cf069aa8aa	Update deprecated Python 3.8 typing (#13971 )	2025-03-02 17:34:51 -08:00
Ce Gao	bf33700ecd	[v0][structured output] Support reasoning output (#12955 ) Signed-off-by: Ce Gao <cegao@tensorchord.ai>	2025-03-02 14:49:42 -05:00
qux-bbb	bc6ccb9878	[Doc] Source building add clone step (#14086 ) Signed-off-by: qux-bbb <1147635419@qq.com>	2025-03-02 10:59:50 +00:00
Jee Jee Li	cc5e8f6db8	[Model] Add LoRA support for TransformersModel (#13770 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-03-02 09:17:34 +08:00
Kuntai Du	8994dabc22	[Documentation] Add more deployment guide for Kubernetes deployment (#13841 ) Signed-off-by: KuntaiDu <kuntai@uchicago.edu> Signed-off-by: Kuntai Du <kuntai@uchicago.edu>	2025-03-01 06:44:24 +00:00
Brayden Zhong	f64ffa8c25	[Docs] Add `pipeline_parallel_size` to optimization docs (#14059 ) Signed-off-by: Brayden Zhong <b8zhong@uwaterloo.ca>	2025-03-01 05:43:54 +00:00
Brayden Zhong	2aed2c9fa7	[Doc] Fix ROCm documentation (#14041 ) Signed-off-by: Brayden Zhong <b8zhong@uwaterloo.ca>	2025-02-28 16:42:07 +00:00
Harry Mellor	f58f8b5c96	Update AutoAWQ docs (#14042 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-02-28 15:20:29 +00:00
Cyrus Leung	1088f06242	[Doc] Move multimodal Embedding API example to Online Serving page (#14017 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-02-28 07:12:04 +00:00
Cyrus Leung	f1579b229d	[VLM] Generalized prompt updates for multi-modal processor (#13964 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-02-27 17:44:25 +00:00
王博伟	512d77d582	Update quickstart.md (#13958 )	2025-02-27 16:05:11 +00:00
Szymon Ożóg	7f0be2aa24	[Model] Deepseek GGUF support (#13167 )	2025-02-27 02:08:35 -08:00
Isotr0py	edf309ebbe	[VLM] Support multimodal inputs for Florence-2 models (#13320 )	2025-02-27 02:06:41 -08:00
Michael Goin	ca377cf1b9	Use CUDA 12.4 as default for release and nightly wheels (#12098 )	2025-02-26 19:06:37 -08:00
Jee Jee Li	5157338ed9	[Misc] Improve LoRA spelling (#13831 )	2025-02-25 23:43:01 -08:00
Michael Goin	07c4353057	[Model] Support Grok1 (#13795 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-02-26 01:07:12 +00:00
Harry Mellor	cdc1fa12eb	Remove unused kwargs from model definitions (#13555 )	2025-02-24 17:13:52 -08:00
Nicolò Lucchesi	444b0f0f62	[Misc][Docs] Raise error when flashinfer is not installed and `VLLM_ATTENTION_BACKEND` is set (#12513 ) Signed-off-by: NickLucche <nlucches@redhat.com>	2025-02-24 10:43:21 -05:00
Cyrus Leung	8354f6640c	[Doc] Dockerfile instructions for optional dependencies and dev transformers (#13699 )	2025-02-22 06:04:31 -08:00
Mark McLoughlin	2cb8c1540e	[Metrics] Add `--show-hidden-metrics-for-version` CLI arg (#13295 )	2025-02-22 00:20:45 -08:00
Yuan Tang	8c0dd3d4df	docs: Add a note on full CI run in contributing guide (#13646 )	2025-02-21 21:53:59 -08:00
Gabriel Marinho	1c3c975766	[FEATURE] Enables /score endpoint for embedding models (#12846 )	2025-02-20 22:09:47 -08:00
Kante Yin	44c33f01f3	Add llmaz as another integration (#13643 ) Signed-off-by: kerthcet <kerthcet@gmail.com>	2025-02-21 03:52:40 +00:00
Joe Runde	bfbc0b32c6	[Frontend] Add backend-specific options for guided decoding (#13505 ) Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>	2025-02-20 15:07:58 -05:00
Harry Mellor	992e5c3d34	Merge similar examples in `offline_inference` into single `basic` example (#12737 )	2025-02-20 04:53:51 -08:00
Jee Jee Li	512368e34a	[Misc] Qwen2.5 VL support LoRA (#13261 )	2025-02-19 18:37:55 -08:00
Wilson Wu	01c184b8f3	Fix copyright year to auto get current year (#13561 )	2025-02-19 16:55:34 +00:00
youkaichao	ad5a35c21b	[doc] clarify multi-node serving doc (#13558 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-02-19 22:32:17 +08:00
youkaichao	52ce14d31f	[doc] clarify profiling is only for developers (#13554 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-02-19 20:55:58 +08:00
Roger Wang	fd84857f64	[Doc] Add clarification note regarding paligemma (#13511 )	2025-02-18 22:24:03 -08:00
Harry Mellor	00b69c2d27	[Misc] Remove dangling references to `--use-v2-block-manager` (#13492 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-02-19 03:37:26 +00:00
youkaichao	7b203b7694	[misc] fix debugging code (#13487 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-02-18 09:37:11 -08:00
Harry Mellor	2358ca527b	[Doc]: Improve feature tables (#13224 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-02-18 18:52:39 +08:00
Isotr0py	67ef8f666a	[Model] Enable quantization support for `transformers` backend (#12960 )	2025-02-17 19:52:47 -08:00
Cyrus Leung	7b623fca0b	[VLM] Check required fields before initializing field config in `DictEmbeddingItems` (#13380 )	2025-02-17 01:36:07 -08:00
yankooo	f857311d13	Fix spelling error in index.md (#13369 )	2025-02-17 06:53:20 +00:00
shangmingc	46cdd59577	[Feature][Spec Decode] Simplify the use of Eagle Spec Decode (#12304 ) Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>	2025-02-16 19:32:26 -08:00
凌	da833b0aee	[Docs] Change myenv to vllm. Update python_env_setup.inc.md (#13325 )	2025-02-16 16:04:21 +00:00
Roger Wang	b7d309860e	[V1] Update doc and examples for H2O-VL (#13349 ) Signed-off-by: Roger Wang <ywang@roblox.com>	2025-02-16 10:35:54 +00:00
Cyrus Leung	367cb8ce8c	[Doc] [2/N] Add Fuyu E2E example for multimodal processor (#13331 )	2025-02-15 07:06:23 -08:00
Nicolò Lucchesi	579d7a63b2	[Bugfix][Docs] Fix offline Whisper (#13274 )	2025-02-14 21:32:37 -08:00
Nicolò Lucchesi	d84cef76eb	[Frontend] Add `/v1/audio/transcriptions` OpenAI API endpoint (#12909 )	2025-02-13 07:23:45 -08:00
Cyrus Leung	1bc3b5e71b	[VLM] Separate text-only and vision variants of the same model architecture (#13157 )	2025-02-13 06:19:15 -08:00
Cyrus Leung	c9d3ecf016	[VLM] Merged multi-modal processor for Molmo (#12966 )	2025-02-13 04:34:00 -08:00
Russell Bryant	d46d490c27	[Frontend] Move CLI code into vllm.cmd package (#12971 )	2025-02-12 23:12:21 -08:00
Cody Yu	60c68df6d1	[Build] Automatically use the wheel of the base commit with Python-only build (#13178 )	2025-02-12 23:10:28 -08:00

1 2 3 4 5 ...

717 Commits