|
|
439c84581a
|
[Doc] Update description of vLLM support for CPUs (#6003)
|
2024-07-10 21:15:29 -07:00 |
|
|
|
8a924d2248
|
[Doc] Guide for adding multi-modal plugins (#6205)
|
2024-07-10 14:55:34 +08:00 |
|
|
|
673dd4cae9
|
[Docs] Docs update for Pipeline Parallel (#6222)
Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
Co-authored-by: Simon Mo <simon.mo@hey.com>
|
2024-07-09 16:24:58 -07:00 |
|
|
|
6206dcb29e
|
[Model] Add PaliGemma (#5189)
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2024-07-07 09:25:50 +08:00 |
|
|
|
9389380015
|
[Doc] Move guide for multimodal model and other improvements (#6168)
|
2024-07-06 17:18:59 +08:00 |
|
|
|
175c43eca4
|
[Doc] Reorganize Supported Models by Type (#6167)
|
2024-07-06 05:59:36 +00:00 |
|
|
|
79d406e918
|
[Docs] Fix readthedocs for tag build (#6158)
|
2024-07-05 12:44:40 -07:00 |
|
|
|
ae96ef8fbd
|
[VLM] Calculate maximum number of multi-modal tokens by model (#6121)
|
2024-07-04 16:37:23 -07:00 |
|
|
|
27902d42be
|
[misc][doc] try to add warning for latest html (#5979)
|
2024-07-04 09:57:09 -07:00 |
|
|
|
966fe72141
|
[doc][misc] bump up py version in installation doc (#6119)
|
2024-07-03 15:52:04 -07:00 |
|
|
|
d9e98f42e4
|
[vlm] Remove vision language config. (#6089)
Signed-off-by: Xiaowei Jiang <xwjiang2010@gmail.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2024-07-03 22:14:16 +00:00 |
|
|
|
47f0954af0
|
[Kernel] Expand FP8 support to Ampere GPUs using FP8 Marlin (#5975)
|
2024-07-03 17:38:00 +00:00 |
|
|
|
f1c78138aa
|
[Doc] Fix Mock Import (#6094)
|
2024-07-03 00:13:56 -07:00 |
|
|
|
9831aec49f
|
[Core] Dynamic image size support for VLMs (#5276)
Signed-off-by: Xiaowei Jiang <xwjiang2010@gmail.com>
Co-authored-by: Xiaowei Jiang <xwjiang2010@gmail.com>
Co-authored-by: ywang96 <ywang@roblox.com>
Co-authored-by: xwjiang2010 <87673679+xwjiang2010@users.noreply.github.com>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
|
2024-07-02 20:34:00 -07:00 |
|
|
|
9d6a8daa87
|
[Model] Jamba support (#4115)
Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
Co-authored-by: Erez Schwartz <erezs@ai21.com>
Co-authored-by: Mor Zusman <morz@ai21.com>
Co-authored-by: tomeras91 <57313761+tomeras91@users.noreply.github.com>
Co-authored-by: Tomer Asida <tomera@ai21.com>
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
Co-authored-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
|
2024-07-02 23:11:29 +00:00 |
|
|
|
98d6682cd1
|
[VLM] Remove image_input_type from VLM config (#5852)
Signed-off-by: Xiaowei Jiang <xwjiang2010@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2024-07-02 07:57:09 +00:00 |
|
|
|
8e0817c262
|
[Bugfix][Doc] Fix Doc Formatting (#6048)
|
2024-07-01 15:09:11 -07:00 |
|
|
|
83bdcb6ac3
|
add FAQ doc under 'serving' (#5946)
|
2024-07-01 14:11:36 -07:00 |
|
|
|
4050d646e5
|
[doc][misc] remove deprecated api server in doc (#6037)
|
2024-07-01 12:52:43 -04:00 |
|
|
|
57f09a419c
|
[Hardware][Intel] OpenVINO vLLM backend (#5379)
|
2024-06-28 13:50:16 +00:00 |
|
|
|
5cbe8d155c
|
[Core] Registry for processing model inputs (#5214)
Co-authored-by: ywang96 <ywang@roblox.com>
|
2024-06-28 12:09:56 +00:00 |
|
|
|
79c92c7c8a
|
[Model] Add Gemma 2 (#5908)
|
2024-06-27 13:33:56 -07:00 |
|
|
|
3fd02bda51
|
[doc][misc] add note for Kubernetes users (#5916)
|
2024-06-27 10:07:07 -07:00 |
|
|
|
96354d6a29
|
[Model] Add base class for LoRA-supported models (#5018)
|
2024-06-27 16:03:04 +08:00 |
|
|
|
294104c3f9
|
[doc] update usage of env var to avoid conflict (#5873)
|
2024-06-26 17:57:12 -04:00 |
|
|
|
3aa7b6cf66
|
[Misc][Doc] Add Example of using OpenAI Server with VLM (#5832)
|
2024-06-25 20:34:25 -07:00 |
|
|
|
dd793d1de5
|
[Hardware][AMD][CI/Build][Doc] Upgrade to ROCm 6.1, Dockerfile improvements, test fixes (#5422)
|
2024-06-25 15:56:15 -07:00 |
|
|
|
c18ebfdd71
|
[doc][distributed] add both gloo and nccl tests (#5834)
|
2024-06-25 15:10:28 -04:00 |
|
|
|
f23871e9ee
|
[Doc] Add notice about breaking changes to VLMs (#5818)
|
2024-06-25 01:25:03 -07:00 |
|
|
|
1744cc99ba
|
[Doc] Add Phi-3-medium to list of supported models (#5788)
|
2024-06-24 10:48:55 -07:00 |
|
|
|
e72dc6cb35
|
[Doc] Add "Suggest edit" button to doc pages (#5789)
|
2024-06-24 10:26:17 -07:00 |
|
|
|
c246212952
|
[doc][faq] add warning to download models for every nodes (#5783)
|
2024-06-24 15:37:42 +08:00 |
|
|
|
8c00f9c15d
|
[Docs][TPU] Add installation tip for TPU (#5761)
|
2024-06-21 23:09:40 -07:00 |
|
|
|
5b15bde539
|
[Doc] Documentation on supported hardware for quantization methods (#5745)
|
2024-06-21 12:44:29 -04:00 |
|
|
|
1b2eaac316
|
[Bugfix][Doc] FIx Duplicate Explicit Target Name Errors (#5703)
|
2024-06-19 23:10:47 -07:00 |
|
|
|
e83db9e7e3
|
[Doc] Update docker references (#5614)
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
|
2024-06-19 15:01:45 -07:00 |
|
|
|
2bd231a7b7
|
[Doc] Added cerebrium as Integration option (#5553)
|
2024-06-18 15:56:59 -07:00 |
|
|
|
daef218b55
|
[Model] Initialize Phi-3-vision support (#4986)
|
2024-06-17 19:34:33 -07:00 |
|
|
|
728c4c8a06
|
[Hardware][Intel GPU] Add Intel GPU(XPU) inference backend (#3814)
Co-authored-by: Jiang Li <jiang1.li@intel.com>
Co-authored-by: Abhilash Majumder <abhilash.majumder@intel.com>
Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>
|
2024-06-17 11:01:25 -07:00 |
|
|
|
845a3f26f9
|
[Doc] add debugging tips for crash and multi-node debugging (#5581)
|
2024-06-17 10:08:01 +08:00 |
|
|
|
6e2527a7cb
|
[Doc] Update documentation on Tensorizer (#5471)
|
2024-06-14 11:27:57 -07:00 |
|
|
|
cdab68dcdb
|
[Docs] Add ZhenFund as a Sponsor (#5548)
|
2024-06-14 11:17:21 -07:00 |
|
|
|
0ce7b952f8
|
[Doc] Update LLaVA docs (#5437)
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2024-06-13 11:22:07 -07:00 |
|
|
|
a65634d3ae
|
[Docs] Add 4th meetup slides (#5509)
|
2024-06-13 10:18:26 -07:00 |
|
|
|
80aa7e91fc
|
[Hardware][Intel] Optimize CPU backend and add more performance tips (#4971)
Co-authored-by: Jianan Gu <jianan.gu@intel.com>
|
2024-06-13 09:33:14 -07:00 |
|
|
|
b8d4dfff9c
|
[Doc] Update debug docs (#5438)
|
2024-06-12 14:49:31 -07:00 |
|
|
|
1a8bfd92d5
|
[Hardware] Initial TPU integration (#5292)
|
2024-06-12 11:53:03 -07:00 |
|
|
|
8f89d72090
|
[Doc] add common case for long waiting time (#5430)
|
2024-06-11 11:12:13 -07:00 |
|
|
|
99dac099ab
|
[Core][Doc] Default to multiprocessing for single-node distributed case (#5230)
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
|
2024-06-11 11:10:41 -07:00 |
|
|
|
89ec06c33b
|
[Docs] [Spec decode] Fix docs error in code example (#5427)
|
2024-06-11 10:31:56 -07:00 |
|