|
|
119f683949
|
[doc] add project flag to gcloud TPU command (#19664)
Signed-off-by: David Xia <david@davidxia.com>
|
2025-06-17 01:00:09 +00:00 |
|
|
|
387bdf0ab9
|
[Model] Add support for MiniMaxM1ForCausalLM (shares architecture with MiniMaxText01ForCausalLM) (#19677)
Signed-off-by: QscQ <qscqesze@gmail.com>
|
2025-06-16 09:47:14 -07:00 |
|
|
|
8d120701fd
|
[Docs] Move multiproc doc to v1 dir (#19651)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2025-06-16 09:10:12 +00:00 |
|
|
|
0f0874515a
|
[Doc] Add troubleshooting section to k8s deployment (#19377)
Signed-off-by: Anna Pendleton <pendleton@google.com>
|
2025-06-13 21:47:51 +00:00 |
|
|
|
1015296b79
|
[doc][mkdocs] fix the duplicate Supported features sections in GPU docs (#19606)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-06-13 16:25:08 +00:00 |
|
|
|
c707cfc12e
|
[doc] fix incorrect link (#19586)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-06-13 04:26:09 +00:00 |
|
|
|
7b3c9ff91d
|
[Doc] uses absolute links for structured outputs (#19582)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
|
2025-06-13 03:35:17 +00:00 |
|
|
|
dba68f9159
|
[Doc] Unify structured outputs examples (#18196)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
|
2025-06-12 22:50:31 +00:00 |
|
|
|
c742438f8b
|
[Doc] Add V1 column to supported models list (#19523)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-06-12 19:16:44 +08:00 |
|
|
|
b2d9be6f7d
|
[Docs] Remove WIP features in V1 guide (#19498)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-06-11 09:15:03 -07:00 |
|
|
|
89b0f84e17
|
[doc] fix "Other AI accelerators" getting started page (#19457)
Signed-off-by: David Xia <david@davidxia.com>
|
2025-06-11 16:11:17 +00:00 |
|
|
|
29a38f0352
|
[Doc] Support "important" and "announcement" admonitions (#19479)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-06-11 01:39:58 -07:00 |
|
|
|
a5115f4ff5
|
[Doc] Fix quantization link titles (#19478)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-06-11 01:27:22 -07:00 |
|
|
|
68b4a26149
|
[Doc] Update V1 User Guide for Hardware and Models (#19474)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-06-11 00:49:06 -07:00 |
|
|
|
3952731e8f
|
[New Model]: Support Qwen3 Embedding & Reranker (#19260)
|
2025-06-10 20:07:30 -07:00 |
|
|
|
da9b523ce1
|
[Docs] Note that alternative structured output backends are supported (#19426)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2025-06-10 16:20:00 +00:00 |
|
|
|
9368cc90b2
|
Automatically bind CPU OMP Threads of a rank to CPU ids of a NUMA node. (#17930)
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
Co-authored-by: Li, Jiang <bigpyj64@gmail.com>
|
2025-06-10 06:22:05 +00:00 |
|
|
|
32b3946bb4
|
Add clear documentation around the impact of debugging flag (#19369)
Signed-off-by: Anna Pendleton <pendleton@google.com>
|
2025-06-10 06:16:09 +00:00 |
|
|
|
c016047ed7
|
Fix docs/mkdocs/hooks/remove_announcement.py (#19382)
|
2025-06-09 21:36:54 -07:00 |
|
|
|
c57c9415b1
|
[Docs] Fix a bullet list in usage/security.md (#19358)
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
|
2025-06-09 13:28:51 +00:00 |
|
|
|
0eca5eacd0
|
[Doc] Fix description in the Automatic Prefix Caching design doc (#19333)
Signed-off-by: cr7258 <chengzw258@163.com>
|
2025-06-09 17:30:02 +08:00 |
|
|
|
12e5829221
|
[doc] improve ci doc (#19307)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-06-09 07:26:12 +00:00 |
|
|
|
cb6d572e85
|
[Model] NemotronH support (#18863)
Signed-off-by: Luis Vega <2478335+vegaluisjose@users.noreply.github.com>
Co-authored-by: Luis Vega <2478335+vegaluisjose@users.noreply.github.com>
|
2025-06-05 21:29:28 +00:00 |
|
|
|
78dcf56cb3
|
[doc] small fix (#19167)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-06-05 09:13:50 +08:00 |
|
|
|
8f4ffbd373
|
[Doc] Update V1 Guide for embedding models (#19141)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-06-04 22:57:55 +08:00 |
|
|
|
02658c2dfe
|
Add DeepSeek-R1-0528 function call chat template (#18874)
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
|
2025-06-04 13:24:18 +00:00 |
|
|
|
8711bc5e68
|
[Misc] Add packages for benchmark as extra dependency (#19089)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-06-04 04:18:48 -07:00 |
|
|
|
4555143ea7
|
[CPU] V1 support for the CPU backend (#16441)
|
2025-06-03 18:43:01 -07:00 |
|
|
|
52dceb172d
|
[Docs] Add developer doc about CI failures (#18782)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Mark McLoughlin <markmc@redhat.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-06-04 01:09:13 +00:00 |
|
|
|
01eee40536
|
[doc] update docker version (#19074)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-06-03 19:08:21 +00:00 |
|
|
|
02f0c7b220
|
[Misc] Add SPDX-FileCopyrightText (#19100)
Signed-off-by: simon-mo <simon.mo@hey.com>
|
2025-06-03 11:20:17 -07:00 |
|
|
|
4e88723f32
|
[doc] clarify windows support (#19088)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-06-03 21:42:17 +08:00 |
|
|
|
118ff92111
|
[Doc] Update V1 user guide for embedding and enc-dec models (#19060)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-06-03 02:29:41 -07:00 |
|
|
|
42243fbda0
|
[Doc] Add InternVL LoRA support (#19055)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-06-03 09:08:03 +00:00 |
|
|
|
6d18ed2a2e
|
Update docker docs with ARM CUDA cross-compile (#19037)
Signed-off-by: mgoin <michael@neuralmagic.com>
|
2025-06-03 08:21:53 +00:00 |
|
|
|
f32fcd9444
|
[v1][KVCacheManager] Rename BlockHashType to BlockHash (#19015)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
|
2025-06-03 08:01:48 +00:00 |
|
|
|
d32aa2e670
|
[Bugfix] Use cmake 3.26.1 instead of 3.26 to avoid build failure (#19019)
Signed-off-by: Lu Fang <lufang@fb.com>
|
2025-06-03 00:16:17 -07:00 |
|
|
|
1282bd812e
|
Add tarsier model support (#18985)
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
|
2025-06-03 13:13:13 +08:00 |
|
|
|
9e6f61e8c3
|
[ROCm][Build] Clean up the ROCm build (#19040)
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
|
2025-06-02 20:47:47 -07:00 |
|
|
|
5bc1ad6cee
|
[Doc] Remove duplicate TOCs during MkDocs migration (#19021)
Signed-off-by: Zerohertz <ohg3417@gmail.com>
|
2025-06-02 19:49:48 -07:00 |
|
|
|
5b168b6d7a
|
[doc] add pytest tips (#19010)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-06-02 11:07:26 +00:00 |
|
|
|
432ec9926e
|
[doc] wrong output (#19000)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-06-01 11:26:14 +00:00 |
|
|
|
c594cbf565
|
[doc] small fix - mkdocs (#18996)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-31 20:23:43 -07:00 |
|
|
|
749f5bdd38
|
[doc] fix the list rendering issue - security.md (#18982)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-31 10:39:21 +00:00 |
|
|
|
0f71e24034
|
[Docs] Correct multiprocessing design doc (#18964)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
|
2025-05-31 01:30:15 +00:00 |
|
|
|
5a8641638a
|
[VLM] Add PP support and fix GPTQ inference for Ovis models (#18958)
Signed-off-by: isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-05-30 17:11:44 +00:00 |
|
|
|
ec6833c5e9
|
[doc] show the count for fork and watch (#18950)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-30 06:45:59 -07:00 |
|
|
|
8f8900cee9
|
[doc] add mkdocs doc (#18930)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-30 07:58:44 +00:00 |
|
|
|
4f4a6b844a
|
[Deprecation] Remove mean pooling default for Qwen2EmbeddingModel (#18913)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-30 06:53:37 +00:00 |
|
|
|
5acf828d99
|
[docs] fix: fix markdown syntax (#18927)
|
2025-05-30 05:20:48 +00:00 |
|