51 Commits

Author SHA1 Message Date
4464723f22 [Frontend][Doc][5/N] Improve all pooling task | Polish encode (pooling) api & Document. (#25524)
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-10-30 12:13:05 +00:00
b038d9c40c [Data-parallel] Allow DP>1 for world_size > num_gpus on node (8) (#26367)
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>
Signed-off-by: Rui Qiao <ruisearch42@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Rui Qiao <ruisearch42@gmail.com>
2025-10-17 08:24:42 -07:00
483ea64611 [Docs] Replace all explicit anchors with real links (#27087)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-17 02:22:06 -07:00
4ffd6e8942 [Docs] Reduce custom syntax used in docs (#27009)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-16 20:05:34 -07:00
650b51f9f9 [doc] add Context Parallel Deployment doc (#26877)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-10-15 16:33:52 +08:00
6256697997 [Doc] ruff format remaining Python examples (#26795)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-15 01:25:49 -07:00
3e051bda82 [UX] Replace VLLM_ALL2ALL_BACKEND with --all2all-backend (#26732)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-10-13 18:12:52 -07:00
9fc983c707 [NIXL][non-cuda] Add install script for nixl with non-cuda ucx (#25959)
Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>
2025-10-08 14:19:53 +00:00
2f652e6cdf [Doc] Improve MM Pooling model documentation (#25966)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-09-30 18:58:29 +00:00
ecb37e276a [docs] transcriptions API audio upload (#25446)
Signed-off-by: zxw <1020938856@qq.com>
2025-09-27 15:00:35 +00:00
b1068903fd [docs] fix nixl kv_connector_extra_config.backends key (#25565)
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
Signed-off-by: Peter Pan <peter.pan@daocloud.io>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-09-24 11:00:27 +00:00
5774b0a1da [NIXL][OOT platform] support nixl_connector with oot platform and other nixl_backend (#25121)
Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>
2025-09-23 04:17:42 +00:00
eaffe4486c [Docs] Fix pooling-params doc references in openai_compatible_server.md (#24939) 2025-09-18 04:36:47 -07:00
29283e8976 [Chore] Cleanup guided namespace, move to structured outputs config (#22772)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-18 09:20:27 +00:00
64ad551878 Removes source compilation of nixl dependency (#24874)
Signed-off-by: bbartels <benjamin@bartels.dev>
Signed-off-by: Benjamin Bartels <benjamin@bartels.dev>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Daniele <36171005+dtrifiro@users.noreply.github.com>
2025-09-17 01:33:18 +00:00
bf214ca226 [Misc] Fix examples openai_pooling_client.py (#24853)
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-15 11:57:30 +00:00
51d41265ad [Docs] Fix typos in EP deployment doc (#24669)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-11 09:07:23 -07:00
8b83b93739 [Docs] Document the extra memory footprint overhead when using EPLB (#24537)
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
2025-09-10 06:09:49 -07:00
c44797a4d6 [Docs]add eplb_config param use docs (#24213)
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
2025-09-08 09:36:57 -07:00
55be93baf5 [Doc]: fix 2 hyperlinks leading to Ray site after they changed Ray's doc structure (#24438)
Signed-off-by: Didier Durand <durand.didier@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-08 09:36:54 -07:00
e290594072 [Docs] Rename “Distributed inference and serving” to “Parallelism & Scaling” (#22466)
Signed-off-by: Ricardo Decal <rdecal@anyscale.com>
2025-08-08 19:26:21 +00:00
a2c6696bfe [Docs] Factor out troubleshooting to its own guide; add section for Ray Observability (#21578)
Signed-off-by: Ricardo Decal <rdecal@anyscale.com>
2025-08-07 00:29:13 -07:00
4cd7fe6cea [Docs] Expand introduction to Ray in Multi-node deployment section (#21584)
Signed-off-by: Ricardo Decal <rdecal@anyscale.com>
2025-07-29 22:07:28 -07:00
ba5c5e5404 [Docs] Switch to better markdown linting pre-commit hook (#21851)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-29 19:45:08 -07:00
86ae693f20 [Deprecation][2/N] Replace --task with --runner and --convert (#21470)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-27 19:42:40 -07:00
a6c7fb8cff [Docs] Add Expert Parallelism Initial Documentation (#21373)
Signed-off-by: simon-mo <simon.mo@hey.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-24 12:36:06 -07:00
a7272c23d0 [Docs][minor] Fix broken gh-file link in distributed serving docs (#21543)
Signed-off-by: Ricardo Decal <rdecal@anyscale.com>
2025-07-24 10:36:56 -07:00
684174115d [Docs] Rewrite Distributed Inference and Serving guide (#20593)
Signed-off-by: Ricardo Decal <rdecal@anyscale.com>
Co-authored-by: Simon Mo <simon.mo@hey.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-24 08:13:05 -07:00
63d92abb7c [Frontend] Set MAX_AUDIO_CLIP_FILESIZE_MB via env var instead of hardcoding (#21374)
Signed-off-by: Deven Labovitch <deven@videa.ai>
2025-07-23 20:22:19 -07:00
c4e3b12524 [Docs] Add minimal demo of Ray Data API usage (#21080)
Signed-off-by: Ricardo Decal <rdecal@anyscale.com>
2025-07-17 20:09:19 -07:00
d91278181d [doc] Add more details for Ray-based DP (#20948)
Signed-off-by: Rui Qiao <ruisearch42@gmail.com>
2025-07-15 05:37:12 -07:00
9907fc4494 [Docs] Data Parallel deployment documentation (#20768)
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-07-11 09:42:10 -07:00
4bed167768 [Model][VLM] Support JinaVL Reranker (#20260)
Signed-off-by: shineran96 <shinewang96@gmail.com>
2025-07-10 10:43:43 -07:00
b942c094e3 Stop using title frontmatter and fix doc that can only be reached by search (#20623)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-08 03:27:40 -07:00
b4bab81660 Remove unnecessary explicit title anchors and use relative links instead (#20620)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-08 02:49:13 -07:00
90a2769f20 [Docs] Add Ray Serve LLM section to openai compatible server guide (#20595)
Signed-off-by: Ricardo Decal <rdecal@anyscale.com>
2025-07-07 20:08:05 -07:00
0d914c81a2 [Docs] Rewrite offline inference guide (#20594)
Signed-off-by: Ricardo Decal <rdecal@anyscale.com>
2025-07-07 20:06:02 -07:00
6e428cdd7a [Doc] Syntax highlight request responses as JSON instead of bash (#20582)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-07 20:02:45 -07:00
af107d5a0e Make distinct code and console admonitions so readers are less likely to miss them (#20585)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-07 19:55:28 -07:00
6f1229f91d [Model][2/N] Automatic conversion of CrossEncoding model (#19978)
Signed-off-by: wang.yuqi <noooop@126.com>
2025-07-03 13:59:23 +00:00
be0cfb2b68 fix[Docs]: link anchor is incorrect #20309 (#20315)
Signed-off-by: zxw <1020938856@qq.com>
2025-07-02 06:32:34 +00:00
bd5038af07 [Doc] add config and troubleshooting guide for NCCL & GPUDirect RDMA (#15897)
Signed-off-by: Ernest Wong <chwong719@gmail.com>
2025-06-30 21:44:39 -07:00
6e244ae091 [Perf][Frontend] eliminate api_key and x_request_id headers middleware overhead (#19946)
Signed-off-by: Yazan-Sharaya <yazan.sharaya.yes@gmail.com>
2025-06-27 00:44:14 -04:00
e795d723ed [Frontend] Add /v1/audio/translations OpenAI API endpoint (#19615)
Signed-off-by: Roger Wang <ywang@roblox.com>
Signed-off-by: NickLucche <nlucches@redhat.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
2025-06-25 17:54:14 +00:00
c3649e4fee [Docs] Fix syntax highlighting of shell commands (#19870)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
2025-06-23 17:59:09 +00:00
f17aec0d63 [doc] Fold long code blocks to improve readability (#19926)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-06-23 05:24:23 +00:00
29a38f0352 [Doc] Support "important" and "announcement" admonitions (#19479)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-06-11 01:39:58 -07:00
279f854519 [doc] improve readability (#18675)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-05-25 01:40:31 -07:00
1cb194a018 [Doc] Reorganize user guide (#18661)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-05-24 07:25:33 -07:00
371f7e4ca2 [Doc] Fix broken links and unlinked docs, add shortcuts to home sidebar (#18627)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-05-23 10:22:40 -07:00