Commit Graph

13 Commits

Author SHA1 Message Date
c2cd1a2142 [doc] update pp support (#9853)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-10-30 13:36:51 -07:00
fc912e0886 [Models] Support Qwen model with PP (#6974)
Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
2024-08-01 12:40:43 -07:00
f3ff63c3f4 [doc][distributed] improve multinode serving doc (#6804) 2024-07-25 15:38:32 -07:00
71950af726 [doc][distributed] fix doc argument order (#6691) 2024-07-23 08:55:33 -07:00
c051bfe4eb [doc][distributed] doc for setting up multi-node environment (#6529)
[doc][distributed] add more doc for setting up multi-node environment (#6529)
2024-07-22 21:22:09 -07:00
45ceb85a0c [Docs] Update PP docs (#6598) 2024-07-19 16:38:21 -07:00
5bf35a91e4 [Doc][CI/Build] Update docs and tests to use vllm serve (#6431) 2024-07-17 07:43:21 +00:00
94b82e8c18 [doc][distributed] add suggestion for distributed inference (#6418) 2024-07-15 09:45:51 -07:00
673dd4cae9 [Docs] Docs update for Pipeline Parallel (#6222)
Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
Co-authored-by: Simon Mo <simon.mo@hey.com>
2024-07-09 16:24:58 -07:00
4050d646e5 [doc][misc] remove deprecated api server in doc (#6037) 2024-07-01 12:52:43 -04:00
c246212952 [doc][faq] add warning to download models for every nodes (#5783) 2024-06-24 15:37:42 +08:00
99dac099ab [Core][Doc] Default to multiprocessing for single-node distributed case (#5230)
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
2024-06-11 11:10:41 -07:00
2cf1a333b6 [Doc] Documentation for distributed inference (#261) 2023-06-26 11:34:23 -07:00