Commit Graph

34 Commits

Author SHA1 Message Date
f3d0bf7589 [Doc][Installation] delete python setup.py develop (#3989) 2024-04-11 03:33:02 +00:00
0e3f06fe9c [Hardware][Intel] Add CPU inference backend (#3634)
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
Co-authored-by: Yuan Zhou <yuan.zhou@intel.com>
2024-04-01 22:07:30 -07:00
9c82a1bec3 [Doc] Update installation doc (#3746)
[Doc] Update installation doc for build from source and explain the dependency on torch/cuda version (#3746)
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
2024-03-30 16:34:38 -07:00
42bc386129 [CI/Build] respect the common environment variable MAX_JOBS (#3600) 2024-03-24 17:04:00 -07:00
63e8b28a99 [Doc] minor fix of spelling in amd-installation.rst (#3506) 2024-03-19 20:32:30 +00:00
2a60c9bd17 [Doc] minor fix to neuron-installation.rst (#3505) 2024-03-19 13:21:35 -07:00
d0fae88114 [DOC] add setup document to support neuron backend (#2777) 2024-03-04 01:03:51 +00:00
0580aab02f [ROCm] support Radeon™ 7900 series (gfx1100) without using flash-attention (#2768) 2024-02-10 23:14:37 -08:00
931746bc6d Add documentation on how to do incremental builds (#2796) 2024-02-07 14:42:02 -08:00
6b7de1a030 [ROCm] add support to ROCm 6.0 and MI300 (#2274) 2024-01-26 12:41:10 -08:00
9c1352eb57 [Feature] Simple API token authentication and pluggable middlewares (#1106) 2024-01-23 15:13:00 -08:00
827cbcd37c Update quickstart.rst (#2369) 2024-01-12 12:56:18 -08:00
f745847ef7 [Minor] Fix the format in quick start guide related to Model Scope (#2425) 2024-01-11 19:44:01 -08:00
1db83e31a2 [Docs] Update installation instructions to include CUDA 11.8 xFormers (#2246) 2023-12-22 23:20:02 -08:00
1b7c791d60 [ROCm] Fixes for GPTQ on ROCm (#2180) 2023-12-18 10:41:04 -08:00
6565d9e33e Update installation instruction for vLLM + CUDA 11.8 (#2086) 2023-12-13 09:25:59 -08:00
f375ec8440 [ROCm] Upgrade xformers version for ROCm & update doc (#2079)
Co-authored-by: miloice <jeffaw99@hotmail.com>
2023-12-13 00:56:05 -08:00
6ccc0bfffb Merge EmbeddedLLM/vllm-rocm into vLLM main (#1836)
Co-authored-by: Philipp Moritz <pcmoritz@gmail.com>
Co-authored-by: Amir Balwel <amoooori04@gmail.com>
Co-authored-by: root <kuanfu.liu@akirakan.com>
Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com>
Co-authored-by: kuanfu <kuanfu.liu@embeddedllm.com>
Co-authored-by: miloice <17350011+kliuae@users.noreply.github.com>
2023-12-07 23:16:52 -08:00
42c02f5892 Fix quickstart.rst typo jinja (#1964) 2023-12-07 08:34:44 -08:00
c07a442854 chore(examples-docs): upgrade to OpenAI V1 (#1785) 2023-12-03 01:11:22 -08:00
66785cc05c Support chat template and echo for chat API (#1756) 2023-11-30 16:43:13 -08:00
06e9ebebd5 Add instructions to install vLLM+cu118 (#1717) 2023-11-18 23:48:58 -08:00
edb305584b Support download models from www.modelscope.cn (#1588) 2023-11-17 20:38:31 -08:00
4ee52bb169 Docs: Fix broken link to openai example (#1145)
Link to `openai_client.py` is no longer valid - updated to `openai_completion_client.py`
2023-09-22 11:36:09 -07:00
7d7e3b78a3 Use --ipc=host in docker run for distributed inference (#1125) 2023-09-21 18:26:47 -07:00
b9cecc2635 [Docs] Update installation page (#1005) 2023-09-10 14:23:31 -07:00
b7e62d3454 Fix repo & documentation URLs (#163) 2023-06-19 20:03:40 -07:00
dcda03b4cb Write README and front page of doc (#147) 2023-06-18 03:19:38 -07:00
bec7b2dc26 Add quickstart guide (#148) 2023-06-18 01:26:12 +08:00
0b98ba15c7 Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00
e38074b1e6 Support FP32 (#141) 2023-06-07 00:40:21 -07:00
376725ce74 [PyPI] Packaging for PyPI distribution (#140) 2023-06-05 20:03:14 -07:00
56b7f0efa4 Add a doc for installation (#128) 2023-05-27 01:13:06 -07:00
19d2899439 Add initial sphinx docs (#120) 2023-05-22 17:02:44 -07:00