[Feature] vLLM CLI (#5090)

Co-authored-by: simon-mo <simon.mo@hey.com>
2024-07-14 15:36:43 -07:00
parent 73030b7dae
commit dbfe254eda
7 changed files with 223 additions and 36 deletions
--- a/benchmarks/benchmark_serving.py
+++ b/benchmarks/benchmark_serving.py
@ -2,8 +2,8 @@

 On the server side, run one of the following commands:
    vLLM OpenAI API server
-    python -m vllm.entrypoints.openai.api_server \
-        --model <your_model> --swap-space 16 \
+    vllm serve <your_model> \
+        --swap-space 16 \
        --disable-log-requests

    (TGI backend)