Update deploying_with_k8s.rst (#10922)

2024-12-16 08:33:58 +08:00
parent 25ebed2f8c
commit da6f409246
1 changed files with 2 additions and 2 deletions
--- a/docs/source/serving/deploying_with_k8s.rst
+++ b/docs/source/serving/deploying_with_k8s.rst
@ -162,7 +162,7 @@ To test the deployment, run the following ``curl`` command:
    curl http://mistral-7b.default.svc.cluster.local/v1/completions \
      -H "Content-Type: application/json" \
      -d '{
-            "model": "facebook/opt-125m",
+            "model": "mistralai/Mistral-7B-Instruct-v0.3",
            "prompt": "San Francisco is a",
            "max_tokens": 7,
            "temperature": 0
@ -172,4 +172,4 @@ If the service is correctly deployed, you should receive a response from the vLL

 Conclusion
 ----------
-Deploying vLLM with Kubernetes allows for efficient scaling and management of ML models leveraging GPU resources. By following the steps outlined above, you should be able to set up and test a vLLM deployment within your Kubernetes cluster. If you encounter any issues or have suggestions, please feel free to contribute to the documentation.
+Deploying vLLM with Kubernetes allows for efficient scaling and management of ML models leveraging GPU resources. By following the steps outlined above, you should be able to set up and test a vLLM deployment within your Kubernetes cluster. If you encounter any issues or have suggestions, please feel free to contribute to the documentation.