Add Production Metrics in Prometheus format (#1890)

This commit is contained in:
Simon Mo
2023-12-02 16:37:44 -08:00
committed by GitHub
parent 5f09cbdb63
commit 5313c2cb8b
6 changed files with 89 additions and 2 deletions

View File

@ -67,6 +67,7 @@ Documentation
serving/deploying_with_triton
serving/deploying_with_docker
serving/serving_with_langchain
serving/metrics
.. toctree::
:maxdepth: 1

View File

@ -0,0 +1,13 @@
Production Metrics
==================
vLLM exposes a number of metrics that can be used to monitor the health of the
system. These metrics are exposed via the `/metrics` endpoint on the vLLM
OpenAI compatible API server.
The following metrics are exposed:
.. literalinclude:: ../../../vllm/engine/metrics.py
:language: python
:start-after: begin-metrics-definitions
:end-before: end-metrics-definitions