Add Production Metrics in Prometheus format (#1890)
This commit is contained in:
@ -67,6 +67,7 @@ Documentation
|
||||
serving/deploying_with_triton
|
||||
serving/deploying_with_docker
|
||||
serving/serving_with_langchain
|
||||
serving/metrics
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
13
docs/source/serving/metrics.rst
Normal file
13
docs/source/serving/metrics.rst
Normal file
@ -0,0 +1,13 @@
|
||||
Production Metrics
|
||||
==================
|
||||
|
||||
vLLM exposes a number of metrics that can be used to monitor the health of the
|
||||
system. These metrics are exposed via the `/metrics` endpoint on the vLLM
|
||||
OpenAI compatible API server.
|
||||
|
||||
The following metrics are exposed:
|
||||
|
||||
.. literalinclude:: ../../../vllm/engine/metrics.py
|
||||
:language: python
|
||||
:start-after: begin-metrics-definitions
|
||||
:end-before: end-metrics-definitions
|
||||
Reference in New Issue
Block a user