Add Production Metrics in Prometheus format (#1890)

2023-12-02 16:37:44 -08:00
parent 5f09cbdb63
commit 5313c2cb8b
6 changed files with 89 additions and 2 deletions
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@ -67,6 +67,7 @@ Documentation
   serving/deploying_with_triton
   serving/deploying_with_docker
   serving/serving_with_langchain
+   serving/metrics

 .. toctree::
   :maxdepth: 1
--- a/docs/source/serving/metrics.rst
+++ b/docs/source/serving/metrics.rst
@ -0,0 +1,13 @@
+Production Metrics
+==================
+
+vLLM exposes a number of metrics that can be used to monitor the health of the
+system. These metrics are exposed via the `/metrics` endpoint on the vLLM
+OpenAI compatible API server.
+
+The following metrics are exposed:
+
+.. literalinclude:: ../../../vllm/engine/metrics.py
+    :language: python
+    :start-after: begin-metrics-definitions
+    :end-before: end-metrics-definitions