Stop using title frontmatter and fix doc that can only be reached by search (#20623)

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-08 11:27:40 +01:00
parent b4bab81660
commit b942c094e3
81 changed files with 82 additions and 238 deletions
--- a/docs/models/extensions/runai_model_streamer.md
+++ b/docs/models/extensions/runai_model_streamer.md
@ -1,6 +1,4 @@
---
-title: Loading models with Run:ai Model Streamer
---
+# Loading models with Run:ai Model Streamer

 Run:ai Model Streamer is a library to read tensors in concurrency, while streaming it to GPU memory.
 Further reading can be found in [Run:ai Model Streamer Documentation](https://github.com/run-ai/runai-model-streamer/blob/master/docs/README.md).
--- a/docs/models/extensions/tensorizer.md
+++ b/docs/models/extensions/tensorizer.md
@ -1,6 +1,4 @@
---
-title: Loading models with CoreWeave's Tensorizer
---
+# Loading models with CoreWeave's Tensorizer

 vLLM supports loading models with [CoreWeave's Tensorizer](https://docs.coreweave.com/coreweave-machine-learning-and-ai/inference/tensorizer).
 vLLM model tensors that have been serialized to disk, an HTTP/HTTPS endpoint, or S3 endpoint can be deserialized
--- a/docs/models/generative_models.md
+++ b/docs/models/generative_models.md
@ -1,6 +1,4 @@
---
-title: Generative Models
---
+# Generative Models

 vLLM provides first-class support for generative models, which covers most of LLMs.

--- a/docs/models/hardware_supported_models/tpu.md
+++ b/docs/models/hardware_supported_models/tpu.md
@ -1,6 +1,4 @@
---
-title: TPU
---
+# TPU

 # TPU Supported Models
 ## Text-only Language Models
--- a/docs/models/pooling_models.md
+++ b/docs/models/pooling_models.md
@ -1,6 +1,4 @@
---
-title: Pooling Models
---
+# Pooling Models

 vLLM also supports pooling models, including embedding, reranking and reward models.

--- a/docs/models/supported_models.md
+++ b/docs/models/supported_models.md
@ -1,6 +1,4 @@
---
-title: Supported Models
---
+# Supported Models

 vLLM supports [generative](./generative_models.md) and [pooling](./pooling_models.md) models across various tasks.
 If a model supports more than one task, you can set the task via the `--task` argument.