Remove unnecessary explicit title anchors and use relative links instead (#20620)

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
This commit is contained in:
Harry Mellor
2025-07-08 10:49:13 +01:00
committed by GitHub
parent b91cb3fa5c
commit b4bab81660
86 changed files with 75 additions and 147 deletions

View File

@ -1,12 +1,11 @@
---
title: Summary
---
[](){ #new-model }
!!! important
Many decoder language models can now be automatically loaded using the [Transformers backend][transformers-backend] without having to implement them in vLLM. See if `vllm serve <model>` works first!
vLLM models are specialized [PyTorch](https://pytorch.org/) models that take advantage of various [features][compatibility-matrix] to optimize their performance.
vLLM models are specialized [PyTorch](https://pytorch.org/) models that take advantage of various [features](../../features/compatibility_matrix.md) to optimize their performance.
The complexity of integrating a model into vLLM depends heavily on the model's architecture.
The process is considerably straightforward if the model shares a similar architecture with an existing model in vLLM.

View File

@ -1,7 +1,6 @@
---
title: Basic Model
---
[](){ #new-model-basic }
This guide walks you through the steps to implement a basic vLLM model.
@ -108,7 +107,7 @@ This method should load the weights from the HuggingFace's checkpoint file and a
## 5. Register your model
See [this page][new-model-registration] for instructions on how to register your new model to be used by vLLM.
See [this page](registration.md) for instructions on how to register your new model to be used by vLLM.
## Frequently Asked Questions

View File

@ -1,13 +1,12 @@
---
title: Multi-Modal Support
---
[](){ #supports-multimodal }
This document walks you through the steps to extend a basic model so that it accepts [multi-modal inputs][multimodal-inputs].
This document walks you through the steps to extend a basic model so that it accepts [multi-modal inputs](../../features/multimodal_inputs.md).
## 1. Update the base vLLM model
It is assumed that you have already implemented the model in vLLM according to [these steps][new-model-basic].
It is assumed that you have already implemented the model in vLLM according to [these steps](basic.md).
Further update the model as follows:
- Implement [get_placeholder_str][vllm.model_executor.models.interfaces.SupportsMultiModal.get_placeholder_str] to define the placeholder string which is used to represent the multi-modal item in the text prompt. This should be consistent with the chat template of the model.
@ -483,7 +482,7 @@ Afterwards, create a subclass of [BaseMultiModalProcessor][vllm.multimodal.proce
to fill in the missing details about HF processing.
!!! info
[Multi-Modal Data Processing][mm-processing]
[Multi-Modal Data Processing](../../design/mm_processing.md)
### Multi-modal fields
@ -846,7 +845,7 @@ Examples:
### Handling prompt updates unrelated to multi-modal data
[_get_prompt_updates][vllm.multimodal.processing.BaseMultiModalProcessor._get_prompt_updates] assumes that each application of prompt update corresponds to one multi-modal item. If the HF processor performs additional processing regardless of how many multi-modal items there are, you should override [_apply_hf_processor_tokens_only][vllm.multimodal.processing.BaseMultiModalProcessor._apply_hf_processor_tokens_only] so that the processed token inputs are consistent with the result of applying the HF processor on text inputs. This is because token inputs bypass the HF processor according to [our design][mm-processing].
[_get_prompt_updates][vllm.multimodal.processing.BaseMultiModalProcessor._get_prompt_updates] assumes that each application of prompt update corresponds to one multi-modal item. If the HF processor performs additional processing regardless of how many multi-modal items there are, you should override [_apply_hf_processor_tokens_only][vllm.multimodal.processing.BaseMultiModalProcessor._apply_hf_processor_tokens_only] so that the processed token inputs are consistent with the result of applying the HF processor on text inputs. This is because token inputs bypass the HF processor according to [our design](../../design/mm_processing.md).
Examples:

View File

@ -1,10 +1,9 @@
---
title: Registering a Model
---
[](){ #new-model-registration }
vLLM relies on a model registry to determine how to run each model.
A list of pre-registered architectures can be found [here][supported-models].
A list of pre-registered architectures can be found [here](../../models/supported_models.md).
If your model is not on this list, you must register it to vLLM.
This page provides detailed instructions on how to do so.
@ -14,16 +13,16 @@ This page provides detailed instructions on how to do so.
To add a model directly to the vLLM library, start by forking our [GitHub repository](https://github.com/vllm-project/vllm) and then [build it from source][build-from-source].
This gives you the ability to modify the codebase and test your model.
After you have implemented your model (see [tutorial][new-model-basic]), put it into the <gh-dir:vllm/model_executor/models> directory.
After you have implemented your model (see [tutorial](basic.md)), put it into the <gh-dir:vllm/model_executor/models> directory.
Then, add your model class to `_VLLM_MODELS` in <gh-file:vllm/model_executor/models/registry.py> so that it is automatically registered upon importing vLLM.
Finally, update our [list of supported models][supported-models] to promote your model!
Finally, update our [list of supported models](../../models/supported_models.md) to promote your model!
!!! important
The list of models in each section should be maintained in alphabetical order.
## Out-of-tree models
You can load an external model [using a plugin][plugin-system] without modifying the vLLM codebase.
You can load an external model [using a plugin](../../design/plugin_system.md) without modifying the vLLM codebase.
To register the model, use the following code:
@ -51,4 +50,4 @@ def register():
!!! important
If your model is a multimodal model, ensure the model class implements the [SupportsMultiModal][vllm.model_executor.models.interfaces.SupportsMultiModal] interface.
Read more about that [here][supports-multimodal].
Read more about that [here](multimodal.md).

View File

@ -1,7 +1,6 @@
---
title: Unit Testing
---
[](){ #new-model-tests }
This page explains how to write unit tests to verify the implementation of your model.