Remove unnecessary explicit title anchors and use relative links instead (#20620)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
This commit is contained in:
@ -1,12 +1,11 @@
|
||||
---
|
||||
title: Summary
|
||||
---
|
||||
[](){ #new-model }
|
||||
|
||||
!!! important
|
||||
Many decoder language models can now be automatically loaded using the [Transformers backend][transformers-backend] without having to implement them in vLLM. See if `vllm serve <model>` works first!
|
||||
|
||||
vLLM models are specialized [PyTorch](https://pytorch.org/) models that take advantage of various [features][compatibility-matrix] to optimize their performance.
|
||||
vLLM models are specialized [PyTorch](https://pytorch.org/) models that take advantage of various [features](../../features/compatibility_matrix.md) to optimize their performance.
|
||||
|
||||
The complexity of integrating a model into vLLM depends heavily on the model's architecture.
|
||||
The process is considerably straightforward if the model shares a similar architecture with an existing model in vLLM.
|
||||
|
||||
@ -1,7 +1,6 @@
|
||||
---
|
||||
title: Basic Model
|
||||
---
|
||||
[](){ #new-model-basic }
|
||||
|
||||
This guide walks you through the steps to implement a basic vLLM model.
|
||||
|
||||
@ -108,7 +107,7 @@ This method should load the weights from the HuggingFace's checkpoint file and a
|
||||
|
||||
## 5. Register your model
|
||||
|
||||
See [this page][new-model-registration] for instructions on how to register your new model to be used by vLLM.
|
||||
See [this page](registration.md) for instructions on how to register your new model to be used by vLLM.
|
||||
|
||||
## Frequently Asked Questions
|
||||
|
||||
|
||||
@ -1,13 +1,12 @@
|
||||
---
|
||||
title: Multi-Modal Support
|
||||
---
|
||||
[](){ #supports-multimodal }
|
||||
|
||||
This document walks you through the steps to extend a basic model so that it accepts [multi-modal inputs][multimodal-inputs].
|
||||
This document walks you through the steps to extend a basic model so that it accepts [multi-modal inputs](../../features/multimodal_inputs.md).
|
||||
|
||||
## 1. Update the base vLLM model
|
||||
|
||||
It is assumed that you have already implemented the model in vLLM according to [these steps][new-model-basic].
|
||||
It is assumed that you have already implemented the model in vLLM according to [these steps](basic.md).
|
||||
Further update the model as follows:
|
||||
|
||||
- Implement [get_placeholder_str][vllm.model_executor.models.interfaces.SupportsMultiModal.get_placeholder_str] to define the placeholder string which is used to represent the multi-modal item in the text prompt. This should be consistent with the chat template of the model.
|
||||
@ -483,7 +482,7 @@ Afterwards, create a subclass of [BaseMultiModalProcessor][vllm.multimodal.proce
|
||||
to fill in the missing details about HF processing.
|
||||
|
||||
!!! info
|
||||
[Multi-Modal Data Processing][mm-processing]
|
||||
[Multi-Modal Data Processing](../../design/mm_processing.md)
|
||||
|
||||
### Multi-modal fields
|
||||
|
||||
@ -846,7 +845,7 @@ Examples:
|
||||
|
||||
### Handling prompt updates unrelated to multi-modal data
|
||||
|
||||
[_get_prompt_updates][vllm.multimodal.processing.BaseMultiModalProcessor._get_prompt_updates] assumes that each application of prompt update corresponds to one multi-modal item. If the HF processor performs additional processing regardless of how many multi-modal items there are, you should override [_apply_hf_processor_tokens_only][vllm.multimodal.processing.BaseMultiModalProcessor._apply_hf_processor_tokens_only] so that the processed token inputs are consistent with the result of applying the HF processor on text inputs. This is because token inputs bypass the HF processor according to [our design][mm-processing].
|
||||
[_get_prompt_updates][vllm.multimodal.processing.BaseMultiModalProcessor._get_prompt_updates] assumes that each application of prompt update corresponds to one multi-modal item. If the HF processor performs additional processing regardless of how many multi-modal items there are, you should override [_apply_hf_processor_tokens_only][vllm.multimodal.processing.BaseMultiModalProcessor._apply_hf_processor_tokens_only] so that the processed token inputs are consistent with the result of applying the HF processor on text inputs. This is because token inputs bypass the HF processor according to [our design](../../design/mm_processing.md).
|
||||
|
||||
Examples:
|
||||
|
||||
|
||||
@ -1,10 +1,9 @@
|
||||
---
|
||||
title: Registering a Model
|
||||
---
|
||||
[](){ #new-model-registration }
|
||||
|
||||
vLLM relies on a model registry to determine how to run each model.
|
||||
A list of pre-registered architectures can be found [here][supported-models].
|
||||
A list of pre-registered architectures can be found [here](../../models/supported_models.md).
|
||||
|
||||
If your model is not on this list, you must register it to vLLM.
|
||||
This page provides detailed instructions on how to do so.
|
||||
@ -14,16 +13,16 @@ This page provides detailed instructions on how to do so.
|
||||
To add a model directly to the vLLM library, start by forking our [GitHub repository](https://github.com/vllm-project/vllm) and then [build it from source][build-from-source].
|
||||
This gives you the ability to modify the codebase and test your model.
|
||||
|
||||
After you have implemented your model (see [tutorial][new-model-basic]), put it into the <gh-dir:vllm/model_executor/models> directory.
|
||||
After you have implemented your model (see [tutorial](basic.md)), put it into the <gh-dir:vllm/model_executor/models> directory.
|
||||
Then, add your model class to `_VLLM_MODELS` in <gh-file:vllm/model_executor/models/registry.py> so that it is automatically registered upon importing vLLM.
|
||||
Finally, update our [list of supported models][supported-models] to promote your model!
|
||||
Finally, update our [list of supported models](../../models/supported_models.md) to promote your model!
|
||||
|
||||
!!! important
|
||||
The list of models in each section should be maintained in alphabetical order.
|
||||
|
||||
## Out-of-tree models
|
||||
|
||||
You can load an external model [using a plugin][plugin-system] without modifying the vLLM codebase.
|
||||
You can load an external model [using a plugin](../../design/plugin_system.md) without modifying the vLLM codebase.
|
||||
|
||||
To register the model, use the following code:
|
||||
|
||||
@ -51,4 +50,4 @@ def register():
|
||||
|
||||
!!! important
|
||||
If your model is a multimodal model, ensure the model class implements the [SupportsMultiModal][vllm.model_executor.models.interfaces.SupportsMultiModal] interface.
|
||||
Read more about that [here][supports-multimodal].
|
||||
Read more about that [here](multimodal.md).
|
||||
|
||||
@ -1,7 +1,6 @@
|
||||
---
|
||||
title: Unit Testing
|
||||
---
|
||||
[](){ #new-model-tests }
|
||||
|
||||
This page explains how to write unit tests to verify the implementation of your model.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user