Remove unnecessary explicit title anchors and use relative links instead (#20620)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
This commit is contained in:
@ -1,7 +1,6 @@
|
||||
---
|
||||
title: Frequently Asked Questions
|
||||
---
|
||||
[](){ #faq }
|
||||
|
||||
> Q: How can I serve multiple models on a single port using the OpenAI API?
|
||||
|
||||
@ -12,7 +11,7 @@ A: Assuming that you're referring to using OpenAI compatible server to serve mul
|
||||
> Q: Which model to use for offline inference embedding?
|
||||
|
||||
A: You can try [e5-mistral-7b-instruct](https://huggingface.co/intfloat/e5-mistral-7b-instruct) and [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5);
|
||||
more are listed [here][supported-models].
|
||||
more are listed [here](../models/supported_models.md).
|
||||
|
||||
By extracting hidden states, vLLM can automatically convert text generation models like [Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B),
|
||||
[Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) into embedding models,
|
||||
|
||||
@ -4,7 +4,7 @@ vLLM exposes a number of metrics that can be used to monitor the health of the
|
||||
system. These metrics are exposed via the `/metrics` endpoint on the vLLM
|
||||
OpenAI compatible API server.
|
||||
|
||||
You can start the server using Python, or using [Docker][deployment-docker]:
|
||||
You can start the server using Python, or using [Docker](../deployment/docker.md):
|
||||
|
||||
```bash
|
||||
vllm serve unsloth/Llama-3.2-1B-Instruct
|
||||
|
||||
@ -1,7 +1,6 @@
|
||||
---
|
||||
title: Troubleshooting
|
||||
---
|
||||
[](){ #troubleshooting }
|
||||
|
||||
This document outlines some troubleshooting strategies you can consider. If you think you've discovered a bug, please [search existing issues](https://github.com/vllm-project/vllm/issues?q=is%3Aissue) first to see if it has already been reported. If not, please [file a new issue](https://github.com/vllm-project/vllm/issues/new/choose), providing as much relevant information as possible.
|
||||
|
||||
@ -267,7 +266,7 @@ or:
|
||||
ValueError: Model architectures ['<arch>'] are not supported for now. Supported architectures: [...]
|
||||
```
|
||||
|
||||
But you are sure that the model is in the [list of supported models][supported-models], there may be some issue with vLLM's model resolution. In that case, please follow [these steps](../configuration/model_resolution.md) to explicitly specify the vLLM implementation for the model.
|
||||
But you are sure that the model is in the [list of supported models](../models/supported_models.md), there may be some issue with vLLM's model resolution. In that case, please follow [these steps](../configuration/model_resolution.md) to explicitly specify the vLLM implementation for the model.
|
||||
|
||||
## Failed to infer device type
|
||||
|
||||
|
||||
@ -90,7 +90,7 @@ vLLM V1 currently excludes model architectures with the `SupportsV0Only` protoco
|
||||
|
||||
!!! tip
|
||||
|
||||
This corresponds to the V1 column in our [list of supported models][supported-models].
|
||||
This corresponds to the V1 column in our [list of supported models](../models/supported_models.md).
|
||||
|
||||
See below for the status of models that are not yet supported or have more features planned in V1.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user