[FEATURE] Enables /score endpoint for embedding models (#12846)

This commit is contained in:
Gabriel Marinho
2025-02-21 03:09:47 -03:00
committed by GitHub
parent 1cdc88614a
commit 1c3c975766
11 changed files with 590 additions and 513 deletions

View File

@ -108,8 +108,7 @@ A code example can be found here: <gh-file:examples/offline_inference/basic/clas
### `LLM.score`
The {class}`~vllm.LLM.score` method outputs similarity scores between sentence pairs.
It is primarily designed for [cross-encoder models](https://www.sbert.net/examples/applications/cross-encoder/README.html).
These types of models serve as rerankers between candidate query-document pairs in RAG systems.
It is designed for embedding models and cross encoder models. Embedding models use cosine similarity, and [cross-encoder models](https://www.sbert.net/examples/applications/cross-encoder/README.html) serve as rerankers between candidate query-document pairs in RAG systems.
:::{note}
vLLM can only perform the model inference component (e.g. embedding, reranking) of RAG.