[Model] Adding Support for Qwen2VL as an Embedding Model. Using MrLight/dse-qwen2-2b-mrl-v1 (#9944)
Signed-off-by: FurtherAI <austin.veselka@lighton.ai> Co-authored-by: FurtherAI <austin.veselka@lighton.ai>
This commit is contained in:
@ -584,6 +584,12 @@ Multimodal Embedding
|
||||
- :code:`TIGER-Lab/VLM2Vec-Full`
|
||||
- 🚧
|
||||
- ✅︎
|
||||
* - :code:`Qwen2VLForConditionalGeneration`
|
||||
- Qwen2-VL-based
|
||||
- T + I
|
||||
- :code:`MrLight/dse-qwen2-2b-mrl-v1`
|
||||
-
|
||||
- ✅︎
|
||||
|
||||
.. important::
|
||||
Some model architectures support both generation and embedding tasks.
|
||||
|
||||
@ -310,4 +310,21 @@ Since the request schema is not defined by OpenAI client, we post a request to t
|
||||
response_json = response.json()
|
||||
print("Embedding output:", response_json["data"][0]["embedding"])
|
||||
|
||||
Here is an example for serving the ``MrLight/dse-qwen2-2b-mrl-v1`` model.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
vllm serve MrLight/dse-qwen2-2b-mrl-v1 --task embedding \
|
||||
--trust-remote-code --max-model-len 8192 --chat-template examples/template_dse_qwen2_vl.jinja
|
||||
|
||||
.. important::
|
||||
|
||||
Like with VLM2Vec, we have to explicitly pass ``--task embedding``. Additionally, ``MrLight/dse-qwen2-2b-mrl-v1`` requires an EOS token for embeddings,
|
||||
which is handled by the jinja template.
|
||||
|
||||
.. important::
|
||||
|
||||
Also important, ``MrLight/dse-qwen2-2b-mrl-v1`` requires a placeholder image of the minimum image size for text query embeddings. See the full code
|
||||
example below for details.
|
||||
|
||||
A full code example can be found in `examples/openai_chat_embedding_client_for_multimodal.py <https://github.com/vllm-project/vllm/blob/main/examples/openai_chat_embedding_client_for_multimodal.py>`_.
|
||||
|
||||
Reference in New Issue
Block a user