mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-05-21 00:36:43 +08:00
## Summary - When a model is registered as `chat` in `tenant_llm` but has the `IMAGE2TEXT` tag in `llm_factories.json`, requesting it as `image2text` (e.g. PDF parser) fails with `Tenant Model with name <model> and type image2text not found`. - After resolution via the new fallback, the returned `config_dict["model_type"]` was still `"chat"`, causing `tenant_llm_service.model_instance()` to instantiate `ChatModel` instead of `CvModel` — breaking `describe_with_prompt` at ingestion time. ## What problem does this PR solve? RAGFlow already has a `CHAT→IMAGE2TEXT` fallback: when a chat model is not found, it retries with `image2text`. The symmetric fallback (`IMAGE2TEXT→CHAT`) was missing. This matters for multimodal models declared as `model_type: "chat"` with an `IMAGE2TEXT` tag in `llm_factories.json` (e.g. models added after tenant creation, or providers where a single model serves both purposes). The frontend PDF parser selector correctly surfaces these models via the `IMAGE2TEXT` tag, but the backend fails to resolve them at runtime. ## Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ## Changes **`api/db/joint_services/tenant_model_service.py`** 1. Add `IMAGE2TEXT→CHAT` fallback in `get_model_config_by_type_and_name`: when an `image2text` model is not found in `tenant_llm`, retry with `chat` — but only if the `llm` table confirms `IMAGE2TEXT` capability via the `tags` field. This mirrors the philosophy of the existing `CHAT→IMAGE2TEXT` fallback: substitution is only allowed when the model has declared the required capability. 2. Normalize `config_dict["model_type"]` to `image2text` after the fallback, so the caller (`model_instance`) correctly routes to `CvModel` instead of `ChatModel`. 3. Extend the type validation guard to allow `(requested=image2text, found=chat)` alongside the existing `(requested=chat, found=image2text)` exception. ## Test plan - [ ] Add a model with `model_type=chat` and `tags` containing `IMAGE2TEXT` to a tenant - [ ] Select it as PDF parser in a knowledge base - [ ] Verify ingestion succeeds without `image2text not found` or `describe_with_prompt` errors - [ ] Verify the same model still works correctly in chat context 🤖 Generated with [Claude Code](https://claude.ai/claude-code) --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>