ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-05-23 17:38:04 +08:00

Author	SHA1	Message	Date
bitloi	6499bce2a6	fix: Langfuse chat observation (#15026 ) ### What problem does this PR solve? Closes #15025 Langfuse-enabled `dialog_service.async_chat()` regressed to `langfuse_tracer.start_generation(...)` after the earlier Langfuse v4 migration. Langfuse v4 uses `start_observation(as_type="generation")`, so the remaining `start_generation` call can fail when chat tracing is enabled. This restores the migrated `start_observation(as_type="generation")` call for chat observations while preserving the existing trace context, model, input payload, and update/end flow. It also adds a regression test with a fake Langfuse v4-style client that exposes `start_observation()` but not `start_generation()`. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### Tests - `.venv/bin/pytest test/unit_test/api/db/services/test_dialog_service_final_answer.py -q` - `.venv/bin/ruff check api/db/services/dialog_service.py test/unit_test/api/db/services/test_dialog_service_final_answer.py`	2026-05-20 15:01:19 +08:00
balibabu	1ed8a118cf	Fix: The folder tree menu for moving folders cannot be scrolled. (#15037 ) ### What problem does this PR solve? Fix: The folder tree menu for moving folders cannot be scrolled. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-20 14:59:36 +08:00
bitloi	d69518ea42	fix(go): guard custom base URL driver creation (#15030 ) ### What problem does this PR solve? Closes #15029. Some custom `base_url` paths in `ModelProviderService` call `NewInstance(newURL)` and then immediately invoke methods on the returned driver. Several real Go model drivers still return `nil` from `NewInstance`, so those paths can panic instead of returning a normal error. This PR: - centralizes custom base URL driver creation in `model_service.go` - skips request-local driver creation when `base_url` is blank or whitespace - preserves the existing region key behavior when building the request-local base URL map - returns a clear error when the provider driver is missing or `NewInstance` returns `nil` - routes list/check/task and active model paths through the guarded helper - adds focused unit coverage for empty-region preservation, regional base URLs, blank base URLs, nil drivers, and nil `NewInstance` results ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### Test plan - [x] `git diff --check upstream/main...HEAD` - [x] `/root/go/bin/gofmt -w internal/service/model_service.go internal/service/model_service_test.go` - [x] `GOPATH=/root/gopath GOTOOLCHAIN=local /root/go/bin/go test ./internal/service -run TestNewModelDriverForBaseURL -count=1 -vet=off` - [x] `GOPATH=/root/gopath GOTOOLCHAIN=local /root/go/bin/go build ./internal/service/... ./internal/entity/models/...` Note: the same targeted `go test` command without `-vet=off` is currently blocked by an existing unrelated vet finding in `internal/service/llm.go:355` (`non-constant format string in call to fmt.Errorf`).	2026-05-20 14:58:20 +08:00
Idriss Sbaaoui	aea90f4e39	Feat: add new tests and tescases for restful api suite (#15038 ) ### What problem does this PR solve? extend restful api suite ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Other (please describe): test	2026-05-20 14:56:55 +08:00
Haruko386	2836a934b5	Go: implement provider: 302.AI and JieKou-AI (#15034 ) ### What problem does this PR solve? This PR implement implement provider 302.AI and JieKouAI The following functionalities are now supported: 302.ai - [x] chat / think chat / stream chat / stream think chat - [x] Embedding - [x] ASR - [x] ListModels - [x] Provider connection checking - [x] Balance - [x] Rerank - [x] OCR - [x] Doc Parse - [x] Show task - [ ] ~~List Tasks!~~ - [ ] TTS JieKouAI - [x] chat / think chat / stream chat / stream think chat - [x] Embedding - [x] Rerank - [x] ListModels Verified examples from the CLI: ```palintext # jiekouAI RAGFlow(user)> stream think chat with 'zai-org/glm-4.5@test@jiekouai' message 'Hi' Thinking: Let me think about how to respond to this simple greeting. The user just said "Hi", which is a basic and friendly way to start a conversation. I should respond in a similarly warm and welcoming manner.First, I need to acknowledge their greeting and reciprocate with enthusiasm. Something like "Hello!" or "Hi there!" would work well to create a positive atmosphere right from the start.Next, I should make it clear that I'm ready to help. Since they haven't asked anything specific yet, I'll keep it open-ended and inviting. Perhaps offering assistance with a question or task would encourage them to engage further.I should also maintain a professional yet approachable tone. Being an AI assistant, I want to convey that I'm knowledgeable and capable, but also friendly and easy to talk to.Let me put this all together into a concise response. I'll start with a cheerful greeting, express my readiness to help, and finish with an open invitation for them to share what's on their mind. This should create a welcoming environment for whatever they want to discuss next. Answer: ! I'm Claude, an AI assistant created by Anthropic. I'm here to help you with information, answer questions, or assist you with tasks. What can I help you with today? RAGFlow(user)> think chat with 'zai-org/glm-4.5@test@jiekouai' message 'Hi' Thinking: Let me consider how to respond to this greeting. The user initiated with a simple "Hi," so a friendly and open response would be most appropriate to encourage further conversation. I should maintain a welcoming tone while offering assistance. The response should accomplish a few key things: return the greeting warmly, show openness to conversation, and offer specific ways I can help. This approach demonstrates both approachability and usefulness. I'll start with a greeting in return, then express my availability to help, and finish by suggesting some areas where I can provide assistance. This creates a natural flow from acknowledgment to support. It's important to keep the response concise but inviting. Since the user hasn't specified their needs yet, I'll present a few broad categories of assistance to spark their thinking about what they might want to discuss or ask about. The response should end with an encouraging note that prompts them to share what's on their mind, keeping the conversational ball in their court while making it clear I'm ready to engage with whatever they need. Answer: Hello! How can I help you today? Whether you have questions, need information, or just want to chat, I'm here to assist. RAGFlow(user)> embed text 'walkerwhat' 'jumperwho' with 'text-embedding-3-large@test@jiekouai' dimension 16 +-----------+-------+ \| dimension \| index \| +-----------+-------+ \| 3072 \| 0 \| \| 3072 \| 1 \| +-----------+-------+ RAGFlow(user)> rerank query 'what is rag' document 'rag is retrieval augment generation' 'rag need llm' 'famous rag project includes ragflow' with 'baai/bge-reranker-v2-m3@test@jiekouai' top 3 +-------+-----------------+ \| index \| relevance_score \| +-------+-----------------+ \| 0 \| 0.9830034 \| \| 2 \| 0.06399203 \| \| 1 \| 0.04665664 \| +-------+-----------------+ # 302.ai RAGFlow(user)> think chat with 'kimi-k2.6@test@302.ai' message 'who r u' Thinking: The user is asking "who r u" which is a casual way of asking "who are you." I need to identify myself as an AI assistant created by Moonshot AI. I should be friendly, concise, and helpful. Key points to include: - I am Kimi, an AI assistant made by Moonshot AI - I can help with various tasks like answering questions, writing, analysis, coding, etc. - Keep it casual but informative since the user used "r u" (text speak) I should not: - Pretend to be human - Claim to have personal experiences or emotions - Be overly formal or robotic Simple, friendly response is best. Answer: I'm Kimi, an AI assistant made by Moonshot AI. I can help you with answering questions, writing, coding, analysis, or just chatting. What can I do for you? Time: 17.687750 RAGFlow(user)> stream think chat with 'kimi-k2.6@test@302.ai' message 'who r u' Thinking: user asked "who r u" which is a casual way of asking "who are you." I should introduce myself as Kimi, an AI assistant developed by Moonshot AI. I need to be friendly, concise, and accurate. I should mention my capabilities briefly and keep the tone helpful. Since the user used casual text speak ("r u"), I can match that energy with a friendly but still informative tone.Key points:- I'm Kimi, an AI assistant made by Moonshot AI- I can help with various tasks like answering questions, writing, coding, analysis, etc.- Keep it brief but warm- Don't claim to be human- Don't over-explainDraft:"I'm Kimi, an AI assistant created by Moonshot AI. I can help with answering questions, writing, coding, analysis, brainstorming, and lots of other tasks. What can I do for you?"This is good - direct, accurate, and inviting. Answer: Kimi, an AI assistant made by Moonshot AI. I can help with answering questions, writing, coding, analysis, brainstorming, and lots of other stuff. What can I do for you? Time: 14.912576 RAGFlow(user)> asr with 'whisper-v3-turbo@test@302.ai' audio './internal/test.wav' param '' +---------------------------------------------------------------------------------------------------------------------+ \| text \| +---------------------------------------------------------------------------------------------------------------------+ \| The examination and testimony of the experts enabled the Commission to conclude that five shots may have been fired \| +---------------------------------------------------------------------------------------------------------------------+ RAGFlow(user)> ocr with 'mistral-ocr-latest@test@302.ai' file './internal/test.pdf' +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ \| text \| +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ \| # Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation Bingxin Ke Nando Metzger Anton Obukhov Rodrigo Caye Daudt Shengyu Huang Konrad Schindler Photogrammetry and Remote Sensing, ETH Zürich ![img-0.jpeg](img-0.jpeg) Figur... \| +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ RAGFlow(user)> parse with 'vlm@test@302.ai' file 'https://arxiv.org/pdf/2505.09358' +--------------------------------------+ \| task_id \| +--------------------------------------+ \| 6de6eae6-c122-4b67-91e8-b061a0b8c087 \| +--------------------------------------+ RAGFlow(user)> show 'test@302.ai' task '6de6eae6-c122-4b67-91e8-b061a0b8c087' +----------------------------------------------------------------------------+-------+ \| content \| index \| +----------------------------------------------------------------------------+-------+ \| https://file.302.ai/gpt/imgs/20260519/b340fdff4774699c287fe4ee4658b317.zip \| 0 \| +----------------------------------------------------------------------------+-------+ RAGFlow(user)> embed text 'walkerwhat' 'jumperwho' with 'jina-embeddings-v3@test@302.ai' dimension 16 +-----------+-------+ \| dimension \| index \| +-----------+-------+ \| 1024 \| 0 \| \| 1024 \| 1 \| +-----------+-------+ RAGFlow(user)> rerank query 'what is rag' document 'rag is retrieval augment generation' 'rag need llm' 'famous rag project includes ragflow' with 'jina-reranker-v2-base-multilingual@test@302.ai' top 3; +-------+-----------------+ \| index \| relevance_score \| +-------+-----------------+ \| 0 \| 0.74167407 \| \| 2 \| 0.18832397 \| \| 1 \| 0.15713684 \| +-------+-----------------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2026-05-20 14:10:15 +08:00
qinling0210	77834870fc	Refact functions in engine in GO (#14981 ) ### What problem does this PR solve? Refact functions in engine in GO ### Type of change - [x] Refactoring	2026-05-19 17:34:59 +08:00
Idriss Sbaaoui	6b2fcb4116	Feat: add new tests and tescases for restful api suite (#14996 ) ### What problem does this PR solve? extend restful api suite ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Other (please describe): test	2026-05-19 17:17:31 +08:00
plind	6796a47b8d	feat(sdk): make Begin inputs discoverable on Session.ask (#14842 ) ### What problem does this PR solve? Closes #14751. The user reported that after adding a variable (e.g. `key1`) to an agent's Begin component, the Python SDK gave them no way to pass it: their call `session.ask(question=user_question, stream=False)` had no parameter for `key1`, and the `ask()` signature was just `(question, stream, kwargs)` with a docstring that only described streaming behavior. The functionality already works — `_ask_agent` does `json_data.update(kwargs)` and the server reads `inputs` from the request body at `agent_api.py:902`. The canonical shape is also in the public API docs (`docs/references/python_api_reference.md:1817-1840`): ```python session.ask( "", stream=False, inputs={"line_var": {"type": "line", "value": "I am line_var"}}, return_trace=True, ) ``` But because `inputs`, `release`, and `return_trace` were hidden behind `kwargs`, they did not appear in IDE signature help, and the docstring did not mention them. Users had no path from "I added a key in the UI" to "I need to pass `inputs=...` with this exact shape." This PR promotes the three most relevant Begin-related arguments to named parameters and rewrites the docstring with a worked example. ### What this PR changes - `sdk/python/ragflow_sdk/modules/session.py`: - `Session.ask()` signature becomes `ask(question="", stream=False, inputs=None, release=None, return_trace=None, kwargs)`. - These three new named params are forwarded into the existing `kwargs` dict before dispatch, so the wire format and downstream behavior are unchanged. - Docstring rewritten in numpy style, including the structured `{"type": ..., "value": ...}` shape that the Begin component requires (see `agent/component/begin.py:45-60`). No backend changes. `kwargs` is preserved for forward compatibility with other body fields (`session_id`, `files`, `user_id`, `custom_header`, …). ### Test plan - [ ] `session.ask(question="hi", stream=False)` — existing call still works - [ ] `session.ask("", stream=False, inputs={"key1": {"type": "line", "value": "v"}})` — Begin component receives `key1 = "v"` - [ ] `session.ask("", stream=True, return_trace=True)` — streaming response includes trace events - [ ] IDE / `help(Session.ask)` now shows `inputs`, `release`, `return_trace` with descriptions ### Type of change - [x] Refactoring - [x] Documentation Update	2026-05-19 16:14:57 +08:00
Rene Arredondo	f58e0b3eca	Feat: VLM image descriptions in MinerU parser (#14869 ) (#14946 ) ## Summary Closes #14869. Adds VLM-based semantic descriptions to image chunks produced by the MinerU parser, closing a long-standing parity gap with the deepdoc parser's `VisionFigureParser`. A maintainer flagged this in #13342 ("We may add the VLM enhancement to MinerU parser as well") and an earlier proposal exists in #13824; this PR lands the change end-to-end inside the existing parser plumbing. ## Why Today the MinerU parser returns image chunks containing only the native `image_caption` and `image_footnote` strings from MinerU's JSON. When neither is present (or when both are sparse), the chunk carries effectively no searchable content for the figure and retrieval misses it entirely. Users who configured a local VLM (reporter's case: Gemma-4-31B) had to post-process MinerU's `tmp/.json` themselves. The deepdoc parser already solves this via [`VisionFigureParser`](deepdoc/parser/figure_parser.py): when the tenant has an `IMAGE2TEXT` model configured, each figure gets a semantic description merged into its chunk. This PR brings the same behavior to MinerU. ## What changed ### `deepdoc/parser/mineru_parser.py` - New method `_enhance_images_with_vlm(outputs, vision_model, callback=None)`* — collects every `IMAGE` block with a readable `img_path`, runs `rag.app.picture.vision_llm_chunk` in a 10-worker `ThreadPoolExecutor` using the existing `vision_llm_figure_describe_prompt`, and writes the result back as `vlm_description`. Per-image failures are logged and skipped — they never abort the run. - `_transfer_to_sections` (IMAGE branch) — folds `vlm_description` into the section text alongside caption + footnote, so the description becomes part of the chunk and is searchable / retrievable. - `parse_pdf` — after `_read_output`, calls `_enhance_images_with_vlm(outputs, vision_model, callback=callback)` when a `vision_model` kwarg is supplied. Wrapped in `try / except` so a VLM outage cannot break parsing. ### `rag/app/naive.py` (`by_mineru`) After successfully resolving the MinerU OCR parser, also resolves the tenant's default `LLMType.IMAGE2TEXT` model via `get_tenant_default_model_by_type`, wraps it in an `LLMBundle`, and injects it as `kwargs["vision_model"]` before delegating to `parse_pdf`. ## Behavior \| Tenant config \| Behavior \| \|---\|---\| \| `IMAGE2TEXT` model configured \| MinerU image chunks contain `caption + footnote + VLM description`. Retrieval against figures now actually works. \| \| No `IMAGE2TEXT` model configured \| Exact same output as today (caption + footnote only). Lookup fails silently with an info log; no error, no regression. \| \| VLM call fails for a single image \| That image silently falls back to caption + footnote; other images proceed. \| \| Caller already passes `vision_model` in kwargs \| We don't override it — `if "vision_model" not in kwargs` guards the lookup. \| ## Files - `deepdoc/parser/mineru_parser.py` (+56) - `rag/app/naive.py` (+13)	2026-05-19 16:08:10 +08:00
Idriss Sbaaoui	95b56e73f2	Feat: add new tests and tescases for restful api suite (#14993 ) ### What problem does this PR solve? extend restful api suite ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Other (please describe): test	2026-05-19 15:43:15 +08:00
Rene Arredondo	ce3402cbb9	Fix: restore saved api_key fallback in add_llm (#14921 ) (#14941 ) ## Summary Closes #14921. Reconfiguring an existing LLM provider to enable tool call or vision fails with `Your API key is invalid. Fail to access model.` even when the saved API key is correct. The most visible report is VLLM ("Cannot add vllm model" once `--enable-auto-tool-choice` / vision is toggled on), but the bug applies to every provider whose api_key field stays blank in edit mode. ## Root cause PR #14885 ("Fix: llm add api key overridden") removed the existing-key lookup in `api/apps/llm_app.py::add_llm`. The intent was correct — stop the saved key from clobbering a user-provided new one — but the removal was unconditional, so the edit path now has no fallback at all: 1. `web/src/pages/user-setting/setting-model/hooks.tsx:230` sets the initial `api_key` form value to `''` in edit mode (the real key is never returned to the browser). 2. The user toggles `is_tools` / `vision` without retyping the key. 3. `hooks.tsx:183-185` strips the empty `api_key` from the payload. 4. `add_llm` defaults to the placeholder `"x"` (`api/apps/llm_app.py:182`). 5. The upstream provider rejects `"x"` with `Your API key is invalid`. ## Fix Restore the fallback narrowly, before any factory-specific handler runs: - If `req.get("api_key") is None`, look up the tenant's existing record (using the correctly suffixed `llm_name` for VLLM / OpenAI-API-Compatible / LocalAI / HuggingFace). - Decode the saved blob with `_decode_api_key_config` and write only the decoded `api_key` string back into `req["api_key"]`. Never use the raw JSON payload — that was the exact thing PR #14885 was trying to avoid. - When the user does type a new key, `req.get("api_key")` is not `None` and the fallback is skipped, so PR #14885's fix is preserved. \| Scenario \| Before this PR \| After this PR \| \|---\|---\|---\| \| Plain factory (VLLM, Ollama, …), retype key \| OK \| OK \| \| Plain factory, blank key in edit (the bug) \| Fails with "API key is invalid" \| Recovers saved key, validates against the real one \| \| OpenRouter / Bedrock, change `provider_order` only \| Fails \| `apikey_json([...])` rebuilds the JSON with saved `api_key` + new field \| \| User clears the form and types a brand-new key \| OK (key replaced) \| OK (key replaced — fallback skipped) \| ## Files changed - `api/apps/llm_app.py` — restored fallback in `add_llm` (no other call sites touched). ## Test plan - [ ] Add a VLLM chat model with a valid api_key, no toggles → save succeeds. - [ ] Edit the same model, toggle tool call on, leave api_key blank → save succeeds, validation runs against the saved key. - [ ] Edit again, toggle vision on (model_type → `image2text`), leave api_key blank → save succeeds. - [ ] Edit again and type a new api_key → the new key replaces the saved one (`is None` check skips the fallback). Verify via the DB row or by deliberately typing a wrong key and observing the validation failure. - [ ] Repeat the blank-key edit with OpenRouter, changing only `provider_order` → resulting api_key JSON contains the saved `api_key` and the new `provider_order`. - [ ] First-time add of a new model name → no existing record, fallback no-ops, behaves as before. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2026-05-19 15:32:09 +08:00
tmimmanuel	243d9ed281	Add TogetherAI chat provider (#14957 ) ## What - Add TogetherAI as a chat provider backed by its OpenAI-compatible `/v1/chat/completions` API - Register TogetherAI in the Go model factory and provider config - Support non-streaming chat, SSE streaming chat, model listing, and connection checks ## Notes - Uses the current TogetherAI OpenAI-compatible base URL `https://api.together.ai/v1` - Forwards documented chat parameters from `ChatConfig`: `max_tokens`, `temperature`, `top_p`, `stop`, and GPT-OSS `reasoning_effort` - Routes Together reasoning traces from `reasoning` / `reasoning_content` into `ReasonContent` ## Tests - `go test -vet=off -run TestTogetherAI -count=1 ./internal/entity/models` - `go test -vet=off -count=1 ./internal/entity/models` Refs #14736	2026-05-19 15:10:42 +08:00
tmimmanuel	09a06f1b00	Go: implement provider: Xinference (#14938 ) ### What problem does this PR solve? Closes #14808. Adds a Go model driver for Xinference so self-hosted Xinference chat models can be used through the Go provider layer instead of falling through to the dummy driver. Xinference exposes an OpenAI-compatible API under `/v1`; the driver accepts either a root endpoint such as `http://127.0.0.1:9997` or an OpenAI-compatible endpoint such as `http://127.0.0.1:9997/v1` and normalizes it before calling chat or model-listing routes. ### What is changed? - Add `internal/entity/models/xinference.go` implementing `ModelDriver` for Xinference chat. - Route provider name `xinference` in `internal/entity/models/factory.go`. - Add `conf/models/xinference.json` as a local provider config. - Add focused unit tests in `internal/entity/models/xinference_test.go`. Initial method coverage: - `ChatWithMessages`: POST `/v1/chat/completions`. - `ChatStreamlyWithSender`: SSE streaming from `/v1/chat/completions`. - `ListModels`: GET `/v1/models`. - `CheckConnection`: lightweight `ListModels` probe. - Optional auth: send `Authorization: Bearer <api_key>` only when a non-empty key is configured, matching Xinference no-auth and auth-enabled deployments. - `Balance`, `Embed`, `Rerank`, ASR, TTS, and OCR return `no such method` for this initial chat-provider PR. ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Bug Fix (non-breaking change which fixes an issue) ### Tests - `go test -vet=off -run TestXinference -count=1 ./internal/entity/models/...` - `go test -vet=off -count=1 ./internal/entity/models/...` ### References - Xinference docs: https://inference.readthedocs.io/zh-cn/latest/index.html - OpenAI-compatible chat usage: https://inference.readthedocs.io/zh-cn/latest/getting_started/using_xinference.html - API key auth: https://inference.readthedocs.io/zh-cn/latest/user_guide/auth_system.html --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-19 15:10:13 +08:00
plind	7edabdf7c3	fix(retrieval): keep manual metadata filter reusable inside Iteration (#14849 ) ## What problem does this PR solve? Closes #12582. When a Retrieval component sits inside an Iteration with a manual metadata filter that references the iteration variable (e.g. `{IterationItem:abc@item}`), every iteration reuses the value resolved on the first pass. Root cause: [`_resolve_manual_filter` in `agent/tools/retrieval.py`](https://github.com/infiniflow/ragflow/blob/main/agent/tools/retrieval.py#L144-L171) mutated `flt["value"]` in place. The `filters` list passed in is the live `self._param.meta_data_filter["manual"]` (see [`apply_meta_data_filter` in `common/metadata_utils.py:257-261`](https://github.com/infiniflow/ragflow/blob/main/common/metadata_utils.py#L257-L261)), so after the first iteration the param dict permanently held the resolved string instead of the original variable reference. ```text iter #1: flt["value"] = "{IterationItem:abc@item}" → resolved to "AI" after mutation: flt["value"] = "AI" ← written back into _param iter #2: flt["value"] = "AI" ← no {…} matches retrieval keeps filtering by "AI" forever ``` This PR returns a shallow copy with the resolved value instead, leaving the original filter (and its variable reference) intact for the next iteration. ## Type of change - [x] Bug fix (non-breaking change which fixes an issue) ## Test plan - [ ] Build an agent: `Agent (structured output → list of areas) → Iteration → Retrieval (manual filter: Area = {IterationItem/Item}) → Message`. Run with a multi-area query and confirm each iteration's Retrieval result matches its own item, not the first item. - [ ] Regression: Retrieval with a manual metadata filter outside an Iteration still resolves the variable correctly on each request. - [ ] Regression: Retrieval with no metadata filter and with `auto` / `semi_auto` filters behave unchanged.	2026-05-19 15:08:31 +08:00
plind	f169ab4b39	feat(tts): cache synthesized speech in Redis to avoid redundant calls (#14851 ) ## What problem does this PR solve? Closes #12017. TTS output is deterministic for a given `(model, text)` pair, so re-running the same text through the same TTS model produces the same bytes — yet `Canvas.tts` and `dialog_service.tts` re-synthesized on every request. That's slow and wastes provider quota whenever the same assistant response is replayed, shared across users, or repeated within a session. ### Change New helper `rag/utils/tts_cache.py` with `synthesize_with_cache(tts_mdl, cleaned_text)`: - Key: `tts:cache:{model_id}:{sha256(text)}` — separate namespace per model, identical cleaned text reuses a single entry across both call sites. - Value: the hex-encoded audio blob both call sites already returned. No format change for downstream consumers. - TTL: 7 days by default, configurable via `RAGFLOW_TTS_CACHE_TTL_SECONDS`. - Failure modes: a Redis hiccup falls back to direct synthesis; a failed synthesis still returns `None` (existing contract preserved). [`Canvas.tts`](https://github.com/infiniflow/ragflow/blob/main/agent/canvas.py#L683-L724) and [`dialog_service.tts`](https://github.com/infiniflow/ragflow/blob/main/api/db/services/dialog_service.py#L1367-L1380) now route through the helper; the per-file bytes-accumulation/hex-encode loop has been removed in favor of one shared implementation. ## Type of change - [x] New Feature (non-breaking change which adds functionality) ## Test plan - [ ] Cache hit, chat path: Configure a dialog with TTS enabled, ask the same question twice with `stream=false`. Verify the second response returns the same `audio_binary` and that the second invocation doesn't hit the TTS provider (e.g., observe provider-side logs / usage counters; check no `LLMBundle.tts can't update token usage` log line on the second run). - [ ] Cache hit, agent path: Same exercise via a Conversational Agent that includes a Message component playing back the answer. - [ ] Cache isolation per model: Switch tenant's `tts_id` between two models, run the same text against each — confirm the second model's first synthesis still happens (no cross-model hits). - [ ] TTL override: Set `RAGFLOW_TTS_CACHE_TTL_SECONDS=120`, confirm the entry expires after 2 minutes. - [ ] Redis unavailable: Stop Redis (or break the connection). Verify the TTS endpoint still works — synthesis falls back to direct calls, with a `TTS cache lookup failed` / `TTS cache store failed` warning logged. - [ ] Failure path: Configure a TTS model with an invalid API key, ensure the response still returns successfully with `audio_binary=None` (no regression vs. current behavior).	2026-05-19 14:20:40 +08:00
OrbisAI Security	f17a66d4f0	fix: the opencc c library uses fgets() to read dicti... in text.c (#13970 ) ## Summary Fix critical severity security issue in `internal/cpp/opencc/dictionary/text.c`. ## Vulnerability \| Field \| Value \| \|-------\|-------\| \| ID \| V-001 \| \| Severity \| CRITICAL \| \| Scanner \| multi_agent_ai \| \| Rule \| `V-001` \| \| File \| `internal/cpp/opencc/dictionary/text.c:107` \| Description: The OpenCC C library uses fgets() to read dictionary and configuration files without proper bounds validation on subsequent buffer operations. While fgets() itself is bounds-checked, the sprintf() call at config_reader.c:174 constructs file paths by concatenating home_path and filename without verifying the result fits in pkg_filename buffer. An attacker providing malformed OpenCC configuration files with excessively long path components can overflow the fixed-size buffer, overwriting adjacent memory including return addresses and function pointers. ## Changes - `internal/cpp/opencc/config_reader.c` - `internal/cpp/opencc/dictionary/text.c` - `internal/cpp/opencc/utils.c` ## Verification - [x] Build passes - [x] Scanner re-scan confirms fix - [x] LLM code review passed --- Automated security fix by [OrbisAI Security](https://orbisappsec.com) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Bug Fixes * Improved error detection and handling for malformed configuration and dictionary entries during file parsing. * Enhanced memory cleanup in error recovery paths to prevent potential issues. * Strengthened robustness of string operations and buffer handling throughout the library. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: Ubuntu <ubuntu@ip-172-31-32-15.us-west-2.compute.internal>	2026-05-19 13:55:33 +08:00
刘康伟	c6e3a2e713	Fix: MinerU vlm-http-client backend output file detection (#14240 ) ## Problem When using MinerU with `vlm-http-client` backend, the parser fails to find the output files because they are located in a `vlm/` subdirectory, but the `_read_output` method doesn't check this location. ## Error Message [ERROR]MinerU not found. [MinerU] Missing output file, tried: ... ## Root Cause The MinerU API with `vlm-http-client` backend returns output files in the following structure: output_dir/ vlm/ filename_content_list.json filename.md images/ However, the `_read_output` method in `mineru_parser.py` only checks: 1. `output_dir/filename_content_list.json` 2. `output_dir/sanitized_filename_content_list.json` 3. `output_dir/sanitized_filename/sanitized_filename_content_list.json` It doesn't check the `vlm/` subdirectory. ## Solution Added two additional fallback paths to check the `vlm/` subdirectory: - `output_dir/vlm/filename_content_list.json` - `output_dir/vlm/sanitized_filename_content_list.json` ## Testing Tested with MinerU API using `vlm-http-client` backend. The parser now successfully finds and processes the output files. ## Related This issue occurs specifically when using: - MinerU backend: `vlm-http-client` - MinerU server URL configured for remote vLLM inference	2026-05-19 12:28:31 +08:00
buua436	87d22a4415	Fix: agent session log message (#14991 ) ### What problem does this PR solve? agent session log message ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-19 12:00:02 +08:00
tmimmanuel	4c9529ef36	Add Replicate chat provider (#14958 ) ## What - Add Replicate as a chat provider backed by the documented predictions API - Register Replicate in the Go model factory and provider config - Support non-streaming chat through sync predictions, polling fallback, streaming through `urls.stream`, model listing, and connection checks ## Notes - Uses `POST /v1/predictions` with Replicate model identifiers in `version`, which supports official and community model identifiers - Maps RAGFlow messages into Replicate prompt-shaped inputs (`prompt`, optional `system_prompt`) and forwards common documented LLM inputs: `max_new_tokens`, `temperature`, `top_p` - Preserves whitespace in SSE output chunks and emits RAGFlow `[DONE]` at stream completion ## Tests - `go test -vet=off -run TestReplicate -count=1 ./internal/entity/models` - `go test -vet=off -count=1 ./internal/entity/models` Refs #14736	2026-05-19 11:10:36 +08:00
Haruko386	db9e782747	Go: implement provider: MinerU (#14990 ) ### What problem does this PR solve? Implement MinerU Provider The following functionalities are now supported: MinerU ---- - [x] Parse file - [x] Show task - [ ] ~~List tasks~~ Verified examples from the CLI: ```plaintext RAGFlow(user)> parse with 'vlm@test@mineru' file 'https://arxiv.org/pdf/2505.09358' +--------------------------------------+ \| task_id \| +--------------------------------------+ \| 142ac8ea-d9d0-4a68-a2d1-d3af67635dc9 \| +--------------------------------------+ RAGFlow(user)> show 'test@mineru' task '142ac8ea-d9d0-4a68-a2d1-d3af67635dc9' +--------------------------------------------+-------+ \| content \| index \| +--------------------------------------------+-------+ \| Task is running... Progress: 17 / 18 pages \| 0 \| +--------------------------------------------+-------+ RAGFlow(user)> show 'test@mineru' task '142ac8ea-d9d0-4a68-a2d1-d3af67635dc9' +--------------------------------------------------------------------------------------------+-------+ \| content \| index \| +--------------------------------------------------------------------------------------------+-------+ \| https://cdn-mineru.openxlab.org.cn/pdf/2026-05-18/142ac8ea-d9d0-4a68-a2d1-d3af67635dc9.zip \| 0 \| +--------------------------------------------------------------------------------------------+-------+ ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2026-05-19 10:49:33 +08:00
kingloon	525a87be0f	Misc: fix some typos (#14987 ) ### What problem does this PR solve? Fix minor code quality issues: 1. Fix typo in assertion error message: "Can't fine" → "Can't find" 2. Remove duplicate line in common/connection_utils.py ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring	2026-05-19 10:47:06 +08:00
jony376	198f3c4b9a	Fix: validate memory tenant model IDs on update and enforce tenant scope in memory pipeline (#14923 ) ### Related issues Closes #14922 ### What problem does this PR solve? `POST /memories` already resolves `tenant_llm_id` and `tenant_embd_id` through `ensure_tenant_model_id_for_params`, but `PUT /memories/<memory_id>` accepted client-supplied `tenant_llm_id` / `tenant_embd_id` without checking that those `tenant_llm` rows belong to the memory owner’s tenant. A caller could persist another tenant’s row IDs and later trigger extraction or embedding that loaded foreign model credentials via `get_model_config_by_id(tenant_model_id)` with no tenant allow-list. This change aligns the update path with create: updates that change models must go through `llm_id` / `embd_id` and `ensure_tenant_model_id_for_params` scoped to the memory’s `tenant_id` (not only the current user, so team-access cases stay correct). Direct `tenant_*` fields in the body without `llm_id` / `embd_id` are rejected. As defense in depth, `memory_message_service` passes `allowed_tenant_ids` / `requester_tenant_id` into `get_model_config_by_id` for LLM and embedding resolution so mismatched IDs cannot be used even if bad data existed. A regression test rejects payloads that set only `tenant_llm_id` / `tenant_embd_id`. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: jony376 <jony376@gmail.com>	2026-05-19 10:11:46 +08:00
Magicbook1108	b69a6a5d80	Feat: full optimization on connector dashboard (#14979 ) ### What problem does this PR solve? This PR improves the connector dashboard task management experience and adds better visibility into connector execution logs. ### Overview: #### Before <img width="700" alt="image" src="https://github.com/user-attachments/assets/e4a8ed6f-2e18-4f0f-8528-41a514550052" /> #### Now: <img width="700" alt="Screenshot from 2026-05-18 16-31-30" src="https://github.com/user-attachments/assets/d4ca193b-847a-49ae-9e4f-5fbca60ea627" /> ### 1. Add a new logging page to the connector dashboard A new logging page has been added so users can view connector task execution logs directly from the connector dashboard. ### 2. Merge the Resume button into Confirm The separate Resume button has been removed. The Confirm button now represents different actions depending on the current task state: - Save: Save form changes and reschedule tasks. - Stop: Cancel currently scheduled or running tasks. - Resume: Create new scheduled tasks after the previous tasks have been stopped. - Start: Start tasks when no task has been started yet. ### 3. Separate syncing and pruning tasks Connector tasks are now separated into syncing and pruning. Pruning is controlled by the Sync deleted files option: - When Sync deleted files is disabled, only syncing tasks are shown. - When Sync deleted files is enabled, both syncing and pruning tasks are shown. Now: Sync deleted files disabled <img width="700" alt="Sync deleted files disabled" src="https://github.com/user-attachments/assets/dbd9232e-614a-407f-a0b1-c109e5fa567d" /> Now: Sync deleted files enabled <img width="700" alt="Sync deleted files enabled" src="https://github.com/user-attachments/assets/1f527f48-ccb3-4ee8-97ca-086891489296" /> ### 4. Update logs in backend <img width="700" alt="image" src="https://github.com/user-attachments/assets/10a95a3f-98c1-4e67-8afa-ddf6cda5b0b2" /> ### 5. Remove connector resume API - Removed: `POST /v1/connectors/<connector_id>/resume` - Replaced by: `PATCH /v1/connectors/<connector_id>` ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-19 10:07:11 +08:00
buua436	41a9fc0030	Go: add dataset graph api (#14984 ) ### What problem does this PR solve? add dataset graph api ### Type of change - [x] Refactoring	2026-05-18 20:02:53 +08:00
buua436	d7fb4bdb4e	Go: align document list response (#14982 ) ### What problem does this PR solve? align document list response ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-18 20:00:11 +08:00
buua436	3290257014	Go: fix forgetting policy validation and fix memory update diff checks (#14976 ) ### What problem does this PR solve? fix forgetting policy validation and fix memory update diff checks ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-18 19:21:47 +08:00
Jake Armstrong	93d3deb5e4	Fix admin CLI system variable commands (#14956 ) ## What Fixes #12409. Implements admin CLI support for: - `list vars;` - `show var <name-or-prefix>;` - `set var <name> <value>;` ## Changes - Wire Go CLI variable commands to the admin API. - Support integer and quoted string values in `SET VAR`. - Return variable rows as `data_type`, `name`, `setting_type`, and `value`. - Add exact-name lookup with prefix fallback for `SHOW VAR`. - Validate values by stored data type: `string`, `integer`, `bool`, and `json`. - Keep the legacy Python admin CLI/server behavior aligned. - Update admin CLI docs and add focused tests. ## Verification - `go test -count=1 ./internal/cli` - `python3.12 -m py_compile admin/server/services.py admin/server/routes.py api/db/services/system_settings_service.py admin/client/parser.py admin/client/ragflow_client.py` - Python admin CLI parser smoke test for `SET VAR`, quoted values, `SHOW VAR`, and `LIST VARS`. - Attempted `./run_go_tests.sh`; local environment is missing native tokenizer/linker artifacts: - `internal/cpp/cmake-build-release/librag_tokenizer_c_api.a` - `-lstdc++` Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-18 19:08:45 +08:00
Wang Qi	732e4741c4	Bugfix: fix tag show (#14980 ) ### What problem does this PR solve? Bugfix: fix tag show ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-18 18:55:01 +08:00
Hamza Amin Khokhar	2dbe3b8a62	fix: metadata_condition returning all docs when filter matches nothing (#14967 ) ### What problem does this PR solve? When _parse_doc_id_filter_with_metadata returns [], the empty list is falsy so the WHERE id IN (...) clause was silently skipped, causing the full dataset to be returned instead of an empty result. Change `if doc_ids:` to `if doc_ids is not None:` in both get_list() and get_by_kb_id() to distinguish between no filter (None) and a filter that matched zero documents ([]). Fixes #14962 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-18 18:54:30 +08:00
Haruko386	92145dc764	Go: implement provider: DeepInfra, XunFei (#14978 ) ### What problem does this PR solve? This PR implement implement provider and Mistral, DeepInfra, XunFei The following functionalities are now supported: DeepInfra - [x] chat / think chat / stream chat / stream think chat - [x] Embedding - [x] ASR - [x] TTS - [x] ListModels - [x] Provider connection checking - [x] Balance - [ ] ~~Rerank~~ XunFei - [x] chat / think chat / stream chat / stream think chat ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2026-05-18 16:57:42 +08:00
buua436	b8ac997606	Go: add restful api route aliases (#14977 ) ### What problem does this PR solve? add restful api route aliases ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-18 16:57:14 +08:00
Wang Qi	13b422037f	Refactor: enhance graphrag - part 2 (#14972 ) ### What problem does this PR solve? 1. expose batch_chunk_token_size for configuration 2. retrieve chunks when build subgraph for the doc, not retreive all docs chunks at the begining 3. get all chunks for a document, used to be hard coded 10000 4. delete not used method run_graphrag ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring Follow on: #14617	2026-05-18 16:10:21 +08:00
dev	b12eaee38b	fix(api): enforce tenant access for connector routes (#14747 ) ### What problem does this PR solve? Fixes #14746. Adds tenant access checks for connector-by-id REST routes before reading connector details, mutating connector config/status, deleting connectors, rebuilding, or listing sync logs. Unauthorized callers now receive `RetCode.AUTHENTICATION_ERROR` with `No authorization.` without reaching the connector/log mutation paths. Validation: - `python3 -m pytest --confcutdir=test/testcases/test_web_api/test_connector_app test/testcases/test_web_api/test_connector_app/test_connector_routes_unit.py` - `uvx ruff check api/apps/restful_apis/connector_api.py api/db/services/connector_service.py test/testcases/test_web_api/test_connector_app/test_connector_routes_unit.py` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: dev111-actor <dev111-actor@users.noreply.github.com>	2026-05-18 16:09:26 +08:00
Wang Qi	56d73d0c2c	Refactor: speed up ragflow server, save startup memory (#14973 ) ### What problem does this PR solve? Refactor: speed up ragflow server, save startup memory, saved 200MiB, and 5-9 seconds start time. ##### Before 1241292 \| \| \_ python3 api/ragflow_server.py RAGFlow server is ready after 25.61845850944519s initialization. ##### After 1019968 \| \| \_ python3 api/ragflow_server.py RAGFlow server is ready after 16.205134391784668s initialization. ### Type of change - [x] Refactoring	2026-05-18 15:55:59 +08:00
buua436	b40b0bf996	Go: fix siliconflow embedding response (#14975 ) ### What problem does this PR solve? fix siliconflow embedding response ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-18 15:07:07 +08:00
dale053	fe82a96193	Fix: add SSRF guard for agent test_db_connection endpoint (#14860 ) ### What problem does this PR solve? Closes #14858 The `test_db_connection` endpoint in the agent API accepts a user-supplied `host` and connects to it directly via database drivers (MySQL/PostgreSQL) without any validation. This allows an attacker to probe internal network addresses (e.g. `127.0.0.1`, `10.x.x.x`, link-local, etc.) through the server — a classic Server-Side Request Forgery (SSRF) vulnerability. This PR adds an SSRF guard that resolves the host and rejects any address that is not globally routable before the database connection is attempted. Changes: - `common/ssrf_guard.py` — Added `assert_host_is_safe()`, a host-level counterpart of the existing `assert_url_is_safe()`, designed for non-HTTP protocols (database drivers) where there is no URL to parse. - `api/apps/restful_apis/agent_api.py` — Call `assert_host_is_safe(req["host"])` at the top of `test_db_connection` so that non-public hosts are rejected early with a clear error message. Fixes #14858 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-18 14:32:44 +08:00
tmimmanuel	b09da6e347	Go: implement provider: CometAPI (#14930 ) ### What problem does this PR solve? Adds the Go model provider driver for CometAPI, which is listed as unchecked in the Go provider tracking issue #14736 and requested in #14804. Without this, the Go layer falls back to the dummy driver for the `cometapi` provider. Fixes #14804 ### What this PR includes - New `internal/entity/models/cometapi.go` implementing `ModelDriver` for CometAPI. - New `conf/models/cometapi.json` with CometAPI base URLs and representative chat / embedding models from the public catalog. - `factory.go`: route `"cometapi"` to `NewCometAPIModel`. - Unit tests in `internal/entity/models/cometapi_test.go`. ### Method coverage - `ChatWithMessages`: `POST /v1/chat/completions`. - `ChatStreamlyWithSender`: SSE streaming on the same endpoint. - `Embed`: `POST /v1/embeddings`, including optional `dimensions`. - `ListModels`: `GET /api/models` public catalog. - `Balance`: `GET https://query.cometapi.com/user/quota?key=...`. - `CheckConnection`: delegates to the quota query to verify the key. - `Rerank`, ASR, TTS, OCR: return `no such method` for now. No ModelDriver interface change. No new dependencies. ### How was this tested? ```bash go test -vet=off -run TestCometAPI -count=1 ./internal/entity/models/... go test -vet=off -count=1 ./internal/entity/models/... ``` --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Jin Hai <haijin.chn@gmail.com> Signed-off-by: majiayu000 <1835304752@qq.com> Co-authored-by: 加帆 <Jiafan@users.noreply.github.com> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: bulexu <baiheng527@gmail.com> Co-authored-by: xubh <xubh@wikiflyer.cn> Co-authored-by: Jin Hai <haijin.chn@gmail.com> Co-authored-by: Carve_ <75568342+Rynzie02@users.noreply.github.com> Co-authored-by: Paul Y Hui <paulhui@seismic.com> Co-authored-by: LIRUI YU <128563231+LiruiYu33@users.noreply.github.com> Co-authored-by: yun.kou <koopking@gmail.com> Co-authored-by: Yun.kou <yunkou@deepglint.com> Co-authored-by: Ahmad Intisar <168020872+ahmadintisar@users.noreply.github.com> Co-authored-by: Ahmad Intisar <ahmadintisar@Ahmads-MacBook-M4-Pro.local> Co-authored-by: chanx <1243304602@qq.com> Co-authored-by: Syed Shahmeer Ali <syedshahmeerali196@gmail.com> Co-authored-by: Octopus <liyuan851277048@icloud.com> Co-authored-by: lif <1835304752@qq.com>	2026-05-18 14:31:16 +08:00
qinling0210	f1d2383572	Push metadata filters down to Infinity (#14974 ) ### What problem does this PR solve? Push metadata filters down to Infinity ### Type of change - [x] Refactoring	2026-05-18 14:22:04 +08:00
Kevin Hu	7cdc74bbe5	Refactor: Drop the vector fetch for ES (#14970 ) ## Summary - Stop pulling chunk vectors (`q__vec`) back from Elasticsearch in the main retrieval path. ES already knows them; shipping them was pure bandwidth/memory overhead. - Recover the per-chunk cosine similarity via a second KNN-only ES call filtered by the candidate chunk ids. The new `_score` is merged with locally computed term similarity using the user-configured `vector_similarity_weight`. - Lazily fetch the chunk embedding only for the chunks `insert_citations` actually needs. ## Details `rag/nlp/search.py`* - `Dealer.search`: no longer appends `q__vec` to the ES select list. OceanBase still gets it (its rerank path is unchanged). - New `Dealer._knn_scores(sres, idx_names, kb_ids)`: a `MatchDenseExpr` over the cached query vector filtered by `id IN sres.ids`, returning `{chunk_id: cosine_score}` via ES `_score`. - New `Dealer.rerank_with_knn(...)`: term similarity from `qryr.token_similarity` plus the ES-supplied KNN score, combined with `tkweight`/`vtweight` and the existing rank-feature bonus. - New `Dealer.fetch_chunk_vectors(chunk_ids, tenant_ids, kb_ids, dim)`: on-demand vector fetch for citation use. - `Dealer.retrieval` routes Infinity → unchanged, OceanBase → existing local `rerank`, ES → new KNN-score path. `common/doc_store/es_conn_base.py`* - New `get_scores(res)` helper returning `{_id: _score}` directly from hit headers (ES doesn't surface `_score` through `get_fields`). `api/db/services/dialog_service.py` - New top-level `_hydrate_chunk_vectors(...)` helper. On ES it back-fills `ck["vector"]` from `fetch_chunk_vectors` right before `insert_citations`. No-op on Infinity / OB (their chunks already carry vectors). - Both `decorate_answer` closures became `async` and are `await`-ed at all call sites in `async_chat` and `async_ask`. ## Backend behavior \| Backend \| Returns chunk vec in main search \| Sim source \| Vectors for citations \| \|---\|---\|---\|---\| \| ES \| No \| second KNN call (`_score`) merged with term sim \| fetched on demand \| \| Infinity \| No (unchanged) \| normalized `_score` \| already on chunks \| \| OceanBase \| Yes (kept) \| local hybrid rerank \| already on chunks \| ## Test plan	2026-05-18 14:21:56 +08:00
Rene Arredondo	9f2fb4611f	Fix: guard empty/whitespace embedding inputs in LLMBundle (#14428 ) (#14924 ) Closes #14428 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-18 14:11:54 +08:00
carlos4s	2eba2c4d75	Add Anthropic Go model provider (#14940 ) ### What problem does this PR solve? Adds the missing Anthropic provider implementation for the Go model provider layer. Closes #14939 ### What changed - Add `conf/models/anthropic.json` with Anthropic Claude chat/vision models and API endpoints. - Add `internal/entity/models/anthropic.go` implementing non-streaming Messages API chat, model listing, and connection checking. - Register `anthropic` in the Go model factory. - Add httptest coverage for headers, payload mapping, response parsing, validation errors, provider errors, model listing, connection checking, factory registration, and unsupported methods. ### Notes Streaming chat is left as an explicit `no such method` follow-up because this initial provider focuses on non-streaming chat and connection checking. ### Tests - `docker run --rm -v /home/ubuntu/Documents/gitTensor_repos/carlos/ragflow:/work -v /tmp/ragflow-go-cache:/go/pkg/mod -v /tmp/ragflow-go-build:/root/.cache/go-build -w /work golang:1.25 go test -vet=off ./internal/entity/models -run Anthropic -count=1 -v` - `docker run --rm -v /home/ubuntu/Documents/gitTensor_repos/carlos/ragflow:/work -v /tmp/ragflow-go-cache:/go/pkg/mod -v /tmp/ragflow-go-build:/root/.cache/go-build -w /work golang:1.25 go test -vet=off ./internal/entity -count=1` - `git diff --check` - `jq . conf/models/anthropic.json >/dev/null` Plain `go test ./internal/entity/models` currently hits pre-existing unrelated vet findings in other provider files (`baidu.go`, `cohere.go`, `fishaudio.go`, `openrouter.go`). --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-18 12:03:33 +08:00
Jake Armstrong	fe1433d1ff	Go: add Jina chat completions support (#14935 ) ### What problem does this PR solve? This PR adds non-streaming chat support for the Jina Go model provider. The Jina provider was added with embedding, rerank, model listing, and connection checking, but `ChatWithMessages` still returned a not-implemented error even though Jina exposes an OpenAI-compatible `/v1/chat/completions` endpoint. Closes #14933 The following functionalities are now supported: ### Jina: - [x] Chat - [ ] Stream Chat - [x] Embedding - [x] Rerank - [x] Model listing - [x] Provider connection checking - [ ] Balance ### Implementation details: - Implements `JinaModel.ChatWithMessages` - Sends `Authorization: Bearer <api-key>` and JSON chat completion requests - Validates API key, model name, messages, and configured region before making requests - Forwards supported chat config fields: `max_tokens`, `temperature`, `top_p`, and `stop` - Parses the first chat completion choice into `ChatResponse.Answer` - Adds `jina-ai/jina-vlm` as a chat-capable model in `conf/models/jina.json` - Adds focused unit tests for request construction, auth, response parsing, validation errors, provider errors, and region handling Verification: ```plaintext docker run --rm -v $PWD:/repo -w /repo golang:1.25 sh -c '/usr/local/go/bin/gofmt -w internal/entity/models/jina.go internal/entity/models/jina_test.go && /usr/local/go/bin/go test -vet=off ./internal/entity/models -run TestJina -count=1' ok ragflow/internal/entity/models 0.037s ``` Note: `go test ./internal/entity/models -run TestJina -count=1` currently hits unrelated existing vet findings in other provider files, so the focused Jina tests were run with `-vet=off`. ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-18 12:03:12 +08:00
Panda Dev	6794ad2f70	Go: implement Embed (embeddings) in Novita driver (#14895 ) ### What problem does this PR solve? Fixes #14893 The Novita Go driver landed in #14850 and shipped a stub `Embed` method that returned `"novita, no such method"`, so Novita could not be used as an embedding provider in RAGFlow. This PR fills that gap. Novita exposes a public embeddings endpoint at `POST https://api.novita.ai/v3/embeddings` that accepts the standard OpenAI-compatible request shape (`{model, input}`) with `Authorization: Bearer <api_key>`. Two embedding models are documented in Novita's model library: `baai/bge-m3` (multilingual, 8192 tokens) and `baai/bge-large-en-v1.5`. ### Changes - `internal/entity/models/novita.go`: implement `NovitaModel.Embed`. - Validate inputs (api key, model name) and short-circuit on empty texts. - Resolve region with the existing `baseURLForRegion` helper. - Build URL from `URLSuffix.Embedding` (the embeddings path lives under `/v3/`, separate from the chat path under `/openai/v1/`). - Send `{model, input}` POST body, add `dimensions` when `embeddingConfig.Dimension > 0` (matches the pattern in #14735). - Bearer auth + JSON content type, mirroring the chat path. - Parse `{data: [{embedding, index}]}` and reorder by `index`, rejecting out-of-range indices, duplicates, and missing entries so the output always lines up with the input. Same shape as the merged Mistral and Upstage Embed implementations. - `conf/models/novita.json`: - Add `"embedding": "v3/embeddings"` to `url_suffix`. - Add default embedding model entries for `baai/bge-m3` and `baai/bge-large-en-v1.5` so they appear in the model picker. ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-18 12:02:28 +08:00
Idriss Sbaaoui	e98f3e5c0d	Fix session deletion leaking chat-upload blobs (#14969 ) ### What problem does this PR solve? This fixes a bug where files uploaded in chat were left in storage after the session was deleted. It now removes those chat-uploaded blobs during session deletion. fixes #14965 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-18 11:14:27 +08:00
qinling0210	9d94527b1d	Bump to infinity v0.7.0 (#14968 ) ### What problem does this PR solve? Upgrade infinity ### Type of change - [x] Refactoring	2026-05-18 10:25:59 +08:00
07heco	e194027b01	refactor: optimize BaseTitleChunker to improve RAG document chunk quality (#14247 ) ## RAG Optimization Description Optimize the core `BaseTitleChunker` in `rag/flow/chunker/title_chunker/common.py` to improve RAG document chunking quality and retrieval accuracy. ## Key Changes 1. Format-branched text processing: Preserve original whitespace & indentation for Markdown/HTML payloads to maintain document semantics and chunk fidelity; only perform full whitespace cleaning on plain text content. 2. Empty chunk filtering: Thoroughly filter invalid pure-blank lines to reduce noisy data in vector database. 3. Code deduplication: Unified markdown/text/html payload extraction logic, removed redundant repeated code blocks. 4. None serialization fix: Avoid converting `None` value into literal `"None"` string in chunk text fields. 5. Production logging: Added input/output line count logging for filter logic, observable in online environment. 6. 100% backward compatible: No changes to chunking hierarchy rules, output format and all existing workflows. ## RAG Business Value - Preserves document format fidelity for structured Markdown/HTML files - Reduces invalid noisy chunks → improves RAG retrieval precision - Cleans plain text data → optimizes vector embedding quality - Improves code maintainability with no breaking changes - Provides observable logging for chunk filtering behavior ## Compatibility - ✅ No API changes - ✅ No chunk logic modifications - ✅ All document parsing/chunking workflows unaffected - ✅ All pre-checks passed, no code conflicts ### Type of change - [x] Refactoring - [x] Performance Improvement	2026-05-18 10:00:18 +08:00
Ricardo-M-L	ff318aba7a	fix: correct literal_eval dispatch and bool isinstance ordering in agent components (#13988 ) ## Summary This PR fixes 3 bugs in agent components: ### Bug 1: `DataOperations._invoke()` dispatches `"literal_eval"` to wrong handler File: `agent/component/data_operations.py`, line 76 The `_invoke()` method compares `self._param.operations` against `"recursive_eval"` (line 76), but the valid value defined in `DataOperationsParam.__init__()` (line 29) and validated in `check()` (line 43) is `"literal_eval"`. This means selecting the `literal_eval` operation from the frontend would never match, and the method `_literal_eval()` would never be called. Fix: Change `"recursive_eval"` to `"literal_eval"` in the dispatch condition. ### Bug 2: `VariableAssigner._clear()` — `bool` branch unreachable File: `agent/component/variable_assigner.py`, lines 95–100 In Python, `bool` is a subclass of `int` (`True` is `isinstance(True, int) == True`). The `isinstance(variable, int)` check on line 95 catches boolean values before the `isinstance(variable, bool)` check on line 99, making the bool branch unreachable. A boolean variable would be cleared to `0` instead of `False`. Fix: Move the `isinstance(variable, bool)` check before `isinstance(variable, int)`. ### Bug 3: `LoopItem.evaluate_condition()` — `bool` branch unreachable File: `agent/component/loopitem.py`, lines 67–93 Same issue as Bug 2: `isinstance(var, (int, float))` on line 67 catches boolean values before `isinstance(var, bool)` on line 85. Boolean variables would be evaluated with numeric operators (`=`, `≠`, `>`, etc.) instead of boolean operators (`is`, `is not`). Fix: Move the `isinstance(var, bool)` check before `isinstance(var, (int, float))`. ## Test plan - [ ] Verify `DataOperations` with `literal_eval` operation correctly invokes `_literal_eval()` - [ ] Verify `VariableAssigner._clear()` returns `False` for boolean variables (not `0`) - [ ] Verify `LoopItem.evaluate_condition()` uses boolean operators for `True`/`False` values 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Bug Fixes * Fixed operation routing logic to correctly dispatch the "literal_eval" operation to its handler. * Refactor * Reorganized conditional branch ordering in agent components to improve code structure and maintainability without affecting functional behavior. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-18 09:58:45 +08:00
小熊	09d45046e5	Feat/web markdown UI updates (#14214 ) ### What problem does this PR solve? LLM/chat and search UIs render Markdown in several places (document preview, floating chat widget, next-search, etc.). Plugin lists and behavior were duplicated or inconsistent, and single newlines in model output were not always rendered as visible line breaks, which hurts readability for chat-style content. This PR centralizes shared remark/rehype configuration (including `remark-breaks` for newline handling) and wires the main Markdown surfaces to use it, so behavior is consistent and easier to maintain. ### Type of change - [x] Refactoring --------- Co-authored-by: Yingfeng Zhang <yingfeng.zhang@gmail.com>	2026-05-15 22:29:44 +08:00
Haruko386	bf41d35729	Go: implement PaddleOCR provider and implement ASR for CoHere (#14954 ) ### What problem does this PR solve? This PR implement implement OCR for Baidu and Mistral, implement PaddleOCR provider and implement ASR for CoHere Verified examples from the CLI: ``` RAGFlow(user)> ocr with 'mistral-ocr-2512@test@mistral' file './internal/text.jpg' +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ \| text \| +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ \| Parallel to these organizational innovations there were significant complementary technical innovations (e.g., improved methods of manufacturing cast-iron pipe and of coating interiors for pressure maintenance, and newer paving and construction material... \| +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ RAGFlow(user)> ocr with 'paddleocr-vl-0.9b@test@baidu' file './internal/text.jpg' +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ \| text \| +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ \| Parallel to these organizational innovations there were significant complementary technical innovations (e.g., improved methods of manufacturing cast-iron pipe and of coating interiors for pressure maintenance, and newer paving and construction material... \| +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ # PaddleOCR RAGFlow(user)> ocr with 'PaddleOCR-VL-1.5@test@paddleocr' file './internal/test.pdf' +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ \| text \| +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ \| # Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation Bingxin Ke Nando Metzger Photogra Anton Obukhov Rodrigo Caye Daudt netry and Remote Sensing, Shengyu Huang Konrad Schindler ETH Zürich <div style="text-align: c... \| +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ # Cohere RAGFlow(user)> asr with 'cohere-transcribe-03-2026@test@cohere' audio './internal/test.wav' param '{"language": "en"}' +-----------------------------------------------------------------------------------------------------------------------+ \| text \| +-----------------------------------------------------------------------------------------------------------------------+ \| The examination and testimony of the experts enabled the Commission to conclude that five shots may have been fired. \| +-----------------------------------------------------------------------------------------------------------------------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2026-05-15 18:41:43 +08:00
wdeveloper16	14c0985182	feat: bump Python minimum from 3.12 to 3.13, drop strenum backport (#14767 ) Closes #14753 ## What changed \| File \| Change \| \|---\|---\| \| `pyproject.toml` \| `requires-python` → `>=3.13,<3.15`; remove `strenum==0.4.15` \| \| `Dockerfile` \| `uv python install 3.13`, `uv sync --python 3.13` \| \| `.github/workflows/tests.yml` \| `uv sync --python 3.13` on both matrix legs \| \| `CLAUDE.md` \| dev setup command + requirements note updated \| \| `deepdoc/parser/mineru_parser.py` \| `from strenum import StrEnum` → `from enum import StrEnum` \| \| `agent/tools/code_exec.py` \| same \| `StrEnum` has been in the stdlib since Python 3.11 — the `strenum` backport package is no longer needed once the floor is 3.13. ## Why uv.lock is not regenerated `uv lock --python 3.13` fails because: 1. The infiniflow/graspologic fork pins `numpy>=1.26.4,<2.0.0` 2. `tensorflow-cpu>=2.20.0` (the first release with cp313 wheels) depends on `ml-dtypes>=0.5.1`, which requires `numpy>=2.1.0` 3. These two constraints are irreconcilable on Python 3.13 The lockfile regeneration requires loosening the `numpy` upper bound in the `infiniflow/graspologic` fork. Once that fork commit is updated and the SHA in `pyproject.toml:49` is bumped, `uv lock --python 3.13` will succeed. ## RFC corrections Two claims in the original RFC (#14753) did not hold up under code review: - "graspologic hard-blocks 3.13" — the infiniflow fork at the pinned commit has no `<3.13` Python constraint. The blocker is the transitive `numpy<2.0.0` conflict with tensorflow-cpu's test dependency, not a direct Python version cap. - "free-threading throughput gains for I/O-bound workload" — Python 3.13 free-threading requires a special `--disable-gil` build and provides no benefit for async I/O code (the GIL is already released during I/O). The real motivation is forward compatibility and improved error messages.	2026-05-15 14:40:53 +08:00

1 2 3 4 5 ...

6295 Commits