ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-05-21 00:36:43 +08:00

Author	SHA1	Message	Date
bitloi	6499bce2a6	fix: Langfuse chat observation (#15026 ) ### What problem does this PR solve? Closes #15025 Langfuse-enabled `dialog_service.async_chat()` regressed to `langfuse_tracer.start_generation(...)` after the earlier Langfuse v4 migration. Langfuse v4 uses `start_observation(as_type="generation")`, so the remaining `start_generation` call can fail when chat tracing is enabled. This restores the migrated `start_observation(as_type="generation")` call for chat observations while preserving the existing trace context, model, input payload, and update/end flow. It also adds a regression test with a fake Langfuse v4-style client that exposes `start_observation()` but not `start_generation()`. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### Tests - `.venv/bin/pytest test/unit_test/api/db/services/test_dialog_service_final_answer.py -q` - `.venv/bin/ruff check api/db/services/dialog_service.py test/unit_test/api/db/services/test_dialog_service_final_answer.py`	2026-05-20 15:01:19 +08:00
Rene Arredondo	ce3402cbb9	Fix: restore saved api_key fallback in add_llm (#14921 ) (#14941 ) ## Summary Closes #14921. Reconfiguring an existing LLM provider to enable tool call or vision fails with `Your API key is invalid. Fail to access model.` even when the saved API key is correct. The most visible report is VLLM ("Cannot add vllm model" once `--enable-auto-tool-choice` / vision is toggled on), but the bug applies to every provider whose api_key field stays blank in edit mode. ## Root cause PR #14885 ("Fix: llm add api key overridden") removed the existing-key lookup in `api/apps/llm_app.py::add_llm`. The intent was correct — stop the saved key from clobbering a user-provided new one — but the removal was unconditional, so the edit path now has no fallback at all: 1. `web/src/pages/user-setting/setting-model/hooks.tsx:230` sets the initial `api_key` form value to `''` in edit mode (the real key is never returned to the browser). 2. The user toggles `is_tools` / `vision` without retyping the key. 3. `hooks.tsx:183-185` strips the empty `api_key` from the payload. 4. `add_llm` defaults to the placeholder `"x"` (`api/apps/llm_app.py:182`). 5. The upstream provider rejects `"x"` with `Your API key is invalid`. ## Fix Restore the fallback narrowly, before any factory-specific handler runs: - If `req.get("api_key") is None`, look up the tenant's existing record (using the correctly suffixed `llm_name` for VLLM / OpenAI-API-Compatible / LocalAI / HuggingFace). - Decode the saved blob with `_decode_api_key_config` and write only the decoded `api_key` string back into `req["api_key"]`. Never use the raw JSON payload — that was the exact thing PR #14885 was trying to avoid. - When the user does type a new key, `req.get("api_key")` is not `None` and the fallback is skipped, so PR #14885's fix is preserved. \| Scenario \| Before this PR \| After this PR \| \|---\|---\|---\| \| Plain factory (VLLM, Ollama, …), retype key \| OK \| OK \| \| Plain factory, blank key in edit (the bug) \| Fails with "API key is invalid" \| Recovers saved key, validates against the real one \| \| OpenRouter / Bedrock, change `provider_order` only \| Fails \| `apikey_json([...])` rebuilds the JSON with saved `api_key` + new field \| \| User clears the form and types a brand-new key \| OK (key replaced) \| OK (key replaced — fallback skipped) \| ## Files changed - `api/apps/llm_app.py` — restored fallback in `add_llm` (no other call sites touched). ## Test plan - [ ] Add a VLLM chat model with a valid api_key, no toggles → save succeeds. - [ ] Edit the same model, toggle tool call on, leave api_key blank → save succeeds, validation runs against the saved key. - [ ] Edit again, toggle vision on (model_type → `image2text`), leave api_key blank → save succeeds. - [ ] Edit again and type a new api_key → the new key replaces the saved one (`is None` check skips the fallback). Verify via the DB row or by deliberately typing a wrong key and observing the validation failure. - [ ] Repeat the blank-key edit with OpenRouter, changing only `provider_order` → resulting api_key JSON contains the saved `api_key` and the new `provider_order`. - [ ] First-time add of a new model name → no existing record, fallback no-ops, behaves as before. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2026-05-19 15:32:09 +08:00
plind	f169ab4b39	feat(tts): cache synthesized speech in Redis to avoid redundant calls (#14851 ) ## What problem does this PR solve? Closes #12017. TTS output is deterministic for a given `(model, text)` pair, so re-running the same text through the same TTS model produces the same bytes — yet `Canvas.tts` and `dialog_service.tts` re-synthesized on every request. That's slow and wastes provider quota whenever the same assistant response is replayed, shared across users, or repeated within a session. ### Change New helper `rag/utils/tts_cache.py` with `synthesize_with_cache(tts_mdl, cleaned_text)`: - Key: `tts:cache:{model_id}:{sha256(text)}` — separate namespace per model, identical cleaned text reuses a single entry across both call sites. - Value: the hex-encoded audio blob both call sites already returned. No format change for downstream consumers. - TTL: 7 days by default, configurable via `RAGFLOW_TTS_CACHE_TTL_SECONDS`. - Failure modes: a Redis hiccup falls back to direct synthesis; a failed synthesis still returns `None` (existing contract preserved). [`Canvas.tts`](https://github.com/infiniflow/ragflow/blob/main/agent/canvas.py#L683-L724) and [`dialog_service.tts`](https://github.com/infiniflow/ragflow/blob/main/api/db/services/dialog_service.py#L1367-L1380) now route through the helper; the per-file bytes-accumulation/hex-encode loop has been removed in favor of one shared implementation. ## Type of change - [x] New Feature (non-breaking change which adds functionality) ## Test plan - [ ] Cache hit, chat path: Configure a dialog with TTS enabled, ask the same question twice with `stream=false`. Verify the second response returns the same `audio_binary` and that the second invocation doesn't hit the TTS provider (e.g., observe provider-side logs / usage counters; check no `LLMBundle.tts can't update token usage` log line on the second run). - [ ] Cache hit, agent path: Same exercise via a Conversational Agent that includes a Message component playing back the answer. - [ ] Cache isolation per model: Switch tenant's `tts_id` between two models, run the same text against each — confirm the second model's first synthesis still happens (no cross-model hits). - [ ] TTL override: Set `RAGFLOW_TTS_CACHE_TTL_SECONDS=120`, confirm the entry expires after 2 minutes. - [ ] Redis unavailable: Stop Redis (or break the connection). Verify the TTS endpoint still works — synthesis falls back to direct calls, with a `TTS cache lookup failed` / `TTS cache store failed` warning logged. - [ ] Failure path: Configure a TTS model with an invalid API key, ensure the response still returns successfully with `audio_binary=None` (no regression vs. current behavior).	2026-05-19 14:20:40 +08:00
buua436	87d22a4415	Fix: agent session log message (#14991 ) ### What problem does this PR solve? agent session log message ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-19 12:00:02 +08:00
kingloon	525a87be0f	Misc: fix some typos (#14987 ) ### What problem does this PR solve? Fix minor code quality issues: 1. Fix typo in assertion error message: "Can't fine" → "Can't find" 2. Remove duplicate line in common/connection_utils.py ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring	2026-05-19 10:47:06 +08:00
jony376	198f3c4b9a	Fix: validate memory tenant model IDs on update and enforce tenant scope in memory pipeline (#14923 ) ### Related issues Closes #14922 ### What problem does this PR solve? `POST /memories` already resolves `tenant_llm_id` and `tenant_embd_id` through `ensure_tenant_model_id_for_params`, but `PUT /memories/<memory_id>` accepted client-supplied `tenant_llm_id` / `tenant_embd_id` without checking that those `tenant_llm` rows belong to the memory owner’s tenant. A caller could persist another tenant’s row IDs and later trigger extraction or embedding that loaded foreign model credentials via `get_model_config_by_id(tenant_model_id)` with no tenant allow-list. This change aligns the update path with create: updates that change models must go through `llm_id` / `embd_id` and `ensure_tenant_model_id_for_params` scoped to the memory’s `tenant_id` (not only the current user, so team-access cases stay correct). Direct `tenant_*` fields in the body without `llm_id` / `embd_id` are rejected. As defense in depth, `memory_message_service` passes `allowed_tenant_ids` / `requester_tenant_id` into `get_model_config_by_id` for LLM and embedding resolution so mismatched IDs cannot be used even if bad data existed. A regression test rejects payloads that set only `tenant_llm_id` / `tenant_embd_id`. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: jony376 <jony376@gmail.com>	2026-05-19 10:11:46 +08:00
Magicbook1108	b69a6a5d80	Feat: full optimization on connector dashboard (#14979 ) ### What problem does this PR solve? This PR improves the connector dashboard task management experience and adds better visibility into connector execution logs. ### Overview: #### Before <img width="700" alt="image" src="https://github.com/user-attachments/assets/e4a8ed6f-2e18-4f0f-8528-41a514550052" /> #### Now: <img width="700" alt="Screenshot from 2026-05-18 16-31-30" src="https://github.com/user-attachments/assets/d4ca193b-847a-49ae-9e4f-5fbca60ea627" /> ### 1. Add a new logging page to the connector dashboard A new logging page has been added so users can view connector task execution logs directly from the connector dashboard. ### 2. Merge the Resume button into Confirm The separate Resume button has been removed. The Confirm button now represents different actions depending on the current task state: - Save: Save form changes and reschedule tasks. - Stop: Cancel currently scheduled or running tasks. - Resume: Create new scheduled tasks after the previous tasks have been stopped. - Start: Start tasks when no task has been started yet. ### 3. Separate syncing and pruning tasks Connector tasks are now separated into syncing and pruning. Pruning is controlled by the Sync deleted files option: - When Sync deleted files is disabled, only syncing tasks are shown. - When Sync deleted files is enabled, both syncing and pruning tasks are shown. Now: Sync deleted files disabled <img width="700" alt="Sync deleted files disabled" src="https://github.com/user-attachments/assets/dbd9232e-614a-407f-a0b1-c109e5fa567d" /> Now: Sync deleted files enabled <img width="700" alt="Sync deleted files enabled" src="https://github.com/user-attachments/assets/1f527f48-ccb3-4ee8-97ca-086891489296" /> ### 4. Update logs in backend <img width="700" alt="image" src="https://github.com/user-attachments/assets/10a95a3f-98c1-4e67-8afa-ddf6cda5b0b2" /> ### 5. Remove connector resume API - Removed: `POST /v1/connectors/<connector_id>/resume` - Replaced by: `PATCH /v1/connectors/<connector_id>` ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-19 10:07:11 +08:00
Jake Armstrong	93d3deb5e4	Fix admin CLI system variable commands (#14956 ) ## What Fixes #12409. Implements admin CLI support for: - `list vars;` - `show var <name-or-prefix>;` - `set var <name> <value>;` ## Changes - Wire Go CLI variable commands to the admin API. - Support integer and quoted string values in `SET VAR`. - Return variable rows as `data_type`, `name`, `setting_type`, and `value`. - Add exact-name lookup with prefix fallback for `SHOW VAR`. - Validate values by stored data type: `string`, `integer`, `bool`, and `json`. - Keep the legacy Python admin CLI/server behavior aligned. - Update admin CLI docs and add focused tests. ## Verification - `go test -count=1 ./internal/cli` - `python3.12 -m py_compile admin/server/services.py admin/server/routes.py api/db/services/system_settings_service.py admin/client/parser.py admin/client/ragflow_client.py` - Python admin CLI parser smoke test for `SET VAR`, quoted values, `SHOW VAR`, and `LIST VARS`. - Attempted `./run_go_tests.sh`; local environment is missing native tokenizer/linker artifacts: - `internal/cpp/cmake-build-release/librag_tokenizer_c_api.a` - `-lstdc++` Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-18 19:08:45 +08:00
Wang Qi	732e4741c4	Bugfix: fix tag show (#14980 ) ### What problem does this PR solve? Bugfix: fix tag show ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-18 18:55:01 +08:00
Hamza Amin Khokhar	2dbe3b8a62	fix: metadata_condition returning all docs when filter matches nothing (#14967 ) ### What problem does this PR solve? When _parse_doc_id_filter_with_metadata returns [], the empty list is falsy so the WHERE id IN (...) clause was silently skipped, causing the full dataset to be returned instead of an empty result. Change `if doc_ids:` to `if doc_ids is not None:` in both get_list() and get_by_kb_id() to distinguish between no filter (None) and a filter that matched zero documents ([]). Fixes #14962 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-18 18:54:30 +08:00
Wang Qi	13b422037f	Refactor: enhance graphrag - part 2 (#14972 ) ### What problem does this PR solve? 1. expose batch_chunk_token_size for configuration 2. retrieve chunks when build subgraph for the doc, not retreive all docs chunks at the begining 3. get all chunks for a document, used to be hard coded 10000 4. delete not used method run_graphrag ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring Follow on: #14617	2026-05-18 16:10:21 +08:00
dev	b12eaee38b	fix(api): enforce tenant access for connector routes (#14747 ) ### What problem does this PR solve? Fixes #14746. Adds tenant access checks for connector-by-id REST routes before reading connector details, mutating connector config/status, deleting connectors, rebuilding, or listing sync logs. Unauthorized callers now receive `RetCode.AUTHENTICATION_ERROR` with `No authorization.` without reaching the connector/log mutation paths. Validation: - `python3 -m pytest --confcutdir=test/testcases/test_web_api/test_connector_app test/testcases/test_web_api/test_connector_app/test_connector_routes_unit.py` - `uvx ruff check api/apps/restful_apis/connector_api.py api/db/services/connector_service.py test/testcases/test_web_api/test_connector_app/test_connector_routes_unit.py` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: dev111-actor <dev111-actor@users.noreply.github.com>	2026-05-18 16:09:26 +08:00
Wang Qi	56d73d0c2c	Refactor: speed up ragflow server, save startup memory (#14973 ) ### What problem does this PR solve? Refactor: speed up ragflow server, save startup memory, saved 200MiB, and 5-9 seconds start time. ##### Before 1241292 \| \| \_ python3 api/ragflow_server.py RAGFlow server is ready after 25.61845850944519s initialization. ##### After 1019968 \| \| \_ python3 api/ragflow_server.py RAGFlow server is ready after 16.205134391784668s initialization. ### Type of change - [x] Refactoring	2026-05-18 15:55:59 +08:00
dale053	fe82a96193	Fix: add SSRF guard for agent test_db_connection endpoint (#14860 ) ### What problem does this PR solve? Closes #14858 The `test_db_connection` endpoint in the agent API accepts a user-supplied `host` and connects to it directly via database drivers (MySQL/PostgreSQL) without any validation. This allows an attacker to probe internal network addresses (e.g. `127.0.0.1`, `10.x.x.x`, link-local, etc.) through the server — a classic Server-Side Request Forgery (SSRF) vulnerability. This PR adds an SSRF guard that resolves the host and rejects any address that is not globally routable before the database connection is attempted. Changes: - `common/ssrf_guard.py` — Added `assert_host_is_safe()`, a host-level counterpart of the existing `assert_url_is_safe()`, designed for non-HTTP protocols (database drivers) where there is no URL to parse. - `api/apps/restful_apis/agent_api.py` — Call `assert_host_is_safe(req["host"])` at the top of `test_db_connection` so that non-public hosts are rejected early with a clear error message. Fixes #14858 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-18 14:32:44 +08:00
qinling0210	f1d2383572	Push metadata filters down to Infinity (#14974 ) ### What problem does this PR solve? Push metadata filters down to Infinity ### Type of change - [x] Refactoring	2026-05-18 14:22:04 +08:00
Kevin Hu	7cdc74bbe5	Refactor: Drop the vector fetch for ES (#14970 ) ## Summary - Stop pulling chunk vectors (`q__vec`) back from Elasticsearch in the main retrieval path. ES already knows them; shipping them was pure bandwidth/memory overhead. - Recover the per-chunk cosine similarity via a second KNN-only ES call filtered by the candidate chunk ids. The new `_score` is merged with locally computed term similarity using the user-configured `vector_similarity_weight`. - Lazily fetch the chunk embedding only for the chunks `insert_citations` actually needs. ## Details `rag/nlp/search.py`* - `Dealer.search`: no longer appends `q__vec` to the ES select list. OceanBase still gets it (its rerank path is unchanged). - New `Dealer._knn_scores(sres, idx_names, kb_ids)`: a `MatchDenseExpr` over the cached query vector filtered by `id IN sres.ids`, returning `{chunk_id: cosine_score}` via ES `_score`. - New `Dealer.rerank_with_knn(...)`: term similarity from `qryr.token_similarity` plus the ES-supplied KNN score, combined with `tkweight`/`vtweight` and the existing rank-feature bonus. - New `Dealer.fetch_chunk_vectors(chunk_ids, tenant_ids, kb_ids, dim)`: on-demand vector fetch for citation use. - `Dealer.retrieval` routes Infinity → unchanged, OceanBase → existing local `rerank`, ES → new KNN-score path. `common/doc_store/es_conn_base.py`* - New `get_scores(res)` helper returning `{_id: _score}` directly from hit headers (ES doesn't surface `_score` through `get_fields`). `api/db/services/dialog_service.py` - New top-level `_hydrate_chunk_vectors(...)` helper. On ES it back-fills `ck["vector"]` from `fetch_chunk_vectors` right before `insert_citations`. No-op on Infinity / OB (their chunks already carry vectors). - Both `decorate_answer` closures became `async` and are `await`-ed at all call sites in `async_chat` and `async_ask`. ## Backend behavior \| Backend \| Returns chunk vec in main search \| Sim source \| Vectors for citations \| \|---\|---\|---\|---\| \| ES \| No \| second KNN call (`_score`) merged with term sim \| fetched on demand \| \| Infinity \| No (unchanged) \| normalized `_score` \| already on chunks \| \| OceanBase \| Yes (kept) \| local hybrid rerank \| already on chunks \| ## Test plan	2026-05-18 14:21:56 +08:00
Rene Arredondo	9f2fb4611f	Fix: guard empty/whitespace embedding inputs in LLMBundle (#14428 ) (#14924 ) Closes #14428 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-18 14:11:54 +08:00
Idriss Sbaaoui	e98f3e5c0d	Fix session deletion leaking chat-upload blobs (#14969 ) ### What problem does this PR solve? This fixes a bug where files uploaded in chat were left in storage after the session was deleted. It now removes those chat-uploaded blobs during session deletion. fixes #14965 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-18 11:14:27 +08:00
wdeveloper16	14c0985182	feat: bump Python minimum from 3.12 to 3.13, drop strenum backport (#14767 ) Closes #14753 ## What changed \| File \| Change \| \|---\|---\| \| `pyproject.toml` \| `requires-python` → `>=3.13,<3.15`; remove `strenum==0.4.15` \| \| `Dockerfile` \| `uv python install 3.13`, `uv sync --python 3.13` \| \| `.github/workflows/tests.yml` \| `uv sync --python 3.13` on both matrix legs \| \| `CLAUDE.md` \| dev setup command + requirements note updated \| \| `deepdoc/parser/mineru_parser.py` \| `from strenum import StrEnum` → `from enum import StrEnum` \| \| `agent/tools/code_exec.py` \| same \| `StrEnum` has been in the stdlib since Python 3.11 — the `strenum` backport package is no longer needed once the floor is 3.13. ## Why uv.lock is not regenerated `uv lock --python 3.13` fails because: 1. The infiniflow/graspologic fork pins `numpy>=1.26.4,<2.0.0` 2. `tensorflow-cpu>=2.20.0` (the first release with cp313 wheels) depends on `ml-dtypes>=0.5.1`, which requires `numpy>=2.1.0` 3. These two constraints are irreconcilable on Python 3.13 The lockfile regeneration requires loosening the `numpy` upper bound in the `infiniflow/graspologic` fork. Once that fork commit is updated and the SHA in `pyproject.toml:49` is bumped, `uv lock --python 3.13` will succeed. ## RFC corrections Two claims in the original RFC (#14753) did not hold up under code review: - "graspologic hard-blocks 3.13" — the infiniflow fork at the pinned commit has no `<3.13` Python constraint. The blocker is the transitive `numpy<2.0.0` conflict with tensorflow-cpu's test dependency, not a direct Python version cap. - "free-threading throughput gains for I/O-bound workload" — Python 3.13 free-threading requires a special `--disable-gil` build and provides no benefit for async I/O code (the GIL is already released during I/O). The real motivation is forward compatibility and improved error messages.	2026-05-15 14:40:53 +08:00
plind	c9622d0924	fix(agentbot): aggregate structured output in non-streaming completions (#14848 ) ## What problem does this PR solve? Closes #13384. The `/api/v1/agentbots/<agent_id>/completions` non-streaming path returned the first yielded SSE chunk and exited: ```python async for answer in agent_completion(objs[0].tenant_id, agent_id, **req): return get_result(data=answer) ``` That meant structured output, the full assistant message, and reference data were all dropped when an agent was called with `stream=false`. Streaming worked because each event was forwarded individually; non-streaming was returning a raw SSE-formatted string from a single early event. The v1 endpoint at [`agent_api.py:1006-1050`](https://github.com/infiniflow/ragflow/blob/main/api/apps/restful_apis/agent_api.py#L1006-L1050) already handles this correctly. This PR mirrors that aggregation in the SDK beta endpoint: parse each SSE line, accumulate `content` from `message` events, merge `reference`, collect `outputs.structured` from each `node_finished` event keyed by `component_id`, and attach all of them to the final response. ## Type of change - [x] Bug fix (non-breaking change which fixes an issue) ## Test plan - [ ] Build an agent with a node that emits structured output, call `POST /api/v1/agentbots/<agent_id>/completions` with `stream=false` and a beta API token, verify `data.structured.<component_id>` is present in the response. - [ ] Same agent with `stream=true` — verify behavior is unchanged. - [ ] Agent without structured output — verify `data.structured` is omitted, `content` and `reference` still aggregated correctly.	2026-05-15 12:42:33 +08:00
Sebastion	547b8cf9d8	security: always use RestrictedUnpickler in deserialize_b64 (CWE-502) (#14803 ) ## Summary Harden `api/utils/configs.deserialize_b64` so that it always routes pickle data through the existing `RestrictedUnpickler` (`restricted_loads`) rather than falling back to bare `pickle.loads()`. - CWE-502 — Deserialization of Untrusted Data - File / function: `api/utils/configs.py` → `deserialize_b64` - Caller: `SerializedField.python_value` in `api/db/db_models.py` (invoked by Peewee whenever a pickled DB column is read) ## The issue Before this change, `deserialize_b64` consulted a `use_deserialize_safe_module` config flag that defaults to `False` and is not set anywhere in the repository: ```python use_deserialize_safe_module = get_base_config('use_deserialize_safe_module', False) if use_deserialize_safe_module: return restricted_loads(src) return pickle.loads(src) # <-- default path ``` So the default code path was unrestricted `pickle.loads()` on bytes read from a MySQL `SerializedField(serialized_type=PICKLE)` column. Any attacker who can influence those bytes (SQL injection elsewhere, compromised DB credentials, a backup restored from an untrusted source, or a compromised replication peer) can craft a pickle payload that achieves arbitrary code execution on the ragflow application server when the field is next read. Today no model in-tree instantiates a `SerializedField` with the default PICKLE type — only `JsonSerializedField` is used in practice — so the attack surface is currently latent rather than actively reachable through an HTTP endpoint. But the insecure-by-default behaviour is a sharp edge: any future field that uses the default PICKLE serialization would silently inherit RCE-on-read semantics. ## The fix ```diff - use_deserialize_safe_module = get_base_config( - 'use_deserialize_safe_module', False) - if use_deserialize_safe_module: - return restricted_loads(src) - return pickle.loads(src) + return restricted_loads(src) ``` `restricted_loads` is the existing `RestrictedUnpickler` already defined in the same file, which limits permitted modules to `numpy` and `rag_flow`. The config flag (and the now-dead `get_base_config` import) are removed. Diff is 1 insertion / 6 deletions, scoped to a single function. ## Testing - Built a malicious pickle whose `__reduce__` resolves to `posix.system('id')`. Pre-fix: executes. Post-fix: `restricted_loads` raises `UnpicklingError: global 'posix.system' is forbidden`. - Round-tripped a benign `numpy.ndarray` through `serialize_b64` → `deserialize_b64`. Values preserved bit-for-bit. - Confirmed `use_deserialize_safe_module` is not set in any config file in the tree, so removing the flag does not change any operator-facing knob that was actually in use. ## A note on `restricted_loads` itself The existing `SECURITY.md` notes that `restricted_loads`'s `numpy` allow-list can still be reached via `numpy.f2py.diagnose.run_command`. This PR does not attempt to fix that — it is a separate hardening question about tightening the allow-list to specific symbols rather than whole modules. The change here strictly improves on the status quo (bare `pickle.loads`) and brings the default path in line with what the `restricted_loads` helper was clearly designed for. Happy to follow up with a separate PR narrowing the allow-list if that direction is welcome. ## Adversarial review Before submitting, we tried to argue this finding away. The two strongest objections are (1) "no field uses PICKLE today, so this is unreachable" — true, but the default behaviour of a security-sensitive helper still matters because new fields silently inherit it; and (2) "the attacker already needs DB write access, which is game over" — partially true, but pickle-RCE meaningfully escalates data tampering into code execution on the application host (filesystem, internal network, in-process secrets), which is not equivalent. The fix is one line of real code, has no behavioural cost for legitimate callers, and removes an insecure default. We decided it was worth filing. --- <sub>_Submitted by Sebastion — autonomous open-source security research from [Foundation Machines](https://foundationmachines.ai). Free for public repos via the [Sebastion AI GitHub App](https://github.com/marketplace/sebastion-ai)._</sub>	2026-05-15 10:58:27 +08:00
buua436	58819f5d3e	fix: add document download endpoint and refactor existing download function (#14927 ) ### What problem does this PR solve? add document download endpoint and refactor existing download function ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-15 09:36:58 +08:00
wdeveloper16	a98994ff91	fix: close db connections reliably in test_db_connection (#14777 ) ## Summary - Fixes resource-management bugs in the `POST /agents/test_db_connection` endpoint where database connections could be left open on error (part of #14750) ## Changes - `api/apps/restful_apis/agent_api.py` — `test_db_connection`: - mysql / mariadb / oceanbase / postgres: replaced bare `db.connect()` / `db.close()` fallthrough with `with db.connection_context()` and a probe `SELECT 1` — guaranteed close on both success and exception - mssql: nested `try/finally` blocks so `cursor.close()` and `db.close()` are always called even when `cursor.execute()` raises - trino: wrapped cursor ops in `try/finally` for the same reason - Removed the `if req["db_type"] != "mssql": db.connect(); db.close()` shared fallthrough block — each branch now owns its teardown - Consolidated to a single `return get_json_result(...)` after the if/elif chain	2026-05-14 16:45:44 +08:00
dale053	bd99a22661	fix: atomic chunk/token counter updates for documents and knowledge b… (#14867 ) ### What problem does this PR solve? Fixes #14866. Previously, `DocumentService.increment_chunk_num` and `decrement_chunk_num` updated the `Document` row and its parent `Knowledgebase` row in two separate, non-transactional statements. If the second update failed (DB error, connection drop, etc.) after the first one succeeded, the document and knowledge base chunk/token counters would drift apart and stay inconsistent. There was also a behavioral asymmetry between the two methods: - `increment_chunk_num` only logged a warning when the document row was missing and returned a value that callers usually treated as success. - `decrement_chunk_num` raised `LookupError` in the same situation. This PR makes the counter updates atomic and aligns the missing-document behavior between the two methods: - Wrap the `Document` and `Knowledgebase` updates in `increment_chunk_num` / `decrement_chunk_num` inside a `DB.atomic()` block so both succeed or both roll back together. - Raise `LookupError` from `increment_chunk_num` when the target document no longer exists, matching `decrement_chunk_num`. - Update `reset_document_for_reparse` in `document_api_service.py` to catch the new `LookupError` and return a proper "Document not found!" API error instead of propagating the exception. No schema changes, no API contract changes for the success path; only the failure mode for a missing document during reparse is now a clean error response instead of an uncaught exception. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-14 14:48:52 +08:00
Ethan T.	ba8cb9dd4a	fix: replace mutable default arguments with None in LLM chat models (#13513 ) ## Summary - Replace `gen_conf={}` with `gen_conf=None` + guard in `rag/llm/chat_model.py` (12 instances across Base, BaiChuanChat, LocalLLM, MistralChat, ReplicateChat, BaiduYiyanChat, GoogleChat classes) - Replace `doc_ids=[]` with `doc_ids=None` + guard in `api/db/services/document_service.py` (1 instance) - Mutable default arguments are shared across all calls, causing potential cross-request state contamination - See Python docs: https://docs.python.org/3/faq/programming.html#why-are-default-values-shared-between-objects ## Test plan - [x] Verify LLM calls work with and without explicit gen_conf - [x] No behavior change for existing callers — `None` is replaced with `{}` at function entry 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Jin Hai <haijin.chn@gmail.com> Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2026-05-14 14:46:47 +08:00
dale053	714f777fa0	Fix: missing authentication on agent file upload and download endpoints (#14854 ) ### What problem does this PR solve? Closes #14853 The `/agents/download` and `/agents/<agent_id>/upload` endpoints in the agent API are missing `@login_required` and `@add_tenant_id_to_kwargs` decorators, allowing unauthenticated access. This is a security issue — any user can upload files to or download files from an agent without being logged in. Additionally, the upload endpoint bypasses canvas access control (`@_require_canvas_access_async`). This PR adds the missing authentication and authorization decorators to both endpoints and replaces the manual `user_id` / `created_by` lookups with the `tenant_id` provided by the auth middleware, making these endpoints consistent with the rest of the agent API. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-14 13:48:41 +08:00
Ricardo-M-L	48b4aa3e93	Fix WebDriver resource leak in HTML-to-PDF conversion (#14310 ) ### What problem does this PR solve? In `api/utils/web_utils.py`, `__get_pdf_from_html()` creates a Chrome WebDriver but only calls `driver.quit()` inside the `TimeoutException` handler. If the page element becomes stale before the timeout (no exception raised), the WebDriver is never quit, leaking the Chrome browser process and returning `None`. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### Changes - Move the PDF printing logic and `driver.quit()` outside the `except` block so they execute on all code paths - Use `try/finally` to ensure `driver.quit()` is always called, even if the `Page.printToPDF` DevTools call fails Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-14 13:28:58 +08:00
Br1an	d46bbd30f7	Fix: send input and output token usage to Langfuse (#13294 ) ### What problem does this PR solve? Closes #9837 The Langfuse integration currently only sends the output text to `langfuse_generation.update()` without including token usage information. This means Langfuse cannot track input/output token consumption for cost analysis and monitoring. ### Solution Add the `usage` parameter to `langfuse_generation.update()` with: - `input`: approximate input token count from `message_fit_in()` - `output`: approximate output token count from `num_tokens_from_string(answer)` - `total`: sum of input and output ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2026-05-14 13:11:37 +08:00
buua436	b89878c593	Fix: dataset document download route (#14910 ) ### What problem does this PR solve? dataset document download route ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-14 10:59:06 +08:00
plind	dd76653dc1	feat: add tag management for Agents with filtering and sorting (#14774 ) (#14799 ) ## Summary Closes #14774. Adds free-form tags on agents (UserCanvas) with full UI + API: - Stored as comma-separated `tags` column on `UserCanvas` with online migration. - New endpoints: `GET /v1/agents/tags` (aggregate counts) and `PUT /v1/agent/<id>/tags` (write). `GET /v1/agents` accepts a `tags=` query. - "Edit tags" item in agent dropdown opens a chip-style editor dialog; tags render as badges on each agent card. - New "Tags" facet in the agents filter bar, with counts. ## Implementation notes - Tag matching is exact-token: the SQL filter wraps stored tags as `,…,` and matches `,ml,` so `ml` doesn't match `ml-ops`. - Server-side normalization in `UserCanvasService.update_tags`: dedup (case-insensitive), per-tag cap of 64 chars, total length capped at 512 chars to fit the column, commas inside tag values are replaced with spaces. - Tenant authorization: `PUT /v1/agent/<id>/tags` gates on `UserCanvasService.accessible(canvas_id, tenant_id)`. - Tag listing scope: `UserCanvasService.list_tags` follows the same own + team-shared rule as `get_by_tenant_ids`. - i18n: keys added to `en.ts` and `zh.ts` only (per project convention; other locales fall back). - `HomeCard` gets a non-breaking `extra?: ReactNode` slot for the chip row; no `src/components/ui/` files modified. ## Test plan - [ ] Backend boot runs `migrate_db` → confirm `user_canvas.tags` column exists (`DESCRIBE user_canvas`). - [ ] Agents page renders cards normally (no console error from missing field). - [ ] `⋯ → Edit tags` opens a dialog that stays open (regression: dialog was unmounting with the dropdown). - [ ] Typing a tag without pressing Enter and clicking Save persists it (regression: last typed tag was being dropped). - [ ] Chip input supports Enter/comma to commit, Backspace on empty to remove, `×` to remove individual chip. - [ ] Tag containing a comma sent via API is stored with the comma replaced by a space. - [ ] 20 long tags sent via API does not error (length cap silently truncates). - [ ] "Tags" filter in the filter bar shows counts and narrows the list. - [ ] Filtering by `ml` does not return agents tagged `ml-ops`. - [ ] UI in Chinese shows 编辑标签 / 添加标签以整理和筛选你的智能体 etc. - [ ] `PUT /v1/agent/<other-tenant-id>/tags` returns `Agent not found or no permission.`	2026-05-13 21:41:32 +08:00
Ethan T.	8c5845f6ca	fix: use context manager for pdfplumber to prevent resource leak (#13512 ) ## Summary - Convert `pdfplumber.open()` to use `with` context manager in `api/utils/file_utils.py` (`thumbnail_img` function) - If any exception occurs between `open()` and `close()`, the PDF file handle leaks - The rest of the codebase (e.g. `read_potential_broken_pdf` in the same file) already uses `with pdfplumber.open(...)` correctly ## Test plan - [x] PDF thumbnail generation works correctly with context manager - [x] Resources properly cleaned up on exceptions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-13 21:09:51 +08:00
Ahmad Intisar	e994051eb9	Feature/generic api connector (#13545 ) # feat: Add Generic REST API Connector ## What problem does this PR solve? RAGFlow supports many specific data source connectors (MySQL, Slack, Google Drive, etc.), but there was no way to connect an arbitrary REST API as a data source. Users with custom or third-party APIs had to write a new connector class for each one. This PR adds a generic, configuration-driven REST API connector that lets users connect any REST API as a data source entirely through the UI — no code changes needed per API. --- ## Features ### Core Connector (`common/data_source/rest_api_connector.py`) - Implements `LoadConnector` and `PollConnector` interfaces for full and incremental sync - Configurable authentication: None, API Key (custom header), Bearer Token, Basic Auth - Pluggable pagination: Page-based, Offset-based, Cursor-based, or None - Smart page-size inference from user's query parameters to avoid duplicate/conflicting params - Configurable request delay between pages to prevent API rate limiting - Auto-detection of the items array in JSON responses (`items`, `results`, `data`, `records`, or first list found) - Advanced field mapping with dot-notation (`country.name`), array wildcards (`newsType[].name`), type hints, and default values - Optional content template rendering (`"Title: {title}\nBody: {body}"`) - HTML stripping for content fields - Stable document IDs via `hash128` from a configurable ID field or auto-generated from item content - Pydantic configuration schema with automatic coercion of UI string inputs to dicts/lists ### Backend Registration (`rag/svr/sync_data_source.py`, `common/constants.py`, `common/data_source/config.py`) - `REST_API` sync class wired into RAGFlow's `func_factory` - Full sync (`load_from_state`) and incremental polling (`poll_source`) support - Credentials and config passed from task to connector following existing patterns (MySQL, SeaFile, etc.) ### Test Connection Endpoint (`api/apps/connector_app.py`) - `POST /v1/connector/<id>/test` validates config schema, authentication, and API connectivity without triggering a sync - Clear error messages for auth failures vs. config issues ### Frontend UI (`web/src/pages/user-setting/data-source/constant/`) - Postman-style configuration:* Base URL, Query Parameters (key=value per line), Auth, Content Fields, Metadata Fields, Pagination Type - Auth-type-aware form: fields for API key header/value, Bearer token, or Basic username/password appear only when relevant - Advanced Settings toggle for: Custom Headers, Max Pages, Request Delay, Poll Timestamp Field, Request Body (POST) - Connector icon (SVG) and i18n strings (English) - "Test Connection" button to validate before syncing --- ## Controls & Safety - Configurable max pages safety cap (default: 1000, adjustable in UI) - Configurable request delay between pages (default: 0.5s, adjustable in UI) - Auth errors (401/403) fail immediately without retries; transient errors retry with exponential backoff - Diagnostic logging: auth setup confirmation, request details on failure, content field extraction status --- ## Type of change - [x] New Feature (non-breaking change which adds functionality) ##Visual Screenshots of Features <img width="482" height="510" alt="Screenshot 2026-03-11 at 5 19 52 PM" src="https://github.com/user-attachments/assets/dcb7ab4a-1622-44f3-bb02-d6f0527314c4" /> (Connector can be configured within the external data sources tab) Configuration Parameters: <img width="661" height="682" alt="Screenshot 2026-03-11 at 5 20 46 PM" src="https://github.com/user-attachments/assets/5e154e71-4ab5-4872-bfb2-04f02b73c18a" /> <img width="661" height="682" alt="Screenshot 2026-03-11 at 5 20 54 PM" src="https://github.com/user-attachments/assets/00cb14b7-0bcf-4b94-9d71-34e93369ecb2" /> Connection can be tested before attaching to dataset: <img width="981" height="681" alt="Screenshot 2026-03-11 at 5 21 40 PM" src="https://github.com/user-attachments/assets/aaa6eeeb-89a7-4349-bc34-2423bf8be9ee" /> Ingestion tested with API connector (works perfectly fine): <img width="1062" height="705" alt="Screenshot 2026-03-11 at 5 22 30 PM" src="https://github.com/user-attachments/assets/afcd0d58-cadd-4152-badc-d2f14d96fbec" /> Search & Retrieval works as well with metadata flow: <img width="1062" height="705" alt="Screenshot 2026-03-11 at 5 23 05 PM" src="https://github.com/user-attachments/assets/d41ee935-dcf7-4456-b317-22a76ca032c0" /> --------- Co-authored-by: Ahmad Intisar <ahmadintisar@Ahmads-MacBook-M4-Pro.local> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-05-13 20:35:01 +08:00
jony376	7f699d1202	Fix: enforce tenant authorization for `tenant_rerank_id` in retrieval flows (#14782 ) ### Related issues Closes #14781 ### What problem does this PR solve? Some retrieval endpoints accepted caller-supplied `tenant_rerank_id` and resolved it through `get_model_config_by_id(...)`. That helper loaded `TenantLLM` rows by global database id and returned decoded model configuration without checking whether the model belonged to the authenticated tenant or the dataset owner tenant. This meant dataset access was validated, but rerank-model selection was not. A caller who knew or could guess another tenant's `tenant_rerank_id` could attempt retrieval with a foreign rerank model config, creating a cross-tenant authorization gap for model usage. This PR closes that gap by making `tenant_rerank_id` resolution tenant-aware across the retrieval paths that accept it. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): ### Solution - Extend `get_model_config_by_id(...)` to accept an optional `allowed_tenant_ids` set and reject `TenantLLM` rows whose `tenant_id` is outside that set. - Pass the allowed tenant scope from retrieval endpoints that accept `tenant_rerank_id`: - `api/apps/sdk/doc.py` - `api/apps/sdk/session.py` - `api/apps/services/dataset_api_service.py` - Use the authenticated tenant plus dataset-owner tenant ids already derived by each retrieval flow as the authorization boundary for rerank model selection. - Add focused unit coverage to assert unauthorized `tenant_rerank_id` values are rejected and that the allowed tenant set is propagated correctly. ### Testing - `python -m py_compile` on: - `api/db/joint_services/tenant_model_service.py` - `api/apps/services/dataset_api_service.py` - `api/apps/sdk/doc.py` - `api/apps/sdk/session.py` - Added unit tests in: - `test/testcases/test_http_api/test_file_management_within_dataset/test_doc_sdk_routes_unit.py` - `test/testcases/test_http_api/test_session_management/test_session_sdk_routes_unit.py` ### Notes for reviewers - This change is intentionally narrow: it affects only the `tenant_rerank_id` path, not the normal `rerank_id` name-based resolution path. - Local lint/syntax checks passed. - Full pytest execution could not be completed in this environment because the local test runtime is missing `strenum`, so the route-test files fail during collection before exercising the updated cases. --------- Co-authored-by: jony376 <jony376@gmail.com>	2026-05-13 19:53:08 +08:00
Wang Qi	f3b3596c29	Speed up ragflow server (#14894 ) ### What problem does this PR solve? Speed up ragflow server ### Type of change - [ ] Refactoring	2026-05-13 18:01:33 +08:00
buua436	8cb2bf04fb	Fix: llm add api key overridden (#14885 ) ### What problem does this PR solve? llm add api key overridden ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-13 17:15:32 +08:00
Wang Qi	ff685d3131	Delete duplicate route (#14883 ) ### What problem does this PR solve? The delete /graph is duplicated of `/datasets/<dataset_id>/<index_type>`, delete it. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-13 15:57:44 +08:00
Wang Qi	45d676bc05	Fix delete graphrag not take effect in UI (#14879 ) ### What problem does this PR solve? Fix delete graphrag not take effect in UI ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-13 13:49:16 +08:00
Wang Qi	64bd0130d3	Add REST API backward compatibility (#14872 ) ### What problem does this PR solve? Add REST API backward compatibility ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-13 11:44:40 +08:00
dale053	5a5e766386	fix(api): authorize owner_ids for list chats and search apps (#14775 ) Closes #14768 ### What problem does this PR solve? The `list_chats` and `list_searches` REST API endpoints did not enforce authorization on the `owner_ids` query parameter. Any authenticated user could pass arbitrary tenant IDs to `owner_ids` and retrieve chats or search apps belonging to other tenants they are not a member of. This PR resolves the issue by: 1. Looking up the current user's authorized tenants via `TenantService.get_joined_tenants_by_user_id` and rejecting any `owner_ids` that fall outside that set. 2. When no `owner_ids` are provided, scoping the query to only the user's authorized tenants instead of returning an unfiltered result. 3. Adding unit tests that verify unauthorized `owner_ids` are rejected with `OPERATING_ERROR`. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-13 09:43:44 +08:00
CaptainTimon	2717ee283f	feat(raptor): add Psi tree builder with original-space ranking and safe migration (#14679 ) ### What problem does this PR solve? Closes #14674. This PR improves RAPTOR configuration and tree construction while preserving the existing RAPTOR behavior as the default. RAPTOR currently builds summary layers with the original UMAP + GMM clustering path. This PR keeps that default path, and adds: - A hidden backend tree-builder option: - `tree_builder="raptor"`: default, existing RAPTOR behavior. - `tree_builder="psi"`: rank-aware Psi-style tree builder using original embedding-space cosine ranking. - A user-facing clustering method option for the default RAPTOR builder: - `clustering_method="gmm"`: existing default. - `clustering_method="ahc"`: agglomerative hierarchical clustering path. - A RAPTOR UI setting for `Clustering method` and `Max cluster`. ### What changed #### Backend - Added `tree_builder` support for RAPTOR/Psi. - Added `clustering_method` support for GMM/AHC. - Kept existing RAPTOR + GMM as the default. - Added Psi tree building from original-space cosine similarity. - Added bucketed Psi building controls for large inputs: - `raptor.ext.psi_exact_max_leaves` - `raptor.ext.psi_bucket_size` - Added method-aware RAPTOR summary metadata using existing `extra.raptor_method`. - Avoided adding a dedicated DB schema field for experimental method tracking. - Added cleanup/migration logic to avoid mixing stale RAPTOR summary trees. - Added defensive checks for Psi tree construction and summary failures. #### Frontend/UI - Added `Clustering method` in RAPTOR settings with `GMM` and `AHC`. - Added/kept `Max cluster` in RAPTOR settings. - Enlarged max cluster UI limit to `1024`, matching backend validation. - Kept AHC editable even when a RAPTOR task has already finished. - Fixed the UI save payload so `clustering_method` and `tree_builder` are serialized through `parser_config.raptor.ext`, avoiding backend validation errors for extra top-level RAPTOR fields. Example saved RAPTOR config: ```json { "raptor": { "max_cluster": 317, "ext": { "clustering_method": "ahc", "tree_builder": "raptor" } } } Co-authored-by: CaptainTimon <CaptainTimon@users.noreply.github.com>	2026-05-12 09:42:31 +08:00
黄圣祺	415169d497	fix(dify): add GET method support to /dify/retrieval for health check (#13837 ) ## Summary - Add GET method handler to `/api/v1/dify/retrieval` endpoint for Dify external knowledge base connectivity verification - GET requests return a simple success response; POST requests retain existing retrieval logic unchanged ## Problem When Dify integrates with RAGFlow as an external knowledge base, it sends periodic GET requests to the retrieval endpoint for health/connectivity checks. The endpoint only accepted POST, causing werkzeug to return `405 Method Not Allowed`. After several successful POST retrievals, the failing GET health checks trigger Dify's circuit breaker, causing all subsequent requests to fail. Traceback from the issue: ``` werkzeug.exceptions.MethodNotAllowed: 405 Method Not Allowed: The method is not allowed for the requested URL. ``` ## Changes - `api/apps/sdk/dify_retrieval.py`: Added a separate GET route handler (`retrieval_health_check`) that returns `get_json_result(data=True)` ## Test plan - [ ] Verify `GET /api/v1/dify/retrieval` returns `{"code": 0, "message": "success", "data": true}` - [ ] Verify `POST /api/v1/dify/retrieval` with valid API key and body still works as before - [ ] Verify Dify external knowledge base integration no longer returns 405 errors Closes #13788 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Asksksn <Asksksn@noreply.gitcode.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2026-05-12 09:37:07 +08:00
tmimmanuel	663fc1d42c	fix(opensearch): implement doc-meta dispatch surface on OSConnection (#14577 ) ### What problem does this PR solve? Fixes #14570. On OpenSearch backends (`DOC_ENGINE=opensearch`) every document-metadata write failed with `'OSConnection' object has no attribute 'create_doc_meta_idx'`, so both `PATCH /api/v1/datasets/{ds}/documents/{doc}` with `meta_fields` and `POST /api/v1/datasets/{ds}/metadata/update` were unusable while every other document operation (retrieval, parsing, name update, chunk management) worked correctly on the same OpenSearch cluster. The bug runs deeper than the missing method name in the error message suggests. `DocMetadataService` also reached into `settings.docStoreConn.es.*` directly for the index refresh, the scripted partial update, and the count call, which means that even after adding `create_doc_meta_idx` to `OSConnection` the very next call in the same metadata flow would still raise `AttributeError` because `OSConnection` exposes `self.os` rather than `self.es`. Fixing only the reported symptom would have moved the failure one line down without restoring the feature. This PR adds a uniform document-metadata dispatch surface to both connection classes so they present the same abstract API, and routes the service layer through that surface via `getattr` guards instead of poking at backend-specific attributes. The four new methods on `OSConnection` and `ESConnectionBase` are `create_doc_meta_idx`, `refresh_idx`, `count_idx`, and `replace_meta_fields`. `OSConnection.create_doc_meta_idx` reuses the existing `conf/doc_meta_es_mapping.json` schema in the OpenSearch `body=` form because OpenSearch and Elasticsearch share the same index-creation payload, and `replace_meta_fields` emits a full scripted assignment (`ctx._source.meta_fields = params.meta_fields`) on both backends so removed keys actually disappear instead of being preserved by deep-merge semantics. The `getattr`-guarded dispatch in `DocMetadataService` keeps the existing fall-through paths intact for Infinity and OceanBase, which continue to rely on their search-based count fallback and on the delete-then-insert metadata replacement they used before, so this change is strictly additive for those two backends. Verification: `pytest test/unit_test/rag/utils/test_opensearch_doc_meta.py` runs 16 new unit tests that pass locally and pin the `OSConnection` dispatch surface, the `create_doc_meta_idx` short-circuit when the index already exists, the mapping-file payload routing, the `IndicesClient.create` failure path, the `refresh_idx` and `count_idx` success and error sentinels, and the full-assignment script emitted by `replace_meta_fields`. The test module stubs `common.settings` and `rag.nlp` at import time so the suite runs without the heavy backend SDKs that the rest of the repository pulls in transitively. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: tmimmanuel <tmimmanuel@users.noreply.github.com>	2026-05-11 17:04:28 +08:00
box4wangjing	292b0b8bce	chore: fix some comments to improve readability (#14756 ) ### What problem does this PR solve? fix some comments to improve readability ### Type of change - [x] Documentation Update --------- Signed-off-by: box4wangjing <box4wangjing@outlook.com>	2026-05-11 16:48:48 +08:00
Sank	592dba1489	Refact: Added a private helper _visibility_and_status_filter (#13627 ) ### What problem does this PR solve? Added a private helper _visibility_and_status_filter(joined_tenant_ids, user_id) that returns the Peewee condition: visible to user (team or own) and status is VALID. ### Type of change - [x] Refactoring --------- Co-authored-by: Serobabov Aleksandr <40SerobabovAS@region.cbr.ru> Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>	2026-05-11 15:21:41 +08:00
tmimmanuel	6ce014c23b	fix: offload blocking DB/Redis calls to thread pool for high-concurrency support (#13825 ) (#13941 ) ### What problem does this PR solve? Addresses event-loop blocking under high concurrency reported in #13825. When multiple requests hit the API simultaneously, synchronous DB/Redis calls block the async event loop, preventing Quart from handling other requests and causing cascading 502/504 timeouts. This PR wraps all remaining blocking DB/Redis calls in `canvas_app.py`, `chat_api.py`, `session.py`, and `canvas_service.py` with `await thread_pool_exec()` - Offload all synchronous `Service.`, `REDIS_CONN.`, and `APIToken.query` calls to the thread pool - Convert sync endpoint handlers (`list_chats`, `get_chat`, `templates`, `sessions`, etc.) to `async def` - Convert sync helper functions (`_ensure_owned_chat`, `_validate_llm_id`, `_validate_dataset_ids`, etc.) to async - no duplicate sync/async pairs - Wrap `CanvasReplicaService` Redis IO calls (`bootstrap`, `replace_for_set`, `commit_after_run`) - Use `asyncio.gather()` for concurrent file uploads and chat response building Note: This fixes the code-level event-loop blocking, which is a prerequisite for handling concurrent requests. For the full "30 concurrent requests without 502/504" goal described in the issue, users should also tune deployment config: - `WS=4` or higher (HTTP worker processes, default 1) - `MAX_CONCURRENT_CHATS=50` (default 10) - `SANDBOX_EXECUTOR_MANAGER_POOL_SIZE` for workflow-heavy workloads ### Performance verification Reviewer asked for a before-vs-after comparison ([comment](https://github.com/infiniflow/ragflow/pull/13941#issuecomment-4393667231)). I built a self-contained microbenchmark that reproduces the exact failure mode this PR targets: an async handler that performs blocking DB/Redis-style calls (50 ms each, 3 per request, 30 concurrent requests) is run twice — once with the pre-PR pattern (sync call directly inside the async handler) and once with the post-PR pattern (`await thread_pool_exec(...)`). The benchmark imports nothing from RAGFlow except `thread_pool_exec` itself, so it is hermetic and reproducible (`THREAD_POOL_MAX_WORKERS=128`, Python 3.13.12). Throughput — wall-clock for 30 concurrent requests (lower is better) \| flavour \| wall(s) \| p50(s) \| p95(s) \| max(s) \| \|---\|---:\|---:\|---:\|---:\| \| before \| 4.986 \| 0.158 \| 0.207 \| 0.269 \| \| after \| 0.248 \| 0.181 \| 0.230 \| 0.231 \| The pre-PR handler serializes the entire load on the event-loop thread, so 30 × 3 × 50 ms ≈ 4.5 s shows up as the wall time. The post-PR handler parallelizes the blocking work across the thread pool and finishes the same load in 248 ms — a ~20× speedup on this workload. Event-loop responsiveness — latency of an unrelated probe coroutine while the 30 slow requests are running (lower is better) \| flavour \| samples \| probe p50 (ms) \| probe p95 (ms) \| probe max (ms) \| \|---\|---:\|---:\|---:\|---:\| \| before \| 1 \| 5442.26 \| 5442.26 \| 5442.26 \| \| after \| 28 \| 0.88 \| 11.53 \| 98.02 \| This is the metric that maps directly to "the API still answers other requests while one is busy". A 5 ms-interval probe was scheduled while the 30 slow handlers ran. With the pre-PR code the event loop was frozen for the entire duration of the blocking work, so only one probe sample was ever picked up and it waited 5,442 ms. After the PR, 28 probe samples landed with p50 0.88 ms / p95 11.53 ms, meaning unrelated requests are no longer starved by the slow ones. That is the regression mode behind the cascading 502/504s reported in #13825. <details> <summary>Raw benchmark output</summary> ``` config: 30 concurrent requests, 3 blocking calls of 50ms each per request, THREAD_POOL_MAX_WORKERS=128 === Throughput (lower wall is better) === flavour wall(s) p50(s) p95(s) max(s) before 4.986 0.158 0.207 0.269 after 0.248 0.181 0.230 0.231 === Event-loop responsiveness (lower probe latency is better) === flavour samples probe p50(ms) probe p95(ms) probe max(ms) before 1 5442.26 5442.26 5442.26 after 28 0.88 11.53 98.02 ``` </details> The benchmark script is included as a comment on the PR for reproducibility. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Performance Improvement Closes [#13825](https://github.com/infiniflow/ragflow/issues/13825) --------- Co-authored-by: tmimmanuel <tmimmanuel@users.noreply.github.com> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2026-05-11 15:08:55 +08:00
Paul Y Hui	a0efc453f3	Fix: safe argument guard and remove redundant redis call (#14060 ) ### What problem does this PR solve? - Moved if not all([email, new_pwd, new_pwd2]) guard to the top, before any decryption that could crash on None value - Removed the redundant REDIS_CONN.get() call — one call is sufficient ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring	2026-05-11 15:02:24 +08:00
Ricardo-M-L	5ef7f50eef	fix: use context manager for ThreadPoolExecutor in file_service.py (#14144 ) ## Summary - Wrap 2 `ThreadPoolExecutor` instances in `file_service.py` with `with` statement - Ensures threads are properly shut down after all futures complete ## Problem `parse_docs()` (line 532) and the file processing method (line 694) create `ThreadPoolExecutor` instances that are never shut down. In a long-running server process, this leaks thread resources on every invocation — threads remain alive consuming memory even after all submitted work is complete. ## Fix Replace bare `ThreadPoolExecutor()` with `with ThreadPoolExecutor() as exe:` context manager, which calls `executor.shutdown(wait=True)` on exit. ## Test plan - [x] Verified both call sites use `with` statement after fix - [x] No remaining bare `ThreadPoolExecutor` in `file_service.py` - [x] `document_service.py:1066` is a module-level executor (different pattern, not changed in this PR) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2026-05-11 14:02:45 +08:00
buua436	a03b95f8c4	Fix: shared dataset chunk index lookup (#14764 ) ### What problem does this PR solve? shared dataset chunk index lookup ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-11 13:50:08 +08:00
buua436	024c8cb0b5	Fix: dataset search rerank id type (#14759 ) ### What problem does this PR solve? issue: https://github.com/infiniflow/ragflow/issues/14748 change: dataset search rerank id type ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-11 13:48:05 +08:00
jony376	46897d6fa4	Fix: bind memory message `user_id` to authenticated user for JWT auth (#14745 ) ### Related issues Closes #14744 ### What problem does this PR solve? The Memory REST endpoint `POST /api/v1/messages` previously persisted whatever `user_id` the client sent in the JSON body. Memory rows were therefore attributed to an arbitrary string, even when the caller authenticated as a normal workspace user via JWT (browser/session-style bearer token decoded into an access token). That broke attribution and audit semantics for shared memories (team visibility): any authorized writer could spoof another subject id. The Python SDK already sends an optional `user_id` for integrations using API keys (`APIToken`) to tag an external subject distinct from the tenant owner user. ### Solution - Record `g.auth_via_api_token` in `_load_user` (`api/apps/__init__.py`): set `True` only when authentication resolves via `APIToken`, otherwise `False` after JWT-based login succeeds. - In `POST /messages` (`memory_api.add_message`): if the request was authenticated with an API key, keep accepting optional `user_id` from the body (default empty string). For JWT-authenticated users, always set stored `user_id` to `current_user.id` and ignore the client field. - Guard reads of `g` with `RuntimeError` handling so isolated imports or tests without a Quart application context do not fail when resolving `user_id`. - Document on `RAGFlow.add_message` that `user_id` is only meaningful for API-key authentication. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): ### Testing - `python -m py_compile` on modified modules (`api/apps/__init__.py`, `api/apps/restful_apis/memory_api.py`). - Recommended: run web/SDK memory message tests (`test_add_message`, `test_message_routes_unit`) against a full environment with `quart` and configured services. ### Notes for reviewers - Behavior change only for callers using JWT-style authorization on `POST /messages`; API-key callers keep prior optional `user_id` semantics. Co-authored-by: jony376 <jony376@gmail.com> Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-11 13:26:05 +08:00

1 2 3 4 5 ...

1634 Commits