Commit Graph

5421 Commits

Author SHA1 Message Date
ef4cbe72a3 refactor(ui): adjust global navigation bar style (#13419)
### What problem does this PR solve?

Renovate global navigation bar, align styles to the design.
(May causes minor layout issues in sub-pages, will check and fix soon)

### Type of change

- [x] Refactoring
2026-03-05 20:47:29 +08:00
9e0e128ce5 Add checksum/values annotation to ragflow.yaml (#13409)
Add checksum annotation for values in ragflow.yaml

### What problem does this PR solve?

This PR is about this ticket: #13408

Ragflow helm charts do not include the Values.yaml in the list of
watched changes.
If you update the Values.yaml for an existing deployment, helm will not
detect it and not update the deployment.

This PR fixes that.

### Type of change

- [X] Bug Fix (non-breaking change which fixes an issue)
2026-03-05 20:27:38 +08:00
963e31e9b5 Refact: Updated the doc structure. (#13414)
### What problem does this PR solve?

Updated the doc structure.

### Type of change


- [x] Documentation Update
2026-03-05 19:04:56 +08:00
d90d6026af Playwright : new chat multi model test (#13402)
### What problem does this PR solve?

new test for chat multiple model and other chat parameters under
playwright

### Type of change

- [x] Other (please describe): new test/ data-testid
2026-03-05 18:51:57 +08:00
d9785ea2ce Fix: Alibaba cloud OSS config issue (#13406)
### What problem does this PR solve?

 Alibaba Could OSS config issue #13390.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-03-05 18:13:45 +08:00
8b534c895e Fix: UI Placeholder and Hint Optimization (#13416)
### What problem does this PR solve?

Fix: UI Placeholder and Hint Optimization

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-03-05 18:13:19 +08:00
35fc5edc93 feat: Adds the tenant model ID field to the interface definition. (#13274)
### What problem does this PR solve?

feat: Adds the tenant model ID field to the interface definition

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-03-05 17:27:34 +08:00
62cb292635 Feat/tenant model (#13072)
### What problem does this PR solve?

Add id for table tenant_llm and apply in LLMBundle.

### Type of change

- [x] Refactoring

---------

Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
Co-authored-by: Liu An <asiro@qq.com>
2026-03-05 17:27:17 +08:00
47540a4147 Feat: published agent version control (#13410)
### What problem does this PR solve?

Feat: published agent version control

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2026-03-05 17:26:39 +08:00
8c9b080499 fix: update axios to 1.13.5+ to remediate CVE-2026-25639 DoS vulnerability (#13380)
### What problem does this PR solve?

This PR remediates CVE-2026-25639, a HIGH severity Denial of Service
vulnerability in axios caused by __proto__ pollution in the mergeConfig
function. The vulnerability affects both the web frontend and the
sandbox nodejs environment.

Trivy security scan identified axios versions below 1.13.5 as
vulnerable. This PR updates axios to secure versions (1.13.6 in web,
1.13.5 in sandbox) to eliminate the security risk.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-03-05 17:26:04 +08:00
f13a1fb007 Refa: improve model verification ux (#13392)
### What problem does this PR solve?

Improve model verification UX. #13395 

### Type of change

- [x] Refactoring

---------

Co-authored-by: Liu An <asiro@qq.com>
2026-03-05 17:23:47 +08:00
3124fa955e chore: add bin and internal dirs to .gitignore for Go server build output (#13407)
### What problem does this PR solve?

add bin and internal dirs to .gitignore for Go server build output
2026-03-05 15:52:01 +08:00
e1f1184b01 test: add unit tests for graphrag/utils.py (87 test cases) (#13328)
Add comprehensive unit tests for `graphrag/utils.py`, covering 15
functions/classes with 87 test cases.

Tested functions:
- clean_str, dict_has_keys_with_types, perform_variable_replacements
- get_from_to, compute_args_hash, is_float_regex
- GraphChange dataclass
- handle_single_entity_extraction, handle_single_relationship_extraction
- graph_merge, tidy_graph
- split_string_by_multi_markers, pack_user_ass_to_openai_messages
- is_continuous_subsequence, merge_tuples, flat_uniq_list

All 327 existing + new tests pass with no regressions.
2026-03-05 15:30:43 +08:00
3e3b665b89 RAGFlow admin server go version (#13394)
### What problem does this PR solve?

1. init go admin server
2. refactor api server router
3. add benchmark CI to 450s time limit
4. remove docker builder container after building

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-03-05 15:18:40 +08:00
6f5bd4d2e9 feat: add bin and internal dirs to .gitignore for Go server build output (#13391)
### What problem does this PR solve?

add bin and internal dirs to .gitignore for Go server build output
2026-03-05 14:26:40 +08:00
118f737b3a Feat:Enhance chunk management by adding support for 'available', 'tag_kwd' and 'tag_feas' (#13383)
### What problem does this PR solve?

Enhance chunk management by adding support for 'available', 'tag_kwd'
and 'tag_feas' fields in list, add, and update chunk functions just like
chunk_app.py.This improves data handling and flexibility in chunk
processing.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-03-05 13:45:39 +08:00
61209ff3bf Feat: File uploads for future conversations on SDK API (#13378)
### What problem does this PR solve?

This PR aims to:

1. Enable file uploads for the public API, similarly to what
/document/upload_info accomplishes for the frontend;
2. Enable files sent to the /chat/:chat_id/completions endpoint to be
used within the conversation.
We classify the first item as a new future, while classifying the second
one as a bug fix.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)

*The work related to this PR was co-authored by*

[Bruno Ferreira](https://github.com/brunopferreira): Custom Solutions
Manager @ [Orbcom](https://orbcom.pt/)
[Pedro Ferreira](https://github.com/sirj0k3r): Lead Software Developer @
[Orbcom](https://orbcom.pt/)
[Pedro Cardoso](https://github.com/pedromiguel4560): Associate Software
Developer @ [Orbcom](https://orbcom.pt/)

*This PR replaces #13248*

---------

Co-authored-by: Pedro Cardoso <pedrocardoso@orbcom.pt>
Co-authored-by: Pedro Ferreira <pedroferreira@orbcom.pt>
2026-03-04 22:26:58 +08:00
020068dd16 Fix: preserve field boundaries in chunked documents from MySQL… (#13369)
### What problem does this PR solve?

When multiple columns are used as content columns in RDBMS connector,
the generated document text gets chunked by TxtParser which strips
newline delimiters during merge. This causes field names and values from
different columns to be concatenated without any separator, making the
content unreadable.

Changes:
- txt_parser.py: restore newline separator when merging adjacent text
segments within a chunk, so that split sections are not directly
concatenated
- rdbms_connector.py: use double newline between fields and place field
value on a new line after the field name bracket, giving TxtParser
clearer boundaries to work with

Closes #13001

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: tunsuytang <tunsuytang@tencent.com>
2026-03-04 21:42:02 +08:00
9deb3a6249 Refact: Fine tweaks to the doc structure. (#13379)
### What problem does this PR solve?

Fine tweaks to the doc structure.

### Type of change


- [x] Documentation Update
2026-03-04 21:30:28 +08:00
be231faec0 Feat: Write the row and column numbers into the element's data attribute for easy code location. (#13368)
### What problem does this PR solve?

Feat: Write the row and column numbers into the element's data attribute
for easy code location.

### Type of change


- [x] New Feature (non-breaking change which adds functionality)

Co-authored-by: Liu An <asiro@qq.com>
2026-03-04 20:50:58 +08:00
b3a7332c08 playwright : add data-testids for new test (#13364)
### What problem does this PR solve?

add data-testids for new test

### Type of change

- [x] Other (please describe): add data-testids for new test
2026-03-04 19:28:36 +08:00
c99b53064d fix: remove company info from resume_summary to prevent over-retrieval (#13358)
### What problem does this PR solve?

Problem: When searching for a specific company name like(Daofeng
Technology), the search would incorrectly return unrelated resumes
containing generic terms like (Technology) in their company names

Root Cause: The `corporation_name_tks` field was included in the
identity fields that are redundantly written to every chunk. This caused
common words like "科技" to match across all chunks, leading to
over-retrieval of irrelevant resumes.

Solution: Remove `corporation_name_tks` from the `_IDENTITY_FIELDS`
list. Company information is still preserved in the "Work Overview"
chunk where it belongs, allowing proper company-based searches while
preventing false positives from generic terms.

---------

Co-authored-by: Aron.Yao <yaowei@192.168.1.68>
Co-authored-by: Aron.Yao <yaowei@yaoweideMacBook-Pro.local>
Co-authored-by: Liu An <asiro@qq.com>
2026-03-04 19:24:49 +08:00
70e9743ef1 RAGFlow go API server (#13240)
# RAGFlow Go Implementation Plan 🚀

This repository tracks the progress of porting RAGFlow to Go. We'll
implement core features and provide performance comparisons between
Python and Go versions.

## Implementation Checklist

- [x] User Management APIs
- [x] Dataset Management Operations
- [x] Retrieval Test
- [x] Chat Management Operations
- [x] Infinity Go SDK

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: Yingfeng Zhang <yingfeng.zhang@gmail.com>
2026-03-04 19:17:16 +08:00
2508c46c8f Playwright : add new test for configuration tab in datasets (#13365)
### What problem does this PR solve?

this pr adds new tests, for the full configuration tab in datasests

### Type of change

- [x] Other (please describe): new tests
2026-03-04 19:10:06 +08:00
88e8509159 benchmark fail in ci (#13377)
### What problem does this PR solve?
ci fails in elastic search because of benchmark

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-03-04 19:01:41 +08:00
c7d17c84b2 Refa:improve excel parser logic (#13372)
### What problem does this PR solve?

improve excel parser logic

### Type of change
- [x] Refactoring
2026-03-04 18:00:17 +08:00
6bb00e2762 Update graspologic to gitee (#13362)
### What problem does this PR solve?

Accelerate python module downloading

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-03-04 17:48:47 +08:00
8a7272f423 Test: add scenario for embedding_model update when chunk_count > 0 (#13351)
### What problem does this PR solve?

Guard embedding_model change when dataset has existing chunks. API must
return code 102 with message 'When chunk_num (N) > 0, embedding_model
must remain <current_model>' to prevent silent embedding drift.

### Type of change

- [x] Add Testcases

Co-authored-by: Liu An <asiro@qq.com>
2026-03-04 17:41:35 +08:00
f47c47df99 Disable benchmark (#13370)
### What problem does this PR solve?

benchmark always failed in new CI machine. please enable it after the
issue is fixed.

### Type of change

- [x] Other (please describe): disable benchmark

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-03-04 16:36:42 +08:00
5eb602166c Enhance local model deployment documentation support gpustack guide (#13339)
### Type of change

- [X] Documentation Update:Enhance local model deployment documentation
support gpustack guide
2026-03-04 13:54:20 +08:00
54ae5b4a27 Fix Dify external retrieval by providing metadata.document_id (#13337)
### What problem does this PR solve?

## Summary                                                           
  Dify’s external retrieval expects `records[].metadata.document_id` to
  be a non-empty string.                                               
  RAGFlow currently only sets `metadata.doc_id`, which causes Dify     
  validation to fail.                                                  
                                                                       
  This PR adds `metadata.document_id` (mapped from `doc_id`) in the    
  Dify-compatible retrieval response.                                  
                                                                       
  ## Changes                                                           
- Add `meta["document_id"] = c["doc_id"]` in
`api/apps/sdk/dify_retrieval.py`
                                                                       
  ## Testing                                                           
  - Not run (logic-only change).

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-03-04 13:23:37 +08:00
b9ad014f63 Supports login cross multiple RAGFlow servers (#13322)
### What problem does this PR solve?

1. Use redis to store the secret key.
2. During startup API server will read the secret from redis. If no such
secret key, generate one and store it into redis, atomically.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-03-04 13:07:45 +08:00
5f8966608d Fix: The dropdown menu for large models does not automatically focus on the search box. #13313 (#13360)
### What problem does this PR solve?

Fix: The dropdown menu for large models does not automatically focus on
the search box. #13313

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-03-04 12:48:35 +08:00
93d621a666 Fix: Correct PDF chunking parameter name in naive (#13357)
### What problem does this PR solve?

Fix: Correct PDF chunking parameter name in naive #13325

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-03-04 11:51:10 +08:00
733a64f0d6 Fix: Change the background color of the message notification button. (#13344)
### What problem does this PR solve?

Fix: Change the background color of the message notification button.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-03-04 11:10:05 +08:00
839b603768 feat: Add PDF parser selection to Agent Begin and Await Response comp… (#13325)
### Issue: #12756

### What problem does this PR solve?

When users upload files through Agent's Begin or Await Response
components, the parsing is hardcoded to "Plain Text", ignoring all other
available parsers (DeepDOC, TCADP, Docling, MinerU, PaddleOCR). This PR
adds a PDF parser dropdown to these components so users can select the
appropriate parser for their file inputs.


### Changes

**Backend**
- `agent/component/fillup.py` - Added `layout_recognize` param to
`UserFillUpParam`, forwarded to `FileService.get_files()`
- `agent/component/begin.py` - Same forwarding in `Begin._invoke()`
- `agent/canvas.py` - Extract Begin's `layout_recognize` for `sys.files`
parsing, added param to `get_files_async()` / `get_files()`
- `api/db/services/file_service.py` - Added `layout_recognize` param to
`parse()` and `get_files()`, replacing hardcoded `"Plain Text"`
- `rag/app/naive.py` - Added `"plain text"` and `"tcadp parser"` aliases
to PARSERS dict to match dropdown values after `.lower()`

**Frontend**
- `web/src/pages/agent/form/begin-form/index.tsx` - Show
`LayoutRecognizeFormField` dropdown when file inputs exist
- `web/src/pages/agent/form/begin-form/schema.ts` - Added
`layout_recognize` to Zod schema
- `web/src/pages/agent/form/user-fill-up-form/index.tsx` - Same dropdown
for Await Response component


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-03-04 11:09:33 +08:00
7715bad04e refactor: reorganize unit test files into appropriate directories (#13343)
### What problem does this PR solve?

Move test files from utils/ to their corresponding functional
directories:
- api/db/ for database related tests
- api/utils/ for API utility tests
- rag/utils/ for RAG utility tests

### Type of change

- [x] Refactoring
2026-03-04 11:02:56 +08:00
33ba955b02 Translate Chinese text to English in agent/sandbox (#13356)
Chinese text remained in generated code comments, log messages, field
descriptions, and documentation files under `agent/sandbox/`.

### Changes

- **`tests/MIGRATION_GUIDE.md`** — Full EN translation (migration guide
from OpenSandbox → Code Interpreter)
- **`tests/QUICKSTART.md`** — Full EN translation (quick test guide for
Aliyun sandbox provider)
- **`providers/aliyun_codeinterpreter.py`** — Removed `(主账号ID)` from
docstring, error log, and config field description
- **`sandbox_spec.md`** — Removed `(主账号ID)` from `account_id` field
description
- **`tests/test_aliyun_codeinterpreter_integration.py`** — Removed
`(主账号ID)` from inline comment

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

<!-- START COPILOT CODING AGENT TIPS -->
---

💡 You can make Copilot smarter by setting up custom instructions,
customizing its development environment and configuring Model Context
Protocol (MCP) servers. Learn more [Copilot coding agent
tips](https://gh.io/copilot-coding-agent-tips) in the docs.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: yuzhichang <153784+yuzhichang@users.noreply.github.com>
2026-03-04 10:49:38 +08:00
0a4c0c38c7 Feat: expose admin service in helm configuration (#13345)
### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
For helm deployment, there is also requirement to enable the Admin
Service for administrative operations.
So expose the ability of enable/disable this function by helm
configuration.
        When it's enabled (by default),
<img width="486" height="190" alt="image"
src="https://github.com/user-attachments/assets/4db0dc3d-bd94-4ad9-bb5d-a240aac5e1c5"
/>
        Admin access and operations would be feasible like below,
<img width="2530" height="876" alt="image"
src="https://github.com/user-attachments/assets/3e948e1b-7522-4f8d-8dc0-c80a22242022"
/>
Something like 'user management' is very much important for Ragflow
User/Owner to control their clients.
2026-03-04 10:26:10 +08:00
2f4ca38adf Fix : make playwright tests idempotent (#13332)
### What problem does this PR solve?

Playwright tests previously depended on cross-file execution order
(`auth -> provider -> dataset -> chat`).
This change makes setup explicit and idempotent via fixtures so tests
can run independently.

- Added/standardized prerequisite fixtures in
`test/playwright/conftest.py`:
- `ensure_auth_context`, `ensure_model_provider_configured`,
`ensure_dataset_ready`, `ensure_chat_ready`
- Made provisioning reusable/idempotent with `RUN_ID`-based resource
naming.
- Synced auth envs (`E2E_ADMIN_EMAIL`, `E2E_ADMIN_PASSWORD`) into seeded
creds.
- Fixed provider cache freshness (`auth_header`/`page` refresh on cache
hit).

Also included minimal stability fixes:
- dataset create stale-element click handling,
- search wait logic for results/empty-state,
- agent create-menu handling,
- agent run-step retry when run UI doesn’t open first click.

### Type of change

- [x] Test fix
- [x] Refactoring

---------

Co-authored-by: Liu An <asiro@qq.com>
2026-03-04 10:07:14 +08:00
1c87f97dde Docs: Minor document structure tweak. (#13346)
### What problem does this PR solve?

Refactored the document architecture.

### Type of change

- [x] Documentation Update
2026-03-03 20:09:34 +08:00
f7c808383f Docs: Refactored documentation (#13340)
### What problem does this PR solve?

Refactored documentation. 

### Type of change

- [x] Documentation Update
2026-03-03 17:48:48 +08:00
48755a3352 Fix: (resume) Cross-verify project experience and work experience, and remove duplicate text (#13323)
Cross-verify project experience and work experience, and remove
duplicate text

---------

Co-authored-by: Aron.Yao <yaowei@192.168.1.68>
Co-authored-by: Aron.Yao <yaowei@yaoweideMacBook-Pro.local>
2026-03-03 14:53:46 +08:00
eca60208e3 Fix: The document generation node cannot generate the output content of a large model to a file. #13321 (#13326)
### What problem does this PR solve?

Fix: The document generation node cannot generate the output content of
a large model to a file. #13321
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-03-03 11:05:24 +08:00
4f09b3e2a4 Fix: pipeline canvas category (#13319)
### What problem does this PR solve?

Fix: pipeline canvas category

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-03-02 20:27:36 +08:00
707de2461a Fix: use async_chat with sync wrapper in resume parser (#13320)
### What problem does this PR solve?

Fix AttributeError when calling llm.chat() in resume parser. LLMBundle
only has async_chat method, not chat method. Use `_run_coroutine_sync`
wrapper to call async_chat synchronously.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-03-02 19:51:06 +08:00
ef264b52c7 Fix: Fixed some errors in the console (#13317)
### What problem does this PR solve?

Fix: Fixed some errors in the console
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-03-02 19:19:15 +08:00
a806f7b707 Potential fix for code scanning alert no. 71: Incomplete URL substring sanitization (#13318)
Potential fix for
[https://github.com/infiniflow/ragflow/security/code-scanning/71](https://github.com/infiniflow/ragflow/security/code-scanning/71)

In general, instead of using `String.prototype.includes` on the entire
URL string, parse the URL and make decisions based on its `host` (or
`hostname`) field. This avoids cases where the trusted domain appears in
the path, query, or as part of a different hostname.

Here, `payload.source_fid` is set to `'siliconflow_intl'` if
`postBody.base_url` “contains” `api.siliconflow.com`. To keep behavior
for correct inputs but close the hole, we should:

1. Safely parse `postBody.base_url` using the standard `URL` class.
2. Extract the hostname (`url.hostname`).
3. Compare it appropriately:
- If we only want the exact host `api.siliconflow.com`, use strict
equality.
- If international endpoints may include subdomains like
`foo.api.siliconflow.com`, allow those via suffix check on the hostname.
4. Fall back to `LLMFactory.SILICONFLOW` if parsing fails or the host
does not match.

Concretely, in `web/src/pages/user-setting/setting-model/hooks.tsx`, in
the `onApiKeySavingOk` callback where `payload.source_fid` is set,
replace the `toLowerCase().includes('api.siliconflow.com')` logic with a
small block that:

- Initializes a local `let sourceFid = LLMFactory.SILICONFLOW;`
- If `postBody.base_url` is present, attempts `new
URL(postBody.base_url)` inside a `try/catch`, lowercases `url.hostname`,
and checks whether it equals `api.siliconflow.com` or ends with
`.api.siliconflow.com`.
- Assigns `payload.source_fid = sourceFid`.

No new external dependencies are required; `URL` is available in modern
browsers and Node, and TypeScript understands it.


_Suggested fixes powered by Copilot Autofix. Review carefully before
merging._

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
2026-03-02 19:11:52 +08:00
b0ace2c5d0 feat: enable Arabic in production UI and add complete Arabic documentation (#13315)
### What problem does this PR solve?

This PR adds end-to-end Arabic support in production. It also adds a
full Arabic README

### Type of change

 - [x] New Feature (non-breaking change which adds functionality)
 - [x] Documentation Update
2026-03-02 19:10:11 +08:00
f8c91e8854 Refa: Resume parsing module (architectural optimizations based on SmartResume Pipeline) (#13255)
Core optimizations (refer to arXiv:2510.09722):

1. PDF text fusion: Metadata + OCR dual-path extraction and fusion

2. Page-aware reconstruction: YOLOv10 page segmentation + hierarchical
sorting + line number indexing

3. Parallel task decomposition: Basic information/work
experience/educational background three-way parallel LLM extraction

4. Index pointer mechanism: LLM returns a range of line numbers instead
of generating the full text, reducing the illusion of full text.

---------

Co-authored-by: Aron.Yao <yaowei@yaoweideMacBook-Pro.local>
Co-authored-by: Aron.Yao <yaowei@192.168.1.68>
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
2026-03-02 19:05:50 +08:00