ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-06-08 08:07:21 +08:00

Author	SHA1	Message	Date
Zhichang Yu	b7744e053e	fix: support dense_vector from ES fields response (ES 9.x compatibility) (#13972 ) fix: support dense_vector from ES fields response (ES 9.x compatibility) - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Configuration Chore (non-breaking change which updates configuration) ## Summary by CodeRabbit * Bug Fixes * More accurate handling and unwrapping of dense-vector fields so returned values have correct shapes. * Field selection reliably limits returned data and falls back to alternate result locations when needed. * Use of consistent result IDs and tolerant handling when score values are missing. * Chores / Configuration * Increased build memory and adjusted build-time flags for the frontend build. * Simplified runtime model/GPU checks and removed an automated runtime GPU-install attempt. * Build Fixes * `web/vite.config.ts`: make `build.minify` and `build.sourcemap` respect `VITE_MINIFY` and `VITE_BUILD_SOURCEMAP` env vars from Dockerfile instead of hardcoding `terser` and `true`. * Environment * Allow stack version override and default the runtime image tag to "latest". <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Bug Fixes * Correct unwrapping of dense-vector fields and reliable field selection with fallback locations. * Consistent use of hit-level IDs and tolerant handling when score values are missing. * Chores / Configuration * Increased frontend build memory and added build-time minify/sourcemap flags; build minification and sourcemap now configurable. * Removed runtime GPU detection for model initialization; force CPU initialization. * Environment * Allow stack version override and default runtime image tag to "latest". <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-09 17:44:13 +08:00
Magicbook1108	107fe6cf90	Feat: support doc for pipeline parser in word (#14005 ) ### What problem does this PR solve? Feat: support doc for pipeline parser in word ### Type of change - [x] New Feature (non-breaking change which adds functionality) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Added support for processing legacy Word `.doc` file formats, extending document compatibility. * Bug Fixes * Enhanced error handling during document parsing to improve reliability and prevent processing failures.	2026-04-09 16:40:42 +08:00
Magicbook1108	8d52ef2893	Feat: enable sync deleted files for connector (#14000 ) ### What problem does this PR solve? Feat: enable sync deleted files for connector 1. first comes with github ### Type of change - [x] New Feature (non-breaking change which adds functionality) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Added "sync deleted files" feature for data sources, enabling automatic removal of files deleted from the source system. * Added multilingual support for the new sync deleted files setting across multiple languages. * UI Improvements * Improved checkbox form field rendering and layout. * Enhanced full-width display for authentication token input fields.	2026-04-09 16:40:14 +08:00
Jack	577c96bf2a	Refactor: Merge document update API (#13962 ) ### What problem does this PR solve? Refactor: merge document.rename into document.update_document ### Type of change - [x] Refactoring <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Added a unified document update API (PUT) supporting name, metadata, parser/chunk settings, and status changes. * Breaking Changes * Legacy single-parameter rename endpoint removed; renames now require dataset + document identifiers. * `/list` now reads dataset id from a different query parameter. * Validation / Bug Fixes * Stricter meta_fields and parser-config validation; unauthenticated requests return 401. * Frontend * UI now sends dataset id when saving document names. * Tests * Numerous unit and HTTP tests adjusted or removed to match new API and validations. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com> Co-authored-by: Jin Hai <haijin.chn@gmail.com> Co-authored-by: MkDev11 <94194147+MkDev11@users.noreply.github.com> Co-authored-by: mkdev11 <YOUR_GITHUB_ID+MkDev11@users.noreply.github.com> Co-authored-by: mkdev11 <MkDev11@users.noreply.github.com> Co-authored-by: Qi Wang <wangq8@outlook.com> Co-authored-by: dataCenter430 <161712630+dataCenter430@users.noreply.github.com> Co-authored-by: balibabu <cike8899@users.noreply.github.com>	2026-04-09 11:17:38 +08:00
Ricardo-M-L	c13f8856a1	fix: correct typos in agent component filename and templates (#13930 ) ## Summary - Rename misspelled file `varaiable_aggregator.py` → `variable_aggregator.py` - Fix `unkown` → `unknown` in template and frontend constant (3 instances) - Fix `Finale` → `Final` in customer feedback template (2 instances) ## Test plan - [ ] Verify variable aggregator component loads correctly - [ ] Verify agent templates render properly 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: yuj <yuj@ztjzsoft.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-04-09 11:06:01 +08:00
Lynn	dbfb439239	Feat: migrate script (#13976 ) ### What problem does this PR solve? Add stage for migrate tenant_llm data into table tenant_model_instance and tenant_model. ### Type of change - [x] Other (please describe): tool script <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Chores * Added two new migration stages to move tenant model and instance records into new target tables, with dry-run, full-execute, and "create table only" modes; migration skips already-migrated rows to avoid duplicates. * Bug Fixes * Cleaned up migration header logging for clearer output. * Documentation * Added usage guide describing stages, options, modes, config format, examples, and expected logs. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-04-09 11:03:39 +08:00
Magicbook1108	c5871c1078	Fix: dsl import/export (#13992 ) ### What problem does this PR solve? Fix: dsl import/export ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Enhanced JSON import functionality for agents to automatically populate components from imported graph structures. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Zhichang Yu <yuzhichang@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-09 10:55:22 +08:00
qinling0210	82fa85c837	Implement Delete in GO and refactor functions (#13974 ) ### What problem does this PR solve? Implement Delete in GO and refactor functions ### Type of change - [x] Refactoring <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Added a remove_chunks command to delete specific or all chunks from a document. * Added new endpoints for chunk removal and chunk update. * Refactor * Renamed index commands to dataset/metadata table terminology and updated REST routes accordingly. * Updated chunk update flow to a JSON POST style and improved metadata error messages. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>	2026-04-09 09:52:31 +08:00
Jack	3b7723855c	Fix: revert xgboost version to 1.6.0 (#13984 ) ### What problem does this PR solve? Revert xgboost version to 1.6.0 ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Chores * Updated xgboost dependency from version 3.2.0 to 1.6.0 <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-04-08 19:53:47 +08:00
Jin Hai	5fe6f7c9ac	Go CLI: Add list configs and set log level command (#13983 ) ### What problem does this PR solve? 1. list configs 2. set log level debug/info/warn/error/fatal/panic ``` RAGFlow(user)> list configs; +--------------------+-----------------------+ \| key \| value \| +--------------------+-----------------------+ \| redis_host \| localhost:6379 \| \| doc_engine \| elasticsearch \| \| elasticsearch_host \| http://localhost:1200 \| \| log_level \| info \| \| database \| mysql \| \| database_host \| localhost:3306 \| \| admin \| 0.0.0.0:9383 \| \| storage_engine \| minio \| \| minio_host \| localhost:9000 \| +--------------------+-----------------------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * New Features * Added `LIST CONFIGS` command to view system configuration details (Redis, database, log level, storage engine, and host settings). * Added `SET LOG LEVEL` command to adjust logging verbosity at runtime. * Improvements * Enhanced log level configuration defaults and runtime state management. * Reorganized token management and system endpoints under `/system/` routes for better API organization. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-08 19:32:53 +08:00
balibabu	86900dca99	Refactor: Remove unused API code (#13978 ) ### What problem does this PR solve? Refactor: Remove unused API code ### Type of change - [x] New Feature (non-breaking change which adds functionality) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Style * Updated table header styling in dataset settings by removing a hard-coded background color class, allowing the header to use default or inherited component styling instead. * Refactor * Removed token management endpoints from the API service. Token creation, listing, and removal functions are no longer available. * Removed the statistics data endpoint from available API routes. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-04-08 18:46:08 +08:00
balibabu	c0c3287af4	Fix: Error message: Use 'const' instead. (#13982 ) ### What problem does this PR solve? Fix: Linter error message: Use 'const' instead. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Refactor * Updated variable declarations across form components, agent utilities, memory management hooks, and data handling functions to enhance code consistency and maintainability throughout the application codebase. * Style * Added ESLint suppressions to document intentional constant-condition patterns in asynchronous event streaming operations. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-04-08 18:13:14 +08:00
Yongteng Lei	3064895bbb	Fix: import error in sandbox provider (#13971 ) ### What problem does this PR solve? Fix import error in sandbox provider. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Chores * Updated internal configuration import mechanism for sandbox provider initialization. No end-user impact. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-04-08 15:35:30 +08:00
Jin Hai	fa75aee3b9	Refactor system API (#13958 ) ### What problem does this PR solve? - ping - token - log level ### Type of change - [x] Refactoring <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Refactor * System endpoints consolidated under /api/v1/system: ping, health check, and token management moved to the centralized API surface. * Token management unified at /api/v1/system/tokens with list/create/delete behavior. * Documentation * API reference updated to reflect the new /api/v1/system paths. * Tests * Client fixtures and test utilities updated to use /api/v1/system/tokens; one unit test for health/oceanbase status removed. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-08 15:26:18 +08:00
Jin Hai	ad789f5c43	Fix list files (#13960 ) ### What problem does this PR solve? As title. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Bug Fixes * Standardized the query parameter used when listing documents so listings behave consistently across the web and client interfaces. * Clarified the error message shown when a required dataset ID is missing to give clearer guidance to users. * Tests * Updated test coverage to reflect the standardized dataset identifier usage. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-08 13:38:30 +08:00
balibabu	b8764cfa11	Fix: The document management table cannot be displayed. (#13967 ) ### What problem does this PR solve? Fix: The document management table cannot be displayed. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Bug Fixes * Improved table layout and overflow behavior in the files view to ensure proper scrolling and display. * Chores * Removed unused system status functionality and cleaned up service methods. * Updated TypeScript configuration for compatibility. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-04-08 11:37:27 +08:00
dataCenter430	62a1333cf2	Feat: expose parent-child chunking configuration via HTTP API and Python SDK (#13940 ) … ### What problem does this PR solve? Closes #13857 Parent-child chunking was introduced in v0.23.0 but is only configurable through the web UI. Users managing datasets programmatically cannot enable it via the HTTP API or Python SDK because `ParserConfig` uses `extra="forbid"`, rejecting the `children_delimiter` field at validation. ### What does this PR change? Adds a `parent_child` nested config to `ParserConfig`, following the same pattern as `raptor` and `graphrag`: ```json "parser_config": { "parent_child": { "use_parent_child": true, "children_delimiter": "\n" } } ``` - api/utils/validation_utils.py — new ParentChildConfig model, added to ParserConfig - api/utils/api_utils.py — naive defaults + flatten to children_delimiter for the execution layer - api/apps/services/dataset_api_service.py — flatten on the update path - test/testcases/configs.py — updated DEFAULT_PARSER_CONFIG - test/testcases/test_http_api/test_dataset_management/test_create_dataset.py — 4 valid + 2 invalid test cases No changes to the execution layer (rag/app/naive.py, rag/nlp/search.py). Existing UI flow via ext is unaffected. ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Added parent-child chunking configuration for dataset creation and updates with new `use_parent_child` toggle and customizable `children_delimiter` setting to specify how parent chunks are split into child chunks. * Documentation * Updated HTTP and Python API references with parent-child chunking configuration details and examples. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-04-08 11:36:57 +08:00
Qi Wang	0ced071a0b	Use uv run python3 x.py instead of uv run x.py (#13966 ) ### Use uv run python3 x.py instead of uv run x.py When directly call `uv run x.py` it will use the python in shebang, it does not work if the default python lack of some packages, so change it to best practices `uv run python3 x.py` ### Type of change - [x] Documentation Update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * Documentation * Updated development setup instructions across all README files (English and multiple language translations) to use explicit Python interpreter invocation for the dependency download command. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-04-08 10:33:46 +08:00
MkDev11	cfee2bc9db	feat: Auto-adjust chunk recall weights based on user feedback (#12689 ) ### What problem does this PR solve? Implements automatic adjustment of knowledge base chunk recall weights based on user feedback (upvotes/downvotes). When users upvote or downvote a response, the system locates the corresponding knowledge snippets and adjusts their recall weight to improve future retrieval quality. Closes #12670 How it works: 1. User upvotes/downvotes a response via `POST /thumbup` 2. System extracts chunk IDs from the conversation reference 3. For each referenced chunk: - Reads current `pagerank_fea` value from document store - Increments (+1) for upvote or decrements (-1) for downvote - Clamps weight to [0, 100] range - Updates chunk in ES/Infinity/OceanBase 4. Future retrievals score these chunks higher/lower based on accumulated feedback Files changed: - `api/db/services/chunk_feedback_service.py` - New service for updating chunk pagerank weights - `api/apps/conversation_app.py` - Integrated feedback service into thumbup endpoint - `test/testcases/test_web_api/test_chunk_feedback/` - Unit tests ### Type of change - [x] New Feature (non-breaking change which adds functionality) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Chat message feedback now updates per-chunk relevance weights (feature-flag gated), with configurable weighting and atomic updates across storage backends. * Bug Fixes * Stricter validation for message feedback inputs and more robust handling of feedback transitions. * Tests * Expanded test coverage for chunk-feedback behavior, weighting strategies, storage backends, and thumb-flip scenarios. * Chores * CI workflow extended to run the new chunk-feedback web API tests. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: mkdev11 <YOUR_GITHUB_ID+MkDev11@users.noreply.github.com> Co-authored-by: mkdev11 <MkDev11@users.noreply.github.com>	2026-04-08 09:52:18 +08:00
Jin Hai	4a2a17c27a	Fix typos (#13961 ) ### What problem does this PR solve? as title. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Chores * Internal code quality improvements with no user-facing changes. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-07 23:16:52 +08:00
Jin Hai	931021875a	Refactor system/version API to RESTful style (#13956 ) ### What problem does this PR solve? Refactor version API to RESTful style. Python and go server API also updated. ### Type of change - [x] Refactoring <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * Refactor * Migrated core API endpoints to the `/api/v1/` namespace for improved consistency and organization. * Standardized system version, search, and chat list endpoints under the new API versioning structure. * New Features * Added MinIO region configuration support, allowing specification of storage engine regional settings via environment variables or configuration files. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-07 19:07:47 +08:00
Yang_Ming	bc8d67ce78	feat: add region parameter support to MinIO connection (#13954 ) ## Summary - Add optional `region` parameter to `Minio()` client constructor in `rag/utils/minio_conn.py` - Reads from `MINIO.region` in settings, defaults to `None` when not configured - Required by some S3-compatible storage services (e.g., AWS S3, Tencent COS) for proper bucket access ## Motivation When using RAGFlow with S3-compatible storage that requires a region (such as AWS S3 or Tencent Cloud COS), the MinIO client fails to access buckets because the `region` parameter is not passed through. The `Minio()` Python client already supports the `region` parameter natively — this PR simply wires it up from the RAGFlow configuration. ## Changes - `rag/utils/minio_conn.py`: Pass `region=settings.MINIO.get("region", None) or None` to `Minio()` constructor ## Backward Compatibility - No breaking changes. When `region` is not configured, it defaults to `None`, preserving the existing behavior exactly. ## Test Plan - [ ] Verified with MinIO (no region set) — works as before - [x] Verified with S3-compatible storage requiring region — bucket access succeeds <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Bug Fixes * Enhanced MinIO client initialization with regional configuration support for improved compatibility with region-specific deployments. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: Jarry Wang <code-better-life@users.noreply.github.com> Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-04-07 16:38:23 +08:00
Jin Hai	68f665be7a	CLI: Add float parsing (#13955 ) ### What problem does this PR solve? Add float parsing ### Type of change - [x] New Feature (non-breaking change which adds functionality) Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-07 15:09:45 +08:00
Jin Hai	393efa9b7c	Refactor variable of front end (#13953 ) ### What problem does this PR solve? api_host -> webAPI ExternalApi -> restAPIv1 ### Type of change - [x] Refactoring <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Refactor * Updated internal API endpoint configuration to use consolidated base URL constants for improved maintainability and consistency across the application. * Chores * Updated server-side protocol validation for admin connectivity checks. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-07 15:08:11 +08:00
balibabu	38acf34724	Fix: The agent selected a knowledge base, but the API returned the error: "No dataset is selected". (#13950 ) ### What problem does this PR solve? Fix: The agent selected a knowledge base, but the API returned the error: "No dataset is selected". ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: balibabu <assassin_cike@163.com>	2026-04-07 14:16:37 +08:00
auyua9	fa08fa2a17	docs: fix broken internal links in guides (#13935 ) ### What problem does this PR solve? This fixes two broken internal documentation links in the guides: - `docs/develop/mcp/launch_mcp_server.md` linked `./acquire_ragflow_api_key.md`, but the target page lives one level up as `../acquire_ragflow_api_key.md`. - `docs/guides/dataset/run_retrieval_test.md` linked `./construct_knowledge_graph.md`, but the actual page lives under `./advanced/construct_knowledge_graph.md`. These broken links make it harder to follow the MCP and retrieval-test docs from the local docs tree. ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [x] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2026-04-07 14:01:12 +08:00
Jin Hai	9ac5d28f06	Refactor context command (#13952 ) ### What problem does this PR solve? Refactor context search command ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-07 13:59:27 +08:00
Ricardo-M-L	424aee5bec	fix: correct typos in code comments, docstrings and docs (#13931 ) ## Summary - Fix `a image` → `an image` in README and log message - Fix `colomn` → `column` in table structure recognizer comment - Fix `formated` → `formatted` in confluence connector docstring - Fix `tabel of content` → `table of contents` in TOC prompt ## Test plan - [ ] Documentation and comment changes, no functional impact 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: yuj <yuj@ztjzsoft.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-04-07 13:05:39 +08:00
Ricardo-M-L	29cf8aba48	fix: correct typos in locale files and search hooks (#13932 ) ## Summary - Fix `Refrence` → `Reference` in zh, id, zh-traditional locale files (en.ts already correct) - Fix `from from` → `from` and `this files` → `this file` in en.ts - Fix variable name `reponse` → `response` in search hooks ## Test plan - [ ] Verify UI strings display correctly - [ ] Verify search functionality works with renamed variable 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: yuj <yuj@ztjzsoft.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-04-07 12:26:25 +08:00
Yongteng Lei	112007243d	Refa: refine code_exec component (#13925 ) ### What problem does this PR solve? Refine code_exec component. ### Type of change - [x] Refactoring	2026-04-07 11:48:29 +08:00
Jack	c4b0aaa874	Fix: #6098 - Add validation logic for parser_config when update document (#13911 ) ### What problem does this PR solve? Add validation logic for parser_config. Refactor the processing flow. Before change, validation logics and update logics are mixed up - some validation logis executes followed by some update logic executes and then another such "validation-and-then-update" which is not good. After change, all validation logic executes firstly. Update logic will be executed after ALL validation logic executed. Validation logic for parameters (that come from front end) will be checked using Pydantic. For validation logic that depends on data from DB, they will be in separate methods. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring	2026-04-07 11:33:05 +08:00
Jin Hai	5673245134	Refactor context command (#13948 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-07 11:30:09 +08:00
Idriss Sbaaoui	ff27ce86d6	fix: gpt-5 name-based config clearing from base chat path (#13949 ) ### What problem does this PR solve? fix #13944 where OpenAI-compatible custom endpoints failed verification when model names contained `gpt-5` becauser of incorrect name-based handling in the Base/backend=`base` path. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-04-07 11:24:47 +08:00
buildearth	a0be7c7ca7	Fix(connector): expose id_column, timestamp_column, metadata_columns for MySQL/PostgreSQL incremental sync (#13849 ) ### What problem does this PR solve? The MySQL and PostgreSQL sync classes in `sync_data_source.py` were not passing `id_column`, `timestamp_column`, and `metadata_columns` to `RDBMSConnector`, making incremental sync and document update impossible even when configured. - Without `id_column`: updated records generate new documents instead of overwriting existing ones (doc ID is derived from content hash, so any change produces a new ID). - Without `timestamp_column`: `poll_source` always falls back to full sync, ignoring the configured time range. - The three fields existed in the frontend default values but had no form inputs, so users had no way to fill them in. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) ### Changes - Backend (`rag/svr/sync_data_source.py`): pass `id_column`, `timestamp_column`, and `metadata_columns` from `self.conf` to `RDBMSConnector` for both `MySQL` and `PostgreSQL` sync classes. - Frontend (`web/src/pages/user-setting/data-source/constant/index.tsx`): add `ID Column`, `Timestamp Column`, and `Metadata Columns` form fields to MySQL and PostgreSQL data source configuration UI with tooltips. Signed-off-by: lixintao <lixintao@uniontech.com> Co-authored-by: lixintao <lixintao@uniontech.com>	2026-04-07 10:24:30 +08:00
qinling0210	49386bc1b5	Implement UpdateDataset and UpdateMetadata in GO (#13928 ) ### What problem does this PR solve? Implement UpdateDataset and UpdateMetadata in GO Add cli: UPDATE CHUNK <chunk_id> OF DATASET <dataset_name> SET <update_fields> REMOVE TAGS 'tag1', 'tag2' from DATASET 'dataset_name'; SET METADATA OF DOCUMENT <doc_id> TO <meta> ### Type of change - [ ] Refactoring	2026-04-07 09:44:51 +08:00
Lynn	60ec5880e5	Feat: mysql data migrate script (#13927 ) ### What problem does this PR solve? Add a script to migrate data in tenant_llm into tenant_model_provider. ### Type of change - [x] Other (please describe): tool script.	2026-04-03 20:01:37 +08:00
Magicbook1108	69264b3a70	Feat: Refact pipeline (#13826 ) ### What problem does this PR solve? ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring --------- Co-authored-by: Zhichang Yu <yuzhichang@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-03 19:26:45 +08:00
Jin Hai	6d9430a125	Add think chat to CLI (#13922 ) ### What problem does this PR solve? Now user can use 'think mode' to chat with LLM ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-03 18:11:23 +08:00
Yingfeng	e518c20736	Update README (#13924 ) ### Type of change - [x] Documentation Update	2026-04-03 17:29:48 +08:00
akie	35b2a714f9	Fix: tag datasets not visible in tag sets dropdown (#13921 ) ## Problem Description When a user creates Dataset A using the Tag parser (for CSV/Excel files with tag definitions), and then creates Dataset B, the Tag Sets dropdown in Dataset B's Configuration page cannot display Dataset A. ### Steps to Reproduce 1. Create Dataset A with Tag as the chunking method 2. Upload a CSV file to Dataset A to generate tags 3. Create Dataset B 4. Navigate to Dataset B → Configuration → Tag Sets 5. Expected: Dataset A should appear in the dropdown 6. Actual: The dropdown is empty, Dataset A is not visible --- ## Root Cause Analysis After thorough code review, the original code logic is correct. The `chunk_method` field flows properly through the system: ### Data Flow ```mermaid sequenceDiagram participant Frontend participant Pydantic participant API participant Database Note over Frontend,Database: Creating a Tag Dataset Frontend->>Pydantic: POST {chunk_method: "tag"} Pydantic->>API: serialization_alias converts<br/>chunk_method → parser_id API->>Database: INSERT {parser_id: "tag"} Note over Frontend,Database: Querying Datasets Frontend->>API: GET /api/v1/datasets API->>Database: SELECT parser_id, ... Database-->>API: Returns {parser_id: "tag"} API->>API: remap_dictionary_keys()<br/>parser_id → chunk_method API-->>Frontend: {chunk_method: "tag"} Note over Frontend: Filter: x.chunk_method === 'tag' Note over Frontend: ✅ Match found! ``` ### Field Mapping Location: `api/utils/api_utils.py:657-662` ```python DEFAULT_KEY_MAP = { "chunk_num": "chunk_count", "doc_num": "document_count", "parser_id": "chunk_method", # Maps DB field to API response "embd_id": "embedding_model", } ``` ### Frontend Filtering (Already Correct) Location: `web/src/pages/dataset/dataset-setting/components/tag-item.tsx:24` ```typescript const knowledgeOptions = knowledgeList .filter((x) => x.chunk_method === 'tag') // ✅ Correct field .map((x) => ({...})); ``` --- ## Actual Issue The most likely causes for the "bug" are: 1. Browser Cache: Old data cached before proper deployment 2. Stale Data: Datasets created before the code was fully deployed 3. Container Not Restarted: Changes not applied to running container --- ## Resolution No code changes are needed. The existing code correctly: 1. Accepts `chunk_method` from frontend 2. Converts to `parser_id` via Pydantic serialization_alias 3. Stores in database as `parser_id` 4. Maps back to `chunk_method` in API response 5. Frontend filters by `chunk_method === 'tag'`	2026-04-03 17:29:10 +08:00
LeonTung	0b724be521	chore(templates): Update the customer feedback dispatcher template (#13919 ) ### What problem does this PR solve? Update the customer feedback dispatcher template and introduce a new operator `Variable Aggregator`. ### Type of change - [x] Other (please describe): Template change --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-04-03 16:51:39 +08:00
balibabu	5b43c7cf16	Feat: Place the language configuration in web/.env for easy user configuration. (#13920 ) ### What problem does this PR solve? Feat: Place the language configuration in web/.env for easy user configuration. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-04-03 16:50:18 +08:00
Ricardo-M-L	354108922b	fix: use f-string with separator in switch operator error message (#13915 ) \`switch.py\` line 137 concatenates the operator directly after the text without separator: \`'Not supported operator' + operator\` → produces \`"Not supported operatorXXX"\` Changed to: \`f'Not supported operator: {operator}'\`	2026-04-03 16:49:28 +08:00
chanx	21af67f6f9	feat(File Management): Refactor File List API and Add Knowledge Base Document Initialization (#13914 ) ### What problem does this PR solve? feat(File Management): Refactor File List API and Add Knowledge Base Document Initialization - Migrate the file list API endpoint from `/v1/file/list` to `/api/v1/files` to align with the Python implementation. - Add logic for initializing knowledge base documents; automatically create the `.knowledgebase` folder and associated documents when retrieving the root directory. - Enhance parameter validation and error handling, including the introduction of a new `CodeParamError` error code. - Optimize the file list response structure to match the implementation on the Python side. - Update the Vite configuration to support proxying the new `/api/v1/files` endpoint. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-04-03 15:08:43 +08:00
writinwaters	6263857c1e	Agent templates regrouped and renamed (#13873 ) ### What problem does this PR solve? Regrouped and renamed agent templates to increase user engagement. ### Type of change - [x] Refactoring	2026-04-03 13:43:25 +08:00
Zhichang Yu	ab358fe949	feat: make Azure cloud authority configurable for SPN auth (#13898 ) ## Summary - The Azure SPN storage handler hardcoded `AzureAuthorityHosts.AZURE_CHINA`, preventing users in Azure Public Cloud regions (UK-South, EU, US, etc.) from authenticating - Add a `cloud` config option (env: `AZURE_CLOUD`) supporting all four Azure sovereignties: `public`, `china`, `government`, `germany` - Defaults to `public` (global Azure) — the most common international use case Closes #13259 ## Test plan - [ ] Verify default (`cloud: public`) connects to Azure Public Cloud endpoints - [ ] Verify `cloud: china` retains existing behavior for Azure China users - [ ] Verify `AZURE_CLOUD` env var overrides the config file value 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-03 12:51:26 +08:00
Zhichang Yu	384fa6fc6e	Replace MinIO official image with pgsty/minio fork (#13896 ) ## Summary - Replace `quay.io/minio/minio` with `pgsty/minio` community fork in `docker/docker-compose-base.yml` MinIO stopped distributing pre-built Docker images and changed its license. The pgsty/minio fork provides drop-in compatible images under AGPLv3. Closes #13840 ## Test plan - [x] Verify `docker compose -f docker/docker-compose-base.yml up -d` pulls the pgsty/minio image successfully - [ ] Verify MinIO console accessible on port 9001 - [ ] Verify RAGFlow backend can connect to MinIO and perform file operations normally 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-02 22:03:02 +08:00
Yongteng Lei	b7daf6285b	Refa: Chat conversations /convsersation API to RESTFul (#13893 ) ### What problem does this PR solve? Chat conversations /convsersation API to RESTFul. ### Type of change - [x] Refactoring	2026-04-02 20:49:23 +08:00
chanx	bbb9b1df85	feat: Implement file upload and folder creation features by GO (#13903 ) ### What problem does this PR solve? feat: Implement file upload and folder creation features - Add file upload route in router.go - Add file operation methods in dao/file.go - Add util/file.go for file type detection and filename handling - Implement file upload and folder creation endpoints in handler/file.go - Implement file upload and folder creation logic in service/file.go - Modify response message format in memory.go - Add document count method in dao/document.go ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-04-02 20:21:04 +08:00
Jin Hai	6c29128de1	Refactor model provider and command (#13887 ) ### What problem does this PR solve? Introduce 5 new tables, including model groups and provider instance. ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-02 20:20:35 +08:00

1 2 3 4 5 ...

5699 Commits