ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-01-19 03:35:11 +08:00

Author	SHA1	Message	Date
Jin Hai	38f0a92da9	Use RAGFlow CLI to replace RAGFlow Admin CLI (#12653 ) ### What problem does this PR solve? ``` $ python admin/client/ragflow_cli.py -t user -u aaa@aaa.com -p 9380 ragflow> list datasets; ragflow> list default models; ragflow> show version; ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com> nightly	2026-01-17 17:52:38 +08:00
writinwaters	067ddcbf23	Docs: Added configure memory (#12665 ) ### What problem does this PR solve? As title. ### Type of change - [x] Documentation Update	2026-01-17 17:49:19 +08:00
Hetavi Shah	46305ef35e	Add User API Token Management to Admin API and CLI (#12595 ) ## Summary This PR extends the RAGFlow Admin API and CLI with comprehensive user API token management capabilities. Administrators can now generate, list, and delete API tokens for users through both the REST API and the Admin CLI interface. ## Changes ### Backend API (`admin/server/`) #### New Endpoints - POST `/api/v1/admin/users/<username>/new_token` - Generate a new API token for a user - GET `/api/v1/admin/users/<username>/token_list` - List all API tokens for a user - DELETE `/api/v1/admin/users/<username>/token/<token>` - Delete a specific API token for a user #### Service Layer Updates (`services.py`) - Added `get_user_api_key(username)` - Retrieves all API tokens for a user - Added `save_api_token(api_token)` - Saves a new API token to the database - Added `delete_api_token(username, token)` - Deletes an API token for a user ### Admin CLI (`admin/client/`) #### New Commands - `GENERATE TOKEN FOR USER <username>;` - Generate a new API token for the specified user - `LIST TOKENS OF <username>;` - List all API tokens associated with a user - `DROP TOKEN <token> OF <username>;` - Delete a specific API token for a user ### Testing Added comprehensive test suite in `test/testcases/test_admin_api/`: - `test_generate_user_api_key.py` - Tests for API token generation - `test_get_user_api_key.py` - Tests for listing user API tokens - `test_delete_user_api_key.py` - Tests for deleting API tokens - `conftest.py` - Shared test fixtures and utilities ## Technical Details ### Token Generation - Tokens are generated using `generate_confirmation_token()` utility - Each token includes metadata: `tenant_id`, `token`, `beta`, `create_time`, `create_date` - Tokens are associated with user tenants automatically ### Security Considerations - All endpoints require admin authentication (`@check_admin_auth`) - Tokens are URL-encoded when passed in DELETE requests to handle special characters - Proper error handling for unauthorized access and missing resources ### API Response Format All endpoints follow the standard RAGFlow response format: ```json { "code": 0, "data": {...}, "message": "Success message" } ``` ## Files Changed - `admin/client/admin_client.py` - CLI token management commands - `admin/server/routes.py` - New API endpoints - `admin/server/services.py` - Token management service methods - `docs/guides/admin/admin_cli.md` - CLI documentation updates - `test/testcases/test_admin_api/conftest.py` - Test fixtures - `test/testcases/test_admin_api/test_user_api_key_management/*` - Test suites ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: Alexander Strasser <alexander.strasser@ondewo.com> Co-authored-by: Hetavi Shah <your.email@example.com>	2026-01-17 15:21:00 +08:00
He Wang	bd9163904a	fix(ob_conn): ignore duplicate errors when executing 'create_idx' (#12661 ) ### What problem does this PR solve? Skip duplicate errors to avoid 'create_idx' failures caused by slow metadata refresh or external modifications. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-16 20:46:37 +08:00
Kevin Hu	b6d7733058	Feat: metadata settings in KB. (#12662 ) ### What problem does this PR solve? #11910 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-01-16 20:14:02 +08:00
6ba3i	4f036a881d	Fix: Infinity keyword round-trip, highlight fallback, and KB update guards (#12660 ) ### What problem does this PR solve? Fixes Infinity-specific API regressions: preserves ```important_kwd``` round‑trip for ```[""]```, restores required highlight key in retrieval responses, and enforces Infinity guards for unsupported ```parser_id=tag``` and pagerank in ```/v1/kb/update```. Also removes a slow/buggy pandas row-wise apply that was throwing ```ValueError``` and causing flakiness. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-16 20:03:52 +08:00
6ba3i	59075a0b58	Fix : p3 level sdk test error for update chat (#12654 ) ### What problem does this PR solve? fix for update chat failing ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-16 17:47:12 +08:00
PentaFDevs	30bd25716b	Fix PDF Generator output variables not appearing in subsequent agent steps (#12619 ) This commit fixes multiple issues preventing PDF Generator (Docs Generator) output variables from being visible in the Output section and available to downstream nodes. ### What problem does this PR solve? Issues Fixed: 1. PDF Generator nodes initialized with empty object instead of proper initial values 2. Output structure mismatch (had 'value' property that system doesn't expect) 3. Missing 'download' output in form schema 4. Output list computed from static values instead of form state 5. Added null/undefined guard to transferOutputs function ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Changes: - web/src/pages/agent/constant/index.tsx: Fixed output structure in initialPDFGeneratorValues - web/src/pages/agent/hooks/use-add-node.ts: Initialize PDF Generator with proper values - web/src/pages/agent/form/pdf-generator-form/index.tsx: Fixed schema and use form.watch - web/src/pages/agent/form/components/output.tsx: Added null guard and spacing	2026-01-16 16:50:53 +08:00
balibabu	99dae3c64c	Fix: In the agent loop, if the await response is selected as the variable, the operator cannot be selected. #12656 (#12657 ) ### What problem does this PR solve? Fix: In the agent loop, if the await response is selected as the variable, the operator cannot be selected. #12656 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-16 16:49:48 +08:00
Magicbook1108	045314a1aa	Fix: duplicate content in chunk (#12655 ) ### What problem does this PR solve? Fix: duplicate content in chunk #12336 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-16 15:32:04 +08:00
6ba3i	2b20d0b3bb	Fix : Web API tests by normalizing errors, validation, and uploads (#12620 ) ### What problem does this PR solve? Fixes web API behavior mismatches that caused test failures by normalizing error responses, tightening validations, correcting error messages, and closing upload file handles. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-16 11:09:22 +08:00
zagnaan	59f4c51222	fix(entrypoint): Preserve $ in passwords during template expansion (#12509 ) ### What problem does this PR solve? Fix shell variable expansion to preserve $ in password defaults when env vars are unset. Fixes Azure RDS auto-rotated passwords (that contain $) being truncated during template processing. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-15 19:30:33 +08:00
chanx	8c1fbfb130	Fix：Some bugs (#12648 ) ### What problem does this PR solve? Fix: Modified and optimized the metadata condition card component. Fix: Use startOfDay and endOfDay to ensure the date range includes a full day. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-15 19:28:22 +08:00
Kevin Hu	cec06bfb5d	Fix: empty chunk issue. (#12638 ) #12570 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-15 17:46:21 +08:00
writinwaters	2167e3a3c0	Docs: Added share memory (#12647 ) ### Type of change - [x] Documentation Update	2026-01-15 17:21:36 +08:00
liuxiaoyusky	2ea8dddef6	fix(infinity): Use comma separator for important_kwd to preserve mult… (#12618 ) ## Problem The \`important_kwd\` field in Infinity connector was using mismatched separators: - Storage: \`list2str(v)\` uses space as default separator - Reading: \`v.split()\` splits by all whitespace This causes multi-word keywords like \`\"Senior Fund Manager\"\` to be incorrectly split into \`[\"Senior\", \"Fund\", \"Manager\"]\`. ## Solution Use comma \`,\` as separator for both storing and reading, consistent with: 1. The LLM output format in \`keyword_prompt.md\` (\"delimited by ENGLISH COMMA\") 2. The \`cached.split(\",\")\` in \`task_executor.py\` ## Changes - \`insert()\`: \`list2str(v)\` → \`list2str(v, \",\")\` - \`update()\`: \`list2str(v)\` → \`list2str(v, \",\")\` - \`get_fields()\`: \`v.split()\` → \`v.split(\",\") if v else []\` ## Impact This bug affects: - Python-level reranking weight calculation (\`important_kwd * 5\`) - API response keyword display - Search precision due to fragmented keywords	2026-01-15 15:32:40 +08:00
longbingljw	18867daba7	chore: bump pyobvector from 0.2.18 to 0.2.22 (#12640 ) ### What problem does this PR solve? Update ob client ### Type of change - [x] Other (please describe):dependency upgrade	2026-01-15 15:21:34 +08:00
longbingljw	d68176326d	feat: add oceanbase mount to gitignore (#12642 ) ### What problem does this PR solve? feat: add oceanbase mount to .gitignore ### Type of change - [x] Refactoring	2026-01-15 15:20:40 +08:00
balibabu	d531bd4f1a	Fix: Editing the agent greeting causes the greeting to be continuously added to the message list. #12635 (#12636 ) ### What problem does this PR solve? Fix: Editing the agent greeting causes the greeting to be continuously added to the message list. #12635 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-15 14:55:19 +08:00
Vedant Madane	ac936005e6	fix: ensure deleted chunks are not returned in retrieval (#12520 ) (#12546 ) ## Summary Fixes #12520 - Deleted chunks should not appear in retrieval/reference results. ## Changes ### Core Fix - api/apps/chunk_app.py: Include \doc_id\ in delete condition to properly scope the delete operation ### Improved Error Handling - api/db/services/document_service.py: Better separation of concerns with individual try-catch blocks and proper logging for each cleanup operation ### Doc Store Updates - rag/utils/es_conn.py: Updated delete query construction to support compound conditions - rag/utils/opensearch_conn.py: Same updates for OpenSearch compatibility ### Tests - test/testcases/.../test_retrieval_chunks.py: Added \TestDeletedChunksNotRetrievable\ class with regression tests - test/unit/test_delete_query_construction.py: Unit tests for delete query construction ## Testing - Added regression tests that verify deleted chunks are not returned by retrieval API - Tests cover single chunk deletion and batch deletion scenarios	2026-01-15 14:45:55 +08:00
Pegasus	d8192f8f17	Fix: validate regex pattern in split_with_pattern to prevent crash (#12633 ) ### What problem does this PR solve? Fix regex pattern validation in split_with_pattern (#12605) - Add try-except block to validate user-provided regex patterns before use - Gracefully fallback to single chunk when invalid regex is provided - Prevent server crash during DOCX parsing with malformed delimiters ## Problem Parsing DOCX files with custom regex delimiters crashes with `re.error: nothing to repeat at position 9` when users provide invalid regex patterns. Closes #12605 ## Solution Validate and compile regex pattern before use. On invalid pattern, log warning and return content as single chunk instead of crashing. ## Changes - `rag/nlp/__init__.py`: Add regex validation in `split_with_pattern()` function ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Contribution by Gittensor, see my contribution statistics at https://gittensor.io/miners/details?githubId=42954461	2026-01-15 14:24:51 +08:00
Kevin Hu	eb35e2b89f	Fix: async invocation isssue. (#12634 ) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-15 14:22:16 +08:00
MkDev11	97b983fd0b	fix: add fallback parser list for empty parser_ids (#12632 ) ### What problem does this PR solve? Fixes #12570 - The slicing method dropdown was empty when deploying RAGFlow v0.23.1 from source code. The issue occurred because `parser_ids` from the tenant info was empty or undefined, causing `useSelectParserList` to return an empty array. This PR adds a fallback to a default parser list when `parser_ids` is empty, ensuring the dropdown always has options. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --- Contribution by Gittensor, see my contribution statistics at https://gittensor.io/miners/details?githubId=94194147	2026-01-15 14:05:25 +08:00
Magicbook1108	b40a7b2e7d	Feat: Hash doc id to avoid duplicate name. (#12573 ) ### What problem does this PR solve? Feat: Hash doc id to avoid duplicate name. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-01-15 14:02:15 +08:00
Kevin Hu	9a10558f80	Refa: async retrieval process. (#12629 ) ### Type of change - [x] Refactoring - [x] Performance Improvement	2026-01-15 12:28:49 +08:00
SID	f82628c40c	Fix: langfuse connection error handling #12621 (#12626 ) ## Description Fixes connection error handling when langfuse service is unavailable. The application now gracefully handles connection failures instead of crashing. ## Changes - Wrapped `langfuse.auth_check()` calls in try-except blocks in: - `api/db/services/dialog_service.py` - `api/db/services/tenant_llm_service.py` ## Problem When langfuse service is unavailable or connection is refused, `langfuse.auth_check()` throws `httpx.ConnectError: [Errno 111] Connection refused`, causing the application to crash during document parsing or dialog operations. ## Solution Added try-except blocks around `langfuse.auth_check()` calls to catch connection errors and gracefully skip langfuse tracing instead of crashing. The application continues functioning normally even when langfuse is unavailable. ## Related Issue Fixes #12621 --- Contribution by Gittensor, see my contribution statistics at https://gittensor.io/miners/details?githubId=158349177	2026-01-15 11:23:15 +08:00
chanx	7af98328f5	Fix: the styles of the multi-select component and the filter pop-up. (#12628 ) ### What problem does this PR solve? Fix: Fix the styles of the multi-select component and the filter pop-up. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-15 10:53:18 +08:00
MkDev11	678a4f959c	Fix: skip internal bookmark references in DOCX parsing (#12604 ) (#12611 ) ### What problem does this PR solve? Fixes #12604 - DOCX files containing hyperlinks to internal bookmarks (e.g., `#_文档目录`) cause a `KeyError` during parsing: ``` KeyError: "There is no item named 'word/#_文档目录' in the archive" ``` This happens because python-docx incorrectly tries to read internal bookmark references as files from the ZIP archive. Internal bookmarks are relationship targets starting with `#` and are not actual files. This PR extends the existing `load_from_xml_v2` workaround (which already handles `NULL` targets) to also skip relationship targets starting with `#`. Related upstream issue: https://github.com/python-openxml/python-docx/issues/902 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --- Contribution by Gittensor, see my contribution statistics at https://gittensor.io/miners/details?githubId=94194147	2026-01-14 19:08:46 +08:00
Kevin Hu	15a8bb2e9c	Fix: chunk list async issue. (#12615 ) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-14 17:32:07 +08:00
Pegasus	b091ff2730	Fix enable_thinking parameter for Qwen3 models (#12603 ) ### Issue When using Qwen3 models (`qwen3-32b`, `qwen3-max`) through the Tongyi-Qianwen provider for non-streaming calls (e.g., knowledge graph generation), the API fails with: Closes #12424 ``` parameter.enable_thinking must be set to false for non-streaming calls ``` ### Root Cause In `LiteLLMBase.async_chat()`, the `extra_body={"enable_thinking": False}` was set in `kwargs` but never forwarded to `_construct_completion_args()`. ### What problem does this PR solve? Pass merged kwargs to `_construct_completion_args()` using `{gen_conf, **kwargs}` to safely handle potential duplicate parameters. ### Changes - `rag/llm/chat_model.py`: Forward kwargs containing `extra_body` to `_construct_completion_args()` in `async_chat()` _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Contribution by Gittensor, see my contribution statistics at https://gittensor.io/miners/details?githubId=42954461	2026-01-14 16:35:46 +08:00
6ba3i	5b22f94502	Feat: Benchmark CLI additions and documentation (#12536 ) ### What problem does this PR solve? This PR adds a dedicated HTTP benchmark CLI for RAGFlow chat and retrieval endpoints so we can measure latency/QPS. ### Type of change - [x] Documentation Update - [x] Other (please describe): Adds a CLI benchmarking tool for chat/retrieval latency/QPS --------- Co-authored-by: Liu An <asiro@qq.com>	2026-01-14 13:49:16 +08:00
Yongteng Lei	a7671583b3	Feat: add CN regions for AWS (#12610 ) ### What problem does this PR solve? Add CN regions for AWS. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-01-14 12:34:55 +08:00
balibabu	d32fa02d97	Fix: Unable to copy category node. #12607 (#12609 ) ### What problem does this PR solve? Fix: Unable to copy category node. #12607 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-14 11:45:31 +08:00
lys1313013	f72a35188d	refactor: remove debug print statements (#12598 ) ### What problem does this PR solve? This PR eliminates unnecessary debug print statements that were left in hot paths of the codebase. ### Type of change - [x] Refactoring	2026-01-14 10:05:34 +08:00
6ba3i	ea619dba3b	Added to the HTTP API test suite (#12556 ) ### What problem does this PR solve? This PR adds missing HTTP API test coverage for dataset graph/GraphRAG/RAPTOR tasks, metadata summary, chat completions, agent sessions/completions, and related questions. It also introduces minimal HTTP test helpers to exercise these endpoints consistently with the existing suite. ### Type of change - [x] Other (please describe): Test coverage (HTTP API tests) --------- Co-authored-by: Liu An <asiro@qq.com>	2026-01-14 10:02:30 +08:00
writinwaters	36b0835740	Docs: Use memory (#12599 ) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	2026-01-14 09:40:31 +08:00
6ba3i	0795616b34	Align p3 HTTP/SDK tests with current backend behavior (#12563 ) ### What problem does this PR solve? Updates pre-existing HTTP API and SDK tests to align with current backend behavior (validation errors, 404s, and schema defaults). This ensures p3 regression coverage is accurate without changing production code. ### Type of change - [x] Other (please describe): align p3 HTTP/SDK tests with current backend behavior --------- Co-authored-by: Liu An <asiro@qq.com>	2026-01-13 19:22:47 +08:00
Yongteng Lei	941651a16f	Fix: wrong input trace in Category component (#12590 ) ### What problem does this PR solve? Wrong input trace in Category component ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-13 17:54:57 +08:00
He Wang	360114ed42	fix(ob_conn): avoid reusing SQLAlchemy Column objects in DDL (#12588 ) ### What problem does this PR solve? When there are multiple users, parsing a document for a new user can trigger the reuse of column objects, leading to the error `sqlalchemy.exc.ArgumentError: Column object 'id' already assigned to Table xxx`. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-13 17:39:20 +08:00
chanx	ffedb2c6d3	Feat: The MetadataFilterConditions component supports adding values via search. (#12585 ) ### What problem does this PR solve? Feat: The MetadataFilterConditions component supports adding values via search. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-01-13 17:03:25 +08:00
LIRUI YU	947e63ca14	Fixed typos and added pptx preview for frontend (#12577 ) ### What problem does this PR solve? Previously, we added support for previewing PPT and PPTX files in the backend. Now, we are adding it to the frontend, so when the slides in the chat interface are referenced, they will no longer be blank. ### Type of change - Bug Fix (non-breaking change which fixes an issue)	2026-01-13 17:02:36 +08:00
He Wang	34d74d9928	fix: add uv-aarch64-unknown-linux-gnu.tar.gz to deps image (#12516 ) ### What problem does this PR solve? Add uv-aarch64-unknown-linux-gnu.tar.gz to support building ARM64 Docker images. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: Liu An <asiro@qq.com>	2026-01-13 15:37:32 +08:00
balibabu	accae95126	Feat: Exported Agent JSON Should Include Conversation Variables Configuration #11796 (#12579 ) ### What problem does this PR solve? Feat: Exported Agent JSON Should Include Conversation Variables Configuration #11796 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-01-13 15:35:45 +08:00
Yongteng Lei	68e5c86e9c	Fix: image not displaying thumbnails when using pipeline (#12574 ) ### What problem does this PR solve? Fix image not displaying thumbnails when using pipeline. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-13 12:54:13 +08:00
Yongteng Lei	64c75d558e	Fix: zip extraction vulnerabilities in MinerU and TCADP (#12527 ) ### What problem does this PR solve? Fix zip extraction vulnerabilities: - Block symlink entries in zip files. - Reject encrypted zip entries. - Prevent absolute path attacks (including Windows paths). - Block path traversal attempts (../). - Stop zip slip exploits (directory escape). - Use streaming for memory-safe file handling. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-13 12:24:50 +08:00
LIRUI YU	41c84fd78f	Add MIME types for PPT and PPTX files (#12562 ) Otherwise, slide files cannot be opened in Chat module ### What problem does this PR solve? Backend Reason (API): In the api/utils/web_utils.py file of the backend, the CONTENT_TYPE_MAP dictionary is missing ppt and pptx. MIME type mapping. This means that when the frontend requests a PPTX file, the backend cannot correctly inform the browser that it is a PPTX file, resulting in the file being displayed incorrectly. Type identification error. ### Type of change - Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-01-13 12:17:49 +08:00
LGRY	d76912ab15	Fix: Use uv pip install for Docling installation (#12567 ) Fixes #12440 ### What problem does this PR solve? The current implementation uses `python3 -m pip` which can fail in certain environments. This change leverages `uv pip install` instead, which aligns with the project's existing tooling. ### Type of change - Removed the ensurepip line (not needed since uv manages pip) - Changed python3 to "$PY" for consistency with the rest of the script - Changed python3 -m pip install to uv pip install Co-authored-by: Gongzi <gongzi@192.168.0.100>	2026-01-13 11:48:42 +08:00
Lin Manhui	4fe3c24198	feat: PaddleOCR PDF parser supports thumnails and positions (#12565 ) ### What problem does this PR solve? 1. PaddleOCR PDF parser supports thumnails and positions. 2. Add FAQ documentation for PaddleOCR PDF parser. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-01-13 09:51:08 +08:00
Kevin Hu	44bada64c9	Feat: support tree structured deep-research policy. (#12559 ) ### What problem does this PR solve? #12558 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-01-13 09:41:35 +08:00
Jimmy Ben Klieve	867ec94258	revert white-space changes in docs (#12557 ) ### What problem does this PR solve? Trailing white-spaces in commit `6814ace1aa` got automatically trimmed by code editor may causes documentation typesetting broken. Mostly for double spaces for soft line breaks. ### Type of change - [x] Documentation Update	2026-01-13 09:41:02 +08:00

1 2 3 4 5 ...

5071 Commits