ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-06-01 05:17:51 +08:00

Author	SHA1	Message	Date
Magicbook1108	7143954b48	Fix: chats_openai in none stream condition (#13495 ) ### What problem does this PR solve? Fix: chats_openai in none stream condition #13453 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-03-10 13:44:17 +08:00
qinling0210	7c92f51133	Fix retrieval function when metadata_condtion is specified in retrieval API (#13473 ) ### What problem does this PR solve? Fix https://github.com/infiniflow/ragflow/issues/13388 The following command returns empty when there is doc with the meta data ``` curl --request POST \ --url http://localhost:9222/api/v1/retrieval \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ragflow-fO3mPFePfLgUYg8-9gjBVVXbvHqrvMPLGaW0P86PvAk' \ --data '{ "question": "any question", "dataset_ids": ["9bb4f0591b8811f18a4a84ba59049aa3"], "metadata_condition": { "logic": "and", "conditions": [ { "name": "character", "comparison_operator": "is", "value": "刘备" } ] } }' ``` When metadata_condtion is specified in the retrieval API, it is converted to doc_ids and doc_ids is passed to retrieval function. In retrieval funciton, when doc_ids is explicitly provided , we should bypass threshold. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-03-10 11:57:32 +08:00
Liu An	7166a7e50e	Test: adjust test priority markers for API tests (#13450 ) ### What problem does this PR solve? Changed test priority markers from p1/p2 to p3 in three test files: - test_table_parser_dataset_chat.py: Adjusted priority for table parser dataset chat test - test_delete_chunks.py: Updated priority for chunk deletion test with invalid IDs - test_retrieval_chunks.py: Modified priority for chunks retrieval pagination test These changes demote the priority of specific test cases to p3, indicating they are lower priority tests that can run later in the test suite execution. ### Type of change - [x] Test update	2026-03-06 20:17:39 +08:00
OliverW	3ed91345aa	fix(auth): return HTTP 401 for token-auth failures (#13420 ) Follow-up to #12488 #13386 ### What problem does this PR solve? Previously, token authentication failures returned HTTP 200 with an error code in the response body. This PR updates `token_required` to raise `Unauthorized` and relies on the global error handler to return a structured JSON response with HTTP 401 status. The response body structure (`code`, `message`, `data`) remains unchanged to preserve compatibility with the official SDK. Frontend logic has been updated to handle HTTP 401 responses in addition to checking `data.code`. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-03-06 18:18:14 +08:00
Yongteng Lei	51be1f1442	Refa: empty ids means no-op operation (#13439 ) ### What problem does this PR solve? Empty ids means no-op operation. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Documentation Update - [x] Refactoring --------- Co-authored-by: writinwaters <cai.keith@gmail.com>	2026-03-06 18:16:42 +08:00
Lynn	62cb292635	Feat/tenant model (#13072 ) ### What problem does this PR solve? Add id for table tenant_llm and apply in LLMBundle. ### Type of change - [x] Refactoring --------- Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com> Co-authored-by: Liu An <asiro@qq.com>	2026-03-05 17:27:17 +08:00
Good0987	8a7272f423	Test: add scenario for embedding_model update when chunk_count > 0 (#13351 ) ### What problem does this PR solve? Guard embedding_model change when dataset has existing chunks. API must return code 102 with message 'When chunk_num (N) > 0, embedding_model must remain <current_model>' to prevent silent embedding drift. ### Type of change - [x] Add Testcases Co-authored-by: Liu An <asiro@qq.com>	2026-03-04 17:41:35 +08:00
Idriss Sbaaoui	9d78d3ddb1	Tests: fix failling http in CI (#13301 ) ### What problem does this PR solve? test_doc_sdk_routes_unit had two flaky/incorrect branch assumptions: 1. parse/stop_parsing production logic gates on doc.run, but tests used progress, causing branch mismatch and unintended fallthrough into mutation/DB paths. 2. stop_parsing invalid-state test asserted an outdated message fragment, making the contract brittle. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-03-02 10:44:33 +08:00
天海蒼灆	983150b936	Fix (api): fix the document parsing status check logic (#12504 ) ### What problem does this PR solve? When the original code terminates the parsing task halfway, the progress may not be 0 or 1, which will result in the inability to call the interface to parse again -Change the document parsing progress check to task status check, and use TaskStatus.RUNNING.value to judge -Update the condition judgment for stopping parsing documents, and check whether the task is running instead ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-02-28 14:38:55 +08:00
qinling0210	8b6d363a98	Use pagination in _search_metadata (#13238 ) ### What problem does this PR solve? Fix [#13210](https://github.com/infiniflow/ragflow/issues/13210) Remove limit in _search_metadata, use pagination in _search_metadata. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-02-27 11:24:49 +08:00
6ba3i	22c4d72891	tests: improve RAGFlow coverage based on Codecov report (#13219 ) ### What problem does this PR solve? Codecov’s coverage report shows that several RAGFlow code paths are currently untested or under-tested. This makes it easier for regressions to slip in during refactors and feature work. This PR adds targeted automated tests to cover the files and branches highlighted by Codecov, improving confidence in core behavior while keeping runtime functionality unchanged. ### Type of change - [x] Other (please describe): Test coverage improvement (adds/extends unit and integration tests to address Codecov-reported gaps)	2026-02-26 19:03:26 +08:00
6ba3i	38011f2c16	tests: improve RAGFlow coverage based on Codecov report (#13200 ) ### What problem does this PR solve? Codecov’s coverage report shows that several RAGFlow code paths are currently untested or under-tested. This makes it easier for regressions to slip in during refactors and feature work. This PR adds targeted automated tests to cover the files and branches highlighted by Codecov, improving confidence in core behavior while keeping runtime functionality unchanged. ### Type of change - [x] Other (please describe): Test coverage improvement (adds/extends unit and integration tests to address Codecov-reported gaps)	2026-02-25 19:12:11 +08:00
6ba3i	fabbfcab90	Fix: failing p3 test for SDK/HTTP APIs (#13062 ) ### What problem does this PR solve? Adjust highlight parsing, add row-count SQL override, tweak retrieval thresholding, and update tests with engine-aware skips/utilities. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-02-09 14:56:10 +08:00
Liu An	c4f60b349d	Fix(test): downgrade test priorities (#12913 ) ### What problem does this PR solve? Changed test priorities in multiple test files, downgrading from p1 to p2 and p2 to p3. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-30 20:02:56 +08:00
Liu An	4947e9473a	Fix(test): Update error message assertions for unsupported content type tests (#12901 ) ### What problem does this PR solve? This commit updates test cases for create, delete, and update dataset endpoints to expect consistent error messages when an unsupported content type is provided. ### Type of change - [x] Bug Fix (test)	2026-01-30 09:45:04 +08:00
qinling0210	9a5208976c	Put document metadata in ES/Infinity (#12826 ) ### What problem does this PR solve? Put document metadata in ES/Infinity. Index name of meta data: ragflow_doc_meta_{tenant_id} ### Type of change - [x] Refactoring	2026-01-28 13:29:34 +08:00
Stephen Hu	52da81cf9e	Fix:Redis configuration template error in v0.22.1 (#12685 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/12674 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-27 12:47:46 +08:00
Julien Deveaux	6be197cbb6	Fix: Use tiktoken for proper token counting in OpenAI-compatible endpoint #7850 (#12760 ) ### What problem does this PR solve? The OpenAI-compatible chat endpoint (`/chats_openai/<chat_id>/chat/completions`) was not returning accurate token usage in streaming responses. The token counts were either missing or inaccurate because the underlying LLM API responses weren't being properly parsed for usage data. This PR adds proper token counting using tiktoken (cl100k_base encoding) as a fallback when the LLM API doesn't provide usage data in streaming chunks. This ensures clients always receive token usage information in the response, which is essential for billing and quota management. Changes: - Add tiktoken-based token counting for streaming responses in OpenAI-compatible endpoint - Ensure `usage` field is always populated in the final streaming chunk - Add unit tests for token usage calculation Fixes #7850 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-23 09:36:21 +08:00
Kevin Hu	3beb85efa0	Feat: enhance metadata arranging. (#12745 ) ### What problem does this PR solve? #11564 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-01-22 15:34:08 +08:00
Liu An	f98abf14a8	Refa(test): improve code formatting and remove debug prints (#12739 ) ### What problem does this PR solve? - Improving code formatting and consistency - Removing debug print statements ### Type of change - [x] Refactoring	2026-01-21 14:53:17 +08:00
6ba3i	aee9860970	Make document change-status idempotent for Infinity doc store (#12717 ) ### What problem does this PR solve? This PR makes the document change‑status endpoint idempotent under the Infinity doc store. If a document already has the requested status, the handler returns success without touching the engine, preventing unnecessary updates and avoiding missing‑table errors while keeping responses consistent. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-20 19:11:21 +08:00
qinling0210	b40d639fdb	Add dataset with table parser type for Infinity and answer question in chat using SQL (#12541 ) ### What problem does this PR solve? 1) Create dataset using table parser for infinity 2) Answer questions in chat using SQL ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-01-19 19:35:14 +08:00
Vedant Madane	ac936005e6	fix: ensure deleted chunks are not returned in retrieval (#12520 ) (#12546 ) ## Summary Fixes #12520 - Deleted chunks should not appear in retrieval/reference results. ## Changes ### Core Fix - api/apps/chunk_app.py: Include \doc_id\ in delete condition to properly scope the delete operation ### Improved Error Handling - api/db/services/document_service.py: Better separation of concerns with individual try-catch blocks and proper logging for each cleanup operation ### Doc Store Updates - rag/utils/es_conn.py: Updated delete query construction to support compound conditions - rag/utils/opensearch_conn.py: Same updates for OpenSearch compatibility ### Tests - test/testcases/.../test_retrieval_chunks.py: Added \TestDeletedChunksNotRetrievable\ class with regression tests - test/unit/test_delete_query_construction.py: Unit tests for delete query construction ## Testing - Added regression tests that verify deleted chunks are not returned by retrieval API - Tests cover single chunk deletion and batch deletion scenarios	2026-01-15 14:45:55 +08:00
6ba3i	ea619dba3b	Added to the HTTP API test suite (#12556 ) ### What problem does this PR solve? This PR adds missing HTTP API test coverage for dataset graph/GraphRAG/RAPTOR tasks, metadata summary, chat completions, agent sessions/completions, and related questions. It also introduces minimal HTTP test helpers to exercise these endpoints consistently with the existing suite. ### Type of change - [x] Other (please describe): Test coverage (HTTP API tests) --------- Co-authored-by: Liu An <asiro@qq.com>	2026-01-14 10:02:30 +08:00
6ba3i	0795616b34	Align p3 HTTP/SDK tests with current backend behavior (#12563 ) ### What problem does this PR solve? Updates pre-existing HTTP API and SDK tests to align with current backend behavior (validation errors, 404s, and schema defaults). This ensures p3 regression coverage is accurate without changing production code. ### Type of change - [x] Other (please describe): align p3 HTTP/SDK tests with current backend behavior --------- Co-authored-by: Liu An <asiro@qq.com>	2026-01-13 19:22:47 +08:00
Jin Hai	d1c4077a75	Fix directory name (#12195 ) ### What problem does this PR solve? as title. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-12-25 14:24:13 +08:00
Jin Hai	30019dab9f	Change knowledge base to dataset (#11976 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-12-17 10:03:33 +08:00
Zhichang Yu	40e84ca41a	Use Infinity single-field-multi-index (#11444 ) ### What problem does this PR solve? Use Infinity single-field-multi-index ### Type of change - [x] Refactoring - [x] Performance Improvement	2025-11-26 11:06:37 +08:00
Liu An	bfc84ba95b	Test: handle duplicate names by appending "(1)" (#11244 ) ### What problem does this PR solve? - Updated tests to reflect new behavior of handling duplicate dataset names - Instead of returning an error, the system now appends "(1)" to duplicate names - This problem was introduced by PR #10960 ### Type of change - [x] Testcase update	2025-11-13 15:18:32 +08:00
Billy Bao	19f71a961a	Fix: Create dataset performance unmatched between HTTP api and web ui (#10960 ) ### What problem does this PR solve? Fix: Create dataset performance unmatched between HTTP api and web ui #10925 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-11-04 13:45:14 +08:00
Zhichang Yu	73144e278b	Don't release full image (#10654 ) ### What problem does this PR solve? Introduced gpu profile in .env Added Dockerfile_tei fix datrie Removed LIGHTEN flag ### Type of change - [x] Documentation Update - [x] Refactoring	2025-10-23 23:02:27 +08:00
Liu An	594bf485d4	Test: update test cases for chunk retrieval pagination (#10694 ) ### What problem does this PR solve? Updated test cases in test_retrieval_chunks.py to: - Remove skip mark from page pagination test case (issues/6646 resolved) - Add skip marks for page_size=1 tests due to new issue (issues/10692) ### Type of change - [x] Test	2025-10-21 13:02:29 +08:00
writinwaters	6e862553cb	Docs: Deprecated 'Create session with agent' (#9464 ) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	2025-08-14 12:13:11 +08:00
Liu An	b55c3d07dc	Test: Update error message assertions for chunk update tests (#9468 ) ### What problem does this PR solve? Modify test cases to accept additional error message format when updating chunks. fix actions: https://github.com/infiniflow/ragflow/actions/runs/16942741621/job/48015850297 ### Type of change - [x] Update test cases	2025-08-14 12:11:20 +08:00
Liu An	57b9f8cf52	Fix: Update test assertions and simplify test cases (#9400 ) ### What problem does this PR solve? - Fix error message assertion in test_update_chunk.py to match new ownership validation - Simplify dataset listing test cases by removing lambda assertions for sorting - Fix actions: https://github.com/infiniflow/ragflow/actions/runs/16885465524/job/47831942553 ### Type of change - [x] Fix test cases	2025-08-12 10:57:30 +08:00
Zhichang Yu	342a04ec8a	Added infinity rank_feature support (#9044 ) ### What problem does this PR solve? Added infinity rank_feature support ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-29 09:14:23 +08:00
Liu An	b5ffca332a	Refa: validation utils to use Pydantic v2 style models (#9037 ) ### What problem does this PR solve? - Update BaseModel to use model_config instead of Config class - Replace StrEnum with Literal types for method fields - Convert Field declarations to Annotated style ### Type of change - [x] Refactoring	2025-07-25 12:16:45 +08:00
Liu An	b4b6d296ea	Fix: Increase timeouts for document parsing and model checks (#8996 ) ### What problem does this PR solve? - Extended embedding model timeout from 3 to 10 seconds in api_utils.py - Added more time for large file batches and concurrent parsing operations to prevent test flakiness - Import from #8940 - https://github.com/infiniflow/ragflow/actions/runs/16422052652 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-23 15:08:36 +08:00
Liu An	0020c50000	Fix: Refactor parser config handling and add GraphRAG defaults (#8778 ) ### What problem does this PR solve? - Update `get_parser_config` to merge provided configs with defaults - Add GraphRAG configuration defaults for all chunk methods - Make raptor and graphrag fields non-nullable in ParserConfig schema - Update related test cases to reflect config changes - Ensure backward compatibility while adding new GraphRAG support - #8396 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-23 09:29:37 +08:00
Liu An	f8524462b0	Fix: Increase default `chunk_token_num` from 128 to 512 in parser config (#8753 ) ### What problem does this PR solve? Updated the default `chunk_token_num` value in `api_utils.py` and `validation_utils.py` to 512 to accommodate larger text chunks. Adjusted corresponding test cases in HTTP and SDK API tests to reflect this change. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-10 09:53:20 +08:00
Yongteng Lei	4d7bfd2ba3	Fix: typo process_duration (#8696 ) ### What problem does this PR solve? Fix typo process_duration. ### Type of change - [x] Documentation Update - [x] Refactoring	2025-07-07 14:11:47 +08:00
Liu An	0b40eb3e90	Test: Add tests for chunk API endpoints (#8616 ) ### What problem does this PR solve? - Add comprehensive test suite for chunk operations including: - Test files for create, list, retrieve, update, and delete chunks - Authorization tests - Batch operations tests - Update test configurations and common utilities - Validate `important_kwd` and `question_kwd` fields are lists in chunk_app.py - Reorganize imports and clean up duplicate code ### Type of change - [x] Add test cases	2025-07-02 09:49:08 +08:00
Liu An	dac5bcdf17	Fix: Enforce default embedding model in create_dataset / update_dataset (#8486 ) ### What problem does this PR solve? Previous: - Defaulted to hardcoded model 'BAAI/bge-large-zh-v1.5@BAAI' - Did not respect user-configured default embedding_model Now: - Correctly prioritizes user-configured default embedding_model Other: - Make embedding_model optional in CreateDatasetReq with proper None handling - Add default embedding model fallback in dataset update when empty - Enhance validation utils to handle None values and string normalization - Update SDK default embedding model to None to match API changes - Adjust related test cases to reflect new validation rules ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-25 16:41:32 +08:00
Liu An	9f9acf0c49	Test: Add document app tests (#8456 ) ### What problem does this PR solve? - Add new test suite for document app with create/list/parse/upload/remove tests - Update API URLs to use version variable from config in HTTP and web API tests ### Type of change - [x] Add test cases	2025-06-24 17:26:16 +08:00
Jin Hai	e470645efd	Refactor code (#8341 ) ### What problem does this PR solve? 1. rename var 2. update if statement ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-06-18 16:40:30 +08:00
Liu An	a3bebeb599	Fix: Enforce 255-byte filename limit (#8290 ) ### What problem does this PR solve? - Add filename length validation (<=255 bytes) for document upload/rename in both HTTP and SDK APIs - Update error messages for consistency - Fix comparison operator in SDK from '>=' to '>' for filename length check ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-16 16:39:41 +08:00
Liu An	64af09ce7b	Test: Add web API test suite for knowledge base operations (#8254 ) ### What problem does this PR solve? - Implement RAGFlowWebApiAuth class for web API authentication - Add comprehensive test cases for KB CRUD operations - Set up common fixtures and utilities in conftest.py - Add helper functions in common.py for web API requests The changes establish a complete testing framework for knowledge base management via web API endpoints. ### Type of change - [x] Add test case	2025-06-13 16:39:10 +08:00
Liu An	86a1411b07	Refa: Test configs (#8220 ) ### What problem does this PR solve? - Move common constants (HOST_ADDRESS, INVALID_API_TOKEN, etc.) to configs.py - Update test imports to use centralized configs - Clean up duplicate constant definitions across test files This improves maintainability by centralizing configuration. ### Type of change - [x] Refactoring test case	2025-06-12 17:42:00 +08:00
Liu An	54a465f9e8	Test: fix chunk deletion test assertions (#8222 ) ### What problem does this PR solve? - Fix test assertions in test_delete_chunks.py to expect empty results after deletion Action 7619 ### Type of change - [x] Bug Fix test cases	2025-06-12 17:41:46 +08:00
Liu An	7fbbc9650d	Fix: Move pagerank field from create to update dataset API (#8217 ) ### What problem does this PR solve? - Remove pagerank from CreateDatasetReq and add to UpdateDatasetReq - Add pagerank update logic in dataset update endpoint - Update API documentation to reflect changes - Modify related test cases and SDK references #8208 This change makes pagerank a mutable property that can only be set after dataset creation, and only when using elasticsearch as the doc engine. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-12 15:47:49 +08:00

1 2

58 Commits