Commit Graph

58 Commits

Author SHA1 Message Date
7143954b48 Fix: chats_openai in none stream condition (#13495)
### What problem does this PR solve?

Fix: chats_openai in none stream condition #13453

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-03-10 13:44:17 +08:00
7c92f51133 Fix retrieval function when metadata_condtion is specified in retrieval API (#13473)
### What problem does this PR solve?

Fix https://github.com/infiniflow/ragflow/issues/13388

The following command returns empty when there is doc with the meta data
```
curl --request POST \
     --url http://localhost:9222/api/v1/retrieval \
     --header 'Content-Type: application/json' \
     --header 'Authorization: Bearer ragflow-fO3mPFePfLgUYg8-9gjBVVXbvHqrvMPLGaW0P86PvAk' \
     --data '{
          "question": "any question",
          "dataset_ids": ["9bb4f0591b8811f18a4a84ba59049aa3"],
           "metadata_condition": {
            "logic": "and",
            "conditions": [
              {
                "name": "character",
                "comparison_operator": "is",
                "value": "刘备"
              }
            ]
          }
     }'
```

When metadata_condtion is specified in the retrieval API, it is
converted to doc_ids and doc_ids is passed to retrieval function.
In retrieval funciton, when doc_ids is explicitly provided , we should
bypass threshold.


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-03-10 11:57:32 +08:00
7166a7e50e Test: adjust test priority markers for API tests (#13450)
### What problem does this PR solve?

Changed test priority markers from p1/p2 to p3 in three test files:
- test_table_parser_dataset_chat.py: Adjusted priority for table parser
dataset chat test
- test_delete_chunks.py: Updated priority for chunk deletion test with
invalid IDs
- test_retrieval_chunks.py: Modified priority for chunks retrieval
pagination test

These changes demote the priority of specific test cases to p3,
indicating they are lower priority tests that can run later in the test
suite execution.

### Type of change

- [x] Test update
2026-03-06 20:17:39 +08:00
3ed91345aa fix(auth): return HTTP 401 for token-auth failures (#13420)
Follow-up to #12488 #13386

### What problem does this PR solve?

Previously, token authentication failures returned HTTP 200 with an
error code in the response body.

This PR updates `token_required` to raise `Unauthorized` and relies on
the global error handler to return a structured JSON response with HTTP
401 status.

The response body structure (`code`, `message`, `data`) remains
unchanged to preserve compatibility with the official SDK.

Frontend logic has been updated to handle HTTP 401 responses in addition
to checking `data.code`.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-03-06 18:18:14 +08:00
51be1f1442 Refa: empty ids means no-op operation (#13439)
### What problem does this PR solve?

Empty ids means no-op operation.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Documentation Update
- [x] Refactoring

---------

Co-authored-by: writinwaters <cai.keith@gmail.com>
2026-03-06 18:16:42 +08:00
62cb292635 Feat/tenant model (#13072)
### What problem does this PR solve?

Add id for table tenant_llm and apply in LLMBundle.

### Type of change

- [x] Refactoring

---------

Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
Co-authored-by: Liu An <asiro@qq.com>
2026-03-05 17:27:17 +08:00
8a7272f423 Test: add scenario for embedding_model update when chunk_count > 0 (#13351)
### What problem does this PR solve?

Guard embedding_model change when dataset has existing chunks. API must
return code 102 with message 'When chunk_num (N) > 0, embedding_model
must remain <current_model>' to prevent silent embedding drift.

### Type of change

- [x] Add Testcases

Co-authored-by: Liu An <asiro@qq.com>
2026-03-04 17:41:35 +08:00
9d78d3ddb1 Tests: fix failling http in CI (#13301)
### What problem does this PR solve?
test_doc_sdk_routes_unit had two flaky/incorrect branch assumptions:

1. parse/stop_parsing production logic gates on doc.run, but tests used
progress, causing branch mismatch and unintended fallthrough into
mutation/DB paths.
2. stop_parsing invalid-state test asserted an outdated message
fragment, making the contract brittle.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-03-02 10:44:33 +08:00
983150b936 Fix (api): fix the document parsing status check logic (#12504)
### What problem does this PR solve?
When the original code terminates the parsing task halfway, the progress
may not be 0 or 1, which will result in the inability to call the
interface to parse again

-Change the document parsing progress check to task status check, and
use TaskStatus.RUNNING.value to judge
-Update the condition judgment for stopping parsing documents, and check
whether the task is running instead


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-02-28 14:38:55 +08:00
8b6d363a98 Use pagination in _search_metadata (#13238)
### What problem does this PR solve?

Fix [#13210](https://github.com/infiniflow/ragflow/issues/13210)

Remove limit in _search_metadata, use pagination in _search_metadata.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-02-27 11:24:49 +08:00
22c4d72891 tests: improve RAGFlow coverage based on Codecov report (#13219)
### What problem does this PR solve?

Codecov’s coverage report shows that several RAGFlow code paths are
currently untested or under-tested. This makes it easier for regressions
to slip in during refactors and feature work.
This PR adds targeted automated tests to cover the files and branches
highlighted by Codecov, improving confidence in core behavior while
keeping runtime functionality unchanged.

### Type of change

- [x] Other (please describe): Test coverage improvement (adds/extends
unit and integration tests to address Codecov-reported gaps)
2026-02-26 19:03:26 +08:00
38011f2c16 tests: improve RAGFlow coverage based on Codecov report (#13200)
### What problem does this PR solve?

Codecov’s coverage report shows that several RAGFlow code paths are
currently untested or under-tested. This makes it easier for regressions
to slip in during refactors and feature work.
This PR adds targeted automated tests to cover the files and branches
highlighted by Codecov, improving confidence in core behavior while
keeping runtime functionality unchanged.

### Type of change

- [x] Other (please describe): Test coverage improvement (adds/extends
unit and integration tests to address Codecov-reported gaps)
2026-02-25 19:12:11 +08:00
fabbfcab90 Fix: failing p3 test for SDK/HTTP APIs (#13062)
### What problem does this PR solve?

Adjust highlight parsing, add row-count SQL override, tweak retrieval
thresholding, and update tests with engine-aware skips/utilities.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-02-09 14:56:10 +08:00
c4f60b349d Fix(test): downgrade test priorities (#12913)
### What problem does this PR solve?

Changed test priorities in multiple test files, downgrading from p1 to
p2 and p2 to p3.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-01-30 20:02:56 +08:00
4947e9473a Fix(test): Update error message assertions for unsupported content type tests (#12901)
### What problem does this PR solve?

This commit updates test cases for create, delete, and update dataset
endpoints to expect consistent error messages when an unsupported
content type is provided.

### Type of change

- [x] Bug Fix (test)
2026-01-30 09:45:04 +08:00
9a5208976c Put document metadata in ES/Infinity (#12826)
### What problem does this PR solve?

Put document metadata in ES/Infinity.

Index name of meta data: ragflow_doc_meta_{tenant_id}

### Type of change

- [x] Refactoring
2026-01-28 13:29:34 +08:00
52da81cf9e Fix:Redis configuration template error in v0.22.1 (#12685)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/12674

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-01-27 12:47:46 +08:00
6be197cbb6 Fix: Use tiktoken for proper token counting in OpenAI-compatible endpoint #7850 (#12760)
### What problem does this PR solve?
The OpenAI-compatible chat endpoint
(`/chats_openai/<chat_id>/chat/completions`) was not returning accurate
token
usage in streaming responses. The token counts were either missing or
inaccurate because the underlying LLM API
responses weren't being properly parsed for usage data.
This PR adds proper token counting using tiktoken (cl100k_base encoding)
as a fallback when the LLM API doesn't provide usage data in streaming
chunks. This ensures clients always receive token usage information in
the
response, which is essential for billing and quota management.
**Changes:**
- Add tiktoken-based token counting for streaming responses in
OpenAI-compatible endpoint
- Ensure `usage` field is always populated in the final streaming chunk
- Add unit tests for token usage calculation
  Fixes #7850

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-01-23 09:36:21 +08:00
3beb85efa0 Feat: enhance metadata arranging. (#12745)
### What problem does this PR solve?
#11564

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-01-22 15:34:08 +08:00
f98abf14a8 Refa(test): improve code formatting and remove debug prints (#12739)
### What problem does this PR solve?

- Improving code formatting and consistency
- Removing debug print statements

### Type of change

- [x] Refactoring
2026-01-21 14:53:17 +08:00
aee9860970 Make document change-status idempotent for Infinity doc store (#12717)
### What problem does this PR solve?

This PR makes the document change‑status endpoint idempotent under the
Infinity doc store. If a document already has the requested status, the
handler returns success without touching the engine, preventing
unnecessary updates and avoiding missing‑table errors while keeping
responses consistent.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-01-20 19:11:21 +08:00
b40d639fdb Add dataset with table parser type for Infinity and answer question in chat using SQL (#12541)
### What problem does this PR solve?

1) Create  dataset using table parser for infinity
2) Answer questions in chat using SQL

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-01-19 19:35:14 +08:00
ac936005e6 fix: ensure deleted chunks are not returned in retrieval (#12520) (#12546)
## Summary
Fixes #12520 - Deleted chunks should not appear in retrieval/reference
results.

## Changes

### Core Fix
- **api/apps/chunk_app.py**: Include \doc_id\ in delete condition to
properly scope the delete operation

### Improved Error Handling
- **api/db/services/document_service.py**: Better separation of concerns
with individual try-catch blocks and proper logging for each cleanup
operation

### Doc Store Updates
- **rag/utils/es_conn.py**: Updated delete query construction to support
compound conditions
- **rag/utils/opensearch_conn.py**: Same updates for OpenSearch
compatibility

### Tests
- **test/testcases/.../test_retrieval_chunks.py**: Added
\TestDeletedChunksNotRetrievable\ class with regression tests
- **test/unit/test_delete_query_construction.py**: Unit tests for delete
query construction

## Testing
- Added regression tests that verify deleted chunks are not returned by
retrieval API
- Tests cover single chunk deletion and batch deletion scenarios
2026-01-15 14:45:55 +08:00
ea619dba3b Added to the HTTP API test suite (#12556)
### What problem does this PR solve?

This PR adds missing HTTP API test coverage for dataset
graph/GraphRAG/RAPTOR tasks, metadata summary, chat completions, agent
sessions/completions, and related questions. It also introduces minimal
HTTP test helpers to exercise these endpoints consistently with the
existing suite.

### Type of change

- [x]  Other (please describe): Test coverage (HTTP API tests)

---------

Co-authored-by: Liu An <asiro@qq.com>
2026-01-14 10:02:30 +08:00
0795616b34 Align p3 HTTP/SDK tests with current backend behavior (#12563)
### What problem does this PR solve?

Updates pre-existing HTTP API and SDK tests to align with current
backend behavior (validation errors, 404s, and schema defaults). This
ensures p3 regression coverage is accurate without changing production
code.

### Type of change

- [x] Other (please describe): align p3 HTTP/SDK tests with current
backend behavior

---------

Co-authored-by: Liu An <asiro@qq.com>
2026-01-13 19:22:47 +08:00
d1c4077a75 Fix directory name (#12195)
### What problem does this PR solve?

as title.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-12-25 14:24:13 +08:00
30019dab9f Change knowledge base to dataset (#11976)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-12-17 10:03:33 +08:00
40e84ca41a Use Infinity single-field-multi-index (#11444)
### What problem does this PR solve?

Use Infinity single-field-multi-index

### Type of change

- [x] Refactoring
- [x] Performance Improvement
2025-11-26 11:06:37 +08:00
bfc84ba95b Test: handle duplicate names by appending "(1)" (#11244)
### What problem does this PR solve?

- Updated tests to reflect new behavior of handling duplicate dataset
names
- Instead of returning an error, the system now appends "(1)" to
duplicate names
- This problem was introduced by PR #10960

### Type of change

- [x] Testcase update
2025-11-13 15:18:32 +08:00
19f71a961a Fix: Create dataset performance unmatched between HTTP api and web ui (#10960)
### What problem does this PR solve?

Fix: Create dataset performance unmatched between HTTP api and web ui
#10925

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-11-04 13:45:14 +08:00
73144e278b Don't release full image (#10654)
### What problem does this PR solve?

Introduced gpu profile in .env
Added Dockerfile_tei
fix datrie
Removed LIGHTEN flag

### Type of change

- [x] Documentation Update
- [x] Refactoring
2025-10-23 23:02:27 +08:00
594bf485d4 Test: update test cases for chunk retrieval pagination (#10694)
### What problem does this PR solve?

Updated test cases in test_retrieval_chunks.py to:
- Remove skip mark from page pagination test case (issues/6646 resolved)
- Add skip marks for page_size=1 tests due to new issue (issues/10692)

### Type of change

- [x] Test
2025-10-21 13:02:29 +08:00
6e862553cb Docs: Deprecated 'Create session with agent' (#9464)
### What problem does this PR solve?


### Type of change

- [x] Documentation Update
2025-08-14 12:13:11 +08:00
b55c3d07dc Test: Update error message assertions for chunk update tests (#9468)
### What problem does this PR solve?

Modify test cases to accept additional error message format when
updating chunks.
fix actions:
https://github.com/infiniflow/ragflow/actions/runs/16942741621/job/48015850297

### Type of change

- [x] Update test cases
2025-08-14 12:11:20 +08:00
57b9f8cf52 Fix: Update test assertions and simplify test cases (#9400)
### What problem does this PR solve?

- Fix error message assertion in test_update_chunk.py to match new
ownership validation
- Simplify dataset listing test cases by removing lambda assertions for
sorting
- Fix actions:
https://github.com/infiniflow/ragflow/actions/runs/16885465524/job/47831942553

### Type of change

- [x] Fix test cases
2025-08-12 10:57:30 +08:00
342a04ec8a Added infinity rank_feature support (#9044)
### What problem does this PR solve?

Added infinity rank_feature support

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-07-29 09:14:23 +08:00
b5ffca332a Refa: validation utils to use Pydantic v2 style models (#9037)
### What problem does this PR solve?

- Update BaseModel to use model_config instead of Config class
- Replace StrEnum with Literal types for method fields
- Convert Field declarations to Annotated style

### Type of change

- [x] Refactoring
2025-07-25 12:16:45 +08:00
b4b6d296ea Fix: Increase timeouts for document parsing and model checks (#8996)
### What problem does this PR solve?

- Extended embedding model timeout from 3 to 10 seconds in api_utils.py
- Added more time for large file batches and concurrent parsing
operations to prevent test flakiness
- Import from #8940
- https://github.com/infiniflow/ragflow/actions/runs/16422052652

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-07-23 15:08:36 +08:00
0020c50000 Fix: Refactor parser config handling and add GraphRAG defaults (#8778)
### What problem does this PR solve?

- Update `get_parser_config` to merge provided configs with defaults
- Add GraphRAG configuration defaults for all chunk methods
- Make raptor and graphrag fields non-nullable in ParserConfig schema
- Update related test cases to reflect config changes
- Ensure backward compatibility while adding new GraphRAG support
- #8396

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-07-23 09:29:37 +08:00
f8524462b0 Fix: Increase default chunk_token_num from 128 to 512 in parser config (#8753)
### What problem does this PR solve?

Updated the default `chunk_token_num` value in `api_utils.py` and
`validation_utils.py` to 512 to accommodate larger text chunks. Adjusted
corresponding test cases in HTTP and SDK API tests to reflect this
change.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-07-10 09:53:20 +08:00
4d7bfd2ba3 Fix: typo process_duration (#8696)
### What problem does this PR solve?

Fix typo process_duration.

### Type of change

- [x] Documentation Update
- [x] Refactoring
2025-07-07 14:11:47 +08:00
0b40eb3e90 Test: Add tests for chunk API endpoints (#8616)
### What problem does this PR solve?

- Add comprehensive test suite for chunk operations including:
  - Test files for create, list, retrieve, update, and delete chunks
  - Authorization tests
  - Batch operations tests
- Update test configurations and common utilities
- Validate `important_kwd` and `question_kwd` fields are lists in
chunk_app.py
- Reorganize imports and clean up duplicate code

### Type of change

- [x] Add test cases
2025-07-02 09:49:08 +08:00
dac5bcdf17 Fix: Enforce default embedding model in create_dataset / update_dataset (#8486)
### What problem does this PR solve?

Previous:
- Defaulted to hardcoded model 'BAAI/bge-large-zh-v1.5@BAAI'
- Did not respect user-configured default embedding_model

Now:
- Correctly prioritizes user-configured default embedding_model

Other:
- Make embedding_model optional in CreateDatasetReq with proper None
handling
- Add default embedding model fallback in dataset update when empty
- Enhance validation utils to handle None values and string
normalization
- Update SDK default embedding model to None to match API changes
- Adjust related test cases to reflect new validation rules

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-25 16:41:32 +08:00
9f9acf0c49 Test: Add document app tests (#8456)
### What problem does this PR solve?

- Add new test suite for document app with
create/list/parse/upload/remove tests
- Update API URLs to use version variable from config in HTTP and web
API tests

### Type of change

- [x] Add test cases
2025-06-24 17:26:16 +08:00
e470645efd Refactor code (#8341)
### What problem does this PR solve?

1. rename var
2. update if statement

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-06-18 16:40:30 +08:00
a3bebeb599 Fix: Enforce 255-byte filename limit (#8290)
### What problem does this PR solve?

- Add filename length validation (<=255 bytes) for document
upload/rename in both HTTP and SDK APIs
- Update error messages for consistency
- Fix comparison operator in SDK from '>=' to '>' for filename length
check

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-16 16:39:41 +08:00
64af09ce7b Test: Add web API test suite for knowledge base operations (#8254)
### What problem does this PR solve?

- Implement RAGFlowWebApiAuth class for web API authentication
- Add comprehensive test cases for KB CRUD operations
- Set up common fixtures and utilities in conftest.py
- Add helper functions in common.py for web API requests

The changes establish a complete testing framework for knowledge base
management via web API endpoints.

### Type of change

- [x] Add test case
2025-06-13 16:39:10 +08:00
86a1411b07 Refa: Test configs (#8220)
### What problem does this PR solve?

- Move common constants (HOST_ADDRESS, INVALID_API_TOKEN, etc.) to
configs.py
- Update test imports to use centralized configs
- Clean up duplicate constant definitions across test files

This improves maintainability by centralizing configuration.

### Type of change

- [x] Refactoring test case
2025-06-12 17:42:00 +08:00
54a465f9e8 Test: fix chunk deletion test assertions (#8222)
### What problem does this PR solve?

- Fix test assertions in test_delete_chunks.py to expect empty results
after deletion

Action 7619

### Type of change

- [x] Bug Fix test cases
2025-06-12 17:41:46 +08:00
7fbbc9650d Fix: Move pagerank field from create to update dataset API (#8217)
### What problem does this PR solve?

- Remove pagerank from CreateDatasetReq and add to UpdateDatasetReq
- Add pagerank update logic in dataset update endpoint
- Update API documentation to reflect changes
- Modify related test cases and SDK references

#8208

This change makes pagerank a mutable property that can only be set after
dataset creation, and only when using elasticsearch as the doc engine.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-12 15:47:49 +08:00