Commit Graph

164 Commits

Author SHA1 Message Date
49912a156e Refactor: migrate document run api (#14351)
### What problem does this PR solve?

Before migration: POST /v1/document/run
After migration: POST /api/v1/documents/ingest/

### Type of change

- [x] Refactoring
2026-04-27 21:25:58 +08:00
4dcc42e0e1 feat(api): add unified index API and dataset management endpoints (#14222)
### What problem does this PR solve?

## Summary

Refactor the dataset API layer into a clean service/REST separation
pattern, add a unified `/index` API for graph/raptor/mindmap operations,
and introduce several new dataset management endpoints with full test
coverage.

## Changes

### Service Layer (`dataset_api_service.py`)

- Added `trace_index(dataset_id, tenant_id, index_type)` — unified trace
function for all index types
- Added `run_index`, `delete_index` service functions
- Added `get_dataset`, `get_ingestion_summary`, `list_ingestion_logs`,
`get_ingestion_log`
- Added `run_embedding`, `list_tags`, `aggregate_tags`, `delete_tags`,
`rename_tag`
- Added `get_flattened_metadata`, `get_auto_metadata`,
`update_auto_metadata`

### REST API Layer (`dataset_api.py`)

**New unified routes:**

| Method | Route | Description |
|--------|-------|-------------|
| POST | `/datasets/<id>/index?type=graph\|raptor\|mindmap` | Run index
task |
| GET | `/datasets/<id>/index?type=graph\|raptor\|mindmap` | Trace index
task |
| DELETE | `/datasets/<id>/<index_type>` | Delete index |
| GET | `/datasets/<id>` | Get dataset details |
| GET | `/datasets/<id>/ingestions/summary` | Ingestion summary |
| GET | `/datasets/<id>/ingestions` | List ingestion logs |
| GET | `/datasets/<id>/ingestions/<log_id>` | Get single ingestion log
|
| POST | `/datasets/<id>/embedding` | Run embedding |
| GET | `/datasets/<id>/tags` | List tags |
| GET | `/datasets/tags/aggregation` | Aggregate tags across datasets |
| DELETE | `/datasets/<id>/tags` | Delete tags |
| PUT | `/datasets/<id>/tags` | Rename tag |
| GET | `/datasets/metadata/flattened` | Get flattened metadata |
| GET/PUT | `/datasets/<id>/metadata/config` | New metadata config path
|

**Removed routes (replaced by unified `/index`):**

- `POST /datasets/<id>/mindmap`
- `GET /datasets/<id>/mindmap`

**Preserved legacy routes (backward compatibility):**

- `/run_graphrag`, `/trace_graphrag`, `/run_raptor`, `/trace_raptor`
- `/auto_metadata` GET/PUT

### Test Suite

- Updated `common.py` helpers: added `trace_index`, removed
`run_mindmap`/`trace_mindmap`
- Added 7 new test files with 39 test cases total:

| Test File | Cases |
|-----------|-------|
| `test_get_dataset.py` | 4 |
| `test_ingestion_summary.py` | 2 |
| `test_ingestion_logs.py` | 5 |
| `test_index_api.py` | 14 |
| `test_embedding.py` | 2 |
| `test_tags.py` | 8 |
| `test_flattened_metadata.py` | 4 |

- Deleted `test_mindmap_tasks.py` (covered by unified index tests)

## Design Decisions

1. **Unified `/index?type=...`** — single endpoint replaces 3 separate
route pairs for graph/raptor/mindmap
2. **Backward compatibility** — old routes (`/run_graphrag`,
`/run_raptor`, `/auto_metadata`) preserved alongside new paths
3. **`_VALID_INDEX_TYPES = {"graph", "raptor", "mindmap"}`** — input
validation via constant set
4. **`_INDEX_TYPE_TO_TASK_ID_FIELD`** — maps index type to KB model task
ID field for clean dispatch

## Files Changed

- `api/apps/restful_apis/dataset_api.py`
- `api/apps/services/dataset_api_service.py`
- `sdk/python/ragflow_sdk/modules/dataset.py`
- `test/testcases/test_http_api/common.py`
- `test/testcases/test_http_api/test_dataset_management/` (7 new files)
### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring

---------

Signed-off-by: noob <yixiao121314@outlook.com>
2026-04-27 09:38:01 +08:00
199fbceb72 Refactor user REST API (#14334)
### What problem does this PR solve?
Refactor user REST API

### Type of change
- [x] Refactoring
2026-04-24 10:25:15 +08:00
2d05475693 Refactor: Consolidation WEB API & HTTP API for document infos (#14239)
### What problem does this PR solve?

Before consolidation
Web API: POST /v1/document/infos
Http API - GET /api/v1/datasets/<dataset_id>/documents

After consolidation, Restful API -- GET
/api/v1/datasets/<dataset_id>/documents?ids=id1&ids=id2

### Type of change

- [ ] Refactoring
2026-04-21 19:35:11 +08:00
fa75aee3b9 Refactor system API (#13958)
### What problem does this PR solve?

- ping
- token
- log level

### Type of change

- [x] Refactoring


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Refactor**
* System endpoints consolidated under /api/v1/system: ping, health
check, and token management moved to the centralized API surface.
* Token management unified at /api/v1/system/tokens with
list/create/delete behavior.

* **Documentation**
  * API reference updated to reflect the new /api/v1/system paths.

* **Tests**
* Client fixtures and test utilities updated to use
/api/v1/system/tokens; one unit test for health/oceanbase status
removed.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-08 15:26:18 +08:00
ad789f5c43 Fix list files (#13960)
### What problem does this PR solve?

As title.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* Standardized the query parameter used when listing documents so
listings behave consistently across the web and client interfaces.
* Clarified the error message shown when a required dataset ID is
missing to give clearer guidance to users.

* **Tests**
* Updated test coverage to reflect the standardized dataset identifier
usage.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-08 13:38:30 +08:00
983150b936 Fix (api): fix the document parsing status check logic (#12504)
### What problem does this PR solve?
When the original code terminates the parsing task halfway, the progress
may not be 0 or 1, which will result in the inability to call the
interface to parse again

-Change the document parsing progress check to task status check, and
use TaskStatus.RUNNING.value to judge
-Update the condition judgment for stopping parsing documents, and check
whether the task is running instead


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-02-28 14:38:55 +08:00
6f1a555d5f Refa(sdk/python/test): remove unused testcases and utilities (#12505)
### What problem does this PR solve?

Removed the following dir:
- sdk/python/test/libs/
- sdk/python/test/test_http_api/
- sdk/python/test/test_sdk_api/

### Type of change

- [x] Refactoring
2026-01-08 16:11:35 +08:00
30019dab9f Change knowledge base to dataset (#11976)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-12-17 10:03:33 +08:00
d1716d865a Feat: Alter flask to Quart for async API serving. (#11275)
### What problem does this PR solve?

#11277

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-11-18 17:05:16 +08:00
0569b50fed Fix: create dataset return type inconsistent (#11272)
### What problem does this PR solve?

Fix: create dataset return type inconsistent #11167 
 
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-11-17 15:27:19 +08:00
19f71a961a Fix: Create dataset performance unmatched between HTTP api and web ui (#10960)
### What problem does this PR solve?

Fix: Create dataset performance unmatched between HTTP api and web ui
#10925

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-11-04 13:45:14 +08:00
73144e278b Don't release full image (#10654)
### What problem does this PR solve?

Introduced gpu profile in .env
Added Dockerfile_tei
fix datrie
Removed LIGHTEN flag

### Type of change

- [x] Documentation Update
- [x] Refactoring
2025-10-23 23:02:27 +08:00
6e862553cb Docs: Deprecated 'Create session with agent' (#9464)
### What problem does this PR solve?


### Type of change

- [x] Documentation Update
2025-08-14 12:13:11 +08:00
342a04ec8a Added infinity rank_feature support (#9044)
### What problem does this PR solve?

Added infinity rank_feature support

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-07-29 09:14:23 +08:00
4d7bfd2ba3 Fix: typo process_duration (#8696)
### What problem does this PR solve?

Fix typo process_duration.

### Type of change

- [x] Documentation Update
- [x] Refactoring
2025-07-07 14:11:47 +08:00
e470645efd Refactor code (#8341)
### What problem does this PR solve?

1. rename var
2. update if statement

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-06-18 16:40:30 +08:00
4f3abb855a Fix: remove zhipu ai api key (#8066)
### What problem does this PR solve?

- Removed hardcoded Zhipu API key from codebase
- New requirement: Tests now require ZHIPU_AI_API_KEY environment
variable
  Example: export ZHIPU_AI_API_KEY=your_api_key_here

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-05 12:04:09 +08:00
31f4d44c73 Update upload filename length limit from 128 to 256, which is aligned with os (#7971)
### What problem does this PR solve?

Change filename length limit from 128 to 256

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-05-30 14:25:59 +08:00
5d6bf2224a Fix: Opensearch chunk management (#7802)
### What problem does this PR solve?

This PR solve the problems metioned in the
pr(https://github.com/infiniflow/ragflow/pull/7140) which is also
submitted by me

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):


### Introduction
I fixed the problems when using OpenSearch as the DOC_ENGINE, the
failures of pytest and the wrong API's return.
Mainly about delete chunk, list chunks, update chunk, retrieval chunk.
The pytest comand "cd sdk/python && uv sync --python 3.10 --group test
--frozen && source .venv/bin/activate && cd test/test_http_api &&
DOC_ENGINE=opensearch pytest test_chunk_management_within_dataset -s
--tb=short " is finally successful.

###Others
As some changes between Elasticsearch And Opensearch differ, some pytest
results about OpenSearch are correct and resonable. However, some pytest
params (skipif params) are incompatible. So I changed some pytest params
about skipif.

As a search engine programmer, I will still focus on the usage of vector
databases (especially OpenSearch) for the RAG stuff.
Thanks for your review
2025-05-26 16:57:58 +08:00
e166f132b3 Feat: change default models (#7777)
### What problem does this PR solve?

change default models to buildin models
https://github.com/infiniflow/ragflow/issues/7774

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-05-23 18:21:25 +08:00
fed1221302 Refa: HTTP API list datasets / test cases / docs (#7720)
### What problem does this PR solve?

This PR introduces Pydantic-based validation for the list datasets HTTP
API, improving code clarity and robustness. Key changes include:

Pydantic Validation
Error Handling
Test Updates
Documentation Updates

### Type of change

- [x] Documentation Update
- [x] Refactoring
2025-05-20 09:58:26 +08:00
59705a1c1d Test: change variable for ZHIPU_AI_API_KEY (#7684)
### What problem does this PR solve?

change variable for ZHIPU_AI_API_KEY

### Type of change

- [x] Update test case
2025-05-16 15:58:54 +08:00
04edf9729f Test: use environment variable for ZHIPU_AI_API_KEY (#7680)
### What problem does this PR solve?

use environment variable for ZHIPU_AI_API_KEY

### Type of change

- [x] Test update
2025-05-16 13:51:21 +08:00
ae8b628f0a Refa: HTTP API delete dataset / test cases / docs (#7657)
### What problem does this PR solve?

This PR introduces Pydantic-based validation for the delete dataset HTTP
API, improving code clarity and robustness. Key changes include:

1. Pydantic Validation
2. Error Handling
3. Test Updates
4. Documentation Updates

### Type of change

- [x] Documentation Update
- [x] Refactoring
2025-05-16 10:16:43 +08:00
f8cc557892 Fix(api): correct default value handling in dataset parser config (#7589)
### What problem does this PR solve?

Fix  HTTP API Create/Update dataset parser config default value error

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-12 19:39:18 +08:00
ef0c4b134d Test: skip unstable test cases (#7578)
### What problem does this PR solve?

Skip unstable test cases to ensure daily testing stability

### Type of change

- [x] Update test cases
2025-05-12 09:49:14 +08:00
35e36cb945 Refa: HTTP API update dataset / test cases / docs (#7564)
### What problem does this PR solve?

This PR introduces Pydantic-based validation for the update dataset HTTP
API, improving code clarity and robustness. Key changes include:
1. Pydantic Validation
2. ​​Error Handling
3. Test Updates
4. Documentation Updates
5. fix bug: #5915

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Documentation Update
- [x] Refactoring
2025-05-09 19:17:08 +08:00
0fbca63e9d Test: Configure test case priorities to reduce CI execution time (#7532)
### What problem does this PR solve?

Configure test case priorities to reduce CI execution time

### Type of change

- [x] Test cases update
2025-05-08 19:22:52 +08:00
c98933499a refa: Optimize create dataset validation (#7451)
### What problem does this PR solve?

Optimize dataset validation and add function docs

### Type of change

- [x] Refactoring
2025-05-06 17:38:06 +08:00
fc379e90d1 Fix: change create dataset htto api delimiter default value to r'\n' (#7434)
### What problem does this PR solve?

change create dataset delimiter default value to r'\n'

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-04-30 17:43:42 +08:00
1f82889001 Fix: create dataset remove unnecessary parameter constraints (#7432)
### What problem does this PR solve?

Remove unnecessary parameter restrictions in dataset creation API

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-30 14:50:23 +08:00
e6c824e606 Test: Update tests to use new fixture instead of deprecated one (#7431)
### What problem does this PR solve?

Deprecate get_dataset_id_and_document_id fixture, use add_document
instead

### Type of change

- [x] Update test cases
2025-04-30 14:49:26 +08:00
78380fa181 Refa: http API create dataset and test cases (#7393)
### What problem does this PR solve?

This PR introduces Pydantic-based validation for the create dataset HTTP
API, improving code clarity and robustness. Key changes include:
1. Pydantic Validation
2. ​​Error Handling
3. Test Updates
4. Documentation

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Documentation Update
- [x] Refactoring
2025-04-29 16:53:57 +08:00
c7310f7fb2 Refa: similarity calculations. (#7381)
### What problem does this PR solve?


### Type of change

- [x] Refactoring
2025-04-28 19:17:11 +08:00
a4be6c50cf [BREAKING CHANGE] GET to POST: enhance document list capability (#7349)
### What problem does this PR solve?

Enhance capability of `list_docs`.

Breaking change: change method from `GET` to `POST`.

### Type of change

- [x] Refactoring
- [x] Enhancement with breaking change
2025-04-27 16:48:27 +08:00
94181a990b Refa: knowledge_graph chunk method is deprecated (#7220)
### What problem does this PR solve?

The knowledge_graph chunk method is deprecated and should no longer be
used. #7184.

### Type of change

- [x] Refactoring
2025-04-23 13:01:46 +08:00
f35ff65c36 [BREAKING CHANGE] GET to POST: enhance kb list capability (#7205)
### What problem does this PR solve?

Enhance capability of `list_kbs`.

Breaking change: change method from `GET` to `POST`.

### Type of change

- [x] Refactoring
- [x] Enhancement with breaking change
2025-04-22 17:54:12 +08:00
67dee2d74e Fix: fix retrieval tesing wrong pagination (#7174)
### What problem does this PR solve?

Fix retrieval testing wrong pagination. #7171 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-04-22 15:16:04 +08:00
e5f9d148e7 Test: Added test cases for Delete Sessions With Chat Assistant HTTP API (#7025)
### What problem does this PR solve?

cover [Delete chat assistant's
sessions](https://ragflow.io/docs/dev/http_api_reference#delete-chat-assistants-sessions)
endpoints

### Type of change

- [x] Add test cases
2025-04-15 14:54:26 +08:00
9b789c2ae9 Test: Added test cases for Update Session With Chat Assistant HTTP API (#6968)
### What problem does this PR solve?

cover [Update chat assistant's
sessions](https://ragflow.io/docs/dev/http_api_reference#update-chat-assistants-session)
endpoints

### Type of change

- [x] Update test cases
2025-04-11 20:10:24 +08:00
ffb9f01bea Test: Update test cases for PR 6906 ISSUE 6875 (#6971)
### What problem does this PR solve?

PR #6906 ISSUE #6875

### Type of change

- [ ] Update test cases
2025-04-11 20:09:44 +08:00
dc59aba132 Test: Added test cases for List Sessions With Chat Assistant HTTP API (#6938)
### What problem does this PR solve?

cover [List chat assistant's
sessions](https://ragflow.io/docs/dev/http_api_reference#list-chat-assistants-sessions)
endpoints

### Type of change

- [x] Update test cases
2025-04-10 17:31:01 +08:00
8fb5edd927 Test: Update test cases for PR 6906 (#6929)
### What problem does this PR solve?

PR #6906

### Type of change

- [x] Update test cases
2025-04-10 12:28:56 +08:00
3bb1e012e6 Fix: assistant deleteion issue. (#6906)
### What problem does this PR solve?

#6875

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-09 20:29:40 +08:00
22758a2763 Test: Update test cases for PR 6888 ISSUE 6876 (#6907)
### What problem does this PR solve?

PR #6888 ISSUE #6876

### Type of change

- [x] Update test case
2025-04-09 20:29:29 +08:00
ae107f31d9 Test: Added test cases for Create Session With Chat Assistant HTTP API (#6902)
### What problem does this PR solve?

cover [create session with chat
assistant](https://ragflow.io/docs/dev/http_api_reference#create-session-with-chat-assistant)
endpoints

### Type of change

- [x] add test cases
2025-04-09 17:21:48 +08:00
c26c38ee12 Test: Added test cases for Delete Chat Assistants HTTP API (#6879)
### What problem does this PR solve?

cover [delete chat
assistants](https://ragflow.io/docs/dev/http_api_reference#delete-chat-assistants)
endpoints

### Type of change

- [x] add test cases
2025-04-08 18:53:02 +08:00
a1fb32908d Fix: Error message is incorrect when updating chat name #6850 (#6851)
### What problem does this PR solve?

#6850 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-07 17:13:17 +08:00
0b89458eb8 Test: Added test cases for Update Chat Assistant HTTP API (#6843)
### What problem does this PR solve?

cover [update chat
assistant](https://ragflow.io/docs/v0.17.2/http_api_reference#update-chat-assistant)
endpoints

### Type of change

- [x] add test cases
2025-04-07 15:04:23 +08:00