Commit Graph

98 Commits

Author SHA1 Message Date
46897d6fa4 Fix: bind memory message user_id to authenticated user for JWT auth (#14745)
### Related issues

Closes #14744

### What problem does this PR solve?

The Memory REST endpoint `POST /api/v1/messages` previously persisted
whatever `user_id` the client sent in the JSON body. Memory rows were
therefore attributed to an arbitrary string, even when the caller
authenticated as a normal workspace user via JWT (browser/session-style
bearer token decoded into an access token). That broke attribution and
audit semantics for shared memories (team visibility): any authorized
writer could spoof another subject id.

The Python SDK already sends an optional `user_id` for integrations
using **API keys** (`APIToken`) to tag an external subject distinct from
the tenant owner user.

### Solution

- Record **`g.auth_via_api_token`** in `_load_user`
(`api/apps/__init__.py`): set `True` only when authentication resolves
via `APIToken`, otherwise `False` after JWT-based login succeeds.
- In **`POST /messages`** (`memory_api.add_message`): if the request was
authenticated with an API key, keep accepting optional `user_id` from
the body (default empty string). For JWT-authenticated users, **always**
set stored `user_id` to **`current_user.id`** and ignore the client
field.
- Guard reads of `g` with **`RuntimeError`** handling so isolated
imports or tests without a Quart application context do not fail when
resolving `user_id`.
- Document on **`RAGFlow.add_message`** that `user_id` is only
meaningful for API-key authentication.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

### Testing

- `python -m py_compile` on modified modules (`api/apps/__init__.py`,
`api/apps/restful_apis/memory_api.py`).
- Recommended: run web/SDK memory message tests (`test_add_message`,
`test_message_routes_unit`) against a full environment with `quart` and
configured services.

### Notes for reviewers

- Behavior change **only** for callers using JWT-style authorization on
`POST /messages`; API-key callers keep prior optional `user_id`
semantics.

Co-authored-by: jony376 <jony376@gmail.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 13:26:05 +08:00
f32034e83e Refactor: completion -> completions (#14584)
### What problem does this PR solve?

Keep only /completions, deprecated /completion

### Type of change

- [x] Refactoring
2026-05-06 17:19:22 +08:00
24af0875e5 Feat/configurable metadata display (#13464)
### What problem does this PR solve?

Currently, RAGFlow's Search and Chat interfaces display only raw
vectorized text chunks during retrieval, without contextual information
about their source documents. Users cannot see document titles, page
numbers, upload dates, or custom metadata fields that would help them
understand and trust the retrieved results.

This PR introduces an **optional metadata display feature** that
enriches retrieved chunks with document-level metadata in both the
Search tab and Chatbot interface.

**Key improvements:**
- **Search results**: Display document metadata as styled badges beneath
chunk snippets
- **Chat citations**: Show metadata in citation popovers and reference
lists for better source context
- **LLM context**: Metadata is injected into the LLM prompt to enable
more accurate, citation-aware responses
- **External API support**: Applications using RAGFlow's SDK retrieval
endpoints (`/v1/retrieval`, `/v1/searchbots/retrieval_test`) can opt-in
via request parameters
- **User control**: Multi-select dropdown UI allows users to choose
which metadata fields to display

**Implementation approach:**
-  Reuses existing `DocMetadataService` infrastructure (no new database
tables or indices)
-  Settings stored in existing JSON configuration fields
(`search_config.reference_metadata`, `prompt_config.reference_metadata`)
-  No database migrations required
-  Disabled by default (fully opt-in and backward-compatible)
-  Dynamic metadata field selection populated from actual document
metadata keys
-  Fixed critical bug where Python's builtin `set()` was shadowed by a
route handler function

**Modified endpoints (all backward-compatible):**
- `POST /v1/retrieval` (Public SDK)
- `POST /v1/searchbots/retrieval_test` (Searchbots)
- `POST /v1/chunk/retrieval_test` (UI/Internal)
- Chat completions endpoints (via `extra_body.reference_metadata` or
`prompt_config`)

### Type of change

- [x] New Feature (non-breaking change which adds functionality)


###Images
-
<img width="879" height="1275" alt="image"
src="https://github.com/user-attachments/assets/95b2d731-31ae-45a1-b081-bf5893f52aeb"
/>
<br><br>
<br><br>

<img width="1532" height="362" alt="image"
src="https://github.com/user-attachments/assets/9cebc65b-b7a7-459f-b25e-3b13fa9b638e"
/>
<br><br>
<br><br>

<img width="2586" height="1320" alt="image"
src="https://github.com/user-attachments/assets/2153d493-d899-461f-a7a9-041391e07776"
/>

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Attili-sys <Attili-sys@users.noreply.github.com>
Co-authored-by: Ahmad Intisar <ahmadintisar@Ahmads-MacBook-M4-Pro.local>
2026-04-30 23:13:27 +08:00
4dcc42e0e1 feat(api): add unified index API and dataset management endpoints (#14222)
### What problem does this PR solve?

## Summary

Refactor the dataset API layer into a clean service/REST separation
pattern, add a unified `/index` API for graph/raptor/mindmap operations,
and introduce several new dataset management endpoints with full test
coverage.

## Changes

### Service Layer (`dataset_api_service.py`)

- Added `trace_index(dataset_id, tenant_id, index_type)` — unified trace
function for all index types
- Added `run_index`, `delete_index` service functions
- Added `get_dataset`, `get_ingestion_summary`, `list_ingestion_logs`,
`get_ingestion_log`
- Added `run_embedding`, `list_tags`, `aggregate_tags`, `delete_tags`,
`rename_tag`
- Added `get_flattened_metadata`, `get_auto_metadata`,
`update_auto_metadata`

### REST API Layer (`dataset_api.py`)

**New unified routes:**

| Method | Route | Description |
|--------|-------|-------------|
| POST | `/datasets/<id>/index?type=graph\|raptor\|mindmap` | Run index
task |
| GET | `/datasets/<id>/index?type=graph\|raptor\|mindmap` | Trace index
task |
| DELETE | `/datasets/<id>/<index_type>` | Delete index |
| GET | `/datasets/<id>` | Get dataset details |
| GET | `/datasets/<id>/ingestions/summary` | Ingestion summary |
| GET | `/datasets/<id>/ingestions` | List ingestion logs |
| GET | `/datasets/<id>/ingestions/<log_id>` | Get single ingestion log
|
| POST | `/datasets/<id>/embedding` | Run embedding |
| GET | `/datasets/<id>/tags` | List tags |
| GET | `/datasets/tags/aggregation` | Aggregate tags across datasets |
| DELETE | `/datasets/<id>/tags` | Delete tags |
| PUT | `/datasets/<id>/tags` | Rename tag |
| GET | `/datasets/metadata/flattened` | Get flattened metadata |
| GET/PUT | `/datasets/<id>/metadata/config` | New metadata config path
|

**Removed routes (replaced by unified `/index`):**

- `POST /datasets/<id>/mindmap`
- `GET /datasets/<id>/mindmap`

**Preserved legacy routes (backward compatibility):**

- `/run_graphrag`, `/trace_graphrag`, `/run_raptor`, `/trace_raptor`
- `/auto_metadata` GET/PUT

### Test Suite

- Updated `common.py` helpers: added `trace_index`, removed
`run_mindmap`/`trace_mindmap`
- Added 7 new test files with 39 test cases total:

| Test File | Cases |
|-----------|-------|
| `test_get_dataset.py` | 4 |
| `test_ingestion_summary.py` | 2 |
| `test_ingestion_logs.py` | 5 |
| `test_index_api.py` | 14 |
| `test_embedding.py` | 2 |
| `test_tags.py` | 8 |
| `test_flattened_metadata.py` | 4 |

- Deleted `test_mindmap_tasks.py` (covered by unified index tests)

## Design Decisions

1. **Unified `/index?type=...`** — single endpoint replaces 3 separate
route pairs for graph/raptor/mindmap
2. **Backward compatibility** — old routes (`/run_graphrag`,
`/run_raptor`, `/auto_metadata`) preserved alongside new paths
3. **`_VALID_INDEX_TYPES = {"graph", "raptor", "mindmap"}`** — input
validation via constant set
4. **`_INDEX_TYPE_TO_TASK_ID_FIELD`** — maps index type to KB model task
ID field for clean dispatch

## Files Changed

- `api/apps/restful_apis/dataset_api.py`
- `api/apps/services/dataset_api_service.py`
- `sdk/python/ragflow_sdk/modules/dataset.py`
- `test/testcases/test_http_api/common.py`
- `test/testcases/test_http_api/test_dataset_management/` (7 new files)
### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring

---------

Signed-off-by: noob <yixiao121314@outlook.com>
2026-04-27 09:38:01 +08:00
c74aece63c Feat: Agent api (#14157)
### What problem does this PR solve?

1. **List agents**  
   **Prev API**:  
   - `/v1/canvas/list GET`  
   - `/api/v1/agents GET`  
   **Current API**: `/api/v2/agents GET`

2. **Get canvas template**  
   **Prev API**: `/v1/canvas/templates GET`  
   **Current API**: `/api/v2/agents/templates GET`

3. **Delete an agent**  
   **Prev API**: 
    - `/v1/canvas/rm POST`  
    - `/api/v1/agents/<agent_id> DELETE`
   **Current API**: `/api/v2/agents/<agent_id> DELETE`

4. **Update an agent**  
   **Prev API**: 
    - `/api/v1/agents/<agent_id> PUT`   
    - `/v1/canvas/setting POST `
   **Current API**: `/api/v2/agents/<agent_id> PATCH`


5. **Create an agent**  
   **Prev API**: 
    - `/v1/canvas/set POST`  
    - `/api/v1/agents POST`
   **Current API**: `/api/v2/agents POST`


6. **Get an agent**  
   **Prev API**: 
    - `/v1/canvas/get/<canvas_id> GET `  
   **Current API**: `/api/v2/agents/<agent_id> GET`


7. **Reset an agent**  
   **Prev API**: 
    - `/v1/canvas/reset POST`  
   **Current API**: `/api/v2/agents/<agent_id>/reset POST`


8. **Upload a file to an agent**  
   **Prev API**: 
    - `/v1/canvas/upload/<canvas_id> POST`  
   **Current API**: `/api/v2/agents/<agent_id>/upload POST`


9. **Input form**  
   **Prev API**: 
    - `/v1/canvas/input_form GET`  
**Current API**:
`/api/v2/agents/<agent_id>/components/<component_id>/input-form GET`


10. **Debug an agent**  
   **Prev API**: 
    - `/v1/canvas/debug POST`  
**Current API**:
`/api/v2/agents/<agent_id>/components/<component_id>/debug POST`


11. **Trace an agent**  
   **Prev API**: 
    - `/v1/canvas/trace GET`  
   **Current API**: `/api/v2/agents/<agent_id>/logs/<message_id> GET`


12. **Get an agent version list**  
   **Prev API**: 
    - `/v1/canvas/getlistversion/<canvas_id>`  
   **Current API**: `/api/v2/agents/<agent_id>/versions GET`


13. **Get a version of agent**  
   **Prev API**: 
    - `/v1/canvas/getversion/<version_id>`  
**Current API**: `/api/v2/agents/<agent_id>/versions/<version_id> GET`


14. **Test db connection**  
   **Prev API**: 
    - `/v1/canvas/test_db_connect POST`  
   **Current API**: `/api/v2/agents/test_db_connection`


15. **Rerun the agent**  
   **Prev API**: 
    - `/v1/canvas/rerun POST`  
   **Current API**: `/api/v2/agents/rerun POST`


16. **Get prompts**  
   **Prev API**: 
    - `/v1/canvas/prompts GET`  
   **Current API**: `/api/v2/agents/prompts GET`

### Type of change
- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: chanx <1243304602@qq.com>
2026-04-24 10:02:22 +08:00
7817b0d779 Refa: migrate chunk APIs to RESTful routes (#14291)
### What problem does this PR solve?

migrate chunk APIs to RESTful routes

### Type of change
- [x] Refactoring
2026-04-23 14:17:23 +08:00
3ce1e44b2d Fix: document and sdk support of searching message with user_id (#14283)
### What problem does this PR solve?

Add document of search message with user_id, add sdk support.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
2026-04-22 14:43:38 +08:00
6baf74afc1 Refa: align chat and search restful APIs (#14229)
### What problem does this PR solve?

Refactor /api/v1/chats to be more RESTful.

### Type of change

- [x] Refactoring

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-04-22 10:49:11 +08:00
2d05475693 Refactor: Consolidation WEB API & HTTP API for document infos (#14239)
### What problem does this PR solve?

Before consolidation
Web API: POST /v1/document/infos
Http API - GET /api/v1/datasets/<dataset_id>/documents

After consolidation, Restful API -- GET
/api/v1/datasets/<dataset_id>/documents?ids=id1&ids=id2

### Type of change

- [ ] Refactoring
2026-04-21 19:35:11 +08:00
576431de99 Refactor: Change update doc from PUT to patch (#14067)
### What problem does this PR solve?

Before change, update_document in api/apps/restful_apis/document_api.py
is using "PUT".
After change, it will use "PATCH" which is more suitable.

### Type of change

- [x] Refactoring
2026-04-14 17:12:23 +08:00
b7daf6285b Refa: Chat conversations /convsersation API to RESTFul (#13893)
### What problem does this PR solve?

Chat conversations /convsersation API to RESTFul.

### Type of change

- [x] Refactoring
2026-04-02 20:49:23 +08:00
b622c47ed6 Refa: Chats /chat API to RESTFul (#13881)
### What problem does this PR solve?

 Refactor Chats /chat API to RESTFul.

### Type of change

- [x] Refactoring
2026-04-01 20:10:37 +08:00
b1d28b5898 Revert "Refa: Chats /chat API to RESTFul (#13871)" (#13877)
### What problem does this PR solve?

This reverts commit 1a608ac411.

### Type of change

- [x] Other (please describe):
2026-04-01 11:05:29 +08:00
1a608ac411 Refa: Chats /chat API to RESTFul (#13871)
### What problem does this PR solve?

Chats /chat API to RESTFul.

### Type of change

- [x] Refactoring
2026-04-01 10:50:22 +08:00
641b319647 feat: support reading tags via API (#12891) (#13732)
### What problem does this PR solve?

Enable reading Tag Set tags via API (expose tag_kwd field). The result
of the queried list chunks is as shown below:

<img width="1422" height="818" alt="image"
src="https://github.com/user-attachments/assets/abd1960a-fe34-489e-9d72-525f8e574938"
/>


### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Co-authored-by: heyang.why <heyang.why@alibaba-inc.com>
2026-03-29 20:17:01 +08:00
4bb1acaa5b Refactor: dataset / kb API to RESTFul style (#13690)
### What problem does this PR solve?

1. Split dataset api to gateway and service, and modify web UI to use
restful http api.
2. Old KB releated APIs are commented.

### Type of change

- [x] Refactoring

---------

Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
2026-03-19 14:41:36 +08:00
986dcf1cc8 Revert "Refactor: dataset / kb API to RESTFul style" (#13646)
Reverts infiniflow/ragflow#13619
2026-03-17 12:09:48 +08:00
1db5409d82 Refactor: dataset / kb API to RESTFul style (#13619)
### What problem does this PR solve?

1. Split dataset api to gateway and service, and modify web UI to use
restful http api.
2. Old KB releated APIs are commented.

### Type of change

- [x] Refactoring
2026-03-16 22:51:34 +08:00
af7e24ba8c Feat: add_chunk supports add image (#13629)
### What problem does this PR solve?

Add_chunk supports add image.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
2026-03-16 20:15:36 +08:00
a2d72202cf Revert "Refactor dataset / kb API to RESTFul style" (#13614)
Reverts infiniflow/ragflow#13263
2026-03-16 10:44:38 +08:00
7c32e206be Refactor dataset / kb API to RESTFul style (#13263)
### What problem does this PR solve?

1. Split dataset api to gateway and service, and modify web UI to use
restful http api.
2. Old KB releated APIs are commented.

### Type of change

- [x] Refactoring
2026-03-13 20:02:35 +08:00
227c852e67 Fix typo: documnet_keyword -> document_keyword in Chunk class (#13531)
### What problem does this PR solve?
The Chunk class had a typo in the attribute name 'documnet_keyword',
which caused the document_name field to remain empty when retrieving
chunks via the SDK. This fix corrects the spelling to
'document_keyword'.

Changes:
- Line 36: Changed self.documnet_keyword to self.document_keyword
- Line 52: Updated backward compatibility code to use
self.document_keyword


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-03-12 15:23:55 +08:00
e1b632a7bb Feat: add delete all support for delete operations (#13530)
### What problem does this PR solve?

Add delete all support for delete operations.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update

---------

Co-authored-by: writinwaters <cai.keith@gmail.com>
2026-03-12 09:47:42 +08:00
d43aebe701 Fix/13142 auto metadata (#13217)
### What problem does this PR solve?

Close #13142

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-02-26 10:25:48 +08:00
2c4499ec45 Fix: key error "content" #12844 (#12847)
### What problem does this PR solve?

Fix: key error "content" #12844

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2026-01-28 14:39:34 +08:00
59075a0b58 Fix : p3 level sdk test error for update chat (#12654)
### What problem does this PR solve?

fix for update chat failing

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-01-16 17:47:12 +08:00
2b20d0b3bb Fix : Web API tests by normalizing errors, validation, and uploads (#12620)
### What problem does this PR solve?

Fixes web API behavior mismatches that caused test failures by
normalizing error responses, tightening validations, correcting error
messages, and closing upload file handles.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-01-16 11:09:22 +08:00
f9d4179bf2 Feat:memory sdk (#12538)
### What problem does this PR solve?

Move memory and message apis to /api, and add sdk support.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-01-09 17:45:58 +08:00
c7cf7aad4e Fix: update RAGFlow SDK for consistency (#12065)
### What problem does this PR solve?

Fix: update RAGFlow SDK for consistency #12059

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-22 11:09:56 +08:00
2118bc2556 Fix: Python SDK retrieve document_name is empty (#12062)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/12056

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-22 11:08:39 +08:00
5e05f43c3d Update default prompt (#11984)
### What problem does this PR solve?

New default prompt:

```
You are an intelligent assistant. Your primary function is to answer questions based strictly on the provided knowledge base.

**Essential Rules:**
- Your answer must be derived **solely** from this knowledge base: `{knowledge}`.
- **When information is available**: Summarize the content to give a detailed answer.
- **When information is unavailable**: Your response must contain this exact sentence: "The answer you are looking for is not found in the knowledge base!"
- **Always consider** the entire conversation history.
```

Also fix some grammar errors.

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-12-17 12:57:24 +08:00
5c81e01de5 Fix: incorrect async chat streamly output (#11679)
### What problem does this PR solve?

Incorrect async chat streamly output. #11677.

Disable beartype for #11666.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-03 11:15:45 +08:00
d2b1da0e26 Fix: Optimize edge check & incorrect parameter usage (#11396)
### What problem does this PR solve?

Fix: incorrect parameter usage #8084
Fix: Optimize edge check #10851

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-11-20 12:49:47 +08:00
de53498b39 Fix: Update env to support PPTX and update README for version changes (#11167)
### What problem does this PR solve?

Fix: Update env to support PPTX
Fix: update README for version changes #11138

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Documentation Update

---------

Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
2025-11-11 19:56:54 +08:00
c7bd0a755c Fix: python api streaming structure (#11105)
### What problem does this PR solve?

Fix: python api streaming structure

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-11-07 16:50:58 +08:00
95fad5d523 Fix: Chat session completion (#10851)
### What problem does this PR solve?
Fix: Chat session completion #10791
### Type of change

- [X] Bug Fix (non-breaking change which fixes an issue)
2025-10-29 09:44:02 +08:00
16ec6ad346 Fix: Pass kwargs in python api #10699 (#10808)
### What problem does this PR solve?

Fix: Pass kwargs in python api #10699

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-10-27 15:18:56 +08:00
68e47c81d4 Feat: Add parse_document with feed back (#10523)
### What problem does this PR solve?

Solved: Sync Parse Document API #5635
Feat: Add parse_document with feed back, user can view the status of
each document after parsing finished.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
2025-10-14 09:31:19 +08:00
534fa60b2a Fix: Agent.reset() argument wrong #10463 & Unable to converse with agent through Python API. #10415 (#10472)
### What problem does this PR solve?
Fix: Agent.reset() argument wrong #10463 & Unable to converse with agent
through Python API. #10415

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-10-10 20:44:05 +08:00
677c99b090 Feat: Add metadata filtering function for /api/v1/retrieval (#9877)
-Added the metadata_dedition parameter in the document retrieval
interface to filter document metadata -Updated the API documentation and
added explanations for the metadata_dedition parameter

### What problem does this PR solve?

Make /api/v1/retrieval api also can use metadata filter

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-09-05 11:12:15 +08:00
6e862553cb Docs: Deprecated 'Create session with agent' (#9464)
### What problem does this PR solve?


### Type of change

- [x] Documentation Update
2025-08-14 12:13:11 +08:00
5c3577c4c9 Python SDK: add meta_fields to Document class (#9387)
### What problem does this PR solve?

Python class Document was missing "meta_fields", e.g. when querying, the
document instances came without meta_fields

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-12 10:16:12 +08:00
c9b156fa6d Fix: Remove default dataset_ids from Chat class initialization (#9381)
### What problem does this PR solve?

- The default dataset_ids "kb1" was removed from the Chat class. 
- The HTTP API response does not include the dataset_ids field.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-11 17:15:30 +08:00
60d652d2e1 Feat: list documents supports range filtering (#9214)
### What problem does this PR solve?

list_document supports range filtering.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-04 16:35:35 +08:00
0b487dee43 Fix: support cross language for API. (#8946)
### What problem does this PR solve?

Close #8943

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-07-21 17:25:28 +08:00
4d7bfd2ba3 Fix: typo process_duration (#8696)
### What problem does this PR solve?

Fix typo process_duration.

### Type of change

- [x] Documentation Update
- [x] Refactoring
2025-07-07 14:11:47 +08:00
7353070f49 Adds retrieval result fields to Chunk (#8478)
### What problem does this PR solve?

This PR adds fields to the `Chunk` class to store retrieval results like
similarity scores, term similarity, vector similarity, positions, and
document type. This allows the chunk object to hold all the information
needed when returning search results from the vector database.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-06-25 16:53:15 +08:00
dac5bcdf17 Fix: Enforce default embedding model in create_dataset / update_dataset (#8486)
### What problem does this PR solve?

Previous:
- Defaulted to hardcoded model 'BAAI/bge-large-zh-v1.5@BAAI'
- Did not respect user-configured default embedding_model

Now:
- Correctly prioritizes user-configured default embedding_model

Other:
- Make embedding_model optional in CreateDatasetReq with proper None
handling
- Add default embedding model fallback in dataset update when empty
- Enhance validation utils to handle None values and string
normalization
- Update SDK default embedding model to None to match API changes
- Adjust related test cases to reflect new validation rules

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-25 16:41:32 +08:00
545ea229b6 Refa: Structure Ask Message (#8276)
### What problem does this PR solve?

Refactoring codes for SDK

### Type of change

- [x] Refactoring
2025-06-16 10:17:21 +08:00
7fbbc9650d Fix: Move pagerank field from create to update dataset API (#8217)
### What problem does this PR solve?

- Remove pagerank from CreateDatasetReq and add to UpdateDatasetReq
- Add pagerank update logic in dataset update endpoint
- Update API documentation to reflect changes
- Modify related test cases and SDK references

#8208

This change makes pagerank a mutable property that can only be set after
dataset creation, and only when using elasticsearch as the doc engine.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-12 15:47:49 +08:00