### Related issues
Closes#14744
### What problem does this PR solve?
The Memory REST endpoint `POST /api/v1/messages` previously persisted
whatever `user_id` the client sent in the JSON body. Memory rows were
therefore attributed to an arbitrary string, even when the caller
authenticated as a normal workspace user via JWT (browser/session-style
bearer token decoded into an access token). That broke attribution and
audit semantics for shared memories (team visibility): any authorized
writer could spoof another subject id.
The Python SDK already sends an optional `user_id` for integrations
using **API keys** (`APIToken`) to tag an external subject distinct from
the tenant owner user.
### Solution
- Record **`g.auth_via_api_token`** in `_load_user`
(`api/apps/__init__.py`): set `True` only when authentication resolves
via `APIToken`, otherwise `False` after JWT-based login succeeds.
- In **`POST /messages`** (`memory_api.add_message`): if the request was
authenticated with an API key, keep accepting optional `user_id` from
the body (default empty string). For JWT-authenticated users, **always**
set stored `user_id` to **`current_user.id`** and ignore the client
field.
- Guard reads of `g` with **`RuntimeError`** handling so isolated
imports or tests without a Quart application context do not fail when
resolving `user_id`.
- Document on **`RAGFlow.add_message`** that `user_id` is only
meaningful for API-key authentication.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### Testing
- `python -m py_compile` on modified modules (`api/apps/__init__.py`,
`api/apps/restful_apis/memory_api.py`).
- Recommended: run web/SDK memory message tests (`test_add_message`,
`test_message_routes_unit`) against a full environment with `quart` and
configured services.
### Notes for reviewers
- Behavior change **only** for callers using JWT-style authorization on
`POST /messages`; API-key callers keep prior optional `user_id`
semantics.
Co-authored-by: jony376 <jony376@gmail.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
### What problem does this PR solve?
Currently, RAGFlow's Search and Chat interfaces display only raw
vectorized text chunks during retrieval, without contextual information
about their source documents. Users cannot see document titles, page
numbers, upload dates, or custom metadata fields that would help them
understand and trust the retrieved results.
This PR introduces an **optional metadata display feature** that
enriches retrieved chunks with document-level metadata in both the
Search tab and Chatbot interface.
**Key improvements:**
- **Search results**: Display document metadata as styled badges beneath
chunk snippets
- **Chat citations**: Show metadata in citation popovers and reference
lists for better source context
- **LLM context**: Metadata is injected into the LLM prompt to enable
more accurate, citation-aware responses
- **External API support**: Applications using RAGFlow's SDK retrieval
endpoints (`/v1/retrieval`, `/v1/searchbots/retrieval_test`) can opt-in
via request parameters
- **User control**: Multi-select dropdown UI allows users to choose
which metadata fields to display
**Implementation approach:**
- ✅ Reuses existing `DocMetadataService` infrastructure (no new database
tables or indices)
- ✅ Settings stored in existing JSON configuration fields
(`search_config.reference_metadata`, `prompt_config.reference_metadata`)
- ✅ No database migrations required
- ✅ Disabled by default (fully opt-in and backward-compatible)
- ✅ Dynamic metadata field selection populated from actual document
metadata keys
- ✅ Fixed critical bug where Python's builtin `set()` was shadowed by a
route handler function
**Modified endpoints (all backward-compatible):**
- `POST /v1/retrieval` (Public SDK)
- `POST /v1/searchbots/retrieval_test` (Searchbots)
- `POST /v1/chunk/retrieval_test` (UI/Internal)
- Chat completions endpoints (via `extra_body.reference_metadata` or
`prompt_config`)
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
###Images
-
<img width="879" height="1275" alt="image"
src="https://github.com/user-attachments/assets/95b2d731-31ae-45a1-b081-bf5893f52aeb"
/>
<br><br>
<br><br>
<img width="1532" height="362" alt="image"
src="https://github.com/user-attachments/assets/9cebc65b-b7a7-459f-b25e-3b13fa9b638e"
/>
<br><br>
<br><br>
<img width="2586" height="1320" alt="image"
src="https://github.com/user-attachments/assets/2153d493-d899-461f-a7a9-041391e07776"
/>
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Attili-sys <Attili-sys@users.noreply.github.com>
Co-authored-by: Ahmad Intisar <ahmadintisar@Ahmads-MacBook-M4-Pro.local>
### What problem does this PR solve?
Add document of search message with user_id, add sdk support.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
### What problem does this PR solve?
Refactor /api/v1/chats to be more RESTful.
### Type of change
- [x] Refactoring
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Before consolidation
Web API: POST /v1/document/infos
Http API - GET /api/v1/datasets/<dataset_id>/documents
After consolidation, Restful API -- GET
/api/v1/datasets/<dataset_id>/documents?ids=id1&ids=id2
### Type of change
- [ ] Refactoring
### What problem does this PR solve?
Before change, update_document in api/apps/restful_apis/document_api.py
is using "PUT".
After change, it will use "PATCH" which is more suitable.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Enable reading Tag Set tags via API (expose tag_kwd field). The result
of the queried list chunks is as shown below:
<img width="1422" height="818" alt="image"
src="https://github.com/user-attachments/assets/abd1960a-fe34-489e-9d72-525f8e574938"
/>
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: heyang.why <heyang.why@alibaba-inc.com>
### What problem does this PR solve?
1. Split dataset api to gateway and service, and modify web UI to use
restful http api.
2. Old KB releated APIs are commented.
### Type of change
- [x] Refactoring
---------
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
1. Split dataset api to gateway and service, and modify web UI to use
restful http api.
2. Old KB releated APIs are commented.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Add_chunk supports add image.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
1. Split dataset api to gateway and service, and modify web UI to use
restful http api.
2. Old KB releated APIs are commented.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
The Chunk class had a typo in the attribute name 'documnet_keyword',
which caused the document_name field to remain empty when retrieving
chunks via the SDK. This fix corrects the spelling to
'document_keyword'.
Changes:
- Line 36: Changed self.documnet_keyword to self.document_keyword
- Line 52: Updated backward compatibility code to use
self.document_keyword
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add delete all support for delete operations.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
---------
Co-authored-by: writinwaters <cai.keith@gmail.com>
### What problem does this PR solve?
Fix: key error "content" #12844
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fixes web API behavior mismatches that caused test failures by
normalizing error responses, tightening validations, correcting error
messages, and closing upload file handles.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Move memory and message apis to /api, and add sdk support.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: update RAGFlow SDK for consistency #12059
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
New default prompt:
```
You are an intelligent assistant. Your primary function is to answer questions based strictly on the provided knowledge base.
**Essential Rules:**
- Your answer must be derived **solely** from this knowledge base: `{knowledge}`.
- **When information is available**: Summarize the content to give a detailed answer.
- **When information is unavailable**: Your response must contain this exact sentence: "The answer you are looking for is not found in the knowledge base!"
- **Always consider** the entire conversation history.
```
Also fix some grammar errors.
### Type of change
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Incorrect async chat streamly output. #11677.
Disable beartype for #11666.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: incorrect parameter usage #8084
Fix: Optimize edge check #10851
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Update env to support PPTX
Fix: update README for version changes #11138
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Documentation Update
---------
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
### What problem does this PR solve?
Solved: Sync Parse Document API #5635
Feat: Add parse_document with feed back, user can view the status of
each document after parsing finished.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
### What problem does this PR solve?
Fix: Agent.reset() argument wrong #10463 & Unable to converse with agent
through Python API. #10415
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
-Added the metadata_dedition parameter in the document retrieval
interface to filter document metadata -Updated the API documentation and
added explanations for the metadata_dedition parameter
### What problem does this PR solve?
Make /api/v1/retrieval api also can use metadata filter
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Python class Document was missing "meta_fields", e.g. when querying, the
document instances came without meta_fields
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- The default dataset_ids "kb1" was removed from the Chat class.
- The HTTP API response does not include the dataset_ids field.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
list_document supports range filtering.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR adds fields to the `Chunk` class to store retrieval results like
similarity scores, term similarity, vector similarity, positions, and
document type. This allows the chunk object to hold all the information
needed when returning search results from the vector database.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Previous:
- Defaulted to hardcoded model 'BAAI/bge-large-zh-v1.5@BAAI'
- Did not respect user-configured default embedding_model
Now:
- Correctly prioritizes user-configured default embedding_model
Other:
- Make embedding_model optional in CreateDatasetReq with proper None
handling
- Add default embedding model fallback in dataset update when empty
- Enhance validation utils to handle None values and string
normalization
- Update SDK default embedding model to None to match API changes
- Adjust related test cases to reflect new validation rules
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Remove pagerank from CreateDatasetReq and add to UpdateDatasetReq
- Add pagerank update logic in dataset update endpoint
- Update API documentation to reflect changes
- Modify related test cases and SDK references
#8208
This change makes pagerank a mutable property that can only be set after
dataset creation, and only when using elasticsearch as the doc engine.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)