### What problem does this PR solve?
Fix: Remove duplicate text output from the thought model on the chat
page.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Before migration
Web API: POST /v1/document/update_metadata_setting
After consolidation, Restful API
PUT
/api/v1/datasets/<dataset_id>/documents/<document_id>/metadata/config
### Type of change
- [x] Refactoring
### What problem does this PR solve?
This PR fixes the merge-phase crash reported in #14236 during GraphRAG
entity resolution.
The issue happens after candidate pair resolution completes, when
multiple merge coroutines mutate the same shared `networkx` graph
concurrently. In `_merge_graph_nodes`, the code iterates over
`graph.neighbors(node1)` and also awaits during edge/description
merging. That allows another coroutine to modify the graph adjacency
structure in between, which can trigger `RuntimeError: dictionary keys
changed during iteration` and can also lead to unsafe shared-graph
mutation.
This change keeps the PR scoped to that single issue by:
- serializing merge-time graph mutations with a dedicated merge lock
- snapshotting `graph.neighbors(node1)` with `list(...)` before
iteration
Together, these changes prevent concurrent mutation of the shared graph
during the merge phase and make the merge loop safe against live-view
invalidation.
Fixes#14236
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
- Replace single `Read()` call in Go upload service with `io.ReadAll()`.
- Prevent potential truncated/corrupted file content during multipart
upload.
- Keep existing API behavior unchanged while fixing data integrity risk.
## Root Cause
`io.Reader.Read()` may return fewer bytes than requested without an
error. The previous implementation read once into a full buffer and
assumed all bytes were populated.
## Test plan
- Upload files of multiple sizes and verify uploaded content integrity.
- Confirm upload endpoint still returns successful responses.
- Verify downstream document parsing works on uploaded files.
## Issues
Closes#14266
### What problem does this PR solve?
As title
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
## Add Astraflow Provider Support
This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud
/ 优刻得) as a new AI model provider in RAGFlow, with support for both
global and China endpoints.
### About Astraflow
Astraflow is an OpenAI-compatible AI model aggregation platform
supporting 200+ models from major providers including DeepSeek, Qwen,
GPT, Claude, Gemini, Llama, Mistral, and more.
| Variant | Factory Name | Endpoint | Env Var |
|---------|-------------|----------|---------|
| Global | `Astraflow` | `https://api-us-ca.umodelverse.ai/v1` |
`ASTRAFLOW_API_KEY` |
| China | `Astraflow-CN` | `https://api.modelverse.cn/v1` |
`ASTRAFLOW_CN_API_KEY` |
- **API key signup**: https://astraflow.ucloud.cn/
---
### Files Changed
| File | Change |
|------|--------|
| `rag/llm/__init__.py` | Register `Astraflow` and `Astraflow-CN` in
`SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and
`LITELLM_PROVIDER_PREFIX` |
| `rag/llm/chat_model.py` | Add `AstraflowChat` and `AstraflowCNChat`
(OpenAI-compatible `Base` subclass) |
| `rag/llm/embedding_model.py` | Add `AstraflowEmbed` and
`AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) |
| `rag/llm/rerank_model.py` | Add `AstraflowRerank` and
`AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) |
| `rag/llm/cv_model.py` | Add `AstraflowCV` and `AstraflowCNCV`
(subclasses of `GptV4`) |
| `rag/llm/tts_model.py` | Add `AstraflowTTS` and `AstraflowCNTTS`
(subclasses of `OpenAITTS`) |
| `rag/llm/sequence2txt_model.py` | Add `AstraflowSeq2txt` and
`AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) |
| `conf/llm_factories.json` | Register `Astraflow` and `Astraflow-CN`
factories with a curated list of popular models |
---
### Supported Model Types
- ✅ **Chat / LLM** — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7,
Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more
- ✅ **Text Embedding** — text-embedding-3-small/large
- ✅ **Image / Vision (IMAGE2TEXT)** — GPT-4o, GPT-4.1, Claude, Gemini,
Llama-4, etc.
- ✅ **Text Re-Rank**
- ✅ **TTS** — tts-1
- ✅ **Speech-to-Text (SPEECH2TEXT)** — whisper-1
### Implementation Notes
- Uses the `openai/` LiteLLM prefix — consistent with other
OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI,
OpenRouter, n1n, Avian, etc.)
- `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249)
are separate factory entries, allowing users to choose the optimal
endpoint based on their region.
- All model classes cleanly subclass existing base classes (`Base`,
`OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`)
with no custom logic needed — the provider is fully OpenAI-compatible.
---------
Co-authored-by: user <user@xzaaaMacBook-Air.local>
### What problem does this PR solve?
update MinerU parser to most recent minerU v3 logic
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add document of search message with user_id, add sdk support.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
### What problem does this PR solve?
As title.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
update MinerU endpoint to /pdf_parse which has been exposed since v3.x.
fixes#14263
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
normalize think tags in final chat answer
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Before consolidation
Web API: POST /v1/document/rm
Http API - DELETE /api/v1/datasets/<dataset_id>/documents
After consolidation, Restful API -- DELETE
/api/v1/datasets/<dataset_id>/documents
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Refactor /api/v1/chats to be more RESTful.
### Type of change
- [x] Refactoring
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Before consolidation
Web API: POST /v1/document/infos
Http API - GET /api/v1/datasets/<dataset_id>/documents
After consolidation, Restful API -- GET
/api/v1/datasets/<dataset_id>/documents?ids=id1&ids=id2
### Type of change
- [ ] Refactoring
Closes#14165
Add a short documentation page under Developer Guides introducing
DeepWiki as a resource for developers doing secondary development or
exploring RAGFlow's codebase internals.
---------
Co-authored-by: hyl64 <hyl64@users.noreply.github.com>
### What problem does this PR solve?
Before consolidation
Web API: POST /v1/document/filter
Http API - GET /api/v1/datasets/<dataset_id>/documents
After consolidation, Restful API -- GET
/api/v1/datasets/<dataset_id>/documents?type=filter
### Type of change
- [x] Refactoring
### What problem does this PR solve?
- Update version tags in README files (including translations) from
v0.24.0 to v0.25.0
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Get metadata configuration from union of custom metadata and
built_in_metadata.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Component definition is missing display name.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
1. Supports stream and non-stream chat
2. Supports think and non-think chat
3. List supported models from DeepSeek service. (This command can be
used to verify the API validity)
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: Editing an empty response in the retrieval operator will cause the
focus to shift to the metadata input box.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: The minimum value for the "Suggested text block size" input box is
set to 1.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
OpenSource Resume is supported only with Elasticsearch.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: The number of chunks in the file list is not displayed.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: The mind map on the search page does not display completely upon
initial loading.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
In order to attach the debugger to a running docker container it has to
be inside the docker image
### What problem does this PR solve?
[#14224](https://github.com/infiniflow/ragflow/issues/14224)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fixes#14206.
This issue is a regression. PR #9520 previously changed Gemini models
from `image2text` to `chat` to fix chat-side resolution, but PR #13073
later restored those Gemini entries to `image2text` during model-list
updates, which reintroduced the bug.
The underlying problem is that Gemini models are multimodal and
advertise both `CHAT` and `IMAGE2TEXT`, while tenant model resolution
still depends on a single stored `model_type`. That makes chat-only
flows such as memory extraction fragile when a compatible model is
stored as `image2text`.
This PR fixes the issue at the model resolution layer instead of
changing `llm_factories.json` again:
- keep the stored tenant model type unchanged
- try exact `model_type` lookup first
- if no exact match is found, fall back only when the model metadata
shows the requested capability is supported
- coerce the runtime config to the requested type for chat callers
- fail fast in memory creation instead of silently persisting
`tenant_llm_id=0`
This preserves existing multimodal and `image2text` behavior while
restoring chat compatibility for memory-related flows.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Testing
- Re-checked the current memory creation and memory message extraction
paths against the updated resolution logic
- Verified locally that a Gemini-style tenant model stored as
`image2text` but tagged with `CHAT` can still be resolved for `chat`
- Verified `get_model_config_by_type_and_name(..., CHAT, ...)` returns a
chat-compatible runtime config
- Verified `get_model_config_by_id(..., CHAT)` also returns a
chat-compatible runtime config
- Verified strict resolution still fails when the model metadata does
not advertise chat capability
### What problem does this PR solve?
Now each model support region with different URL
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Before consolidation
Web API: POST /v1/document/list
Http API - GET /api/v1/datasets/<dataset_id>/documents
After consolidation, Restful API -- GET
/api/v1/datasets/<dataset_id>/documents
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Add tips for installing Chinse fonts under code sandbox. Otherwise,
`matplotlib `won't render Chinese correctly.
<img width="2082" height="1186" alt="sales_analysis"
src="https://github.com/user-attachments/assets/57e675ab-1e92-4662-9aeb-ad72a6121eb5"
/>
### Type of change
- [x] Documentation Update
https://bailian.console.aliyun.com/cn-beijing?tab=api#/api/?type=model&url=2780056
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Other (please describe): add gte-rerank-v2、qwen3-rerank