5852 Commits

Author SHA1 Message Date
ffa8738a78 Fix: Remove duplicate text output from the thought model on the chat page. (#14301)
### What problem does this PR solve?

Fix: Remove duplicate text output from the thought model on the chat
page.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-22 23:22:51 +08:00
01753b8f31 Refactor: API connectors (#14228)
### What problem does this PR solve?

Refactor /api/v1/connectors to be more RESTful.

### Type of change
- [x] Refactoring
nightly
2026-04-22 20:42:41 +08:00
c08cd8e090 Refactor: Migrate document metadata config update API (#14286)
### What problem does this PR solve?

Before migration
Web API: POST /v1/document/update_metadata_setting

After consolidation, Restful API
PUT
/api/v1/datasets/<dataset_id>/documents/<document_id>/metadata/config

### Type of change

- [x] Refactoring
2026-04-22 20:01:31 +08:00
d1c62fc19d Refact: Tenant api (#14288)
### What problem does this PR solve?

Refact: Tenant api

### Type of change

- [x] Refactoring
2026-04-22 20:00:32 +08:00
1434f8ade8 Doc: two PDF parser optimizers are supported as of v0.25.0. (#14261)
### What problem does this PR solve?

Multi-column layout detection is supported in v0.25.0

### Type of change


- [x] Documentation Update
2026-04-22 20:00:06 +08:00
b52c518ec9 Set image tag v0.25.0 (#14299)
### What problem does this PR solve?

AD

### Type of change
2026-04-22 19:12:21 +08:00
38e45a1117 Fix: serialize GraphRAG entity resolution merges to avoid graph mutation races (#14237)
### What problem does this PR solve?

This PR fixes the merge-phase crash reported in #14236 during GraphRAG
entity resolution.

The issue happens after candidate pair resolution completes, when
multiple merge coroutines mutate the same shared `networkx` graph
concurrently. In `_merge_graph_nodes`, the code iterates over
`graph.neighbors(node1)` and also awaits during edge/description
merging. That allows another coroutine to modify the graph adjacency
structure in between, which can trigger `RuntimeError: dictionary keys
changed during iteration` and can also lead to unsafe shared-graph
mutation.

This change keeps the PR scoped to that single issue by:
- serializing merge-time graph mutations with a dedicated merge lock
- snapshotting `graph.neighbors(node1)` with `list(...)` before
iteration

Together, these changes prevent concurrent mutation of the shared graph
during the merge phase and make the merge loop safe against live-view
invalidation.

Fixes #14236

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-22 16:42:53 +08:00
e0f0eb277d Fix upload stream handling to prevent truncated files (#14267)
## Summary
- Replace single `Read()` call in Go upload service with `io.ReadAll()`.
- Prevent potential truncated/corrupted file content during multipart
upload.
- Keep existing API behavior unchanged while fixing data integrity risk.

## Root Cause
`io.Reader.Read()` may return fewer bytes than requested without an
error. The previous implementation read once into a full buffer and
assumed all bytes were populated.

## Test plan
- Upload files of multiple sizes and verify uploaded content integrity.
- Confirm upload endpoint still returns successful responses.
- Verify downstream document parsing works on uploaded files.

## Issues
Closes #14266
2026-04-22 16:32:38 +08:00
b8660b9919 Add deepseek and moonshot model json (#14290)
### What problem does this PR solve?

As title

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-22 15:59:41 +08:00
f853a39b40 feat: Add Astraflow provider support (global + China endpoints) (#14270)
## Add Astraflow Provider Support

This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud
/ 优刻得) as a new AI model provider in RAGFlow, with support for both
global and China endpoints.

### About Astraflow
Astraflow is an OpenAI-compatible AI model aggregation platform
supporting 200+ models from major providers including DeepSeek, Qwen,
GPT, Claude, Gemini, Llama, Mistral, and more.

| Variant | Factory Name | Endpoint | Env Var |
|---------|-------------|----------|---------|
| Global | `Astraflow` | `https://api-us-ca.umodelverse.ai/v1` |
`ASTRAFLOW_API_KEY` |
| China | `Astraflow-CN` | `https://api.modelverse.cn/v1` |
`ASTRAFLOW_CN_API_KEY` |

- **API key signup**: https://astraflow.ucloud.cn/

---

### Files Changed

| File | Change |
|------|--------|
| `rag/llm/__init__.py` | Register `Astraflow` and `Astraflow-CN` in
`SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and
`LITELLM_PROVIDER_PREFIX` |
| `rag/llm/chat_model.py` | Add `AstraflowChat` and `AstraflowCNChat`
(OpenAI-compatible `Base` subclass) |
| `rag/llm/embedding_model.py` | Add `AstraflowEmbed` and
`AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) |
| `rag/llm/rerank_model.py` | Add `AstraflowRerank` and
`AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) |
| `rag/llm/cv_model.py` | Add `AstraflowCV` and `AstraflowCNCV`
(subclasses of `GptV4`) |
| `rag/llm/tts_model.py` | Add `AstraflowTTS` and `AstraflowCNTTS`
(subclasses of `OpenAITTS`) |
| `rag/llm/sequence2txt_model.py` | Add `AstraflowSeq2txt` and
`AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) |
| `conf/llm_factories.json` | Register `Astraflow` and `Astraflow-CN`
factories with a curated list of popular models |

---

### Supported Model Types
-  **Chat / LLM** — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7,
Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more
-  **Text Embedding** — text-embedding-3-small/large
-  **Image / Vision (IMAGE2TEXT)** — GPT-4o, GPT-4.1, Claude, Gemini,
Llama-4, etc.
-  **Text Re-Rank**
-  **TTS** — tts-1
-  **Speech-to-Text (SPEECH2TEXT)** — whisper-1

### Implementation Notes
- Uses the `openai/` LiteLLM prefix — consistent with other
OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI,
OpenRouter, n1n, Avian, etc.)
- `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249)
are separate factory entries, allowing users to choose the optimal
endpoint based on their region.
- All model classes cleanly subclass existing base classes (`Base`,
`OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`)
with no custom logic needed — the provider is fully OpenAI-compatible.

---------

Co-authored-by: user <user@xzaaaMacBook-Air.local>
2026-04-22 15:38:34 +08:00
d5d162b374 Fix: MinerU 3.x output discovery and API contract (#14282)
### What problem does this PR solve?

update MinerU parser to most recent minerU v3 logic

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-22 14:44:41 +08:00
3ce1e44b2d Fix: document and sdk support of searching message with user_id (#14283)
### What problem does this PR solve?

Add document of search message with user_id, add sdk support.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
2026-04-22 14:43:38 +08:00
01c5437fdf Fix uv.lock (#14285)
### What problem does this PR solve?

As title.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-22 13:09:21 +08:00
61d756e1b5 Fix #14213 create folder does not accept FOLDER (#14276)
### What problem does this PR solve?

As description.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-22 11:55:10 +08:00
69d8aed792 Doc: v0.25.0 release notes. (#14284)
### What problem does this PR solve?

Added v0.25.0 release notes

### Type of change


- [x] Documentation Update
2026-04-22 11:48:28 +08:00
77a843503d Fix: switch MinerU API endpoint to /pdf_parse (#14272)
### What problem does this PR solve?

update MinerU endpoint to /pdf_parse which has been exposed since v3.x.
fixes #14263

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-22 11:15:46 +08:00
ff29484d42 fix: normalize think tags in final chat answer (#14271)
### What problem does this PR solve?

normalize think tags in final chat answer

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-22 11:15:08 +08:00
3d8a82c0aa Refactor: Consolidation WEB API & HTTP API for document delete api (#14254)
### What problem does this PR solve?

Before consolidation
Web API: POST /v1/document/rm
Http API - DELETE /api/v1/datasets/<dataset_id>/documents

After consolidation, Restful API -- DELETE
/api/v1/datasets/<dataset_id>/documents

### Type of change

- [x] Refactoring
2026-04-22 10:49:52 +08:00
6baf74afc1 Refa: align chat and search restful APIs (#14229)
### What problem does this PR solve?

Refactor /api/v1/chats to be more RESTful.

### Type of change

- [x] Refactoring

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-04-22 10:49:11 +08:00
bfac0195df Update release note (#14275)
### What problem does this PR solve?

As title.

### Type of change

- [x] Documentation Update

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-22 10:47:43 +08:00
74b44e1aa3 Go: add balance command (#14262)
### What problem does this PR solve?

```
RAGFlow(user)> list supported models from 'moonshot' 'test';
+---------------------------------+
| model_name                      |
+---------------------------------+
| moonshot-v1-32k-vision-preview  |
| kimi-k2.6                       |
| moonshot-v1-8k                  |
| moonshot-v1-auto                |
| moonshot-v1-128k                |
| moonshot-v1-32k                 |
| kimi-k2.5                       |
| moonshot-v1-8k-vision-preview   |
| moonshot-v1-128k-vision-preview |
+---------------------------------+
RAGFlow(user)> show balance from 'moonshot' 'test';
+---------+----------+
| balance | currency |
+---------+----------+
| 0       | CNY      |
+---------+----------+
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-21 21:31:50 +08:00
2d05475693 Refactor: Consolidation WEB API & HTTP API for document infos (#14239)
### What problem does this PR solve?

Before consolidation
Web API: POST /v1/document/infos
Http API - GET /api/v1/datasets/<dataset_id>/documents

After consolidation, Restful API -- GET
/api/v1/datasets/<dataset_id>/documents?ids=id1&ids=id2

### Type of change

- [ ] Refactoring
2026-04-21 19:35:11 +08:00
779deadf76 Docs: User-level memory is supported in v0.25.0 (#14259)
### What problem does this PR solve?

v0.25.0 supports linking User ID with conversations.

### Type of change


- [x] Documentation Update
2026-04-21 18:59:00 +08:00
b439f8a74d docs: add DeepWiki developer guide page (#14244)
Closes #14165

Add a short documentation page under Developer Guides introducing
DeepWiki as a resource for developers doing secondary development or
exploring RAGFlow's codebase internals.

---------

Co-authored-by: hyl64 <hyl64@users.noreply.github.com>
2026-04-21 18:57:20 +08:00
009e538a4e Refactor: Consolidation WEB API & HTTP API for document get_filter (#14248)
### What problem does this PR solve?

Before consolidation
Web API: POST /v1/document/filter
Http API - GET /api/v1/datasets/<dataset_id>/documents

After consolidation, Restful API -- GET
/api/v1/datasets/<dataset_id>/documents?type=filter
### Type of change

- [x] Refactoring
2026-04-21 18:55:30 +08:00
a33d0737cd Docs: Update version references to v0.25.0 in READMEs and docs (#14257)
### What problem does this PR solve?

- Update version tags in README files (including translations) from
v0.24.0 to v0.25.0
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files

### Type of change

- [x] Documentation Update
v0.25.0
2026-04-21 17:26:50 +08:00
afdf0814d7 Fix: get metadata conf (#14250)
### What problem does this PR solve?

Get metadata configuration from union of custom metadata and
built_in_metadata.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-21 17:22:42 +08:00
0db2d544a9 Docs: 0.25.0 agent apps can be published. (#14252)
### What problem does this PR solve?

Agent apps can be published.

### Type of change

- [x] Documentation Update
2026-04-21 16:56:11 +08:00
4841ce4239 Fix: Component definition is missing display name. (#14255)
### What problem does this PR solve?

Fix: Component definition is missing display name.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-21 16:53:08 +08:00
e48d75987c Go: add stream / think chat (#14242)
### What problem does this PR solve?

1. Supports stream and non-stream chat
2. Supports think and non-think chat
3. List supported models from DeepSeek service. (This command can be
used to verify the API validity)

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-21 16:52:32 +08:00
a2bea30749 Fix: Editing an empty response in the retrieval operator will cause the focus to shift to the metadata input box. (#14253)
### What problem does this PR solve?

Fix: Editing an empty response in the retrieval operator will cause the
focus to shift to the metadata input box.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-21 16:19:55 +08:00
05c39b90a8 Fix: pipeline parser log not display (#14251)
### What problem does this PR solve?

Fix: pipeline parser log not display

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-21 15:24:13 +08:00
6e33d8722f Revert "Fix: forwarding highlight param" (#14249)
Reverts infiniflow/ragflow#14112
2026-04-21 15:23:18 +08:00
78b800e685 Fix: Fix: The minimum value for the "Suggested text block size" input box is set to 1. (#14246)
### What problem does this PR solve?

Fix: The minimum value for the "Suggested text block size" input box is
set to 1.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-21 14:06:36 +08:00
b3891ba6a4 Fix audio/video in pipeline (#14241)
### What problem does this PR solve?

Fix audio/video in pipeline

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-21 12:17:57 +08:00
8aab158942 OpenSource Resume is supported only with Elasticsearch. (#14233)
### What problem does this PR solve?

OpenSource Resume is supported only with Elasticsearch.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-21 10:05:47 +08:00
f269ee9739 Go: add thinking features to zhipu-ai (#14234)
### What problem does this PR solve?

```
RAGFlow(user)> list models from 'zhipu-ai';
+------------+------------+---------------+----------------+
| features   | max_tokens | model_types   | name           |
+------------+------------+---------------+----------------+
| [thinking] | 128000     | [chat]        | glm-4.7        |
| [thinking] | 128000     | [chat]        | glm-4.5        |
| [thinking] | 128000     | [chat vision] | glm-4.6v-Flash |
| [thinking] | 128000     | [chat]        | glm-4.5-x      |
| [thinking] | 128000     | [chat]        | glm-4.5-air    |
| [thinking] | 128000     | [chat]        | glm-4.5-airx   |
| [thinking] | 128000     | [chat]        | glm-4.5-flash  |
| [thinking] | 64000      | [vision]      | glm-4.5v       |
|            | 128000     | [chat]        | glm-4-plus     |
|            | 128000     | [chat]        | glm-4-0520     |
|            | 128000     | [chat]        | glm-4          |
|            | 8000       | [chat]        | glm-4-airx     |
|            | 128000     | [chat]        | glm-4-air      |
|            | 128000     | [chat]        | glm-4-flash    |
|            | 128000     | [chat]        | glm-4-flashx   |
|            | 1000000    | [chat]        | glm-4-long     |
|            | 128000     | [chat]        | glm-3-turbo    |
|            | 2000       | [vision]      | glm-4v         |
|            | 8192       | [chat]        | glm-4-9b       |
|            | 512        | [embedding]   | embedding-2    |
|            | 512        | [embedding]   | embedding-3    |
|            | 4096       | [asr]         | glm-asr        |
|            | 0          | [tts]         | glm-tts        |
|            | 0          | [ocr]         | glm-ocr        |
|            | 0          | [rerank]      | glm-rerank     |
+------------+------------+---------------+----------------+
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-20 21:53:27 +08:00
c43367eca3 Fix: The number of chunks in the file list is not displayed. (#14232)
### What problem does this PR solve?

Fix: The number of chunks in the file list is not displayed.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-20 19:24:20 +08:00
5265def967 Fix: The mind map on the search page does not display completely upon initial loading. (#14226)
### What problem does this PR solve?

Fix: The mind map on the search page does not display completely upon
initial loading.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-20 19:24:13 +08:00
ba7d3f6c31 Add debugpy dependency to pyproject.toml (#14225)
In order to attach the debugger to a running docker container it has to
be inside the docker image

### What problem does this PR solve?

[#14224](https://github.com/infiniflow/ragflow/issues/14224)

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-20 18:05:17 +08:00
78c3583964 Fix memory resolution regression for multimodal Gemini models (#14209)
### What problem does this PR solve?

Fixes #14206.

This issue is a regression. PR #9520 previously changed Gemini models
from `image2text` to `chat` to fix chat-side resolution, but PR #13073
later restored those Gemini entries to `image2text` during model-list
updates, which reintroduced the bug.

The underlying problem is that Gemini models are multimodal and
advertise both `CHAT` and `IMAGE2TEXT`, while tenant model resolution
still depends on a single stored `model_type`. That makes chat-only
flows such as memory extraction fragile when a compatible model is
stored as `image2text`.

This PR fixes the issue at the model resolution layer instead of
changing `llm_factories.json` again:
- keep the stored tenant model type unchanged
- try exact `model_type` lookup first
- if no exact match is found, fall back only when the model metadata
shows the requested capability is supported
- coerce the runtime config to the requested type for chat callers
- fail fast in memory creation instead of silently persisting
`tenant_llm_id=0`

This preserves existing multimodal and `image2text` behavior while
restoring chat compatibility for memory-related flows.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

### Testing

- Re-checked the current memory creation and memory message extraction
paths against the updated resolution logic
- Verified locally that a Gemini-style tenant model stored as
`image2text` but tagged with `CHAT` can still be resolved for `chat`
- Verified `get_model_config_by_type_and_name(..., CHAT, ...)` returns a
chat-compatible runtime config
- Verified `get_model_config_by_id(..., CHAT)` also returns a
chat-compatible runtime config
- Verified strict resolution still fails when the model metadata does
not advertise chat capability
2026-04-20 16:37:36 +08:00
9c7c105007 Fix: Doc generator (#14223)
### What problem does this PR solve?

Doc generator

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-20 16:37:33 +08:00
af2ed416a7 Add extra field to model instance (#14203)
### What problem does this PR solve?

Now each model support region with different URL

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-20 15:31:12 +08:00
939933649a Refactor: Consolidation WEB API & HTTP API for document list_docs (#14176)
### What problem does this PR solve?

Before consolidation
Web API: POST /v1/document/list
Http API - GET /api/v1/datasets/<dataset_id>/documents

After consolidation, Restful API -- GET
/api/v1/datasets/<dataset_id>/documents

### Type of change

- [x] Refactoring
2026-04-20 14:54:40 +08:00
d053317c4d Fix: variable in doc generator (#14180)
### What problem does this PR solve?

Fix: variable in doc generator

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-20 14:19:42 +08:00
19eedeec61 Fix: accept empty value as 0 chunk (#14220)
### What problem does this PR solve?

Fix: accept empty value as 0 chunk
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-20 12:53:47 +08:00
f554f6ae85 chore(docs): tips for installing CN fonts (#14189)
### What problem does this PR solve?
Add tips for installing Chinse fonts under code sandbox. Otherwise,
`matplotlib `won't render Chinese correctly.

<img width="2082" height="1186" alt="sales_analysis"
src="https://github.com/user-attachments/assets/57e675ab-1e92-4662-9aeb-ad72a6121eb5"
/>



### Type of change

- [x] Documentation Update
2026-04-20 12:11:23 +08:00
0f806dc3ca Feat: mysql sync (#14200)
### What problem does this PR solve?

Add a script to sync db schema with peewee_migrate.

### Type of change

- [x] Other (please describe): tool script
2026-04-20 11:40:01 +08:00
4e992de91f Add tongyi gte-rerank-v2 (#14215)
https://bailian.console.aliyun.com/cn-beijing?tab=api#/api/?type=model&url=2780056

### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change
- [x] Other (please describe): add gte-rerank-v2、qwen3-rerank
2026-04-20 11:39:17 +08:00
d5c306de30 Fix: remove unit test checkpoint resume (#14216)
### What problem does this PR solve?

remove unit test checkpoint resume

### Type of change

- [x] Performance Improvement
2026-04-20 11:27:40 +08:00