Commit Graph

274 Commits

Author SHA1 Message Date
cd54c08e84 Go: implement provider: Ollama (#14580)
### What problem does this PR solve?

implement `Ollama` provider

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-06 12:03:58 +08:00
4ee0702aed Feat: add skills space to context engine (#13908)
### What problem does this PR solve?

issue #13714

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-04-30 12:36:03 +08:00
2548c28d65 feat: add FuturMix as model provider (#14419)
## Summary

Add [FuturMix](https://futurmix.ai) as a new model provider. FuturMix is
an OpenAI-compatible unified AI gateway that provides access to 22+
models (GPT, Claude, Gemini, DeepSeek, and more) through a single API
endpoint and key.

- **API Base**: `https://futurmix.ai/v1` (OpenAI-compatible)
- **Supported capabilities**: Chat, Embedding, Image2Text, TTS,
Speech2Text, Rerank

### Changes

| File | Change |
|------|--------|
| `rag/llm/__init__.py` | Add `FuturMix` to `SupportedLiteLLMProvider`
enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` |
| `rag/llm/chat_model.py` | Add `FuturMixChat(Base)` — follows
Astraflow/Avian pattern |
| `rag/llm/embedding_model.py` | Add `FuturMixEmbed(OpenAIEmbed)` —
follows Astraflow pattern |
| `rag/llm/cv_model.py` | Add `FuturMixCV(GptV4)` — follows
SILICONFLOW/OpenRouter pattern |
| `rag/llm/tts_model.py` | Add `FuturMixTTS(OpenAITTS)` — follows
CometAPI/DeerAPI pattern |
| `rag/llm/sequence2txt_model.py` | Add `FuturMixSeq2txt(GPTSeq2txt)` —
follows StepFun pattern |
| `rag/llm/rerank_model.py` | Add `FuturMixRerank(OpenAI_APIRerank)` |
| `conf/llm_factories.json` | Add factory config with 8 chat, 2
embedding, 1 image2text, 2 TTS, 1 speech2text models |
| `docs/guides/models/supported_models.mdx` | Add FuturMix to supported
models table |

### Models included

- **Chat**: claude-sonnet-4-20250514, claude-3.5-haiku, gpt-4o,
gpt-4o-mini, gemini-2.5-flash, gemini-2.0-flash, deepseek-chat,
deepseek-reasoner
- **Embedding**: text-embedding-3-small, text-embedding-3-large
- **Image2Text**: gpt-4o
- **TTS**: tts-1, tts-1-hd
- **Speech2Text**: whisper-1

## Test plan

- [ ] Verify FuturMix appears in the model provider list in RAGFlow UI
- [ ] Configure FuturMix with API key and test chat completion
- [ ] Test embedding model with document indexing
- [ ] Test image2text with a sample image

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-30 10:59:37 +08:00
0e1477eb23 Go: implement provider: MiniMax (#14478)
### What problem does this PR solve?

implement MiniMax provider

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
2026-04-29 19:06:40 +08:00
bb05a8bd7e Update create model instance command (#14441)
### What problem does this PR solve?

1. support command:

```
RAGFlow(user)> create provider 'vllm' instance 'test' key 'test-key' url 'base-url' region 'abc';
SUCCESS
RAGFlow(user)> list instances from 'vllm';
+----------+----------------------------------------+----------------------------------+--------------+----------------------------------+--------+
| apiKey   | extra                                  | id                               | instanceName | providerID                       | status |
+----------+----------------------------------------+----------------------------------+--------------+----------------------------------+--------+
| test-key | {"base_url":"base-url","region":"abc"} | 40213c89430311f1a7cf38a74640adcc | test         | b4d40e6142d311f1a4f938a74640adcc | enable |
+----------+----------------------------------------+----------------------------------+--------------+----------------------------------+--------+
```
2. support add vllm model
```
RAGFlow(user)> add model 'Qwen/Qwen2-0.5B' to provider 'vllm' instance 'test' with tokens 131072 chat;
SUCCESS
```
3. add vllm chat

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-29 17:05:08 +08:00
decf673049 Go: implement provider: volcengine (#14460)
### What problem does this PR solve?

implement `volcengine` provider 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-04-29 15:45:08 +08:00
f670913bb4 Refactor model type to model class (#14426)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-28 16:05:15 +08:00
effc84a042 Refactor model in GO (#14398)
### What problem does this PR solve?

Refactor model in GO

### Type of change

- [x] Refactoring
2026-04-28 12:59:01 +08:00
819257f257 Go: add volcengine (#14409)
### What problem does this PR solve?

1. Refactor server_main
2. Add volcengine

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-28 12:12:58 +08:00
965717c4fb Go: add new provider: google (#14395)
### What problem does this PR solve?

As title.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-27 20:35:47 +08:00
c3eac4103a Go: aliyun model provider (#14379)
### What problem does this PR solve?

As title.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-27 14:53:33 +08:00
3ad3241ae0 feat: persist RAPTOR layer metadata on summary chunks (#13286)
## Summary

RAPTOR's recursive clustering builds a `layers` list tracking
`(start_idx, end_idx)` boundaries per level, but currently discards this
information — only the flat `chunks` list is returned. This makes it
impossible to distinguish leaf-level summaries from top-level ones.

This PR:
- Returns `(chunks, layers)` tuple from `raptor.py`'s `__call__`
- Annotates each RAPTOR summary chunk with `raptor_layer_int` (1 = first
summary level, 2 = summary-of-summaries, etc.)
- Adds `raptor_layer_int` to `infinity_mapping.json` (Elasticsearch
handles it via existing `*_int` dynamic template)

### Why this matters

Downstream features need to know which RAPTOR layer a summary belongs
to:
- **Retrieving the top-level document summary** for entity extraction,
search snippets, or document comparison
- **Filtering by abstraction level** — users may want only high-level
summaries or only leaf-level cluster summaries
- **RAPTOR recall quality** — #10951 reports summaries not being
recalled for definition queries; layer metadata enables targeted
retrieval

### Changes

| File | Change | LOC |
|------|--------|-----|
| `rag/raptor.py` | Return `(chunks, layers)` tuple | ~3 |
| `rag/svr/task_executor.py` | Build `chunk_layer` mapping, set
`raptor_layer_int` | ~12 |
| `conf/infinity_mapping.json` | Add `raptor_layer_int` integer field |
~1 |

### Backward compatibility

- **Additive only** — no existing fields or behavior changed
- Existing RAPTOR chunks continue to work (they'll have
`raptor_layer_int = 0` by default)
- New RAPTOR chunks get layer metadata automatically

## Test plan

- [ ] Parse a document with RAPTOR enabled, verify `raptor_layer_int` is
set on indexed chunks
- [ ] Verify `raptor_layer_int` values increase with abstraction level
(layer 1 < layer 2 < ...)
- [ ] Verify existing RAPTOR deletion (`delete by raptor_kwd`) still
works
- [ ] Verify Infinity backend accepts the new field

Fixes #7488
Related: #4104, #11191, #10951

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: yuch85 <yuch85.1@gmail.com>
Co-authored-by: Wang Qi <wangq8@outlook.com>
2026-04-27 10:20:46 +08:00
78188ce9e9 Feat: add OpenDataLoader PDF parser backend (#14058) (#14097)
### What problem does this PR solve?

Closes #14058.

RAGFlow supports multiple PDF parsing backends (DeepDOC, MinerU,
Docling, TCADP, PaddleOCR). This PR adds **OpenDataLoader**
([opendataloader-project/opendataloader-pdf](https://github.com/opendataloader-project/opendataloader-pdf))
as a new optional backend, giving users a deterministic, local-first
alternative with competitive table extraction accuracy.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update

---

### Changes

#### Backend
- `deepdoc/parser/opendataloader_parser.py` — new `OpenDataLoaderParser`
class inheriting `RAGFlowPdfParser`. Implements `check_installation()`
(guards Python package + Java 11+ runtime), `parse_pdf()` with
JSON-first extraction (heading/paragraph/table/list/image/formula) and
Markdown fallback, position-tag generation compatible with the shared
`@@page\tx0\tx1\ty0\ty1##` format, and temp-dir lifecycle with cleanup.
- `rag/app/naive.py` — new `by_opendataloader()` wrapper, registered in
`PARSERS` dict, added to `chunk_token_num=0` override list.
- `rag/flow/parser/parser.py` — `"opendataloader"` branch in the
pipeline PDF handler + check validation list.

#### Infrastructure
- `docker/entrypoint.sh` — `ensure_opendataloader()` function: opt-in
via `USE_OPENDATALOADER=true`, skips gracefully if Java is not on PATH.

#### Frontend
- `web/src/components/layout-recognize-form-field.tsx` —
`OpenDataLoader` added to `ParseDocumentType` enum and parser dropdown.
Cascades automatically to the pipeline editor's Parser component.

#### Docs
- `docs/guides/dataset/select_pdf_parser.md` — added OpenDataLoader
entry and full env-var reference.

---

### Environment variables

| Variable | Default | Description |
|---|---|---|
| `USE_OPENDATALOADER` | `false` | Set `true` to install
`opendataloader-pdf` on container startup |
| `OPENDATALOADER_VERSION` | latest | Pin the PyPI release (e.g.
`==2.2.1`) |
| `OPENDATALOADER_HYBRID` | _(unset)_ | Enable hybrid AI mode (e.g.
`docling-fast`) |
| `OPENDATALOADER_IMAGE_OUTPUT` | _(unset)_ | `off` / `embedded` /
`external` |
| `OPENDATALOADER_OUTPUT_DIR` | _(tmp)_ | Persistent output dir; temp
dir used + cleaned if unset |
| `OPENDATALOADER_DELETE_OUTPUT` | `1` | `0` to retain intermediate
files for debugging |
| `OPENDATALOADER_SANITIZE` | _(unset)_ | `1` to filter prompt-injection
patterns from output |

---

### Dependencies

- **Runtime**: `opendataloader-pdf` (PyPI, Apache 2.0) — opt-in, not
added to `pyproject.toml` core deps. Installed by
`ensure_opendataloader()` at container startup when
`USE_OPENDATALOADER=true`.
- **System**: Java 11+ on PATH (JVM is the underlying engine). The
installer skips with a warning if `java` is not found.

---

### How to test

**Standalone parser:**
```bash
source .venv/bin/activate
uv pip install opendataloader-pdf
python3 -c "
import sys; sys.path.insert(0, '.')
from deepdoc.parser.opendataloader_parser import OpenDataLoaderParser
p = OpenDataLoaderParser()
print('available:', p.check_installation())
s, t = p.parse_pdf('path/to/test.pdf', parse_method='pipeline')
print(f'sections={len(s)} tables={len(t)}')
"

```
### Benchmark vs Docling
```
file                      parser            secs  sections  tables
----------------------------------------------------------------------
text-heavy.pdf            docling           45.29       148      10
text-heavy.pdf            opendataloader     3.14       559       0
table-heavy.pdf           docling           7.05        76       3
table-heavy.pdf           opendataloader     3.71        90       0
complex.pdf               docling            42.67       114       8
complex.pdf               opendataloader     3.51       180       0
```
2026-04-25 00:33:02 +08:00
1c244df90d Go: add gitee and siliconflow as model provider (#14336)
### What problem does this PR solve?

As title

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-24 20:59:30 +08:00
1473000135 Implement retrieval_test in GO (#14231)
### What problem does this PR solve?

Implement retrieval_test in GO

### Type of change

- [x] Refactoring
2026-04-24 15:30:14 +08:00
aadd9a333f Feat: deepseek v4 (#14346)
### What problem does this PR solve?

Feat: deepseek v4
### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-04-24 13:07:59 +08:00
2b029882d7 Go: add new provider minimax (#14296)
### What problem does this PR solve?

1. Add new provider minimax
2. Add new command: CHECK INSTANCE 'instance_name' FROM 'provider_name';
```
RAGFlow(user)> check instance 'test' from 'minimax';
SUCCESS
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-23 10:16:20 +08:00
b8660b9919 Add deepseek and moonshot model json (#14290)
### What problem does this PR solve?

As title

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-22 15:59:41 +08:00
f853a39b40 feat: Add Astraflow provider support (global + China endpoints) (#14270)
## Add Astraflow Provider Support

This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud
/ 优刻得) as a new AI model provider in RAGFlow, with support for both
global and China endpoints.

### About Astraflow
Astraflow is an OpenAI-compatible AI model aggregation platform
supporting 200+ models from major providers including DeepSeek, Qwen,
GPT, Claude, Gemini, Llama, Mistral, and more.

| Variant | Factory Name | Endpoint | Env Var |
|---------|-------------|----------|---------|
| Global | `Astraflow` | `https://api-us-ca.umodelverse.ai/v1` |
`ASTRAFLOW_API_KEY` |
| China | `Astraflow-CN` | `https://api.modelverse.cn/v1` |
`ASTRAFLOW_CN_API_KEY` |

- **API key signup**: https://astraflow.ucloud.cn/

---

### Files Changed

| File | Change |
|------|--------|
| `rag/llm/__init__.py` | Register `Astraflow` and `Astraflow-CN` in
`SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and
`LITELLM_PROVIDER_PREFIX` |
| `rag/llm/chat_model.py` | Add `AstraflowChat` and `AstraflowCNChat`
(OpenAI-compatible `Base` subclass) |
| `rag/llm/embedding_model.py` | Add `AstraflowEmbed` and
`AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) |
| `rag/llm/rerank_model.py` | Add `AstraflowRerank` and
`AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) |
| `rag/llm/cv_model.py` | Add `AstraflowCV` and `AstraflowCNCV`
(subclasses of `GptV4`) |
| `rag/llm/tts_model.py` | Add `AstraflowTTS` and `AstraflowCNTTS`
(subclasses of `OpenAITTS`) |
| `rag/llm/sequence2txt_model.py` | Add `AstraflowSeq2txt` and
`AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) |
| `conf/llm_factories.json` | Register `Astraflow` and `Astraflow-CN`
factories with a curated list of popular models |

---

### Supported Model Types
-  **Chat / LLM** — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7,
Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more
-  **Text Embedding** — text-embedding-3-small/large
-  **Image / Vision (IMAGE2TEXT)** — GPT-4o, GPT-4.1, Claude, Gemini,
Llama-4, etc.
-  **Text Re-Rank**
-  **TTS** — tts-1
-  **Speech-to-Text (SPEECH2TEXT)** — whisper-1

### Implementation Notes
- Uses the `openai/` LiteLLM prefix — consistent with other
OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI,
OpenRouter, n1n, Avian, etc.)
- `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249)
are separate factory entries, allowing users to choose the optimal
endpoint based on their region.
- All model classes cleanly subclass existing base classes (`Base`,
`OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`)
with no custom logic needed — the provider is fully OpenAI-compatible.

---------

Co-authored-by: user <user@xzaaaMacBook-Air.local>
2026-04-22 15:38:34 +08:00
e48d75987c Go: add stream / think chat (#14242)
### What problem does this PR solve?

1. Supports stream and non-stream chat
2. Supports think and non-think chat
3. List supported models from DeepSeek service. (This command can be
used to verify the API validity)

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-21 16:52:32 +08:00
f269ee9739 Go: add thinking features to zhipu-ai (#14234)
### What problem does this PR solve?

```
RAGFlow(user)> list models from 'zhipu-ai';
+------------+------------+---------------+----------------+
| features   | max_tokens | model_types   | name           |
+------------+------------+---------------+----------------+
| [thinking] | 128000     | [chat]        | glm-4.7        |
| [thinking] | 128000     | [chat]        | glm-4.5        |
| [thinking] | 128000     | [chat vision] | glm-4.6v-Flash |
| [thinking] | 128000     | [chat]        | glm-4.5-x      |
| [thinking] | 128000     | [chat]        | glm-4.5-air    |
| [thinking] | 128000     | [chat]        | glm-4.5-airx   |
| [thinking] | 128000     | [chat]        | glm-4.5-flash  |
| [thinking] | 64000      | [vision]      | glm-4.5v       |
|            | 128000     | [chat]        | glm-4-plus     |
|            | 128000     | [chat]        | glm-4-0520     |
|            | 128000     | [chat]        | glm-4          |
|            | 8000       | [chat]        | glm-4-airx     |
|            | 128000     | [chat]        | glm-4-air      |
|            | 128000     | [chat]        | glm-4-flash    |
|            | 128000     | [chat]        | glm-4-flashx   |
|            | 1000000    | [chat]        | glm-4-long     |
|            | 128000     | [chat]        | glm-3-turbo    |
|            | 2000       | [vision]      | glm-4v         |
|            | 8192       | [chat]        | glm-4-9b       |
|            | 512        | [embedding]   | embedding-2    |
|            | 512        | [embedding]   | embedding-3    |
|            | 4096       | [asr]         | glm-asr        |
|            | 0          | [tts]         | glm-tts        |
|            | 0          | [ocr]         | glm-ocr        |
|            | 0          | [rerank]      | glm-rerank     |
+------------+------------+---------------+----------------+
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-20 21:53:27 +08:00
af2ed416a7 Add extra field to model instance (#14203)
### What problem does this PR solve?

Now each model support region with different URL

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-20 15:31:12 +08:00
4e992de91f Add tongyi gte-rerank-v2 (#14215)
https://bailian.console.aliyun.com/cn-beijing?tab=api#/api/?type=model&url=2780056

### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change
- [x] Other (please describe): add gte-rerank-v2、qwen3-rerank
2026-04-20 11:39:17 +08:00
94106646e7 Go: set and list default models (#14191)
### What problem does this PR solve?

```
RAGFlow(user)> set default vlm "zhipu-ai" "ccc" "glm-4.6v-flash";
SUCCESS
RAGFlow(user)> list default models;
+--------+----------------+----------------+----------------+------------+
| enable | model_instance | model_name     | model_provider | model_type |
+--------+----------------+----------------+----------------+------------+
| true   | ccc            | glm-4.6v-flash | zhipu-ai       | llm        |
| true   | ccc            | glm-4.6v-flash | zhipu-ai       | image2text |
+--------+----------------+----------------+----------------+------------+
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-17 18:05:33 +08:00
69264b3a70 Feat: Refact pipeline (#13826)
### What problem does this PR solve?

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring

---------

Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-03 19:26:45 +08:00
6c29128de1 Refactor model provider and command (#13887)
### What problem does this PR solve?

Introduce 5 new tables, including model groups and provider instance.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-02 20:20:35 +08:00
e20cf39735 Refactor Go server model provider reading and access (#13831)
### What problem does this PR solve?

1. Refactor model provider json file format
2. Use memory data structure to replace database
3. Add CLI command to access

```
RAGFlow(user)> list pool models from 'xai';
+-------------------------------------------------------------------------------------+------------+-------------+-----------------------+
| features                                                                            | max_tokens | model_types | name                  |
+-------------------------------------------------------------------------------------+------------+-------------+-----------------------+
| map[]                                                                               | 256000     | [llm]       | grok-4                |
| map[]                                                                               | 131072     | [llm]       | grok-3                |
| map[]                                                                               | 131072     | [llm]       | grok-3-fast           |
| map[]                                                                               | 131072     | [llm]       | grok-3-mini           |
| map[]                                                                               | 131072     | [llm]       | grok-3-mini-mini-fast |
| map[multimodal:map[enabled:true input_modalities:[image] output_modalities:[text]]] | 32768      | [vlm]       | grok-2-vision         |
+-------------------------------------------------------------------------------------+------------+-------------+-----------------------+
RAGFlow(user)> show pool model 'grok-2-vision' from 'xai';
+-------------------------------------------------------------------------------------+------------+-------------+---------------+
| features                                                                            | max_tokens | model_types | name          |
+-------------------------------------------------------------------------------------+------------+-------------+---------------+
| map[multimodal:map[enabled:true input_modalities:[image] output_modalities:[text]]] | 32768      | [vlm]       | grok-2-vision |
+-------------------------------------------------------------------------------------+------------+-------------+---------------+
RAGFlow(user)> list pool providers;
+--------+------------------------------------------------------------+---------------------------+
| name   | tags                                                       | url                       |
+--------+------------------------------------------------------------+---------------------------+
| OpenAI | LLM,TEXT EMBEDDING,TTS,TEXT RE-RANK,SPEECH2TEXT,MODERATION | https://api.openai.com/v1 |
| xAI    | LLM                                                        | https://api.x.ai/v1       |
+--------+------------------------------------------------------------+---------------------------+
RAGFlow(user)> show pool provider 'openai';
+---------------------------+--------+------------------------------------------------------------+--------------+
| base_url                  | name   | tags                                                       | total_models |
+---------------------------+--------+------------------------------------------------------------+--------------+
| https://api.openai.com/v1 | OpenAI | LLM,TEXT EMBEDDING,TTS,TEXT RE-RANK,SPEECH2TEXT,MODERATION | 27           |
+---------------------------+--------+------------------------------------------------------------+--------------+
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-03-30 12:00:49 +08:00
2240fc778c Fix: add missing "mom" field to infinity_mapping.json for parent-child chunker (#13821)
### What problem does this PR solve?

When using Infinity as DOC_ENGINE with parent-child chunker enabled,
vector insertion fails because the "mom" field is missing from the index
mapping. This fix adds the required field to resolve the issue.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-03-27 13:06:18 +08:00
b308cd3a02 Update go cli (#13717)
### What problem does this PR solve?

Go cli

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-03-24 20:08:36 +08:00
dd839f30e8 Fix: code supports matplotlib (#13724)
### What problem does this PR solve?

Code as "final" node: 

![img_v3_02vs_aece4caf-8403-4939-9e68-9845a22c2cfg](https://github.com/user-attachments/assets/9d87b8df-da6b-401c-bf6d-8b807fe92c22)

Code as "mid" node:

![img_v3_02vv_f74f331f-d755-44ab-a18c-96fff8cbd34g](https://github.com/user-attachments/assets/c94ef3f9-2a6c-47cb-9d2b-19703d2752e4)


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-03-20 20:32:00 +08:00
13d0df1562 feat: add Perplexity contextualized embeddings API as a new model provider (#13709)
### What problem does this PR solve?

Adds Perplexity contextualized embeddings API as a new model provider,
as requested in #13610.

- `PerplexityEmbed` provider in `rag/llm/embedding_model.py` supporting
both standard (`/v1/embeddings`) and contextualized
(`/v1/contextualizedembeddings`) endpoints
- All 4 Perplexity embedding models registered in
`conf/llm_factories.json`: `pplx-embed-v1-0.6b`, `pplx-embed-v1-4b`,
`pplx-embed-context-v1-0.6b`, `pplx-embed-context-v1-4b`
- Frontend entries (enum, icon mapping, API key URL) in
`web/src/constants/llm.ts`
- Updated `docs/guides/models/supported_models.mdx`
- 22 unit tests in `test/unit_test/rag/llm/test_perplexity_embed.py`

Perplexity's API returns `base64_int8` encoded embeddings (not
OpenAI-compatible), so this uses a custom `requests`-based
implementation. Contextualized vs standard model is auto-detected from
the model name.

Closes #13610

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
2026-03-20 10:47:48 +08:00
f171554c0a feat: upgrade MiniMax default model to M2.7 (#13676)
## Summary
Upgrade MiniMax model configuration to include the latest M2.7 model.

## Changes
- Add `MiniMax-M2.7` and `MiniMax-M2.7-highspeed` to the model selection
list in `conf/llm_factories.json`
- Place M2.7 models at the top of the list as the recommended default
- Retain all previous models (M2.5, M2.5-highspeed, M2.1, M2) as
available alternatives

## Why
MiniMax-M2.7 is the latest flagship model with enhanced reasoning and
coding capabilities. This update ensures RAGFlow users can access the
newest model while maintaining backward compatibility with existing
configurations.

## Testing
- JSON config validated (well-formed)
- No existing MiniMax-specific unit tests affected
- Model entries follow the same structure as existing entries

Co-authored-by: PR Bot <pr-bot@minimaxi.com>
2026-03-18 19:20:10 +08:00
74866371ef Fix compatiblity issue (#13667)
### What problem does this PR solve?

1. Change go admin server port from 9385 to 9383 to avoid conflicts
2. Start go server after python servers are started completely, in
entrypoint.sh
3. Fix some database migration issue
4. Add more API routes in web to compliant with EE.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-03-18 11:51:03 +08:00
35cd56f990 feat: add MiniMax-M2.5 and M2.5-highspeed models (#13557)
## Summary

Add MiniMax's latest M2.5 model family to the model registry and update
the default API base URL to the international endpoint for broader
accessibility.

## Changes

- **Add MiniMax-M2.5 models** to `conf/llm_factories.json`:
- `MiniMax-M2.5` — Peak Performance. Ultimate Value. Master the Complex.
  - `MiniMax-M2.5-highspeed` — Same performance, faster and more agile.
- Both support 204,800 token context window and tool calling (`is_tools:
true`).
- **Update default MiniMax API base URL** in `rag/llm/__init__.py`:
- From `https://api.minimaxi.com/v1` (domestic) to
`https://api.minimax.io/v1` (international).
- Chinese users can still override via the Base URL field in the UI
settings (as documented in existing i18n strings).

## Supported Models

| Model | Context Window | Tool Calling | Description |
|-------|---------------|-------------|-------------|
| `MiniMax-M2.5` | 204,800 tokens | Yes | Peak Performance. Ultimate
Value. |
| `MiniMax-M2.5-highspeed` | 204,800 tokens | Yes | Same performance,
faster and more agile. |

## API Documentation

- OpenAI Compatible API:
https://platform.minimax.io/docs/api-reference/text-openai-api

## Testing

- [x] JSON validation passes
- [x] Python syntax validation passes
- [x] Ruff lint passes
- [x] MiniMax-M2.5 API call verified (returns valid response)
- [x] MiniMax-M2.5-highspeed API call verified (returns valid response)

Co-authored-by: PR Bot <pr-bot@minimaxi.com>
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
2026-03-12 20:41:46 +08:00
5cbdfc5f17 Fix Gitee embedding model URL error (#13553)
### What problem does this PR solve?

As title

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-03-12 13:13:06 +08:00
6023eb27ac feat: add Ragcon provider (#13425)
### What problem does this PR solve?

This PR aims to extend the list of possible providers. Adds new Provider
"RAGcon" within the Ollama Modal. It provides all model types except OCR
via Openai-compatible endpoints.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Jakob <16180662+hauberj@users.noreply.github.com>
2026-03-06 09:37:27 +08:00
5fc3bd38b0 Feat: Support siliconflow.com (#13308)
### What problem does this PR solve?

Feat: Support siliconflow.com

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-03-02 15:37:42 +08:00
1db221f19e Feat: add more models for siliconflow and tongyi-qwen (#13311)
### What problem does this PR solve?

Feat: add more models for siliconflow and tongyi-qwen

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2026-03-02 15:37:08 +08:00
5f53fbe0f1 feat: Add Avian as an LLM provider (#13256)
### What problem does this PR solve?

This PR adds [Avian](https://avian.io) as a new LLM provider to RAGFlow.
Avian provides an OpenAI-compatible API with competitive pricing,
offering access to models like DeepSeek V3.2, Kimi K2.5, GLM-5, and
MiniMax M2.5.

**Provider details:**
- API Base URL: `https://api.avian.io/v1`
- Auth: Bearer token via API key
- OpenAI-compatible (chat completions, streaming, function calling)
- Models:
  - `deepseek/deepseek-v3.2` — 164K context, $0.26/$0.38 per 1M tokens
  - `moonshotai/kimi-k2.5` — 131K context, $0.45/$2.20 per 1M tokens
  - `z-ai/glm-5` — 131K context, $0.30/$2.55 per 1M tokens
  - `minimax/minimax-m2.5` — 1M context, $0.30/$1.10 per 1M tokens

**Changes:**
- `rag/llm/chat_model.py` — Add `AvianChat` class extending `Base`
- `rag/llm/__init__.py` — Register in `SupportedLiteLLMProvider`,
`FACTORY_DEFAULT_BASE_URL`, `LITELLM_PROVIDER_PREFIX`
- `conf/llm_factories.json` — Add Avian factory with model definitions
- `web/src/constants/llm.ts` — Add to `LLMFactory` enum, `IconMap`,
`APIMapUrl`
- `web/src/components/svg-icon.tsx` — Register SVG icon
- `web/src/assets/svg/llm/avian.svg` — Provider icon
- `docs/references/supported_models.mdx` — Add to supported models table

This follows the same pattern as other OpenAI-compatible providers
(e.g., n1n #12680, TokenPony).

cc @KevinHuSh @JinHai-CN

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
2026-02-27 17:36:55 +08:00
25a32c198d Fix: gemini model names (#13073)
### What problem does this PR solve?

Fix: gemini model names #13053


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-02-09 17:51:22 +08:00
6361fc4b33 Feat: update stepfun list (#12991)
### What problem does this PR solve?

Update stepfun list.

Add TTS and Sequence2Text functionalities.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-02-05 12:47:04 +08:00
8fc3986f70 fix(llm): Fix Gitee AI links and update the reranker model configuration (#12916)
### What problem does this PR solve?

Fix Gitee AI links and update the reranker model configuration

### Type of change

- [X] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: franco <1787003204@q.comq>
2026-02-02 13:43:16 +08:00
51210a1762 Add secondary index to infinity (#12825)
Add secondary index:
1. kb_id
2. available_int

---------

Signed-off-by: zpf121 <1219290549@qq.com>
Co-authored-by: Yingfeng Zhang <yingfeng.zhang@gmail.com>
2026-02-02 13:22:29 +08:00
9a5208976c Put document metadata in ES/Infinity (#12826)
### What problem does this PR solve?

Put document metadata in ES/Infinity.

Index name of meta data: ragflow_doc_meta_{tenant_id}

### Type of change

- [x] Refactoring
2026-01-28 13:29:34 +08:00
fd11aca8e5 feat: Implement pluggable multi-provider sandbox architecture (#12820)
## Summary

Implement a flexible sandbox provider system supporting both
self-managed (Docker) and SaaS (Aliyun Code Interpreter) backends for
secure code execution in agent workflows.

**Key Changes:**
-  Aliyun Code Interpreter provider using official
`agentrun-sdk>=0.0.16`
-  Self-managed provider with gVisor (runsc) security
-  Arguments parameter support for dynamic code execution
-  Database-only configuration (removed fallback logic)
-  Configuration scripts for quick setup

Issue #12479

## Features

### 🔌 Provider Abstraction Layer

**1. Self-Managed Provider** (`agent/sandbox/providers/self_managed.py`)
- Wraps existing executor_manager HTTP API
- gVisor (runsc) for secure container isolation
- Configurable pool size, timeout, retry logic
- Languages: Python, Node.js, JavaScript
- ⚠️ **Requires**: gVisor installation, Docker, base images

**2. Aliyun Code Interpreter**
(`agent/sandbox/providers/aliyun_codeinterpreter.py`)
- SaaS integration using official agentrun-sdk
- Serverless microVM execution with auto-authentication
- Hard timeout: 30 seconds max
- Credentials: `AGENTRUN_ACCESS_KEY_ID`, `AGENTRUN_ACCESS_KEY_SECRET`,
`AGENTRUN_ACCOUNT_ID`, `AGENTRUN_REGION`
- Automatically wraps code to call `main()` function

**3. E2B Provider** (`agent/sandbox/providers/e2b.py`)
- Placeholder for future integration

### ⚙️ Configuration System

- `conf/system_settings.json`: Default provider =
`aliyun_codeinterpreter`
- `agent/sandbox/client.py`: Enforces database-only configuration
- Admin UI: `/admin/sandbox-settings`
- Configuration validation via `validate_config()` method
- Health checks for all providers

### 🎯 Key Capabilities

**Arguments Parameter Support:**
All providers support passing arguments to `main()` function:
```python
# User code
def main(name: str, count: int) -> dict:
    return {"message": f"Hello {name}!" * count}

# Executed with: arguments={"name": "World", "count": 3}
# Result: {"message": "Hello World!Hello World!Hello World!"}
```

**Self-Describing Providers:**
Each provider implements `get_config_schema()` returning form
configuration for Admin UI

**Error Handling:**
Structured `ExecutionResult` with stdout, stderr, exit_code,
execution_time

## Configuration Scripts

Two scripts for quick Aliyun sandbox setup:

**Shell Script (requires jq):**
```bash
source scripts/configure_aliyun_sandbox.sh
```

**Python Script (interactive):**
```bash
python3 scripts/configure_aliyun_sandbox.py
```

## Testing

```bash
# Unit tests
uv run pytest agent/sandbox/tests/test_providers.py -v

# Aliyun provider tests
uv run pytest agent/sandbox/tests/test_aliyun_codeinterpreter.py -v

# Integration tests (requires credentials)
uv run pytest agent/sandbox/tests/test_aliyun_codeinterpreter_integration.py -v

# Quick SDK validation
python3 agent/sandbox/tests/verify_sdk.py
```

**Test Coverage:**
- 30 unit tests for provider abstraction
- Provider-specific tests for Aliyun
- Integration tests with real API
- Security tests for executor_manager

## Documentation

- `docs/develop/sandbox_spec.md` - Complete architecture specification
- `agent/sandbox/tests/MIGRATION_GUIDE.md` - Migration from legacy
sandbox
- `agent/sandbox/tests/QUICKSTART.md` - Quick start guide
- `agent/sandbox/tests/README.md` - Testing documentation

## Breaking Changes

⚠️ **Migration Required:**

1. **Directory Move**: `sandbox/` → `agent/sandbox/`
   - Update imports: `from sandbox.` → `from agent.sandbox.`

2. **Mandatory Configuration**: 
   - SystemSettings must have `sandbox.provider_type` configured
   - Removed fallback default values
- Configuration must exist in database (from
`conf/system_settings.json`)

3. **Aliyun Credentials**:
   - Requires `AGENTRUN_*` environment variables (not `ALIYUN_*`)
   - `AGENTRUN_ACCOUNT_ID` is now required (Aliyun primary account ID)

4. **Self-Managed Provider**:
   - gVisor (runsc) must be installed for security
   - Install: `go install gvisor.dev/gvisor/runsc@latest`

## Database Schema Changes

```python
# SystemSettings.value: CharField → TextField
api/db/db_models.py: Changed for unlimited config length

# SystemSettingsService.get_by_name(): Fixed query precision
api/db/services/system_settings_service.py: startswith → exact match
```

## Files Changed

### Backend (Python)
- `agent/sandbox/providers/base.py` - SandboxProvider ABC interface
- `agent/sandbox/providers/manager.py` - ProviderManager
- `agent/sandbox/providers/self_managed.py` - Self-managed provider
- `agent/sandbox/providers/aliyun_codeinterpreter.py` - Aliyun provider
- `agent/sandbox/providers/e2b.py` - E2B provider (placeholder)
- `agent/sandbox/client.py` - Unified client (enforces DB-only config)
- `agent/tools/code_exec.py` - Updated to use provider system
- `admin/server/services.py` - SandboxMgr with registry & validation
- `admin/server/routes.py` - 5 sandbox API endpoints
- `conf/system_settings.json` - Default: aliyun_codeinterpreter
- `api/db/db_models.py` - TextField for SystemSettings.value
- `api/db/services/system_settings_service.py` - Exact match query

### Frontend (TypeScript/React)
- `web/src/pages/admin/sandbox-settings.tsx` - Settings UI
- `web/src/services/admin-service.ts` - Sandbox service functions
- `web/src/services/admin.service.d.ts` - Type definitions
- `web/src/utils/api.ts` - Sandbox API endpoints

### Documentation
- `docs/develop/sandbox_spec.md` - Architecture spec
- `agent/sandbox/tests/MIGRATION_GUIDE.md` - Migration guide
- `agent/sandbox/tests/QUICKSTART.md` - Quick start
- `agent/sandbox/tests/README.md` - Testing guide

### Configuration Scripts
- `scripts/configure_aliyun_sandbox.sh` - Shell script (jq)
- `scripts/configure_aliyun_sandbox.py` - Python script

### Tests
- `agent/sandbox/tests/test_providers.py` - 30 unit tests
- `agent/sandbox/tests/test_aliyun_codeinterpreter.py` - Provider tests
- `agent/sandbox/tests/test_aliyun_codeinterpreter_integration.py` -
Integration tests
- `agent/sandbox/tests/verify_sdk.py` - SDK validation

## Architecture

```
Admin UI → Admin API → SandboxMgr → ProviderManager → [SelfManaged|Aliyun|E2B]
                                      ↓
                                  SystemSettings
```

## Usage

### 1. Configure Provider

**Via Admin UI:**
1. Navigate to `/admin/sandbox-settings`
2. Select provider (Aliyun Code Interpreter / Self-Managed)
3. Fill in configuration
4. Click "Test Connection" to verify
5. Click "Save" to apply

**Via Configuration Scripts:**
```bash
# Aliyun provider
export AGENTRUN_ACCESS_KEY_ID="xxx"
export AGENTRUN_ACCESS_KEY_SECRET="yyy"
export AGENTRUN_ACCOUNT_ID="zzz"
export AGENTRUN_REGION="cn-shanghai"
source scripts/configure_aliyun_sandbox.sh
```

### 2. Restart Service

```bash
cd docker
docker compose restart ragflow-server
```

### 3. Execute Code in Agent

```python
from agent.sandbox.client import execute_code

result = execute_code(
    code='def main(name: str) -> dict: return {"message": f"Hello {name}!"}',
    language="python",
    timeout=30,
    arguments={"name": "World"}
)

print(result.stdout)  # {"message": "Hello World!"}
```

## Troubleshooting

### "Container pool is busy" (Self-Managed)
- **Cause**: Pool exhausted (default: 1 container in `.env`)
- **Fix**: Increase `SANDBOX_EXECUTOR_MANAGER_POOL_SIZE` to 5+

### "Sandbox provider type not configured"
- **Cause**: Database missing configuration
- **Fix**: Run config script or set via Admin UI

### "gVisor not found"
- **Cause**: runsc not installed
- **Fix**: `go install gvisor.dev/gvisor/runsc@latest && sudo cp
~/go/bin/runsc /usr/local/bin/`

### Aliyun authentication errors
- **Cause**: Wrong environment variable names
- **Fix**: Use `AGENTRUN_*` prefix (not `ALIYUN_*`)

## Checklist

- [x] All tests passing (30 unit tests + integration tests)
- [x] Documentation updated (spec, migration guide, quickstart)
- [x] Type definitions added (TypeScript)
- [x] Admin UI implemented
- [x] Configuration validation
- [x] Health checks implemented
- [x] Error handling with structured results
- [x] Breaking changes documented
- [x] Configuration scripts created
- [x] gVisor requirements documented

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-28 13:28:21 +08:00
b57c82b122 Feat: add kimi-k2.5 (#12852)
### What problem does this PR solve?

Add kimi-k2.5

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-01-28 12:41:20 +08:00
c2e8f90023 feat(ci): Add Redis service port configuration to test environment (#12855)
### What problem does this PR solve?

Added Redis port calculation and environment variable export to support
Redis service in test environment. The port is dynamically assigned
based on runner number to prevent conflicts during parallel test
execution. Removed by #12685

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-01-28 09:27:47 +08:00
52da81cf9e Fix:Redis configuration template error in v0.22.1 (#12685)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/12674

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-01-27 12:47:46 +08:00
2d9e7b4acd Fix: aliyun oss need to use s3 signature_version (#12766)
### What problem does this PR solve?

Aliyun OSS do not support boto s4 signature_version which will lead to
an error:

```
botocore.exceptions.ClientError: An error occurred (InvalidArgument) when calling the PutObject operation: aws-chunked encoding is not supported with the specified x-amz-content-sha256 value
```

According to aliyun oss docs, oss_conn need to use s3 signature_version.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-01-22 11:43:55 +08:00
b40d639fdb Add dataset with table parser type for Infinity and answer question in chat using SQL (#12541)
### What problem does this PR solve?

1) Create  dataset using table parser for infinity
2) Answer questions in chat using SQL

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-01-19 19:35:14 +08:00