Commit Graph

6211 Commits

Author SHA1 Message Date
1c0eaa504b Docs: Finalized v0.25.3 release notes (#14913)
### What problem does this PR solve?

0.25.3 release notes, final.

### Type of change

- [x] Documentation Update
2026-05-14 10:57:43 +08:00
b2b63600f1 Adds gpt-5.4-mini and gpt-5.4-nano (#14908)
### What problem does this PR solve?
Includes gpt-5.4-mini and gpt-5.4-nano to the OpenAI model list

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-05-14 10:16:24 +08:00
e577901388 Fix doc format (#14909)
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-14 09:49:45 +08:00
cb01529d8b Go: implement provider: Voyage AI (#14811)
### What problem does this PR solve?

Add a Go driver for Voyage AI (https://voyageai.com), one of the
unchecked providers on the umbrella tracking issue #14736. Voyage AI is
**embed + rerank only** — no chat, no streaming, no `/v1/models`
endpoint. It's the first provider in the Go layer of this shape.

Until this PR, a tenant who configured `voyage` as a model provider in
the Go layer fell through to the default branch of
`internal/entity/models/factory.go` and got the dummy driver.

### What this PR includes

- New `internal/entity/models/voyage.go` with a `VoyageModel`
implementing the `ModelDriver` interface.
- New `conf/models/voyage.json` with 6 embedding models (`voyage-3.5`,
`voyage-3.5-lite`, `voyage-3-large`, `voyage-code-3`, `voyage-law-2`,
`voyage-finance-2`) and 2 rerank models (`rerank-2`, `rerank-2-lite`).
- `factory.go`: route `"voyage"` to `NewVoyageModel`.
- `internal/entity/models/voyage_test.go`: 19 unit tests.

### How the driver works

- **Embed**: `POST /v1/embeddings`. Response is OpenAI-shaped (`{data:
[{embedding, index, object, text}], model, usage}`). Driver reorders by
`index`, rejects duplicate / out-of-range / missing slots, and
short-circuits empty input without an HTTP call.
- **Rerank**: `POST /v1/rerank`. Voyage uses **`top_k`** as the request
param name (not `top_n` like Aliyun/SiliconFlow); the driver translates
`RerankConfig.TopN` → `top_k`. Response is Cohere-shaped (`{data:
[{relevance_score, index}], model}`), so the existing
`RerankResponse{Data: []RerankResult{Index, RelevanceScore}}` shape fits
cleanly.
- **`ListModels`**: returns a hardcoded list of `voyageKnownModels`.
Voyage does **not** expose `/v1/models` (probed live, returns 404), so
the driver synthesizes the list from the same set the config ships. New
upstream models are added by extending one slice.
- **`CheckConnection`**: pings a 1-input embed call against
`voyage-3.5`. Without `/v1/models`, this is the cheapest way to verify
the API key + network path before a tenant tries a real workload.
- **`ChatWithMessages` / `ChatStreamlyWithSender` / `Balance` /
`TranscribeAudio` / `AudioSpeech` / `OCRFile`**: all return `"no such
method"`. Voyage does not host any of these surfaces.

No interface change. No new dependencies.

### How was this tested?

**19 unit tests** in `internal/entity/models/voyage_test.go` — all pass
on go 1.25:

```
$ go test -vet=off -run TestVoyage -count=1 ./internal/entity/models/...
ok      ragflow/internal/entity/models  0.036s
```

Coverage: Name; Embed (happy path, reorder, empty-input, missing
key/model, duplicate index, out-of-range index, missing slot); Rerank
(happy path with `top_k` assertion, default-to-len-documents, empty
documents, out-of-range index); ListModels (static list, missing key);
CheckConnection (happy, 401); chat methods sentinels; Balance sentinel;
audio/OCR sentinels.

`go build ./internal/entity/models/...` exits 0.

**Live integration test** against `api.voyageai.com`:

```
=== RUN   TestVoyageLiveSmoke
    [OK] Name() = "voyage"
    [OK] ListModels (static): 8 models -> [voyage-3.5 voyage-3.5-lite voyage-3-large voyage-code-3 voyage-law-2 voyage-finance-2 rerank-2 rerank-2-lite]
    [OK] CheckConnection
    [OK] Embed vectors=3 dim=1024 indices=[0 1 2]
    [OK] Embed(empty) -> 0 vectors
    [OK] Rerank results=3 scores=[0.8125 0.59765625 0.39453125]
    [OK] ChatWithMessages returns voyage, no such method
    [OK] Balance returns voyage, no such method

VOYAGE LIVE SMOKE PASSED
--- PASS: TestVoyageLiveSmoke (0.81s)
```

What the live run proves on the wire:

- Auth (`Bearer <key>`) accepted by `api.voyageai.com`.
- Embed `voyage-3.5` on 3 inputs returns 3 vectors at dim 1024 with
`index` field preserved as `[0, 1, 2]` — the reorder-by-index code is
exercised on real data.
- Empty input short-circuits without an HTTP call (mock server would
have been hit if it did).
- Rerank `rerank-2` on 3 docs returns 3 real `relevance_score` floats
`[0.8125, 0.598, 0.395]`. The `top_k` translation works on the live
wire.
- All sentinel methods return the documented `"no such method"` strings.

### Note on PR history

This branch was previously named for LocalAI Embed work which is now
consolidated into PR #14813. The branch was reset to `upstream/main` and
rebuilt for Voyage. Diff against `main` is a clean +838 lines across 4
files.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Tracking: #14736

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-14 09:46:54 +08:00
dd76653dc1 feat: add tag management for Agents with filtering and sorting (#14774) (#14799)
## Summary
Closes #14774.

Adds free-form tags on agents (UserCanvas) with full UI + API:
- Stored as comma-separated `tags` column on `UserCanvas` with online
migration.
- New endpoints: `GET /v1/agents/tags` (aggregate counts) and `PUT
/v1/agent/<id>/tags` (write). `GET /v1/agents` accepts a `tags=` query.
- "Edit tags" item in agent dropdown opens a chip-style editor dialog;
tags render as badges on each agent card.
- New "Tags" facet in the agents filter bar, with counts.

## Implementation notes
- **Tag matching is exact-token**: the SQL filter wraps stored tags as
`,…,` and matches `,ml,` so `ml` doesn't match `ml-ops`.
- **Server-side normalization** in `UserCanvasService.update_tags`:
dedup (case-insensitive), per-tag cap of 64 chars, total length capped
at 512 chars to fit the column, commas inside tag values are replaced
with spaces.
- **Tenant authorization**: `PUT /v1/agent/<id>/tags` gates on
`UserCanvasService.accessible(canvas_id, tenant_id)`.
- **Tag listing scope**: `UserCanvasService.list_tags` follows the same
own + team-shared rule as `get_by_tenant_ids`.
- **i18n**: keys added to `en.ts` and `zh.ts` only (per project
convention; other locales fall back).
- **`HomeCard`** gets a non-breaking `extra?: ReactNode` slot for the
chip row; no `src/components/ui/` files modified.

## Test plan
- [ ] Backend boot runs `migrate_db` → confirm `user_canvas.tags` column
exists (`DESCRIBE user_canvas`).
- [ ] Agents page renders cards normally (no console error from missing
field).
- [ ] `⋯ → Edit tags` opens a dialog that stays open (regression: dialog
was unmounting with the dropdown).
- [ ] Typing a tag without pressing Enter and clicking Save persists it
(regression: last typed tag was being dropped).
- [ ] Chip input supports Enter/comma to commit, Backspace on empty to
remove, `×` to remove individual chip.
- [ ] Tag containing a comma sent via API is stored with the comma
replaced by a space.
- [ ] 20 long tags sent via API does not error (length cap silently
truncates).
- [ ] "Tags" filter in the filter bar shows counts and narrows the list.
- [ ] Filtering by `ml` does **not** return agents tagged `ml-ops`.
- [ ] UI in Chinese shows 编辑标签 / 添加标签以整理和筛选你的智能体 etc.
- [ ] `PUT /v1/agent/<other-tenant-id>/tags` returns `Agent not found or
no permission.`
2026-05-13 21:41:32 +08:00
cb49f47c38 Docs: Editorial updates to the v0.25.3 release notes draft. (#14903)
### What problem does this PR solve?


v0.25.3 release notes. To be continued.

### Type of change


- [x] Documentation Update
2026-05-13 21:36:34 +08:00
9e0f976729 Add widget customization and persistence (#14603)
Introduce comprehensive floating widget customization: add new widget
settings (title, subtitle, footer, colors, mute, streaming) with types
and defaults, and expose them via EmbedDialog UI (split into Embed Setup
and Widget Customization tabs). Persist and load settings through Agent
page by reading/writing globals and wiring an onSaveWidgetSettings
handler to setAgent; show a loading ButtonLoading for saving. Update
embed iframe query params and FloatingChatWidget to honor URL params
(colors, text, mute/streaming) with validation/normalization, color
darkening for gradients, footer link normalization, and improved
styling. Also add copy-to-clipboard in message toolbar, adjust syntax
highlighter layout and Copy button, and add i18n key for muteWidget.

### What problem does this PR solve?

Adds a few fields to the embed widget modal to customize the appearance
of the floating widget when embedded into a page.

### Type of change


- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Noah <Noah.Thompson@ecn.forces.gc.ca>
2026-05-13 21:13:11 +08:00
8c5845f6ca fix: use context manager for pdfplumber to prevent resource leak (#13512)
## Summary
- Convert `pdfplumber.open()` to use `with` context manager in
`api/utils/file_utils.py` (`thumbnail_img` function)
- If any exception occurs between `open()` and `close()`, the PDF file
handle leaks
- The rest of the codebase (e.g. `read_potential_broken_pdf` in the same
file) already uses `with pdfplumber.open(...)` correctly

## Test plan
- [x] PDF thumbnail generation works correctly with context manager
- [x] Resources properly cleaned up on exceptions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-13 21:09:51 +08:00
e994051eb9 Feature/generic api connector (#13545)
# feat: Add Generic REST API Connector

## What problem does this PR solve?

RAGFlow supports many specific data source connectors (MySQL, Slack,
Google Drive, etc.), but there was no way to connect an arbitrary REST
API as a data source. Users with custom or third-party APIs had to write
a new connector class for each one.

This PR adds a **generic, configuration-driven REST API connector** that
lets users connect any REST API as a data source entirely through the UI
— no code changes needed per API.

---

## Features

### Core Connector (`common/data_source/rest_api_connector.py`)

- Implements `LoadConnector` and `PollConnector` interfaces for full and
incremental sync
- **Configurable authentication:** None, API Key (custom header), Bearer
Token, Basic Auth
- **Pluggable pagination:** Page-based, Offset-based, Cursor-based, or
None
- Smart page-size inference from user's query parameters to avoid
duplicate/conflicting params
- Configurable request delay between pages to prevent API rate limiting
- Auto-detection of the items array in JSON responses (`items`,
`results`, `data`, `records`, or first list found)
- **Advanced field mapping** with dot-notation (`country.name`), array
wildcards (`newsType[*].name`), type hints, and default values
- Optional content template rendering (`"Title: {title}\nBody: {body}"`)
- HTML stripping for content fields
- Stable document IDs via `hash128` from a configurable ID field or
auto-generated from item content
- Pydantic configuration schema with automatic coercion of UI string
inputs to dicts/lists

### Backend Registration (`rag/svr/sync_data_source.py`,
`common/constants.py`, `common/data_source/config.py`)

- `REST_API` sync class wired into RAGFlow's `func_factory`
- Full sync (`load_from_state`) and incremental polling (`poll_source`)
support
- Credentials and config passed from task to connector following
existing patterns (MySQL, SeaFile, etc.)

### Test Connection Endpoint (`api/apps/connector_app.py`)

- `POST /v1/connector/<id>/test` validates config schema,
authentication, and API connectivity without triggering a sync
- Clear error messages for auth failures vs. config issues

### Frontend UI (`web/src/pages/user-setting/data-source/constant/`)

- **Postman-style configuration:** Base URL, Query Parameters (key=value
per line), Auth, Content Fields, Metadata Fields, Pagination Type
- Auth-type-aware form: fields for API key header/value, Bearer token,
or Basic username/password appear only when relevant
- **Advanced Settings** toggle for: Custom Headers, Max Pages, Request
Delay, Poll Timestamp Field, Request Body (POST)
- Connector icon (SVG) and i18n strings (English)
- **"Test Connection"** button to validate before syncing

---

## Controls & Safety

- Configurable max pages safety cap (default: 1000, adjustable in UI)
- Configurable request delay between pages (default: 0.5s, adjustable in
UI)
- Auth errors (401/403) fail immediately without retries; transient
errors retry with exponential backoff
- Diagnostic logging: auth setup confirmation, request details on
failure, content field extraction status

---

## Type of change

- [x] New Feature (non-breaking change which adds functionality)


##Visual Screenshots of Features
<img width="482" height="510" alt="Screenshot 2026-03-11 at 5 19 52 PM"
src="https://github.com/user-attachments/assets/dcb7ab4a-1622-44f3-bb02-d6f0527314c4"
/>
(Connector can be configured within the external data sources tab)

Configuration Parameters:
<img width="661" height="682" alt="Screenshot 2026-03-11 at 5 20 46 PM"
src="https://github.com/user-attachments/assets/5e154e71-4ab5-4872-bfb2-04f02b73c18a"
/>
<img width="661" height="682" alt="Screenshot 2026-03-11 at 5 20 54 PM"
src="https://github.com/user-attachments/assets/00cb14b7-0bcf-4b94-9d71-34e93369ecb2"
/>

Connection can be tested before attaching to dataset:
<img width="981" height="681" alt="Screenshot 2026-03-11 at 5 21 40 PM"
src="https://github.com/user-attachments/assets/aaa6eeeb-89a7-4349-bc34-2423bf8be9ee"
/>

Ingestion tested with API connector (works perfectly fine):
<img width="1062" height="705" alt="Screenshot 2026-03-11 at 5 22 30 PM"
src="https://github.com/user-attachments/assets/afcd0d58-cadd-4152-badc-d2f14d96fbec"
/>

Search & Retrieval works as well with metadata flow:
<img width="1062" height="705" alt="Screenshot 2026-03-11 at 5 23 05 PM"
src="https://github.com/user-attachments/assets/d41ee935-dcf7-4456-b317-22a76ca032c0"
/>

---------

Co-authored-by: Ahmad Intisar <ahmadintisar@Ahmads-MacBook-M4-Pro.local>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-05-13 20:35:01 +08:00
30d1c1dc28 Fix go compilation (#14900)
### What problem does this PR solve?

As title

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
v0.25.3
2026-05-13 20:05:56 +08:00
7f699d1202 Fix: enforce tenant authorization for tenant_rerank_id in retrieval flows (#14782)
### Related issues

Closes #14781 

### What problem does this PR solve?

Some retrieval endpoints accepted caller-supplied `tenant_rerank_id` and
resolved it through `get_model_config_by_id(...)`. That helper loaded
`TenantLLM` rows by global database id and returned decoded model
configuration without checking whether the model belonged to the
authenticated tenant or the dataset owner tenant.

This meant dataset access was validated, but rerank-model selection was
not. A caller who knew or could guess another tenant's
`tenant_rerank_id` could attempt retrieval with a foreign rerank model
config, creating a cross-tenant authorization gap for model usage.

This PR closes that gap by making `tenant_rerank_id` resolution
tenant-aware across the retrieval paths that accept it.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

### Solution

- Extend `get_model_config_by_id(...)` to accept an optional
`allowed_tenant_ids` set and reject `TenantLLM` rows whose `tenant_id`
is outside that set.
- Pass the allowed tenant scope from retrieval endpoints that accept
`tenant_rerank_id`:
  - `api/apps/sdk/doc.py`
  - `api/apps/sdk/session.py`
  - `api/apps/services/dataset_api_service.py`
- Use the authenticated tenant plus dataset-owner tenant ids already
derived by each retrieval flow as the authorization boundary for rerank
model selection.
- Add focused unit coverage to assert unauthorized `tenant_rerank_id`
values are rejected and that the allowed tenant set is propagated
correctly.

### Testing

- `python -m py_compile` on:
  - `api/db/joint_services/tenant_model_service.py`
  - `api/apps/services/dataset_api_service.py`
  - `api/apps/sdk/doc.py`
  - `api/apps/sdk/session.py`
- Added unit tests in:
-
`test/testcases/test_http_api/test_file_management_within_dataset/test_doc_sdk_routes_unit.py`
-
`test/testcases/test_http_api/test_session_management/test_session_sdk_routes_unit.py`

### Notes for reviewers

- This change is intentionally narrow: it affects only the
`tenant_rerank_id` path, not the normal `rerank_id` name-based
resolution path.
- Local lint/syntax checks passed.
- Full pytest execution could not be completed in this environment
because the local test runtime is missing `strenum`, so the route-test
files fail during collection before exercising the updated cases.

---------

Co-authored-by: jony376 <jony376@gmail.com>
2026-05-13 19:53:08 +08:00
bb1a6259d3 Docs: Updated v0.25.3 release notes draft (#14899)
### What problem does this PR solve?

Draft v0.25.3 release notes

### Type of change


- [x] Documentation Update
2026-05-13 19:49:07 +08:00
87516edadf Bump to infinity v0.7.0-dev7 (#14897)
### What problem does this PR solve?

Upgrade infinity

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-05-13 19:42:50 +08:00
9ed4da74b8 Docs: Draft 0.25.3 release notes (#14898)
### What problem does this PR solve?

A draft 0.25.3 release note.

### Type of change


- [x] Documentation Update

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-13 19:35:55 +08:00
0a4b733b2a Go: implement Rerank in LocalAI driver (#14813)
### What problem does this PR solve?

The LocalAI Go driver landed in #14809 and Embed landed in #14811.
`Rerank` was left as a stub that returns `"not implemented"`. This PR
fills the gap.

LocalAI exposes a public rerank endpoint at `<tenant-url>/v1/rerank`
with a Cohere-shaped request and response (`{model, query, documents,
top_n}` → `{results: [{index, relevance_score}]}`). The Python side has
had `LocalAIRerank` in `rag/llm/rerank_model.py` for a long time. Until
this PR, a tenant who wanted to use LocalAI for reranking in the Go
layer got `"not implemented"`.

### What this PR includes

- `conf/models/localai.json`: add `"rerank": "rerank"` under
`url_suffix` so the driver can build the URL from config. This matches
the `URLSuffix.Rerank` field already used by aliyun and siliconflow.
- `internal/entity/models/localai.go`: replace the `Rerank` stub with a
real implementation that POSTs to `/v1/rerank`. Adds local
request/response types `localAIRerankRequest` and
`localAIRerankResponse`.

No factory change. No interface change.

### How the implementation works

- Validate the model name and resolve the tenant-supplied base URL with
the existing `resolveBaseURL` helper.
- Wrap the request with `context.WithTimeout(nonStreamCallTimeout)` so
the call has a clear deadline. Same pattern `ChatWithMessages`,
`ListModels`, and `Embed` already use in this file.
- Only set the `Authorization` header when a non-empty API key was
supplied. LocalAI accepts an empty key by default, so this preserves the
optional-auth contract.
- Default `top_n` to `len(documents)` when `rerankConfig.TopN == 0`,
matching the existing Aliyun and SiliconFlow rerank implementations.
- Validate every `results[].index` against `len(documents)`. If the
upstream returns an out-of-range index, fail clearly instead of silently
writing past the slice.
- An empty `documents` slice returns `&RerankResponse{}` with no HTTP
call.
- Non-200 responses propagate the upstream status line and body.

### Note on stacking

This PR builds on #14809 (LocalAI driver) and #14811 (LocalAI Embed).
Until both merge, this PR's diff on GitHub will include all three
commits. After #14809 and #14811 land on `main`, GitHub will auto-reduce
this PR to only the `Rerank` changes (one commit, ~99 line diff in
`localai.go` plus 1 line in `localai.json`).

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

### How was this tested?

- `go build ./internal/entity/models/...` returns exit 0 on go 1.25 (the
`go.mod` minimum).
- The full method set on `LocalAIModel` still matches the `ModelDriver`
interface.
- Pattern parity with the existing Aliyun Rerank
(`internal/entity/models/aliyun.go`) and SiliconFlow Rerank
(`internal/entity/models/siliconflow.go`) implementations.

Closes #14812
Depends on #14809, #14811
Tracking: #14736

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-13 19:35:19 +08:00
3182fd0789 Docs: Update version references to v0.25.3 in READMEs and docs (#14896)
### What problem does this PR solve?

- Update version tags in README files (including translations) from
v0.25.2 to v0.25.3
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files

### Type of change

- [x] Documentation Update
2026-05-13 18:42:42 +08:00
f3b3596c29 Speed up ragflow server (#14894)
### What problem does this PR solve?

Speed up ragflow server

### Type of change

- [ ] Refactoring
2026-05-13 18:01:33 +08:00
b18640d228 Go: fix OCR command (#14891)
### What problem does this PR solve?

RAGFlow(user)> ocr with 'hunyuanocr@test@gitee' file './picture.png'
+----------------------------------------------------------+
| text                                                     |
+----------------------------------------------------------+
| 生活不是等待风暴过去,而是学会在雨中翩翩起舞。

——佚名                                                       |
+----------------------------------------------------------+

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-05-13 17:29:53 +08:00
8cb2bf04fb Fix: llm add api key overridden (#14885)
### What problem does this PR solve?

llm add api key overridden

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-13 17:15:32 +08:00
bf90c8948a Go: implement ListModels in ZhipuAI driver (#14886)
### What problem does this PR solve?

Fixes #14884

The ZhipuAI Go driver in `internal/entity/models/zhipu-ai.go` had a stub
`ListModels` method that always returned `"zhipu-ai, no such method"`.
The DeepSeek, Gitee, NVIDIA, OpenAI, SiliconFlow, and OpenRouter drivers
in the same package already implement `ListModels` against the
OpenAI-compatible `/models` endpoint, and the model picker UI relies on
it. This PR brings ZhipuAI in line with that pattern.

### Changes

- `internal/entity/models/zhipu-ai.go`: implement
`ZhipuAIModel.ListModels`.
  - Resolve region with default fallback.
- GET `${BaseURL[region]}/${URLSuffix.Models}` (resolves to
`https://open.bigmodel.cn/api/paas/v4/models` with the default region).
- Send `Authorization: Bearer <api_key>` when an API key is configured.
Omit the header when the key is empty, so an unauthenticated caller gets
a clear `401` from upstream.
- Surface non-200 responses with the upstream status line and body,
matching the other Go drivers.
- Parse the response via the package-level `DSModelList` / `DSModel`
types already used by DeepSeek, Gitee, and SiliconFlow.
- When the response includes `owned_by`, render the entry as
`id@owned_by`, matching the convention of Gitee and SiliconFlow.
- `conf/models/zhipu-ai.json`: add `"models": "models"` to `url_suffix`.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-05-13 16:39:14 +08:00
bbb0798c41 Fix: Set embedded models during form initialization. (#14889)
### What problem does this PR solve?

Fix: Set embedded models during form initialization.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-13 16:34:25 +08:00
8b53960819 Go: implement provider: LongCat (#14809)
### What problem does this PR solve?

Add a Go driver for LongCat (Meituan, https://longcat.chat), one of the
unchecked providers on the umbrella tracking issue #14736. LongCat
exposes an OpenAI-compatible REST API at
`https://api.longcat.chat/openai/v1` with three public chat models
including `LongCat-Flash-Thinking`, a reasoning model that returns
chain-of-thought in `reasoning_content` (OpenAI o-series shape).

Until this PR, a tenant who configured `longcat` as a model provider in
the Go layer fell through to the default branch of
`internal/entity/models/factory.go` and got the dummy driver.

### What this PR includes

- New `internal/entity/models/longcat.go` with a `LongCatModel`
implementing the `ModelDriver` interface.
- New `conf/models/longcat.json` with the 3 public chat models
(Flash-Chat, Flash-Lite, Flash-Thinking) and `url_suffix` for `chat` and
`models`.
- `factory.go`: route `"longcat"` to `NewLongCatModel`.

Method coverage:

- `ChatWithMessages`: `POST /openai/v1/chat/completions`, non-streaming
- `ChatStreamlyWithSender`: SSE stream against the same endpoint
- `ListModels` / `CheckConnection`: `GET /openai/v1/models`
- **Reasoning extraction**: `message.reasoning_content` (non-stream) and
`delta.reasoning_content` (stream) flow into
`ChatResponse.ReasonContent` / the sender's second arg. Matches the
OpenAI o-series convention also used by kimi-k2.6 and DeepSeek-R1.
- **`reasoning_effort` propagation**: `ChatConfig.Effort` → request body
`reasoning_effort` (LongCat-Flash-Thinking honors it; non-reasoning
models ignore it).
- `Embed` / `Rerank` / `Balance` / `TranscribeAudio` / `AudioSpeech` /
`OCRFile` return `"no such method"` (LongCat does not expose any of
these surfaces).

No interface change. No new dependencies.

### How was this tested?

**21 unit tests** in `internal/entity/models/longcat_test.go` — all
pass:

```
$ go test -vet=off -run TestLongCat -count=1 -v ./internal/entity/models/...
=== RUN   TestLongCatName
--- PASS: TestLongCatName (0.00s)
=== RUN   TestLongCatChatHappyPath
--- PASS: TestLongCatChatHappyPath (0.00s)
=== RUN   TestLongCatChatExtractsReasoningContent
--- PASS: TestLongCatChatExtractsReasoningContent (0.00s)
=== RUN   TestLongCatChatPropagatesReasoningEffort
--- PASS: TestLongCatChatPropagatesReasoningEffort (0.00s)
=== RUN   TestLongCatChatOmitsReasoningEffortWhenUnset
--- PASS: TestLongCatChatOmitsReasoningEffortWhenUnset (0.00s)
=== RUN   TestLongCatChatRequiresAPIKey
--- PASS: TestLongCatChatRequiresAPIKey (0.00s)
=== RUN   TestLongCatChatRequiresMessages
--- PASS: TestLongCatChatRequiresMessages (0.00s)
=== RUN   TestLongCatChatRejectsHTTPError
--- PASS: TestLongCatChatRejectsHTTPError (0.00s)
=== RUN   TestLongCatStreamHappyPath
--- PASS: TestLongCatStreamHappyPath (0.00s)
=== RUN   TestLongCatStreamExtractsReasoningContent
--- PASS: TestLongCatStreamExtractsReasoningContent (0.00s)
=== RUN   TestLongCatStreamRejectsExplicitFalse
--- PASS: TestLongCatStreamRejectsExplicitFalse (0.00s)
=== RUN   TestLongCatStreamRequiresSender
--- PASS: TestLongCatStreamRequiresSender (0.00s)
=== RUN   TestLongCatStreamFailsWithoutTerminal
--- PASS: TestLongCatStreamFailsWithoutTerminal (0.00s)
=== RUN   TestLongCatListModelsHappyPath
--- PASS: TestLongCatListModelsHappyPath (0.00s)
=== RUN   TestLongCatListModelsRequiresAPIKey
--- PASS: TestLongCatListModelsRequiresAPIKey (0.00s)
=== RUN   TestLongCatCheckConnectionDelegatesToListModels
--- PASS: TestLongCatCheckConnectionDelegatesToListModels (0.00s)
=== RUN   TestLongCatEmbedReturnsNoSuchMethod
--- PASS: TestLongCatEmbedReturnsNoSuchMethod (0.00s)
=== RUN   TestLongCatRerankReturnsNoSuchMethod
--- PASS: TestLongCatRerankReturnsNoSuchMethod (0.00s)
=== RUN   TestLongCatBalanceReturnsNoSuchMethod
--- PASS: TestLongCatBalanceReturnsNoSuchMethod (0.00s)
=== RUN   TestLongCatAudioOCRReturnNoSuchMethod
--- PASS: TestLongCatAudioOCRReturnNoSuchMethod (0.00s)
PASS
ok      ragflow/internal/entity/models  0.020s
```

`go build ./internal/entity/models/...` exits 0 on go 1.25.

**Live integration test** against `api.longcat.chat`:

```
=== RUN   TestLongCatLiveSmoke
    [OK] Name() = "longcat"
    [OK] CheckConnection
    [OK] ListModels: 5 models -> [LongCat-Flash-Lite LongCat-Flash-Chat LongCat-Flash-Thinking-2601 LongCat-Flash-Omni-2603 LongCat-2.0-Preview]
    [OK] Chat (Flash-Chat) answer="Got it! Let me know if you" reason=""
    [OK] Chat (Flash-Thinking) answer len=443 head="To find 15 % of 80, follow these steps:\n\n1. **Convert the percentage to a frac..."
                  ReasonContent len=557 head="The user asks: \"15% of 80?\" They want step by step reasoning and final answer in \\boxed{}. So we need to compute 15% of ..."
    [OK] Stream content: 78 chunks, 351 chars
    [OK] Stream reasoning: 107 chunks, 537 chars
    [OK] Balance returns longcat, no such method
    [OK] Embed returns longcat, no such method
    [OK] Rerank returns longcat, no such method

LONGCAT LIVE SMOKE PASSED
--- PASS: TestLongCatLiveSmoke (31.01s)
```

What the live run proves on the wire:

- Auth header (`Bearer <key>`) is accepted by `api.longcat.chat`.
- `/openai/v1/models` parser handles the real 5-model response (note:
live API returns versioned aliases `LongCat-Flash-Thinking-2601`,
`LongCat-Flash-Omni-2603`, `LongCat-2.0-Preview` plus the un-versioned
`LongCat-Flash-Chat` and `LongCat-Flash-Lite`).
- Non-stream chat against `LongCat-Flash-Chat`: visible answer parses
correctly, `ReasonContent` correctly empty.
- Non-stream chat against `LongCat-Flash-Thinking`: 443-char answer
flows into `Answer`, 557-char chain-of-thought flows into
`ReasonContent` via the new `message.reasoning_content` extraction.
- Streaming chat against `LongCat-Flash-Thinking`: 107 reasoning chunks
(537 chars) reach the sender's second arg via `delta.reasoning_content`;
78 content chunks (351 chars) reach the first arg. Before this code, the
reasoning chunks would have been silently dropped.
- All sentinel methods (Balance, Embed, Rerank, audio/OCR) return the
documented `"no such method"` strings.

### Note on PR history

This branch was previously named for LocalAI work which is now
consolidated into PR #14813. The branch was reset to `upstream/main` and
rebuilt for LongCat. The diff against `main` is a clean +969 lines
across 4 files.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Tracking: #14736

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-13 16:27:56 +08:00
ff685d3131 Delete duplicate route (#14883)
### What problem does this PR solve?

The delete /graph is duplicated of
`/datasets/<dataset_id>/<index_type>`, delete it.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-13 15:57:44 +08:00
09e1fd290a Chore: migrate tests to restful api (#14871)
### What problem does this PR solve?

add new testing suite for the new restful api endpoints meant to replace
http and web api tests

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Other (please describe): test
2026-05-13 15:07:23 +08:00
d63d3bb7d2 Go: implement provider: Novita.ai (#14850)
### What problem does this PR solve?

Add a Go driver for Novita.ai (https://novita.ai), one of the unchecked
providers on the umbrella tracking issue #14736. Novita exposes an
OpenAI-compatible REST API at `https://api.novita.ai/v3/openai` and
proxies a large catalog of third-party models (DeepSeek, Llama, Qwen3,
Kimi, Gemma, Mistral, MiniMax, GLM, etc.) behind a single OpenAI-shaped
surface — 102 models live at the time of writing.

Until this PR, a tenant who configured `novita` as a model provider in
the Go layer fell through to the default branch of
`internal/entity/models/factory.go` and got the dummy driver.

### What this PR includes

- New `internal/entity/models/novita.go` with a `NovitaModel`
implementing the `ModelDriver` interface (~520 lines).
- New `conf/models/novita.json` with 7 representative chat models
(DeepSeek-V4, Llama-3.3-70B, Qwen3-30B/235B reasoning, Kimi-K2,
Gemma-3-27B, Mistral-Nemo).
- `factory.go`: route `"novita"` to `NewNovitaModel`.
- `internal/entity/models/novita_test.go`: 23 unit tests.

### Notable design point: `<think>...</think>` reasoning extraction

Novita-routed reasoning models like `qwen3-*` and `deepseek-r1-*` embed
their chain-of-thought **inline inside content as `<think>...</think>`
tags**, rather than in a separate `reasoning_content` field. Verified
live by probing `api.novita.ai`:

```
content head 200: <think>
Okay, let's see. I need to find 15% of 80. Hmm, percentages can sometimes be tricky, but I think
content tail 100: h, that works.
Alternatively, 0.15 × 80. If I move the decimal two places to the left for </think>
```

Without handling, a tenant picking qwen3 via Novita would see raw
`<think>` tags in their UI answer — different from every other reasoning
provider in the Go layer.

The driver detects those tags and routes the inner text to
`ChatResponse.ReasonContent` (non-stream) or the sender's second arg
(stream), keeping the visible answer clean of tag clutter:

- **`splitNovitaThink`** — scans a complete content string. Used by the
non-streaming path. Handles multiple `<think>` blocks, unclosed tags
(the model got cut off mid-reasoning), pure-text content with no tags.
- **`novitaThinkExtractor`** — stateful streaming version. Buffers
trailing bytes that might be the start of a tag (e.g. `<thi` held back
when the next chunk completes `nk>`), then emits segments in routing
order so callers can pipe them to a UI. Tested with byte-level chunk
boundaries and tag-spanning scenarios.

### Method coverage

| Method | Behavior |
|---|---|
| `ChatWithMessages` | `POST /v3/openai/chat/completions`, `<think>`
extraction on response |
| `ChatStreamlyWithSender` | SSE stream, stateful `<think>` extraction
across deltas |
| `ListModels` / `CheckConnection` | `GET /v3/openai/models` (102 live)
|
| `Embed` / `Rerank` / `Balance` / `TranscribeAudio` / `AudioSpeech` /
`OCRFile` | `"no such method"` — Novita's OpenAI-compatible surface does
not expose any |

No interface change. No new dependencies.

### How was this tested?

**23 unit tests** in `internal/entity/models/novita_test.go` — all pass:

```
$ go test -vet=off -run "TestNovita|TestSplitNovita" -count=1 ./internal/entity/models/...
ok      ragflow/internal/entity/models  0.020s
```

Coverage:

- `splitNovitaThink` (5 cases: pure text, single block, leading text,
multiple blocks, unclosed tag)
- `novitaThinkExtractor` (6 cases: single-chunk, opening tag span,
closing tag span, byte-level chunking, no tags, lone `<` not as tag
start)
- `ChatWithMessages`: pure text, with `<think>` tags, missing API key,
empty messages, HTTP error
- `ChatStreamlyWithSender`: tag-stripping with spanning deltas, pure
content, sender-required, stream-true-required
- `ListModels` / `CheckConnection` (happy paths)
- All sentinel methods

`go build ./internal/entity/models/...` exits 0 on go 1.25.

**Live integration test** against `api.novita.ai/v3/openai`:

```
=== RUN   TestNovitaLiveSmoke
    [OK] Name() = "novita"
    [OK] CheckConnection
    [OK] ListModels: 102 models (showing first 6) [deepseek/deepseek-v4-pro deepseek/deepseek-v4-flash deepseek/deepseek-v3.2 xiaomimimo/mimo-v2.5-pro moonshotai/kimi-k2.6 zai-org/glm-5.1]
    [OK] Chat (llama-3.3) answer="ok" reason=""
    [OK] Chat (qwen3) answer len=0 head=""
                  ReasonContent len=1657 head="Okay, so I need to figure out what 15% of 80 is. Hmm, percentages can sometimes trip me up, but let ..."
    [OK] Stream content: 0 chunks, 0 chars; reasoning: 600 chunks, 1667 chars
    [OK] Embed/Rerank/Balance/TranscribeAudio/AudioSpeech/OCRFile all return "novita, no such method"

NOVITA LIVE SMOKE PASSED
--- PASS: TestNovitaLiveSmoke (26.18s)
```

What the live run proves on the wire:

- Auth (`Bearer <key>`) accepted by `api.novita.ai`.
- `/v3/openai/models` parser handles the real 102-model response.
- Non-stream chat against `meta-llama/llama-3.3-70b-instruct`: clean
string answer, empty ReasonContent (non-reasoning model, pure-text
path).
- Non-stream chat against `qwen/qwen3-30b-a3b-fp8`: 1657-char reasoning
extracted from `<think>...</think>` and routed to
`ChatResponse.ReasonContent`. Visible answer is 0 chars in this run
because qwen3 spent its 600-token budget entirely on reasoning before
reaching the answer phase — that's the model's behavior, not a driver
bug. The important thing: **no `<think>` tags leaked into the visible
Answer field**.
- Streaming against qwen3: 600 reasoning chunks (1667 chars) emitted via
the sender's 2nd arg across SSE deltas; **no `<think>` tag fragments
leaked into the content channel** despite tag boundaries crossing chunk
boundaries on the wire.
- All 6 sentinel methods return the documented `"no such method"`
strings.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Tracking: #14736
2026-05-13 14:10:50 +08:00
71d327b11c Fix: The text field resizing function in the knowledge block creation… (#14212)
… modal

- Add vertical resizing functionality for the text field

### What problem does this PR solve?

_Fix the issue where the text content of the knowledge base editing
parsing block is too long to scroll._

<img width="701" height="775" alt="image"
src="https://github.com/user-attachments/assets/b258422e-fbc1-466d-abab-062e642c21d5"
/>

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: chenyun <chenyun@chenyundemacbook-pro.local>
2026-05-13 13:57:05 +08:00
45d676bc05 Fix delete graphrag not take effect in UI (#14879)
### What problem does this PR solve?

Fix delete graphrag not take effect in UI

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-13 13:49:16 +08:00
733d75d6a7 Fix(Go): make Baidu Encode fail loudly on malformed responses (#14721)
### What problem does this PR solve?

The Baidu (Qianfan) `Encode` method silently swallowed malformed
responses. If a `data[]` item from the API was missing a field (`index`,
`embedding`, or unexpected shape), the loop did `continue` instead of
returning an error, leaving `nil` entries in the result slice. Callers
got back partial results with no indication anything went wrong, which
then crashes downstream consumers when they try to use a `nil` vector.

Concrete gaps fixed:

- No count-mismatch check between `data` length and input texts (only
checked for empty)
- No duplicate-index detection (a duplicate would silently overwrite)
- No missing-index final scan
- No empty-embedding rejection
- No per-call context timeout
- `EmbeddingConfig.Dimension` (added in #14735) was not propagated

This PR replaces `map[string]interface{}` parsing with a typed
`baiduEmbeddingResponse` struct, applies the standard four-layer
validation (count → out-of-range → duplicate → empty → final
missing-index scan), adds `context.WithTimeout(nonStreamCallTimeout)`,
and forwards `embeddingConfig.Dimension` as the `dimensions` parameter
(Baidu Qianfan v2 uses an OpenAI-compatible API).

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-13 12:54:00 +08:00
8b6dd6a5c2 fix: guard whitespace-only chunks before embedding (#13938)
## Problem

When parsing DOCX files with many tables, DeepDOC generates chunks
containing only empty HTML table tags, such as:

```html
<table><tr><td></td></tr><tr><td></td></tr><tr><td></td></tr><tr><td></td></tr></table>
```

After the regex cleanup at `task_executor.py:584`, this becomes `" "`
(whitespace only).

The guard at line 585 (`if not c`) only catches empty strings `""`, but
whitespace strings are truthy in Python and pass through. When sent to
Zhipu `embedding-3` API, it rejects them with error 1213:
`未正常接收到prompt参数`.

## Root Cause

```python
c = re.sub(r"</?(table|td|caption|tr|th)( [^<>]{0,12})?>", " ", c)
if not c:       # ← only catches "", not "   " / "\n" / "\t"
    c = "None"
```

Verified with Zhipu `embedding-3`:
| Input | Result |
|---|---|
| `""` | error 1213 |
| `" "` | error 1213 |
| `"\n"` | error 1213 |
| `"None"` | OK |

## Fix

```diff
- if not c:
+ if not c.strip():
      c = "None"
```

## Testing

Reproduced with a 678KB DOCX file (166 tables, 270 chunks). Chunk #89 is
the empty table above. After fix, `"None"` is sent instead and embedding
succeeds.

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2026-05-13 11:47:50 +08:00
64bd0130d3 Add REST API backward compatibility (#14872)
### What problem does this PR solve?

Add REST API backward compatibility

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-13 11:44:40 +08:00
5a5e766386 fix(api): authorize owner_ids for list chats and search apps (#14775)
Closes #14768
### What problem does this PR solve?
The `list_chats` and `list_searches` REST API endpoints did not enforce
authorization on the `owner_ids` query parameter. Any authenticated user
could pass arbitrary tenant IDs to `owner_ids` and retrieve chats or
search apps belonging to other tenants they are not a member of.

This PR resolves the issue by:
1. Looking up the current user's authorized tenants via
`TenantService.get_joined_tenants_by_user_id` and rejecting any
`owner_ids` that fall outside that set.
2. When no `owner_ids` are provided, scoping the query to only the
user's authorized tenants instead of returning an unfiltered result.
3. Adding unit tests that verify unauthorized `owner_ids` are rejected
with `OPERATING_ERROR`.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-13 09:43:44 +08:00
c34c81e8e6 fix: remove duplicate .wav and .aac in audio supported extensions list (#14791)
What problem does this PR solve?

In rag/app/audio.py, the supported audio extensions list contains
duplicate entries: .wav appears twice (positions 3 and 5) and .aac
appears twice (positions 6 and 14). While this does not affect runtime
behavior, it is redundant and makes the code harder to maintain.

This PR removes the duplicate entries to keep the list clean and
consistent.

Type of change

 - [X]  Bug Fix (non-breaking change which fixes an issue)
2026-05-13 09:42:31 +08:00
5e46457c28 Docs: How to add Bitbucket as data source. (#14846)
### What problem does this PR solve?

Added a guide on integrating Bitbucket as an external data source.

### Type of change

- [x] Documentation Update
2026-05-12 20:48:30 +08:00
ad4717f40a Go: fix model type check when use the model (#14843)
### What problem does this PR solve?
```
RAGFlow(user)> chat with 'glm-ocr@test@zhipu-ai' message 'what is this'
CLI error: expect  model glm-ocr@zhipu-ai is a chat or multimodal model
```
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-05-12 19:44:01 +08:00
76d5240fb5 Fix #14801 to allow search dataset list when add (#14841)
### What problem does this PR solve?

Fix #14801 to allow search dataset list when add, following on #14825

<img width="2172" height="857" alt="image"
src="https://github.com/user-attachments/assets/65ea7647-56f4-4c16-8437-121b834811f0"
/>


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-12 19:36:23 +08:00
3f41f8cfae Feat: When a Wait Node precedes a Message Node within a Loop Node, the outgoing message is split into two separate messages. (#14839)
### What problem does this PR solve?

Feat: When a Wait Node precedes a Message Node within a Loop Node, the
outgoing message is split into two separate messages.

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2026-05-12 18:48:44 +08:00
127aeac4aa fix: expose gpt-5.5 and gpt-5.4 in OpenAI model list (#14828)
### What problem does this PR solve?

OpenAI model catalogs used in provider selection flows were missing the
latest GPT models (`gpt-5.5` and `gpt-5.4`).
Because model availability is driven by seeded catalog data
(`conf/llm_factories.json` → DB seed → API response), these models were
not selectable in the UI or `/llm/list` responses.

This PR updates and synchronizes the OpenAI catalog definitions across
configuration sources and ensures the new models are correctly exposed
through the API layer and validated in tests.

---

### Type of change

* [x] New Feature (non-breaking change which adds functionality)

---

### Changes Made

* Added `gpt-5.5` and `gpt-5.4` to OpenAI catalog definitions in:

  * `conf/llm_factories.json`
  * `conf/models/openai.json` (chat + vision support)
* Ensured consistency between DB-seeded factory config and provider
model configuration
* Updated test coverage in:

  * `test_llm_list_unit.py`

    * seeded OpenAI catalog entries
* added response-level assertion validating `/llm/list` includes both
new model IDs under OpenAI grouping

---

### Root Cause

OpenAI model listings in selection flows are generated from catalog data
seeded via `conf/llm_factories.json`.
The catalog had not been updated to include the latest GPT models,
resulting in missing availability in UI and API responses.

---

### Testing

* Created isolated test environment:

  * `python -m venv .venv-review`
  * installed `pytest`
* Ran targeted and full test suite:

  * `test_list_app_grouping_availability_and_merge`:  passed
  * Full `test_llm_list_unit.py`:  10 passed

---

### Risks / Limitations

* Adding models to the catalog does not guarantee upstream provider
availability or account entitlement.
* Environments with pre-seeded DB catalogs may require reseed or refresh
to reflect updated configuration.

---

### Notes

* Changes are minimal and scoped strictly to catalog configuration and
related test coverage.
* Ensures `/llm/list` API remains aligned with expected latest OpenAI
model availability.
* Closes #14827
2026-05-12 18:03:47 +08:00
45ee5ca9cd Go: implement provider: Jina (#14838)
### What problem does this PR solve?

This PR completes the Jina provider

**The following functionalities are now supported:**

**Jina:**
- [ ] Chat / Stream Chat (Not available for now: [(Jina chat API
docs)](https://api.jina.ai/docs#/Search%20Foundation%20Models/chat_completions_v1_chat_completions_post))
- [x] Embedding
- [x] Rerank
- [x] Model listing
- [x] Provider connection checking
- [ ] ~~Balance~~

**Verified examples from the CLI:**

```plaintext
RAGFlow(user)> embed text 'walkerwhat' 'jumperwho' with 'jina-embeddings-v2-base-en@test@jina' dimension 16
+-----------+-------+
| dimension | index |
+-----------+-------+
| 768       | 0     |
| 768       | 1     |
+-----------+-------+

RAGFlow(user)> rerank query 'what is rag' document 'rag is retrieval augment generation' 'rag need llm' 'famous rag project includes ragflow' with 'jina-reranker-v2-base-multilingual@test@jina' top 3;
+-------+-----------------+
| index | relevance_score |
+-------+-----------------+
| 0     | 0.74316794      |
| 2     | 0.18713269      |
| 1     | 0.15817434      |
+-------+-----------------+

RAGFlow(user)> list supported models from 'jina' 'test'
+---------------------------------------------+
| model_name                                  |
+---------------------------------------------+
| Jina AI: Jina VLM                           |
| Jina AI: Jina Reranker v3                   |
| Jina AI: Jina Code Embeddings 0.5b          |
| Jina AI: Jina Code Embeddings 1.5b          |
| Jina AI: Jina Embeddings v4                 |
| Jina AI: Jina Reranker M0                   |
| Jina AI: ReaderLM v2                        |
| Jina AI: Jina Clip v2                       |
| Jina AI: Jina Embeddings v3                 |
| Jina AI: Jina Colbert v2                    |
| Jina AI: Reader LM 0.5b                     |
| Jina AI: Reader LM 1.5b                     |
| Jina AI: Jina Reranker v2 Base Multilingual |
| Jina AI: Jina Clip v1                       |
| Jina AI: Jina Reranker v1 Tiny EN           |
| Jina AI: Jina Reranker v1 Turbo EN          |
| Jina AI: Jina Reranker v1 Base EN           |
| Jina AI: Jina Colbert v1 EN                 |
| Jina AI: Jina Embeddings v2 Base ES         |
| Jina AI: Jina Embeddings v2 Base Code       |
| Jina AI: Jina Embeddings v2 Base DE         |
| Jina AI: Jina Embeddings v2 Base ZH         |
| Jina AI: Jina Embeddings v2 Base EN         |
| Jina AI: Jina Embedding B EN v1             |
| Jina AI: Jina Embeddings v5 Text Small      |
| Jina AI: Jina Embeddings v5 Omni Small      |
| Jina AI: Jina Embeddings v5 Omni Nano       |
| Jina AI: Jina Embeddings v5 Text Nano       |
+---------------------------------------------+

RAGFlow(user)> check instance 'test' from 'jina'
SUCCESS
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-05-12 18:03:05 +08:00
7d3836907a Go: implement Embed (embeddings) in Mistral driver (#14807)
### What problem does this PR solve?

The Mistral Go driver landed in #14805 with chat, list models, and check
connection. `Embed` was left as a stub that returns `"not implemented"`.
This PR fills the gap.

`conf/models/mistral.json` did not list any embedding model out of the
box, so a tenant who wanted to use Mistral end to end (chat +
embeddings) could not run an embedding call. This PR adds
`mistral-embed` to the config and a real `/v1/embeddings`
implementation.

### What this PR includes

- `conf/models/mistral.json`: add `"embedding": "embeddings"` under
`url_suffix` so the driver can build the URL from config (matches the
`URLSuffix.Embedding` field already used by openai, siliconflow,
zhipu-ai), and add a `mistral-embed` entry under `models`
(1024-dimensional vectors, 8192 max input tokens).
- `internal/entity/models/mistral.go`: replace the `Embed` stub with a
real implementation that POSTs to `/v1/embeddings`. Adds local response
types `mistralEmbeddingData` and `mistralEmbeddingResponse`.

No factory change. No interface change.

### How the implementation works

- Validate `apiConfig`, the API key, and the model name. Use the
existing `baseURLForRegion` helper so an unknown region fails fast with
a clear error.
- Wrap the request with `context.WithTimeout(nonStreamCallTimeout)` so
the call has a clear deadline. Same pattern as `ChatWithMessages` and
`ListModels` already use in this file.
- Send all input texts in one request. The Mistral API accepts the
`input` field as an array.
- Parse `data[*].embedding` and copy each slice into a `[]EmbeddingData`
indexed by `data[*].index` so the output order matches the input order
even if the API returns items in a different order.
- An empty input slice returns `[]EmbeddingData{}` with no HTTP call.
- Non-200 responses propagate the upstream status line and body.
- A final pass checks that every input slot got a vector. If any slot is
still empty, return a clear error so the caller does not silently use a
zero vector.

### Note on stacking

This PR builds on #14805 (the Mistral driver). Until #14805 merges, this
PR's diff on GitHub will include both that PR's commits and this one.
After #14805 lands on `main`, GitHub will auto-reduce this PR to only
the `Embed` changes (one commit, ~111 line diff in `mistral.go` plus 8
lines in `mistral.json`).

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

### How was this tested?

- `go build ./internal/entity/models/...` returns exit 0 on go 1.25 (the
`go.mod` minimum).
- The full method set on `MistralModel` still matches the `ModelDriver`
interface.
- Pattern parity with the existing OpenAI Embed implementation
(`internal/entity/models/openai.go`).

Closes #14806
Depends on #14805
Tracking: #14736

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-12 17:45:48 +08:00
14332dd75c Go: fix dataset time unit (#14837)
### What problem does this PR solve?

fix dataset time unit

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-12 17:22:16 +08:00
d08bf02d9b Go: add ASR, TTS, OCR command (#14836)
### What problem does this PR solve?

```
RAGFlow(user)> asr with 'glm-asr-2512@test@zhipu-ai' audio './speech.wav';
CLI error: zhipu, no such method
RAGFlow(user)> stream asr with 'glm-asr-2512@test@zhipu-ai' audio './speech.wav';
CLI error: zhipu, no such method

RAGFlow(user)> tts with 'glm-tts@test@zhipu-ai' text 'how are you';
CLI error: zhipu, no such method

RAGFlow(user)> stream tts with 'glm-tts@test@zhipu-ai' text 'how are you';
CLI error: zhipu, no such method

RAGFlow(user)> ocr with 'glm-ocr@test@zhipu-ai' file './test.log';
CLI error: zhipu, no such method
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-05-12 17:17:44 +08:00
9ee481807f GO: implement GET /api/v1/datasets/:dataset_id (#14834)
### What problem does this PR solve?

implement GET /api/v1/datasets/:dataset_id

### Type of change

- [x] Refactoring
2026-05-12 17:16:48 +08:00
4374e07a29 Speed up start time (#14833)
### What problem does this PR solve?

Speed up start time

### Type of change
- [x] Refactoring
2026-05-12 17:00:45 +08:00
eaa2e46b1e Go: implement Embed (embeddings) in Upstage driver (#14819)
### What problem does this PR solve?

The Upstage Go driver landed in #14817 with chat, list models, and check
connection. `Embed` was left as a stub that returns `"not implemented"`.
This PR fills the gap.

Upstage exposes an OpenAI-compatible embeddings endpoint at
`https://api.upstage.ai/v1/solar/embeddings` via the
`solar-embedding-1-large` family (`solar-embedding-1-large-query` for
queries, `solar-embedding-1-large-passage` for passages), and the Python
side has had `UpstageEmbed(OpenAIEmbed)` in `rag/llm/embedding_model.py`
for a long time targeting this same path. The existing
`conf/models/upstage.json` did not list any embedding model out of the
box, so a tenant who wanted to use Upstage end to end could not run an
embedding call. This PR fills the gap.

### What this PR includes

- `conf/models/upstage.json`: add `"embedding": "embeddings"` under
`url_suffix` so the driver can build the URL from config (matches the
`URLSuffix.Embedding` field already used by openai, mistral,
siliconflow, zhipu-ai), and add `solar-embedding-1-large-query` and
`solar-embedding-1-large-passage` entries under `models`.
- `internal/entity/models/upstage.go`: replace the `Embed` stub with a
real implementation that POSTs to `/v1/solar/embeddings`. Adds local
response types `upstageEmbeddingData` and `upstageEmbeddingResponse`.

No factory change. No interface change.

### How the implementation works

- Validate `apiConfig`, the API key, and the model name. Use the
existing `baseURLForRegion` helper so an unknown region fails fast with
a clear error.
- Wrap the request with `context.WithTimeout(nonStreamCallTimeout)` so
the call has a clear deadline. Same pattern as `ChatWithMessages` and
`ListModels` already use in this file.
- Send all input texts in one request. The Upstage API accepts the
`input` field as an array.
- Parse `data[*].embedding` and copy each slice into a `[]EmbeddingData`
indexed by `data[*].index` so the output order matches the input order
even if the API returns items in a different order.
- An empty input slice returns `[]EmbeddingData{}` with no HTTP call.
- Non-200 responses propagate the upstream status line and body.
- A final pass checks that every input slot got a vector. If any slot is
still empty, return a clear error so the caller does not silently use a
zero vector.

### Note on stacking

This PR builds on #14817 (the Upstage driver). Until #14817 merges, this
PR's diff on GitHub will include both that PR's commits and this one.
After #14817 lands on `main`, GitHub will auto-reduce this PR to only
the `Embed` changes (one commit, ~119 line diff in `upstage.go` plus ~15
lines in `upstage.json`).

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

### How was this tested?

- `go build ./internal/entity/models/...` returns exit 0 on go 1.25 (the
`go.mod` minimum).
- The full method set on `UpstageModel` still matches the `ModelDriver`
interface.
- Pattern parity with the existing Mistral Embed
(`internal/entity/models/mistral.go`) and OpenAI Embed
(`internal/entity/models/openai.go`) implementations.

Closes #14818
Depends on #14817
Tracking: #14736

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-12 16:11:06 +08:00
ebab3513c4 Go: implement provider: Baichuan (#14832)
### What problem does this PR solve?

This PR completes the Baichuan provider

**The following functionalities are now supported:**

**Baichuan:**
- [x] Chat / Stream Chat
- [x] Embedding
- [ ] ~~Rerank~~
- [ ] ~~Model listing~~
- [ ] ~~Provider connection checking~~
- [ ] ~~Balance~~

**Verified examples from the CLI:**

```plaintext
# Baichuan

RAGFlow(user)> embed text 'walkerwhat' 'jumperwho' with 'Baichuan-Text-Embedding@test@baichuan' dimension 16;
+-----------+-------+
| dimension | index |
+-----------+-------+
| 1024      | 0     |
| 1024      | 1     |
+-----------+-------+

AGFlow(user)> chat with 'Baichuan-M2@test@baichuan' message 'who r u'
Answer: I'm BaiChuan, a helpful AI assistant created by Baichuan-AI. I'm designed to be a knowledgeable, friendly, and reliable assistant for various tasks like answering questions, explaining concepts, writing content, and more. Feel free to ask me anything! 😊
Time: 1.637975

RAGFlow(user)> stream chat with 'Baichuan-M2@test@baichuan' message 'who r u'
Answer: I'm BaiChuan-m2, an AI assistant developed by Baichuan-AI. My purpose is to help you with a wide range of tasks by providing information, answering questions, solving problems, and assisting with creative projects. Think of me as a helpful digital companion! If you have any questions or need assistance, just let me know.😊
Time: 1.692321
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
2026-05-12 16:10:32 +08:00
2cc206ee85 Test : aggregation edge cases for list and scalar values (#14170)
This PR adds focused unit tests for aggregate_by_field in OceanBase
memory utilities to improve behavior coverage for real-world input
shapes.

- Adds test coverage for list-valued aggregation fields, including
whitespace trimming and skipping invalid list entries.
- Adds test coverage for scalar field values to ensure blank/non-string
values are ignored.
- Confirms aggregation output remains correct and stable for
mixed-quality message payloads.

### Why this helps
It strengthens regression protection for aggregation logic used by
memory retrieval flows, with no production code changes and minimal
review risk.
2026-05-12 15:53:35 +08:00
f85e18afbc Refact: sandbox quickstart.md & add tutorial for code exec component (#14786)
### What problem does this PR solve?

Refact: sandbox quickstart.md && add tutorial for code exec component

### Type of change

- [x] Refactoring


<img width="700" alt="img_v3_0211j_dcff835b-e3bb-4c77-9bc5-3b31a983229g"
src="https://github.com/user-attachments/assets/7842fc0f-639a-458f-b164-bc81a99ce4a5"
/>

---------

Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
2026-05-12 14:42:20 +08:00
e8adc977bd Fix: some agent bug (#14829)
### What problem does this PR solve?

fix: 
update null checks to use 'is None' for better clarity
replace RAGFlowSelect with SelectWithSearch in DebugContent
add max height and overflow to DialogContent in ParameterDialog
 remove unused types from DataOperationsForm

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-12 14:41:49 +08:00
lif
a02b456720 fix(docs): correct broken knowledge graph construction link (#13838)
Fixes #13817

### What problem does this PR solve?

The "knowledge graph construction" link on line 21 of
`docs/guides/dataset/run_retrieval_test.md` points to
`./construct_knowledge_graph.md`, which doesn't exist. The actual file
is at `./advanced/construct_knowledge_graph.md`.

### Type of change

- [x] Documentation Update

Signed-off-by: majiayu000 <1835304752@qq.com>
2026-05-12 14:27:56 +08:00
558ea51a0f Go: implement provider: StepFun (#14815)
### What problem does this PR solve?

Add a Go driver for StepFun (阶跃星辰), one of the unchecked providers on
the umbrella tracking issue #14736.

Until this PR, a tenant who configured `stepfun` as a model provider in
the Go layer fell through to the default branch of
`internal/entity/models/factory.go` and got the dummy driver. Chat, list
models, and check connection all returned `"not implemented"` instead of
reaching the StepFun API.

The Python side has had StepFun registered in `rag/llm/__init__.py` as a
`SupportedLiteLLMProvider` with base URL `https://api.stepfun.com/v1`,
plus `StepFunCV` for vision and `StepFunSeq2txt` for ASR, but no Go
path. StepFun's chat API is OpenAI-compatible, so the implementation
pattern is the same as the merged Moonshot driver (#14433) and OpenAI
driver (#14605).

### What this PR includes

- New file `internal/entity/models/stepfun.go` with a `StepFunModel`
that implements the `ModelDriver` interface.
- `factory.go`: route the `"stepfun"` provider name to
`NewStepFunModel`.
- New `conf/models/stepfun.json` with the public StepFun chat models
(step-2-16k, step-1 family in 8k/32k/128k/256k context lengths,
step-1-flash, and the step-1v / step-1o vision models) and `url_suffix`
entries for `chat` and `models`.

### How the driver works

- StepFun exposes the OpenAI-compatible API at
`https://api.stepfun.com/v1`.
- `ChatWithMessages` and `ChatStreamlyWithSender` post to
`/chat/completions` in the same shape as the merged moonshot,
openrouter, and openai drivers.
- `ListModels` and `CheckConnection` call `/models` to list available
ids and confirm the API key works.
- `Embed` is left as `"not implemented"`. StepFun has not advertised a
public embeddings endpoint in the API reference linked from the umbrella
issue
(`https://platform.stepfun.com/docs/en/api-reference/chat/chat-completion-create`
is the chat endpoint), so any real implementation belongs in a separate
follow-up only after the endpoint is verified.
- `Rerank` and `Balance` return `"no such method"` because StepFun does
not expose either.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

### How was this tested?

- `go build ./internal/entity/models/...` returns exit 0 with no errors
on go 1.25 (the `go.mod` minimum).
- Method set of `StepFunModel` matches the `ModelDriver` interface:
`NewInstance`, `Name`, `ChatWithMessages`, `ChatStreamlyWithSender`,
`Embed`, `Rerank`, `ListModels`, `Balance`, `CheckConnection`.
- Pattern parity with the merged moonshot (#14433), openai (#14605),
openrouter (#14652), and xai (#14550) drivers.

Closes #14814
Tracking: #14736
2026-05-12 13:49:35 +08:00