ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-05-28 11:43:06 +08:00

Author	SHA1	Message	Date
sxxtony	67f7d87dff	Go: implement provider: FuturMix (#15013 ) ### What problem does this PR solve? Add a Go driver for FuturMix (https://futurmix.ai/docs), one of the unchecked providers on the umbrella tracking issue #14736. FuturMix is documented as an "OpenAI-compatible API" aggregator over Claude / GPT / Gemini / DeepSeek (~22 models per their `/models` page). Until this PR, a tenant who configured `futurmix` as a model provider in the Go layer fell through to the default branch of `internal/entity/models/factory.go` and got the dummy driver. --------- Co-authored-by: sxxtony <sxxtony@users.noreply.github.com> Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-26 10:51:29 +08:00
Renzo	806414df43	Go: validate Baidu OCR inputs (#15168 ) ### What problem does this PR solve? Closes #15167. The Baidu Go provider advertises OCR support through `paddleocr-vl-0.9b`, but `BaiduModel.OCRFile` dereferenced required inputs before validating them. Calling OCR with a missing API config, API key, or model name could panic instead of returning a normal error. This PR adds explicit input validation for those required values. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-26 10:51:05 +08:00
Jake Armstrong	b961810e79	Go: implement OCR in ZhipuAI driver (#15143 ) ### What problem does this PR solve? Closes #15142. ZhipuAI lists `glm-ocr` as an OCR model, but the Go driver still returned `no such method` from `OCRFile`. This wires the advertised model to Z.AI's documented `layout_parsing` endpoint and returns the `md_results` Markdown output through the existing `OCRFileResponse.Text` field. This PR also adds focused tests for URL input, raw file-content base64 input, and validation errors. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): ### Test - [x] `go test -vet=off ./internal/entity/models -run 'TestZhipuAIOCRFile'`	2026-05-26 10:50:06 +08:00
qinling0210	af85aa9c7b	Implement Elasticsearch functions in GO (#15160 ) ### What problem does this PR solve? Implement Elasticsearch functions in GO (except for Search) ### Type of change - [x] Refactoring	2026-05-25 19:15:07 +08:00
Haruko386	4783ce9951	fix(Go): rewrite chat, listmodels, embed for Ollama (#15213 ) ### What problem does this PR solve? IDK how to implement `Ollama` on #14580 but it's totally wrong. This is the rewrite version for `Ollama` Verified from CLI ``` # Embed RAGFlow(user)> embed text 'what is rag' 'who are you' with 'nomic-embed-text:latest@test12@ollama' dimension 1024; +-----------+-------+ \| dimension \| index \| +-----------+-------+ \| 768 \| 0 \| \| 768 \| 1 \| +-----------+-------+ # Chat RAGFlow(user)> think chat with 'qwen3:0.6b@test12@ollama' message 'who r u' Thinking: Okay, the user asked, "Who r u?" I need to respond appropriately. First, I should acknowledge their question. Since I'm an AI, I don't have a physical form, but I can confirm that I'm a large language model. I should keep the response friendly and offer help. Let me make sure I'm not making up any information and that the response is natural. Also, I should check for any typos and ensure clarity. Alright, that should cover it. Answer: I'm an AI language model, and I don't have a physical form. However, I can tell you that I'm designed to assist with questions and tasks. How can I help you today? Time: 2.914285 RAGFlow(user)> stream think chat with 'qwen3:0.6b@test12@ollama' message 'who r u' Thinking: , the user asked, "Who are you?" I need to respond appropriately. Since I'm an AI assistant, I should mention that I don't have a physical form or a mind. I should also clarify that I can help with various tasks like answering questions or providing information. It's important to keep the response friendly and informative while maintaining the correct tone. Answer: don't have a physical form or a mind, but I'm here to help with your questions or tasks! What can I do for you today? Time: 1.740047 # LisyModels RAGFlow(user)> list supported models from 'ollama' 'test12' +-------------------------+ \| model_name \| +-------------------------+ \| nomic-embed-text:latest \| \| qwen3:0.6b \| +-------------------------+ ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring	2026-05-25 18:55:03 +08:00
Haruko386	69f301b84a	Go: implement embed for Tencent Hunyuan (#15207 ) ### What problem does this PR solve? Implement embed for Tencent Hunyuan Verified from CLI ``` RAGFlow(user)> embed text 'what is rag' 'who are you' with 'hunyuan-embedding@test1@hunyuan' dimension 16; +-----------+-------+ \| dimension \| index \| +-----------+-------+ \| 1024 \| 0 \| \| 1024 \| 1 \| +-----------+-------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2026-05-25 16:04:17 +08:00
ちー	bb6cfc14e6	feat[go]: implement provider: TokenHub (#15159 ) ### What problem does this PR solve? implement provider TokenHub ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-25 16:02:50 +08:00
Jin Hai	f8c626bbc8	Go: add ingestion server (#15094 ) ### What problem does this PR solve? 1. Go ingestion server will connected with admin server with gRPC stream 2. Go ingestion server will be responsible for ingestion tasks ``` RAGFlow(admin)> list ingestors; +-----------------+-----------+----------------------------------+---------------------------+----------+------------+--------------+--------+------------+---------------+ \| address \| cpu_usage \| id \| last_heartbeat \| name \| process_id \| rss_usage \| status \| task_count \| vms_usage \| +-----------------+-----------+----------------------------------+---------------------------+----------+------------+--------------+--------+------------+---------------+ \| 127.0.0.1:58564 \| 0 \| bdd1870eea2646e0aacb8a2cd3307aa2 \| 2026-05-24T18:16:17+08:00 \| ingestor \| 680152 \| 212.72265625 \| active \| 0 \| 2589.12109375 \| +-----------------+-----------+----------------------------------+---------------------------+----------+------------+--------------+--------+------------+---------------+ RAGFlow(admin)> start ingestion 'abc'; +----------------------------------+ \| task_id \| +----------------------------------+ \| e714777639ca4760ab427b5f211e81ad \| +----------------------------------+ RAGFlow(admin)> stop ingestion 'f7bd39d0a724457eb5fdce6d81699776'; +----------------------------------+ \| task_id \| +----------------------------------+ \| f7bd39d0a724457eb5fdce6d81699776 \| +----------------------------------+ RAGFlow(admin)> list tasks; +-----+----------------------------------+-------+------+----------------------------------+---------------------------+------------+------------+ \| ETA \| assign_to \| error \| from \| id \| last_update \| start_time \| status \| +-----+----------------------------------+-------+------+----------------------------------+---------------------------+------------+------------+ \| 0 \| 17937da188b84f23a5c10bb87588944b \| \| CLI \| eae6431da72a40e796cff3a03008091b \| 2026-05-24T19:46:03+08:00 \| \| COMPLETED \| \| 0 \| 17937da188b84f23a5c10bb87588944b \| \| CLI \| 6cccdd174bd049ecb05a774bbb47593f \| 2026-05-24T19:46:03+08:00 \| \| COMPLETED \| \| 0 \| 17937da188b84f23a5c10bb87588944b \| \| CLI \| ef360d777e57485799adb96b30f2b4b8 \| 2026-05-24T19:46:03+08:00 \| \| CANCELED \| \| 0 \| 17937da188b84f23a5c10bb87588944b \| \| CLI \| bcc5c5448cb64de48b6b6171c36fb790 \| 2026-05-24T19:46:03+08:00 \| \| CANCELED \| \| 0 \| 17937da188b84f23a5c10bb87588944b \| \| CLI \| bfc25384c43a443294fe2da979a38ac2 \| 2026-05-24T19:46:03+08:00 \| \| DISPATCHED \| \| 0 \| 17937da188b84f23a5c10bb87588944b \| \| CLI \| 84960537b85d413b8990a9efd5952d67 \| 2026-05-24T19:46:04+08:00 \| \| DISPATCHED \| \| 0 \| 17937da188b84f23a5c10bb87588944b \| \| CLI \| 3d223c1b51e24b36861a3bfb2f1d58d4 \| 2026-05-24T19:46:03+08:00 \| \| CANCELED \| \| 0 \| 17937da188b84f23a5c10bb87588944b \| \| CLI \| e433b0e356b846c89c301621a3c54494 \| 2026-05-24T19:46:03+08:00 \| \| COMPLETED \| \| 0 \| 17937da188b84f23a5c10bb87588944b \| \| CLI \| 7c93a3880f074ebd8eca14e6b51bb7ef \| 2026-05-24T19:46:03+08:00 \| \| COMPLETED \| \| 0 \| 17937da188b84f23a5c10bb87588944b \| \| CLI \| df2e4ef51aaf4390bff9a23f2692486e \| 2026-05-24T19:46:04+08:00 \| \| DISPATCHED \| \| 0 \| 17937da188b84f23a5c10bb87588944b \| \| CLI \| 7377c53010194ef7a83aa206698d66ff \| 2026-05-24T19:46:05+08:00 \| \| DISPATCHED \| \| 0 \| 17937da188b84f23a5c10bb87588944b \| \| CLI \| df64d1a1f9d348e3a2f174c4d7d69e73 \| 2026-05-24T19:46:05+08:00 \| \| DISPATCHED \| \| 0 \| 17937da188b84f23a5c10bb87588944b \| \| CLI \| b59834512e2847e1bdf13ace04b8a456 \| 2026-05-24T19:46:06+08:00 \| \| DISPATCHED \| \| 0 \| 17937da188b84f23a5c10bb87588944b \| \| CLI \| 0064bb0ab69344028d1ecfda053826f4 \| 2026-05-24T19:46:03+08:00 \| \| QUEUED \| +-----+----------------------------------+-------+------+----------------------------------+---------------------------+------------+------------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-05-25 14:00:08 +08:00
Haruko386	5d022d83e8	Go: implement provider: PaddleOCR_Local (#15158 ) ### What problem does this PR solve? Go: implement provider: PaddleOCR_Local Verified from CLI ``` RAGFlow(user)> ocr with 'PaddleOCR-VL@test@paddleocr_local' file './internal/test1.jpg' +----------------------+ \| text \| +----------------------+ \| ## Parallel to these \| +----------------------+ ``` ### Type of change - [X] Bug Fix (non-breaking change which fixes an issue) - [X] New Feature (non-breaking change which adds functionality) - [X] Refactoring	2026-05-25 12:12:57 +08:00
dripsmvcp	8d8ea71877	Go: implement provider: Tencent Hunyuan (#15092 ) ## Summary - Adds a `Hunyuan` Go driver so the new API server can route Tencent Hunyuan chat instances (registered in `conf/llm_factories.json:3830` as `Tencent Hunyuan`). Follows the same SaaS-driver shape used for Astraflow, Avian, Novita, TogetherAI, Replicate, DeepInfra, Upstage, and LongCat. Closes #15087 --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-25 11:04:39 +08:00
bitloi	432e966414	fix(go): support OpenAI audio endpoints (#15104 ) ### What problem does this PR solve? Closes #15102. OpenAI's Go provider config advertises `whisper-1` as ASR and `tts-1` as TTS, but the Go driver returned `openai, no such method` for both audio paths and did not define `url_suffix.asr` / `url_suffix.tts`. This PR: - adds OpenAI audio URL suffixes for `audio/transcriptions` and `audio/speech` - implements non-streaming `TranscribeAudio` using multipart form uploads - implements non-streaming `AudioSpeech` using the OpenAI speech JSON request shape - keeps streaming TTS explicitly unsupported instead of sending binary audio through the text SSE sender - adds focused tests for config coverage, ASR/TTS request shape, required TTS voice validation, and unsupported streaming TTS ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-25 10:25:53 +08:00
Tohka	302f97de50	Go: implement reasoning_chat, TTS, ASR for Groq (#15153 ) ### What problem does this PR solve? Go: implement reasoning_chat, TTS, ASR for Groq Verify from CLI ``` RAGFlow(user)> think chat with 'qwen/qwen3-32b@test@groq' message 'who r u' Thinking: Okay, the user asked, who r u. I need to determine what the user is asking. They may be asking about my identity. I should introduce my name and basic functions. The user might want to know what I can do, so I should list some common use cases, such as answering questions, creating writing, coding, and expressing opinions. The user may be curious about how they can interact with me, so they can be advised to ask any questions or provide instructions. Keep your answers conversational, avoid overly technical terms, keep answers concise, and encourage further interaction. Check if there's any ambiguity in the answer and make sure it's accurate and meets the user's needs. Also consider if there are other aspects the user may be interested in, such as my training data or performance. But since the question is basic, I'll focus on the essentials first and invite the user to ask more. In summary, respond to the user's questions by introducing yourself, your functions, and encouraging further interaction. Answer: Hello! I'm Qwen. I am a large-scale language model developed by Tongyi Lab, designed to assist you in various ways, such as answering questions, creating text, logical reasoning, programming, and more. I aim to provide clear, accurate, and helpful information and support. How can I assist you today? Feel free to ask any questions or give me tasks! 😊 Time: 2.199908 RAGFlow(user)> stream think chat with 'openai/gpt-oss-20b@test@groq' message 'who r u' Thinking: to respond politely. Answer: ’m ChatGPT—an AI language model created by OpenAI. I’m here to answer questions, offer explanations, and help with a wide range of topics. How can I assist you today? RAGFlow(user)> tts with 'canopylabs/orpheus-arabic-saudi@test@groq' text 'hello? show yourself' play format 'wav' param '{"voice": "fahad"}' SUCCESS RAGFlow(user)> asr with 'whisper-large-v3-turbo@test@groq' audio './internal/test.wav' param '{"language": "en"}' +----------------------------------------------------------------------------------------------------------------------+ \| text \| +----------------------------------------------------------------------------------------------------------------------+ \| The examination and testimony of the experts enabled the Commission to conclude that five shots may have been fired \| +----------------------------------------------------------------------------------------------------------------------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-22 18:02:30 +08:00
Haruko386	3f02ca7ba1	Go: implement embed, rerank, tts for AstraFlow (#15135 ) ### What problem does this PR solve? implement embed, rerank, tts for AstraFlow Verify from CLI ``` # Astraflow RAGFlow(user)> tts with 'IndexTeam/IndexTTS-2@test3@astraflow' text 'hello? show yourself' play format 'wav' param '{"voice": "jack_cheng"}' SUCCESS RAGFlow(user)> rerank query 'what is rag' document 'rag is retrieval augment generation' 'rag need llm' 'famous rag project includes ragflow' with 'bge-reranker-v2-m3@test3@astraflow' top 3; +-------+---------------------+ \| index \| relevance_score \| +-------+---------------------+ \| 0 \| 0.9837390184402466 \| \| 2 \| 0.06322699040174484 \| \| 1 \| 0.04663187265396118 \| +-------+---------------------+ RAGFlow(user)> embed text 'walkerwhat' 'jumperwho' with 'text-embedding-3-large@test3@astraflow' dimension 16 +-----------+-------+ \| dimension \| index \| +-----------+-------+ \| 3072 \| 0 \| \| 3072 \| 1 \| +-----------+-------+ # Xinference ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2026-05-22 18:02:01 +08:00
ghost	f9ce07ced1	feat(go-models): add Groq provider driver (#15097 ) ### What problem does this PR solve? Closes #15088. Adds Groq support to the Go model-provider layer so Groq instances can be routed through the Go API server with the same OpenAI-compatible chat, streaming, model listing, and connection-check flow used by other SaaS providers. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ## Summary - Added a Groq Go model driver. - Added the Groq provider catalog and default OpenAI-compatible API URL. - Registered Groq in the model factory. - Added focused provider tests. ## What changed - Implemented chat completions, SSE streaming, ListModels, and CheckConnection for Groq. - Covered request shape, stream termination, reasoning fallback, model listing, custom base URLs, safe transport setup, and unsupported methods. - Kept the provider catalog scoped to current Groq chat-capable model IDs. - Cleaned up pre-existing Go model package validation blockers so the package can be tested normally with vet enabled. ## Why The existing Python/provider catalog path includes Groq, but the Go model-provider layer did not have a Groq driver, so the Go API server could not instantiate or use Groq as requested in #15088. ## Notes The model package now validates without disabling vet. --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-22 15:24:52 +08:00
dripsmvcp	ed04893415	Go: implement provider: TokenPony (#15091 ) ## Summary - Adds a `TokenPony` Go driver so the new API server can route TokenPony chat instances, matching the existing Python `TokenPonyChat` (`rag/llm/chat_model.py:1210`). Follows the same SaaS-driver shape used for Astraflow, Avian, Novita, TogetherAI, Replicate, DeepInfra, Upstage, and LongCat. Closes #15086 --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-22 15:21:45 +08:00
Jake Armstrong	b1ef5d365f	Go: implement ASR in OpenRouter driver (#15067 ) ### What problem does this PR solve? Fixes #15066 OpenRouter now exposes an official speech-to-text endpoint at `POST /api/v1/audio/transcriptions`, but the Go model driver still returned `openrouter, no such method` from `TranscribeAudio`. This left OpenRouter ASR models unavailable through the Go API server even though the provider already has OpenRouter audio support for TTS. Related provider-tracking context: #14736 ### Type of change - [x] New Feature (non-breaking change which adds functionality) Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-22 15:19:38 +08:00
Jake Armstrong	b2bf9155ed	Go: implement ASR in ZhipuAI driver (#15134 ) ### What problem does this PR solve? This PR implements ASR and TTS support for the ZhipuAI Go driver. The ZhipuAI model config already advertises `glm-asr-2512` as an ASR model, but the Go driver returned `zhipu, no such method` from `TranscribeAudio`. This adds the documented audio transcription endpoint suffix and sends multipart transcription requests with `model`, `stream=false`, and `file` fields. Per maintainer review, this also adds the ZhipuAI TTS endpoint suffix and implements `AudioSpeech` / `AudioSpeechWithSender` for `glm-tts`. Closes #15133 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality)	2026-05-22 11:53:18 +08:00
ghost	b2053cc3c7	feat(go-models): add PPIO provider driver (#15099 ) ### What problem does this PR solve? Closes #15089. Adds PPIO support to the Go model-provider layer so PPIO instances can be routed through the Go API server with the same OpenAI-compatible chat, streaming, model listing, and connection-check flow used by other SaaS providers. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ## Summary - Added a PPIO Go model driver. - Added the PPIO provider catalog and default OpenAI-compatible API URL. - Registered PPIO in the model factory. - Added focused provider and provider-manager tests. ## What changed - Implemented chat completions, SSE streaming, ListModels, and CheckConnection for PPIO. - Covered request shape, stream termination, reasoning fallback, model listing, custom base URLs, safe transport setup, unsupported methods, and provider config loading. - Kept the provider catalog aligned with the existing RAGFlow PPIO factory model set. - Cleaned up pre-existing Go model package validation blockers so the scoped provider tests can run normally with vet enabled. ## Why The existing Python/provider catalog path includes PPIO, but the Go model-provider layer did not have a PPIO driver, so the Go API server could not instantiate or use PPIO as requested in #15089.	2026-05-22 11:52:18 +08:00
Haruko386	1ece1c81da	Go: implement rerank, asr, tts for TogetherAI (#15107 ) ### What problem does this PR solve? implement rerank, asr, tts for TogetherAI ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-21 20:57:04 +08:00
Haruko386	a725e114f9	Go: implement ASR and TTS for Xinference (#15096 ) ### What problem does this PR solve? implement ASR and TTS for Xinference ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2026-05-21 18:28:06 +08:00
tmimmanuel	38a8bc3dab	fix(upstage): extract reasoning delta from streaming responses (#14817 ) ### What problem does this PR solve? `UpstageModel.ChatStreamlyWithSender` (in the driver merged via #14819) only extracted `delta.content` from each SSE event. For the `solar-pro3` reasoning family (and any future Upstage model that follows the same wire shape), the chain-of-thought is streamed in a separate `delta.reasoning` field, and the driver was silently dropping all of it. The non-streaming path already extracts `message.reasoning` into `ChatResponse.ReasonContent` (added earlier in this PR's history), so the same model produced inconsistent behavior between streaming and non-streaming: a tenant calling `solar-pro3` with `reasoning_effort: high` would see the reasoning trace if they used `ChatWithMessages` but not if they used `ChatStreamlyWithSender`. ### Live evidence Probed against `api.upstage.ai/v1/chat/completions` with `solar-pro3` + `reasoning_effort: high` + `stream: true` (8000-token budget so the reasoning has room to finish): ``` $ curl -sN -H "Authorization: Bearer <key>" -H "Content-Type: application/json" \ -X POST https://api.upstage.ai/v1/chat/completions \ -d '{"model":"solar-pro3","messages":[{"role":"user","content":"Compute 15% of 80."}], "max_tokens":8000,"stream":true,"reasoning_effort":"high"}' # across 168 SSE events: # delta keys seen: [content reasoning role] # delta.content total len: 121 chars (the visible answer) # delta.reasoning total len: 159 chars (the chain-of-thought) <- driver dropped this ``` A representative event showing both fields side by side: ```json data: {"choices":[{"index":0,"delta":{"reasoning":"15% = 0.15."}}]} data: {"choices":[{"index":0,"delta":{"content":"15% of 80 is "}}]} ``` The 159 chars of reasoning were arriving on the wire and being thrown away. `solar-pro2` was also probed (625 events); it does not emit `delta.reasoning` — its reasoning is inlined into `delta.content` — so this change is a no-op for it and for `solar-mini`. ### What this PR includes - `internal/entity/models/upstage.go`: in the SSE scanner loop, extract `delta.reasoning` before `delta.content` and forward each non-empty chunk via the sender's second arg (the existing `reasonContent` channel the non-stream path already populates). The ordering contract is documented inline: reasoning chunks within a single SSE event are emitted before content chunks, so a UI that pipes both sees the chain-of-thought start before the answer for that token, matching the wire order Upstage emits. - `internal/entity/models/upstage_test.go`: three new tests pinning the new behavior: - `TestUpstageStreamExtractsReasoningDelta` — reasoning + content forwarded to the right sender args; one-of invariant per call - `TestUpstageStreamReasoningChunksArriveBeforeContent` — ordering pinned within a single SSE event that carries both fields - `TestUpstageStreamWithoutReasoningStillWorks` — regression net: non-reasoning models (`solar-mini`, `solar-pro2`) continue to work; the reason callback never fires No interface change. No factory change. No config change. ### How was this tested? ``` $ go test -vet=off -run TestUpstage -count=1 -v ./internal/entity/models/... ... (existing tests 1..9 still pass) ... === RUN TestUpstageStreamExtractsReasoningDelta --- PASS: TestUpstageStreamExtractsReasoningDelta (0.01s) === RUN TestUpstageStreamReasoningChunksArriveBeforeContent --- PASS: TestUpstageStreamReasoningChunksArriveBeforeContent (0.01s) === RUN TestUpstageStreamWithoutReasoningStillWorks --- PASS: TestUpstageStreamWithoutReasoningStillWorks (0.00s) PASS ok ragflow/internal/entity/models 0.034s ``` 12/12 Upstage tests pass on go 1.25. `go build ./internal/entity/models/...` exits 0. Live integration test (smoke test not committed) — the patched driver was run directly against `api.upstage.ai/v1` with the same prompt that produced the curl evidence above: ``` === RUN TestUpstageStreamReasoningLiveSmoke [OK] visible content: 50 chunks, 84 chars [OK] reasoning: 39 chunks, 90 chars content head 200: "\$15\\% = \\frac{15}{100}=0.15\$.\n\n\\[\n0.15 \\times 80 = 12.\n\\]\n\n15 % of 80 is 12." reasoning head 200: "We need to compute 15% of 80. That's 0.15 * 80 = 12. So answer is 12. Provide explanation." UPSTAGE STREAM REASONING SMOKE PASSED --- PASS: TestUpstageStreamReasoningLiveSmoke (1.97s) ``` Before this fix, the same call would have produced 0 reasoning chunks. The 90 chars of reasoning that the patched driver now surfaces are the chain-of-thought solar-pro3 emits when reasoning_effort is high. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-21 15:33:21 +08:00
tmimmanuel	85d0b46d8e	fix(mistral): handle structured content from magistral reasoning models (#14805 ) ### What problem does this PR solve? `MistralModel.ChatWithMessages` (in the driver merged via #14807) assumes that `choices[0].message.content` from `/v1/chat/completions` is always a string and falls through to `return nil, fmt.Errorf("invalid content format")` on anything else. That assumption breaks for the magistral reasoning family (`magistral-small-`, `magistral-medium-`). When the model needs a chain-of-thought to answer, Mistral returns `content` as a structured array of typed parts: ```json "content": [ {"type": "thinking", "thinking": [{"type": "text", "text": "Combined speed is 150 mph. 300 / 150 = 2 hours."}], "closed": true}, {"type": "text", "text": "They will meet after 2 hours."} ] ``` Concretely, this is what the live API returns today (probed against `api.mistral.ai/v1`): ``` $ curl -H "Authorization: Bearer <key>" -H "Content-Type: application/json" \ -X POST https://api.mistral.ai/v1/chat/completions \ -d '{"model":"magistral-medium-latest", "messages":[{"role":"user","content":"two trains 60mph and 90mph, 300mi apart, when do they meet? step by step."}], "max_tokens":1024}' HTTP 200 { "choices":[{"message":{ "role":"assistant", "content":[ {"type":"thinking","thinking":[{"type":"text","text":"Okay, let's see..."}],"closed":true}, {"type":"text","text":"To determine when the two trains meet..."} ]}}] } ``` With the current driver, every call like that returns the generic `"invalid content format"` error. Trivial prompts that happen to fit in a string answer still succeed, so the breakage is non-deterministic from the tenant's POV: same model, same provider, sometimes works, sometimes 500s with no useful error. A secondary issue: `conf/models/mistral.json` does not include any magistral model. The picker hid the broken path, which is why this wasn't caught during #14807's review. ### What this PR includes - New helper `extractMistralContent(raw interface{}) (answer, reasonContent string, err error)` in `internal/entity/models/mistral.go`, which normalizes both shapes Mistral can return: - `string` → historical path. `Answer = content`, `ReasonContent = ""`. Preserves behavior for every non-reasoning model (`mistral-large-`, `mistral-small-`, `ministral-`, `codestral-`, `pixtral-`, `open-mistral-nemo`). - `[]interface{}` → walk the parts. Concatenate every `{"type":"text", "text":...}` part into `Answer`; concatenate the inner text inside every `{"type":"thinking", "thinking":[...]}` part into `ReasonContent`. - `ChatWithMessages` now calls the helper instead of doing the raw `.(string)` cast. - Unknown part types are skipped, not failed. Mistral has been adding new content variants quickly (audio chunks, citations, etc.); this driver should not 500 every call when a new part type appears. - `conf/models/mistral.json`: add `magistral-medium-latest` and `magistral-small-latest`. Both are visible in `/v1/models` today. No interface change. No factory change. No new dependencies. ### How was this tested? Unit tests* — 5 new tests in `internal/entity/models/mistral_test.go` on top of the 27 already shipped via #14807: - `TestMistralChatHandlesStringContent` — regression net for the historical path - `TestMistralChatExtractsReasoningFromStructuredContent` — the fixture body is a trimmed copy of the actual `magistral-medium-latest` response captured above; asserts both `Answer` and `ReasonContent` are populated correctly - `TestMistralChatHandlesStructuredContentWithoutThinking` — `magistral-` with a trivial answer returns a structured shape that has only a `text` part; `ReasonContent` must stay empty - `TestMistralChatIgnoresUnknownContentPartTypes` — `audio_url` and `future_part_type` parts are skipped, `text` parts still flow through - `TestExtractMistralContent` — table-driven unit coverage of the helper for string, empty string, nil, empty array, text-only, thinking+text, unsupported root type ``` $ go test -vet=off -run "TestMistral\|TestExtractMistralContent" -count=1 -v ./internal/entity/models/... === RUN TestMistralChatHandlesStringContent --- PASS: TestMistralChatHandlesStringContent (0.00s) === RUN TestMistralChatExtractsReasoningFromStructuredContent --- PASS: TestMistralChatExtractsReasoningFromStructuredContent (0.00s) === RUN TestMistralChatHandlesStructuredContentWithoutThinking --- PASS: TestMistralChatHandlesStructuredContentWithoutThinking (0.00s) === RUN TestMistralChatIgnoresUnknownContentPartTypes --- PASS: TestMistralChatIgnoresUnknownContentPartTypes (0.00s) === RUN TestExtractMistralContent === RUN TestExtractMistralContent/plain_string === RUN TestExtractMistralContent/empty_string === RUN TestExtractMistralContent/nil === RUN TestExtractMistralContent/empty_array === RUN TestExtractMistralContent/text_only === RUN TestExtractMistralContent/thinking_then_text === RUN TestExtractMistralContent/unknown_root_type --- PASS: TestExtractMistralContent (0.00s) PASS ok ragflow/internal/entity/models 0.046s ``` All 32 Mistral tests pass on go 1.25. `go build ./internal/entity/models/...` exits 0. Live integration test* — driver exercised against `api.mistral.ai/v1` with the patched code: ``` === RUN TestMistralMagistralSmoke [OK] "magistral-small-latest" present upstream [OK] "magistral-medium-latest" present upstream [OK trivial] Answer="7" ReasonContent="" [OK reasoning] Answer len=797 head="To determine when the two trains meet, we can follow these steps:\n\n1. **Identify..." ReasonContent len=1069 head="Okay, let's see. There are two trains, one going 60 mph and the other going 90 mph. They're moving towards each other, s..." MAGISTRAL SMOKE PASSED --- PASS: TestMistralMagistralSmoke (18.09s) PASS ok ragflow/internal/entity/models 18.112s ``` What the live run proves on the wire: - `magistral-small-latest` with a trivial prompt still uses the string-content shape; the regression-net path is exercised against the real server, not just the mock. - `magistral-medium-latest` with a reasoning prompt uses the structured-array shape; the new code path extracts a 1069-character reasoning trace into `ChatResponse.ReasonContent` and a 797-character visible answer into `ChatResponse.Answer`. Before this fix, the same call returned `"invalid content format"` and the caller saw nothing. The smoke-test file itself is not committed (live tests live outside the PR diff, same convention used for prior provider PRs). ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-21 15:33:14 +08:00
sapienza yoan	9d37234953	build(go): make `bash build.sh` work on macOS arm64 (Homebrew) (#15009 ) ## Problem The Go server build pipeline (`build.sh` + CMake + CGO bindings) was tested on Ubuntu only. On macOS arm64 with Homebrew it fails in five orthogonal places. None of these require platform-specific code paths — the same source builds on both Linux and Darwin after these fixes. ## Reproduction (before) ``` $ uname -a Darwin … 25.4.0 arm64 $ brew install cmake pcre2 simde $ bash build.sh … error: 'simde/x86/sse4.1.h' file not found error: implicit instantiation of undefined template 'std::basic_istringstream<char>' error: no matching function for call to 'Join' … clang: error: no such file or directory: '/usr/local/lib/libpcre2-8.a' ``` ## Fix (5 small, orthogonal changes) ### 1. `internal/cpp/CMakeLists.txt` — find Homebrew + libpcre2-8 portably - Detect Apple platforms via `if(APPLE)`, call `brew --prefix` once, add `${HOMEBREW_PREFIX}/include` and `${HOMEBREW_PREFIX}/lib`. No effect on Linux. - Replace the literal `libpcre2-8.a` link token (which only the Linux linker finds in `/usr/local/lib` by default) with `find_library(PCRE2_LIB NAMES pcre2-8 REQUIRED)`. Works on `/usr/lib/x86_64-linux-gnu` (Debian/Ubuntu), `/usr/local/lib` (Intel Mac & legacy Linux), `/opt/homebrew/lib` (Apple Silicon). ### 2. `internal/cpp/wordnet_lemmatizer.cpp` + `internal/cpp/rag_analyzer.cpp` — explicit `#include <sstream>` libstdc++ (Linux) pulls `<sstream>` in transitively via `<fstream>`; libc++ (Apple Clang) doesn't, so the existing `std::istringstream` / `std::ostringstream` uses fail to compile on macOS. One-line include in each file. ### 3. `internal/cpp/rag_analyzer.cpp` — `Join` template overload fix `Join(tokens, start, tokens.size(), delim)` at line 146 passes `size_t` to an `int` parameter. C++23 strict mode in Apple Clang refuses the implicit narrowing and reports the 4-arg overload as a substitution failure, leaving the call ambiguous between the 3-arg and 4-arg templates. Fix: explicit `static_cast<int>(tokens.size())`. Behaviour identical on libstdc++ — the narrowing was always intentional. ### 4. `internal/binding/rag_analyzer.go` — split darwin CGO LDFLAGS The existing `#cgo darwin LDFLAGS: ... /usr/local/lib/libpcre2-8.a` only matches Intel Macs. Apple Silicon Homebrew installs to `/opt/homebrew`. Split into `darwin,arm64` and `darwin,amd64` build constraints with the right absolute path on each. ### 5. `build.sh` — accept Homebrew path in the pcre2 sanity check The sanity check looked at two Linux paths only and then fell through to `sudo apt -y install libpcre2-dev` on failure. Added `/opt/homebrew/lib/libpcre2-8.a`, and on Darwin failure now exits cleanly with the right `brew install pcre2` hint instead of trying `apt`. ## Verified - `bash build.sh` now completes on macOS arm64 (Apple Silicon, brew 4.x, cmake 4.x, Apple Clang 17, Go 1.25, pcre2 10.x, simde 0.8.x). - Produced binaries: `bin/server_main`, `bin/admin_server`, `bin/ragflow_cli`. - `bin/server_main` boots, connects MySQL, runs migrations, loads the 64 model provider configs cleanly. - Still builds on Linux — the CMake additions are inside an `if(APPLE)` guard, the `find_library` call matches Linux paths too, the build.sh check still tries `apt` when not on Darwin. ## Out of scope The Go server itself currently fails at runtime when not pointing at Elasticsearch (`Failed to initialize doc engine: failed to ping Elasticsearch`), but that's the placeholder Infinity engine documented in `internal/engine/README.md` — unrelated to this build patchset. --- Happy to split this into smaller PRs if you'd prefer (one per file). The five changes are independent.	2026-05-21 15:33:09 +08:00
BitToby	bd4ce39038	Go: implement provider: Perplexity (#15008 ) ## What - Add Perplexity as a chat and embedding provider backed by its OpenAI-compatible `/chat/completions` and `/v1/embeddings` APIs - Register Perplexity in the Go model factory and provider config - Support non-streaming chat, SSE streaming chat, embeddings, model listing, and connection checks Refs #14736 --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-21 15:33:02 +08:00
dripsmvcp	d5ba14a128	feat(go): implement provider Astraflow (#15062 ) (#15064 ) - Adds an `Astraflow` Go driver so the new API server can route Astraflow (UCloud ModelVerse) chat instances, matching the existing Python `AstraflowChat` (`rag/llm/chat_model.py:1237`). Follows the same SaaS-driver shape used for Avian, Novita, TogetherAI, Replicate, DeepInfra, Upstage, and LongCat. Closes #15062 --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-21 15:32:56 +08:00
dripsmvcp	5a18df0fd0	Go: implement provider: Avian (#15045 ) Closes #15044. Avian was listed unchecked in the Go-rewrite tracker #14736 and already had an llm_factories.json entry with 4 preconfigured chat models (deepseek-v3.2, kimi-k2.5, glm-5, minimax-m2.5), but the Go API server had no driver to route them. The Python side has supported Avian at rag/llm/chat_model.py:1220 (AvianChat) via the LiteLLM openai/ provider with default base https://api.avian.io/v1. Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-21 15:32:49 +08:00
sxxtony	7740ec6c95	Go: implement Embed (embeddings) in Replicate driver (#15073 ) ### What problem does this PR solve? `ReplicateModel.Embed` in `internal/entity/models/replicate.go` was a `"replicate, no such method"` stub. Tracking issue #14736 lists Replicate's embedding surface as not implemented. This PR wires it up against Replicate's documented embedding schema. Until this PR, a tenant who selected a Replicate embedding model got the sentinel error on every embed call. Co-authored-by: sxxtony <sxxtony@users.noreply.github.com> Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-21 15:32:41 +08:00
web-dev0521	2d3a1a4483	feat(go-models): add Azure OpenAI model driver (#15022 ) ## What problem does this PR solve? Closes #15021. The Go model-provider layer had no support for Azure OpenAI. Azure OpenAI is not a drop-in base-URL swap of the OpenAI driver — it differs in authentication, endpoint structure, and how models are listed — so it needs its own `ModelDriver` implementation. ## Type of change - [x] New feature (non-breaking change which adds functionality) Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-21 11:52:56 +08:00
Renzo	c7ac9b7171	Go: implement provider: GPUStack (chat) (#15024 ) ### What problem does this PR solve? Fixes #15023 GPUStack is listed as unchecked in the Go-rewrite tracker #14736, and `internal/service/llm.go:171` already classifies it as a self-deployed provider alongside Ollama, Xinference, LocalAI, and LM Studio — but `internal/entity/models/` had no `gpustack.go` driver, so the new Go API server could not route GPUStack instances. This PR adds the chat surface for GPUStack so it lines up with the existing self-hosted Go drivers. Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-21 11:49:18 +08:00
Renzo	394cd5d116	Go: implement Embed in Xinference driver (#14932 ) ## Summary - Replaces the `"no such method"` stub on `XinferenceModel.Embed` (`internal/entity/models/xinference.go`) with a real implementation against Xinference's OpenAI-compatible `/v1/embeddings` endpoint. - Adds the `"embedding": "v1/embeddings"` URL suffix to `conf/models/xinference.json`. - Mirrors the Python `XinferenceEmbed` class in `rag/llm/embedding_model.py:407` for payload shape (OpenAI-compatible `model + input` → `data[].index + data[].embedding`) and tolerates the same no-auth default Xinference deployments use. Authorization is only sent when a non-empty API key is configured, via the existing `setXinferenceAuth` helper. - Reuses the existing `normalizeXinferenceBaseURL` + `baseURLForRegion` helpers so both `http://127.0.0.1:9997` and `http://127.0.0.1:9997/v1` resolve to the same `/v1/embeddings` target without doubled `/v1`. - Validates response indices — duplicate, missing, or out-of-range `data[*].index` values fail with a clear error rather than silently producing misaligned vectors. - Returns `[]EmbeddingData` in original input order (placed by `Index`) so downstream callers can index positionally without re-sorting. - Forwards `EmbeddingConfig.Dimension` as `dimensions` when `> 0`, matching the OpenAI cluster pattern. Closes #14810 Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-21 11:47:30 +08:00
Renzo	fec0b968e7	Go: implement Rerank in Novita driver (#15014 ) ### What problem does this PR solve? Fixes #15012 The Novita Go driver landed in #14850 and shipped a stub `Rerank` method that returned `"novita, no such method"`, so Novita could not be used as a rerank provider in RAGFlow. This PR fills that gap, in the same way #14895 filled the Embed gap on the same driver. Novita exposes a public rerank endpoint at `POST https://api.novita.ai/openai/v1/rerank` that accepts the Cohere-compatible request shape (`{model, query, documents, top_n}`) with `Authorization: Bearer <api_key>`. `baai/bge-reranker-v2-m3` is documented in Novita's model library with a 1024-token limit.	2026-05-21 10:19:17 +08:00
Renzo	536ed07d27	Go: implement Rerank in Xinference driver (#15032 ) ### What problem does this PR solve? Fixes #14816 The Xinference Go driver landed chat in #14938 and Embed is in review in #14932, but `Rerank` shipped as a stub that returns `"xinference, no such method"`. Tenants who launch a rerank model with `--model-type rerank` on their Xinference instance cannot route it through the Go API server. This PR fills the gap. Xinference exposes an OpenAI-compatible REST API. The rerank endpoint is at `POST <base>/v1/rerank` and accepts the Cohere-shaped body `{model, query, documents, top_n}`, returning `{results: [{index, relevance_score}]}` — the same wire shape used by the merged NVIDIA (#14778), Aliyun (#14676), Gitee (#14656), ZhipuAI (#14608), Novita (#15014), and LocalAI (#14813) Rerank implementations. Documented in [Xinference rerank docs](https://inference.readthedocs.io/en/v1.6.1/models/model_abilities/rerank.html); the [builtin rerank model catalog](https://inference.readthedocs.io/en/stable/models/builtin/rerank/) lists `bge-reranker-base`, `bge-reranker-large`, `bge-reranker-v2-m3`, and others.	2026-05-21 10:14:30 +08:00
sxxtony	63db30f0d9	Go: implement provider: n1n.ai (#15010 ) ### What problem does this PR solve? Add a Go driver for n1n.ai (https://docs.n1n.ai), one of the unchecked providers on the umbrella tracking issue #14736. n1n.ai is an OpenAI-compatible aggregator hosting a 450+ model catalog (GPT, Claude, Gemini, DeepSeek, Kimi, Qwen, embedding + reranker families) under `https://api.n1n.ai/v1`. Until this PR, a tenant who configured `n1n` as a model provider in the Go layer fell through to the default branch of `internal/entity/models/factory.go` and got the dummy driver. --------- Co-authored-by: sxxtony <sxxtony@users.noreply.github.com>	2026-05-21 10:13:15 +08:00
Jack Storment	dc01e0e51c	Go: implement Embed (embeddings) in TogetherAI driver (#15017 ) ### What problem does this PR solve? Fixes #15015 The TogetherAI Go driver in `internal/entity/models/togetherai.go` shipped a stub `Embed` method that returned `"TogetherAI, no such method"`, so TogetherAI could not be used as an embedding provider in RAGFlow. This PR fills that gap. TogetherAI exposes a public OpenAI-compatible embeddings endpoint at `POST https://api.together.ai/v1/embeddings` that accepts the standard `{model, input}` shape with `Authorization: Bearer <api_key>` (confirmed in TogetherAI's official docs: https://docs.together.ai/docs/embeddings-overview). Documented embedding models include `intfloat/multilingual-e5-large-instruct`, `BAAI/bge-large-en-v1.5`, and `BAAI/bge-base-en-v1.5`. ### Changes - `internal/entity/models/togetherai.go`: implement `TogetherAIModel.Embed`. - Validate inputs (api key, model name) and short-circuit on empty texts. - Resolve region with the existing `baseURLForRegion` helper. - Build URL from `URLSuffix.Embedding`. - Send `{model, input}` POST body, add `dimensions` when `embeddingConfig.Dimension > 0` (matches the pattern in #14735). - Bearer auth + JSON content type, mirroring the chat path. - Parse `{data: [{embedding, index}]}` and reorder by `index`, rejecting out-of-range indices, duplicates, and missing entries so the output always lines up with the input. Same shape as the merged Mistral, Upstage, and Novita Embed implementations. - `conf/models/togetherai.json`: - Add `"embedding": "embeddings"` to `url_suffix`. - Add default embedding model entries for `intfloat/multilingual-e5-large-instruct`, `BAAI/bge-large-en-v1.5`, and `BAAI/bge-base-en-v1.5`. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-20 20:48:44 +08:00
qinling0210	dbef3e361f	Update chunk/metadata cli (#15055 ) ### What problem does this PR solve? Update chunk/metadata cli ### Type of change - [ ] Refactoring	2026-05-20 20:32:06 +08:00
Haruko386	4a91ca5349	Go: implement provider: MinerU_Local (#15051 ) ### What problem does this PR solve? 1. Add model types when add model --- ``` RAGFlow(user)> add model 'pipeline' to provider 'mineru_local' instance 'test' with tokens 131072 doc_parse; SUCCESS ``` 2. implement provider: MinerU_Local --- Verified from CLI ``` RAGFlow(user)> parse with 'pipeline@test@mineru_local' file './internal/test.pdf' +--------------------------------------+ \| task_id \| +--------------------------------------+ \| c7260e31-b6e2-4b36-955d-e9c60510c669 \| +--------------------------------------+ RAGFlow(user)> show 'test@mineru_local' task 'c7260e31-b6e2-4b36-955d-e9c60510c669' +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+ \| content \| index \| +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+ \| # Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation Bingxin Ke Anton Obukhov Shengyu Huang Nando Metzger Rodrigo Caye Daudt Konrad Schindler Photogrammetry and Remote Sensing, ETH Zurich ¨ ![](images/ae256101419715b544d13722... \| 1 \| +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-20 19:21:57 +08:00
bitloi	d69518ea42	fix(go): guard custom base URL driver creation (#15030 ) ### What problem does this PR solve? Closes #15029. Some custom `base_url` paths in `ModelProviderService` call `NewInstance(newURL)` and then immediately invoke methods on the returned driver. Several real Go model drivers still return `nil` from `NewInstance`, so those paths can panic instead of returning a normal error. This PR: - centralizes custom base URL driver creation in `model_service.go` - skips request-local driver creation when `base_url` is blank or whitespace - preserves the existing region key behavior when building the request-local base URL map - returns a clear error when the provider driver is missing or `NewInstance` returns `nil` - routes list/check/task and active model paths through the guarded helper - adds focused unit coverage for empty-region preservation, regional base URLs, blank base URLs, nil drivers, and nil `NewInstance` results ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### Test plan - [x] `git diff --check upstream/main...HEAD` - [x] `/root/go/bin/gofmt -w internal/service/model_service.go internal/service/model_service_test.go` - [x] `GOPATH=/root/gopath GOTOOLCHAIN=local /root/go/bin/go test ./internal/service -run TestNewModelDriverForBaseURL -count=1 -vet=off` - [x] `GOPATH=/root/gopath GOTOOLCHAIN=local /root/go/bin/go build ./internal/service/... ./internal/entity/models/...` Note: the same targeted `go test` command without `-vet=off` is currently blocked by an existing unrelated vet finding in `internal/service/llm.go:355` (`non-constant format string in call to fmt.Errorf`).	2026-05-20 14:58:20 +08:00
Haruko386	2836a934b5	Go: implement provider: 302.AI and JieKou-AI (#15034 ) ### What problem does this PR solve? This PR implement implement provider 302.AI and JieKouAI The following functionalities are now supported: 302.ai - [x] chat / think chat / stream chat / stream think chat - [x] Embedding - [x] ASR - [x] ListModels - [x] Provider connection checking - [x] Balance - [x] Rerank - [x] OCR - [x] Doc Parse - [x] Show task - [ ] ~~List Tasks!~~ - [ ] TTS JieKouAI - [x] chat / think chat / stream chat / stream think chat - [x] Embedding - [x] Rerank - [x] ListModels Verified examples from the CLI: ```palintext # jiekouAI RAGFlow(user)> stream think chat with 'zai-org/glm-4.5@test@jiekouai' message 'Hi' Thinking: Let me think about how to respond to this simple greeting. The user just said "Hi", which is a basic and friendly way to start a conversation. I should respond in a similarly warm and welcoming manner.First, I need to acknowledge their greeting and reciprocate with enthusiasm. Something like "Hello!" or "Hi there!" would work well to create a positive atmosphere right from the start.Next, I should make it clear that I'm ready to help. Since they haven't asked anything specific yet, I'll keep it open-ended and inviting. Perhaps offering assistance with a question or task would encourage them to engage further.I should also maintain a professional yet approachable tone. Being an AI assistant, I want to convey that I'm knowledgeable and capable, but also friendly and easy to talk to.Let me put this all together into a concise response. I'll start with a cheerful greeting, express my readiness to help, and finish with an open invitation for them to share what's on their mind. This should create a welcoming environment for whatever they want to discuss next. Answer: ! I'm Claude, an AI assistant created by Anthropic. I'm here to help you with information, answer questions, or assist you with tasks. What can I help you with today? RAGFlow(user)> think chat with 'zai-org/glm-4.5@test@jiekouai' message 'Hi' Thinking: Let me consider how to respond to this greeting. The user initiated with a simple "Hi," so a friendly and open response would be most appropriate to encourage further conversation. I should maintain a welcoming tone while offering assistance. The response should accomplish a few key things: return the greeting warmly, show openness to conversation, and offer specific ways I can help. This approach demonstrates both approachability and usefulness. I'll start with a greeting in return, then express my availability to help, and finish by suggesting some areas where I can provide assistance. This creates a natural flow from acknowledgment to support. It's important to keep the response concise but inviting. Since the user hasn't specified their needs yet, I'll present a few broad categories of assistance to spark their thinking about what they might want to discuss or ask about. The response should end with an encouraging note that prompts them to share what's on their mind, keeping the conversational ball in their court while making it clear I'm ready to engage with whatever they need. Answer: Hello! How can I help you today? Whether you have questions, need information, or just want to chat, I'm here to assist. RAGFlow(user)> embed text 'walkerwhat' 'jumperwho' with 'text-embedding-3-large@test@jiekouai' dimension 16 +-----------+-------+ \| dimension \| index \| +-----------+-------+ \| 3072 \| 0 \| \| 3072 \| 1 \| +-----------+-------+ RAGFlow(user)> rerank query 'what is rag' document 'rag is retrieval augment generation' 'rag need llm' 'famous rag project includes ragflow' with 'baai/bge-reranker-v2-m3@test@jiekouai' top 3 +-------+-----------------+ \| index \| relevance_score \| +-------+-----------------+ \| 0 \| 0.9830034 \| \| 2 \| 0.06399203 \| \| 1 \| 0.04665664 \| +-------+-----------------+ # 302.ai RAGFlow(user)> think chat with 'kimi-k2.6@test@302.ai' message 'who r u' Thinking: The user is asking "who r u" which is a casual way of asking "who are you." I need to identify myself as an AI assistant created by Moonshot AI. I should be friendly, concise, and helpful. Key points to include: - I am Kimi, an AI assistant made by Moonshot AI - I can help with various tasks like answering questions, writing, analysis, coding, etc. - Keep it casual but informative since the user used "r u" (text speak) I should not: - Pretend to be human - Claim to have personal experiences or emotions - Be overly formal or robotic Simple, friendly response is best. Answer: I'm Kimi, an AI assistant made by Moonshot AI. I can help you with answering questions, writing, coding, analysis, or just chatting. What can I do for you? Time: 17.687750 RAGFlow(user)> stream think chat with 'kimi-k2.6@test@302.ai' message 'who r u' Thinking: user asked "who r u" which is a casual way of asking "who are you." I should introduce myself as Kimi, an AI assistant developed by Moonshot AI. I need to be friendly, concise, and accurate. I should mention my capabilities briefly and keep the tone helpful. Since the user used casual text speak ("r u"), I can match that energy with a friendly but still informative tone.Key points:- I'm Kimi, an AI assistant made by Moonshot AI- I can help with various tasks like answering questions, writing, coding, analysis, etc.- Keep it brief but warm- Don't claim to be human- Don't over-explainDraft:"I'm Kimi, an AI assistant created by Moonshot AI. I can help with answering questions, writing, coding, analysis, brainstorming, and lots of other tasks. What can I do for you?"This is good - direct, accurate, and inviting. Answer: Kimi, an AI assistant made by Moonshot AI. I can help with answering questions, writing, coding, analysis, brainstorming, and lots of other stuff. What can I do for you? Time: 14.912576 RAGFlow(user)> asr with 'whisper-v3-turbo@test@302.ai' audio './internal/test.wav' param '' +---------------------------------------------------------------------------------------------------------------------+ \| text \| +---------------------------------------------------------------------------------------------------------------------+ \| The examination and testimony of the experts enabled the Commission to conclude that five shots may have been fired \| +---------------------------------------------------------------------------------------------------------------------+ RAGFlow(user)> ocr with 'mistral-ocr-latest@test@302.ai' file './internal/test.pdf' +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ \| text \| +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ \| # Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation Bingxin Ke Nando Metzger Anton Obukhov Rodrigo Caye Daudt Shengyu Huang Konrad Schindler Photogrammetry and Remote Sensing, ETH Zürich ![img-0.jpeg](img-0.jpeg) Figur... \| +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ RAGFlow(user)> parse with 'vlm@test@302.ai' file 'https://arxiv.org/pdf/2505.09358' +--------------------------------------+ \| task_id \| +--------------------------------------+ \| 6de6eae6-c122-4b67-91e8-b061a0b8c087 \| +--------------------------------------+ RAGFlow(user)> show 'test@302.ai' task '6de6eae6-c122-4b67-91e8-b061a0b8c087' +----------------------------------------------------------------------------+-------+ \| content \| index \| +----------------------------------------------------------------------------+-------+ \| https://file.302.ai/gpt/imgs/20260519/b340fdff4774699c287fe4ee4658b317.zip \| 0 \| +----------------------------------------------------------------------------+-------+ RAGFlow(user)> embed text 'walkerwhat' 'jumperwho' with 'jina-embeddings-v3@test@302.ai' dimension 16 +-----------+-------+ \| dimension \| index \| +-----------+-------+ \| 1024 \| 0 \| \| 1024 \| 1 \| +-----------+-------+ RAGFlow(user)> rerank query 'what is rag' document 'rag is retrieval augment generation' 'rag need llm' 'famous rag project includes ragflow' with 'jina-reranker-v2-base-multilingual@test@302.ai' top 3; +-------+-----------------+ \| index \| relevance_score \| +-------+-----------------+ \| 0 \| 0.74167407 \| \| 2 \| 0.18832397 \| \| 1 \| 0.15713684 \| +-------+-----------------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2026-05-20 14:10:15 +08:00
qinling0210	77834870fc	Refact functions in engine in GO (#14981 ) ### What problem does this PR solve? Refact functions in engine in GO ### Type of change - [x] Refactoring	2026-05-19 17:34:59 +08:00
tmimmanuel	243d9ed281	Add TogetherAI chat provider (#14957 ) ## What - Add TogetherAI as a chat provider backed by its OpenAI-compatible `/v1/chat/completions` API - Register TogetherAI in the Go model factory and provider config - Support non-streaming chat, SSE streaming chat, model listing, and connection checks ## Notes - Uses the current TogetherAI OpenAI-compatible base URL `https://api.together.ai/v1` - Forwards documented chat parameters from `ChatConfig`: `max_tokens`, `temperature`, `top_p`, `stop`, and GPT-OSS `reasoning_effort` - Routes Together reasoning traces from `reasoning` / `reasoning_content` into `ReasonContent` ## Tests - `go test -vet=off -run TestTogetherAI -count=1 ./internal/entity/models` - `go test -vet=off -count=1 ./internal/entity/models` Refs #14736	2026-05-19 15:10:42 +08:00
tmimmanuel	09a06f1b00	Go: implement provider: Xinference (#14938 ) ### What problem does this PR solve? Closes #14808. Adds a Go model driver for Xinference so self-hosted Xinference chat models can be used through the Go provider layer instead of falling through to the dummy driver. Xinference exposes an OpenAI-compatible API under `/v1`; the driver accepts either a root endpoint such as `http://127.0.0.1:9997` or an OpenAI-compatible endpoint such as `http://127.0.0.1:9997/v1` and normalizes it before calling chat or model-listing routes. ### What is changed? - Add `internal/entity/models/xinference.go` implementing `ModelDriver` for Xinference chat. - Route provider name `xinference` in `internal/entity/models/factory.go`. - Add `conf/models/xinference.json` as a local provider config. - Add focused unit tests in `internal/entity/models/xinference_test.go`. Initial method coverage: - `ChatWithMessages`: POST `/v1/chat/completions`. - `ChatStreamlyWithSender`: SSE streaming from `/v1/chat/completions`. - `ListModels`: GET `/v1/models`. - `CheckConnection`: lightweight `ListModels` probe. - Optional auth: send `Authorization: Bearer <api_key>` only when a non-empty key is configured, matching Xinference no-auth and auth-enabled deployments. - `Balance`, `Embed`, `Rerank`, ASR, TTS, and OCR return `no such method` for this initial chat-provider PR. ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Bug Fix (non-breaking change which fixes an issue) ### Tests - `go test -vet=off -run TestXinference -count=1 ./internal/entity/models/...` - `go test -vet=off -count=1 ./internal/entity/models/...` ### References - Xinference docs: https://inference.readthedocs.io/zh-cn/latest/index.html - OpenAI-compatible chat usage: https://inference.readthedocs.io/zh-cn/latest/getting_started/using_xinference.html - API key auth: https://inference.readthedocs.io/zh-cn/latest/user_guide/auth_system.html --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-19 15:10:13 +08:00
OrbisAI Security	f17a66d4f0	fix: the opencc c library uses fgets() to read dicti... in text.c (#13970 ) ## Summary Fix critical severity security issue in `internal/cpp/opencc/dictionary/text.c`. ## Vulnerability \| Field \| Value \| \|-------\|-------\| \| ID \| V-001 \| \| Severity \| CRITICAL \| \| Scanner \| multi_agent_ai \| \| Rule \| `V-001` \| \| File \| `internal/cpp/opencc/dictionary/text.c:107` \| Description: The OpenCC C library uses fgets() to read dictionary and configuration files without proper bounds validation on subsequent buffer operations. While fgets() itself is bounds-checked, the sprintf() call at config_reader.c:174 constructs file paths by concatenating home_path and filename without verifying the result fits in pkg_filename buffer. An attacker providing malformed OpenCC configuration files with excessively long path components can overflow the fixed-size buffer, overwriting adjacent memory including return addresses and function pointers. ## Changes - `internal/cpp/opencc/config_reader.c` - `internal/cpp/opencc/dictionary/text.c` - `internal/cpp/opencc/utils.c` ## Verification - [x] Build passes - [x] Scanner re-scan confirms fix - [x] LLM code review passed --- Automated security fix by [OrbisAI Security](https://orbisappsec.com) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Bug Fixes * Improved error detection and handling for malformed configuration and dictionary entries during file parsing. * Enhanced memory cleanup in error recovery paths to prevent potential issues. * Strengthened robustness of string operations and buffer handling throughout the library. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: Ubuntu <ubuntu@ip-172-31-32-15.us-west-2.compute.internal>	2026-05-19 13:55:33 +08:00
tmimmanuel	4c9529ef36	Add Replicate chat provider (#14958 ) ## What - Add Replicate as a chat provider backed by the documented predictions API - Register Replicate in the Go model factory and provider config - Support non-streaming chat through sync predictions, polling fallback, streaming through `urls.stream`, model listing, and connection checks ## Notes - Uses `POST /v1/predictions` with Replicate model identifiers in `version`, which supports official and community model identifiers - Maps RAGFlow messages into Replicate prompt-shaped inputs (`prompt`, optional `system_prompt`) and forwards common documented LLM inputs: `max_new_tokens`, `temperature`, `top_p` - Preserves whitespace in SSE output chunks and emits RAGFlow `[DONE]` at stream completion ## Tests - `go test -vet=off -run TestReplicate -count=1 ./internal/entity/models` - `go test -vet=off -count=1 ./internal/entity/models` Refs #14736	2026-05-19 11:10:36 +08:00
Haruko386	db9e782747	Go: implement provider: MinerU (#14990 ) ### What problem does this PR solve? Implement MinerU Provider The following functionalities are now supported: MinerU ---- - [x] Parse file - [x] Show task - [ ] ~~List tasks~~ Verified examples from the CLI: ```plaintext RAGFlow(user)> parse with 'vlm@test@mineru' file 'https://arxiv.org/pdf/2505.09358' +--------------------------------------+ \| task_id \| +--------------------------------------+ \| 142ac8ea-d9d0-4a68-a2d1-d3af67635dc9 \| +--------------------------------------+ RAGFlow(user)> show 'test@mineru' task '142ac8ea-d9d0-4a68-a2d1-d3af67635dc9' +--------------------------------------------+-------+ \| content \| index \| +--------------------------------------------+-------+ \| Task is running... Progress: 17 / 18 pages \| 0 \| +--------------------------------------------+-------+ RAGFlow(user)> show 'test@mineru' task '142ac8ea-d9d0-4a68-a2d1-d3af67635dc9' +--------------------------------------------------------------------------------------------+-------+ \| content \| index \| +--------------------------------------------------------------------------------------------+-------+ \| https://cdn-mineru.openxlab.org.cn/pdf/2026-05-18/142ac8ea-d9d0-4a68-a2d1-d3af67635dc9.zip \| 0 \| +--------------------------------------------------------------------------------------------+-------+ ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2026-05-19 10:49:33 +08:00
buua436	41a9fc0030	Go: add dataset graph api (#14984 ) ### What problem does this PR solve? add dataset graph api ### Type of change - [x] Refactoring	2026-05-18 20:02:53 +08:00
buua436	d7fb4bdb4e	Go: align document list response (#14982 ) ### What problem does this PR solve? align document list response ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-18 20:00:11 +08:00
buua436	3290257014	Go: fix forgetting policy validation and fix memory update diff checks (#14976 ) ### What problem does this PR solve? fix forgetting policy validation and fix memory update diff checks ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-18 19:21:47 +08:00
Jake Armstrong	93d3deb5e4	Fix admin CLI system variable commands (#14956 ) ## What Fixes #12409. Implements admin CLI support for: - `list vars;` - `show var <name-or-prefix>;` - `set var <name> <value>;` ## Changes - Wire Go CLI variable commands to the admin API. - Support integer and quoted string values in `SET VAR`. - Return variable rows as `data_type`, `name`, `setting_type`, and `value`. - Add exact-name lookup with prefix fallback for `SHOW VAR`. - Validate values by stored data type: `string`, `integer`, `bool`, and `json`. - Keep the legacy Python admin CLI/server behavior aligned. - Update admin CLI docs and add focused tests. ## Verification - `go test -count=1 ./internal/cli` - `python3.12 -m py_compile admin/server/services.py admin/server/routes.py api/db/services/system_settings_service.py admin/client/parser.py admin/client/ragflow_client.py` - Python admin CLI parser smoke test for `SET VAR`, quoted values, `SHOW VAR`, and `LIST VARS`. - Attempted `./run_go_tests.sh`; local environment is missing native tokenizer/linker artifacts: - `internal/cpp/cmake-build-release/librag_tokenizer_c_api.a` - `-lstdc++` Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-18 19:08:45 +08:00
Haruko386	92145dc764	Go: implement provider: DeepInfra, XunFei (#14978 ) ### What problem does this PR solve? This PR implement implement provider and Mistral, DeepInfra, XunFei The following functionalities are now supported: DeepInfra - [x] chat / think chat / stream chat / stream think chat - [x] Embedding - [x] ASR - [x] TTS - [x] ListModels - [x] Provider connection checking - [x] Balance - [ ] ~~Rerank~~ XunFei - [x] chat / think chat / stream chat / stream think chat ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2026-05-18 16:57:42 +08:00
buua436	b8ac997606	Go: add restful api route aliases (#14977 ) ### What problem does this PR solve? add restful api route aliases ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-18 16:57:14 +08:00

1 2 3 4 5

235 Commits