ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-06-08 08:07:21 +08:00

Author	SHA1	Message	Date
Joseff	2ad854c586	Go: implement Rerank in Aliyun driver (#14676 ) ### What problem does this PR solve? The Aliyun Go driver has a stub `Rerank` method that always returns `"Aliyun, Rerank not implemented"`. DashScope exposes an OpenAI-compatible rerank endpoint (`compatible-mode/v1/rerank`) and hosts dedicated bilingual rerankers (`gte-rerank-v2`, `gte-rerank`) that are a natural pairing with the embedding models already in `aliyun.json`. Without this, Aliyun users cannot use reranking within RAGFlow. Closes #14675 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-08 20:21:04 +08:00
Haruko386	94f82acd03	Fix(Go): prevent global state pollution in local model connection check (#14669 ) ### What problem does this PR solve? 1. Fix Global State Pollution in Local Providers (Critical Bug): - Resolved a severe concurrency and architecture issue in `model_service.go`. Previously, `ListSupportedModels` would permanently overwrite the global provider singleton with a localized URL instance (`driver.NewInstance`). This caused cross-request contamination in multi-tenant environments. - Fixed `CheckProviderConnection` for local models (LM Studio, vLLM, Ollama). It now properly creates a localized driver copy and injects the `base_url` before testing the connection, entirely eliminating the false-positive `missing base URL` error without polluting the global state. 2. Implement `VolcEngine` Embeddings: - Fully implemented the `Encode` method for the `volcengine` provider, enabling text embedding capabilities for VolcEngine models. 3. Enhance Region Validation in `SiliconFlow`: - Added a strict empty string check (`*apiConfig.Region != ""`) alongside the existing `nil` check when parsing regions. This ensures that if an empty string is passed, the system safely falls back to the `"default"` region, preventing malformed URL requests and `unsupported protocol scheme` errors. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality)	2026-05-08 15:54:27 +08:00
Panda Dev	a82ae4a991	Go: implement Encode (embeddings) in Aliyun driver (#14647 ) ### What problem does this PR solve? The Aliyun Go driver shipped with a stub \`Encode\` method that returned \`no such method\`, even though \`conf/models/aliyun.json\` already wires the OpenAI-compatible embeddings URL suffix at \`compatible-mode/v1/embeddings\`. The same config also did not list any embedding models, so the picker had nothing to select. So an Aliyun tenant who wanted to use Tongyi text-embedding-v3 or v4 in the Go layer could not, even though the upstream endpoint is public and uses the standard \`POST /v1/embeddings\` shape that the SiliconFlow and ZhipuAI drivers already support. This PR fills the gap. ### What this PR includes - \`conf/models/aliyun.json\`: add \`text-embedding-v4\` and \`text-embedding-v3\` to the \`models\` array. - \`internal/entity/models/aliyun.go\`: replace the \`Encode\` stub with a real implementation. Adds a small local response type that matches the OpenAI-compatible shape. No factory change. No interface change. ### How the driver works - Validate \`apiConfig\` and the API key, validate the model name, resolve the region with a default fallback, build the URL from \`BaseURL[region] + URLSuffix.Embedding\`. - Send all input texts in one request as the \`input\` array, the same OpenAI-compatible shape the SiliconFlow \`Encode\` uses. - Parse \`data[].embedding\` and copy each slice into a \`[][]float64\` indexed by \`data[].index\` so the output order matches the input order even if the API returns items in a different order. - Handle both \`float64\` and \`float32\` element types. - Empty input returns \`[][]float64{}\` with no HTTP call. - Non-200 responses propagate the upstream status line and body. - A final pass checks every input slot got a vector and returns a clear error if any slot is still nil. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### How was this tested? - \`go build ./internal/entity/models/...\` in a clean go 1.25 image returns exit 0. - The full method set on \`AliyunModel\` still matches the \`ModelDriver\` interface. - Pattern parity with the existing SiliconFlow Encode implementation. Closes #14646 --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-08 13:58:25 +08:00
Haruko386	d13a240dc0	Go: implement remaining interface for OpenRouter (#14657 ) ### What problem does this PR solve? 1. implement `rerank`, `embedding`, `balance`, `checkConnet` method for `OpenRouter` 2. delete `chat` method in `internal/entity/models/volcengine.go` ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2026-05-08 13:56:45 +08:00
Panda Dev	d8d49df35e	Go: implement Rerank in Gitee AI driver (#14656 ) ### What problem does this PR solve? The Gitee AI Go driver shipped with a stub \`Rerank\` method that returned \`Rerank not implemented\`, even though \`conf/models/gitee.json\` already wires the rerank URL suffix at \`\"rerank\": \"rerank\"\`. The same config did not list any rerank model, so the picker had nothing to select. So a Gitee tenant could not use BAAI/bge-reranker-v2-m3 as a reranker through the Go layer today, even though the infrastructure was one config entry and one method body away. ### What this PR includes - \`conf/models/gitee.json\`: add \`BAAI/bge-reranker-v2-m3\` to the \`models\` array. - \`internal/entity/models/gitee.go\`: replace the \`Rerank\` stub with a real implementation. Adds two small local types that match the OpenAI-compatible \`/rerank\` shape already used by the SiliconFlow and ZhipuAI drivers. No factory change. No interface change. ### How the driver works - Validate \`apiConfig\` and the API key, validate the model name, resolve the region with a default fallback, build the URL from \`BaseURL[region] + URLSuffix.Rerank\`. - Use a per-call \`context.WithTimeout(30s)\` and \`http.NewRequestWithContext\`, matching the pattern the recently merged Aliyun Encode and the OpenAI driver already use. - Send \`{model, query, documents, top_n, return_documents:false}\` in the body. - Parse \`results[].relevance_score\` and copy each score into the output slice indexed by \`results[].index\`, so the output order matches the input order even if the API returns items in a different order. - Empty input returns \`[]float64{}\` with no HTTP call. - An out-of-range result index returns a clear error rather than silently skipping the entry. - Non-200 responses propagate the upstream status line and body. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### How was this tested? - \`go build ./internal/entity/models/...\` in a clean go 1.25 image returns exit 0. - The full method set on \`GiteeModel\` still matches the \`ModelDriver\` interface. - Pattern parity with the existing SiliconFlow Rerank and the recently merged ZhipuAI Rerank (#14608). Closes #14655	2026-05-08 13:08:22 +08:00
Panda Dev	c7ddc8c039	fix(go): implement ListModels and CheckConnection in NVIDIA driver (#14636 ) ### What problem does this PR solve? The NVIDIA Go driver added in #14623 has a real chat path, but \`ListModels\` and \`CheckConnection\` are stubs that always return \`no such method\`. So: - The model picker cannot auto-populate available NVIDIA NIM model ids. Users have to type the full id by hand (e.g. \`abacusai/dracarys-llama-3.1-70b-instruct\`). - The "Check connection" button always fails for NVIDIA, even when the base URL is reachable and the API key is accepted. NVIDIA NIM is OpenAI-compatible. \`/v1/models\` works with the same Bearer token used for chat. The \`conf/models/nvidia.json\` file already wires the \`models\` url_suffix, so no config change is needed. ### What this PR includes - \`internal/entity/models/nvidia.go\`: - \`ListModels\` now calls \`GET ${BaseURL}/${URLSuffix.Models}\`, parses \`response.data[*].id\`, and returns the list. Same shape as the moonshot, xai, and openai drivers. - \`CheckConnection\` now calls \`ListModels\` and returns its error. Same pattern xai, moonshot, deepseek, aliyun, and gitee already use. \`Balance\`, \`Encode\`, and \`Rerank\` are still stubs in this PR and can be added in follow-ups. No JSON change. No factory change. No interface change. ### How the implementation works - Region resolution falls back to \`default\` when the supplied region is unknown, so a stray region value does not break a valid request. - The Authorization header is only set when \`apiConfig\` and \`ApiKey\` are non-nil and non-empty. This avoids a nil-pointer dereference and lets self-hosted NIM deployments without a key still work. - Non-200 responses propagate the upstream status line and body so the user sees a real error message. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### How was this tested? - \`go build ./internal/entity/models/...\` in a clean go 1.25 image (the go.mod minimum) returns exit 0. - The full method set on \`NvidiaModel\` still matches the \`ModelDriver\` interface. - Pattern parity with the existing xai, moonshot, deepseek, aliyun, gitee, and openai drivers. Closes #14635	2026-05-08 12:04:28 +08:00
Panda Dev	e729eced45	Go: implement Balance in DeepSeek driver (#14632 ) Closes #14631 ### What problem does this PR solve? The DeepSeek Go driver shipped with a stub \`Balance\` method that returned \`no such method\`, even though DeepSeek exposes a public \`GET /user/balance\` endpoint that works with the same Bearer token used for chat. So the "Balance" panel in the model provider UI always shows an error for DeepSeek tenants, while it already works for Moonshot and Gitee. This PR fills the gap. ### What this PR includes - \`conf/models/deepseek.json\`: add \`\"balance\": \"user/balance\"\` under \`url_suffix\` so the driver can build the URL from config the same way the other endpoints do. - \`internal/entity/models/deepseek.go\`: replace the \`Balance\` stub with a real implementation. Adds a small local response type \`deepseekBalanceResponse\` that matches the upstream shape. No factory change. No interface change. ### How the driver works - Validate \`apiConfig\` and the API key, resolve the region (with a \`default\` fallback), and build the URL from \`BaseURL[region] + URLSuffix.Balance\`. - GET the URL with \`Authorization: Bearer <api_key>\`. - Parse the upstream response: \`\`\`json { \"is_available\": true, \"balance_infos\": [ {\"currency\": \"USD\", \"total_balance\": \"10.00\", ...}, {\"currency\": \"CNY\", \"total_balance\": \"70.00\", ...} ] } \`\`\` \`total_balance\` is a string in the upstream API, so the driver parses it with \`strconv.ParseFloat\`. - Return the first balance entry as \`{\"balance\": <float>, \"currency\": <string>}\`, the same shape the Moonshot driver returns. The UI can render it with no provider-specific code. ### Edge cases - Missing or empty API key returns a clear local error before any HTTP call. - Empty \`balance_infos\` returns a clear \"no balance info in response\" error rather than a zero-value silent success. - Non-numeric \`total_balance\` returns a clear parse error. - Non-200 responses propagate the upstream status line and body so the user can see why the call failed. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### How was this tested? - \`go build ./internal/entity/models/...\` in a clean go 1.25 image (the go.mod minimum) returns exit 0. - The full method set on \`DeepSeekModel\` still matches the \`ModelDriver\` interface. - Pattern parity with the existing Moonshot and Gitee Balance implementations.	2026-05-08 12:03:39 +08:00
Haruko386	a377512110	Go: implement provider: OpenRouter (#14652 ) ### What problem does this PR solve? 1. Implement `OpenRouter` Provider: Fully support OpenRouter AI models (e.g., `gemma`, `minimax`). Includes robust handling of Server-Sent Events (SSE) streams, error event interception, and proper parsing of both `reasoning_content` and standard `content`. 2. Fix BaseURL Resolution Bug: Fixed a critical edge case in region configuration parsing. Added a strict empty string check (`*apiConfig.Region != ""`) alongside the `nil` check. This ensures that if the UI passes an empty string, the system correctly falls back to the `"default"` region, preventing `unsupported protocol scheme ""` errors during HTTP requests. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality)	2026-05-08 12:02:37 +08:00
Panda Dev	a86e0ca0ca	Go: implement Balance in SiliconFlow driver (#14643 ) ### What problem does this PR solve? The SiliconFlow Go driver shipped with a stub \`Balance\` method that returned \`no such method\`, even though SiliconFlow exposes a public \`GET /v1/user/info\` endpoint that returns the account balance per currency. So the "Balance" panel in the model provider UI always shows an error for SiliconFlow tenants, while it already works for Moonshot and Gitee. This PR fills the gap. ### What this PR includes - \`conf/models/siliconflow.json\`: add \`\"balance\": \"user/info\"\` under \`url_suffix\` so the driver builds the URL from config. - \`internal/entity/models/siliconflow.go\`: replace the \`Balance\` stub with a real implementation. Adds a small local response type that matches the upstream shape. No factory change. No interface change. ### How the driver works - Validate \`apiConfig\` and the API key, resolve the region with a default fallback, and build the URL from \`BaseURL[region] + URLSuffix.Balance\`. - GET the URL with \`Authorization: Bearer <api_key>\`. - Parse the upstream response. SiliconFlow returns balance fields as strings, so the driver parses them with \`strconv.ParseFloat\`. It prefers \`totalBalance\` over \`balance\` when both are present. - Return \`{\"balance\": <float>, \"currency\": \"CNY\"}\`, the same shape the Moonshot driver returns. The UI can render it with no provider-specific code. ### Edge cases - Missing or empty API key returns a clear local error before any HTTP call. - An unknown region falls back to the default base URL. - Empty \`balance\` and \`totalBalance\` returns a clear "no balance info in response" error rather than a zero-value silent success. - Non-numeric balance string returns a clear parse error. - Non-200 responses propagate the upstream status line and body. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### How was this tested? - \`go build ./internal/entity/models/...\` in a clean go 1.25 image returns exit 0. - The full method set on \`SiliconflowModel\` still matches the \`ModelDriver\` interface. - Pattern parity with the existing Moonshot and Gitee Balance implementations. Closes #14642	2026-05-08 12:01:10 +08:00
Panda Dev	2fd8cdc3cc	fix(go): wire CheckConnection to ListModels in ollama, lm-studio, and vllm (#14614 ) ### What problem does this PR solve? Three Go drivers had `CheckConnection` returning a hardcoded `no such method` error, even though each one already has a working `ListModels` that hits the configured base URL with the configured API key. So the "Check connection" button in the model provider UI always failed for these three providers, even when the underlying setup was fine. Affected drivers: - `internal/entity/models/ollama.go` - `internal/entity/models/lmstudio.go` - `internal/entity/models/vllm.go` This is a real user-facing gap because Ollama and LM Studio are two of the most popular local LLM runners, and vLLM is widely used for self-hosted deployments. ### What this PR includes For each of the three drivers, replace the stub with a small implementation that calls `ListModels` and returns its error: ```go func (o OllamaModel) CheckConnection(apiConfig APIConfig) error { _, err := o.ListModels(apiConfig) return err } ``` This is the exact pattern that xai, moonshot, deepseek, aliyun, and gitee already use for the same method. No JSON change. No factory change. No interface change. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### How was this tested? - `go build ./internal/entity/models/...` in a clean go 1.25 image (the go.mod minimum) returns exit 0. - The full ModelDriver interface still resolves on each driver (NewInstance, Name, ChatWithMessages, ChatStreamlyWithSender, Encode, Rerank, ListModels, Balance, CheckConnection). - Pattern parity with the existing xai, moonshot, deepseek, aliyun, and gitee CheckConnection methods. Closes #14609	2026-05-08 12:00:10 +08:00
Panda Dev	bb10b83e61	Go: implement Rerank in ZhipuAI driver (#14608 ) ### What problem does this PR solve? The ZhipuAI Go driver had a stub Rerank method that returned "not implemented", even though conf/models/zhipu-ai.json already ships glm-rerank as a rerank model and the rerank URL suffix is already wired in url_suffix: ```json "url_suffix": { ... "rerank": "rerank" }, "models": [ {"name": "glm-rerank", "model_types": ["rerank"]}, ... ] ``` So the config was ready but the driver was not. A tenant who picked glm-rerank in the Go layer could not actually run a rerank call. This PR fills the gap so the listed model works end to end. ### What this PR includes - `internal/entity/models/zhipu-ai.go`: real implementation of `ZhipuAIModel.Rerank`, plus two small local types (`zhipuRerankRequest`, `zhipuRerankResponse`) that mirror the standard OpenAI-compatible rerank shape used by SiliconFlow. No factory change. No JSON change. No interface change. ### How the driver works - POST to `${BaseURL}/${URLSuffix.Rerank}` (resolves to `https://open.bigmodel.cn/api/paas/v4/rerank` with the default config), reusing the existing httpClient on the driver. - Validate apiConfig and the API key, validate the model name, and resolve the region. Return a clear local error before any HTTP call when something is missing. - Send `{model, query, documents, top_n, return_documents: false}` in the body, the same shape the SiliconFlow driver already uses. - Walk `results[].relevance_score` and copy each score into the output slice indexed by `results[].index`, so the output order matches the input order even if the API returns results in a different order. - Empty `texts` input returns an empty `[]float64` with no HTTP call. - Non-200 responses propagate the upstream status line and body. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### How was this tested? - `go build ./internal/entity/models/...` in a clean go 1.25 image (the go.mod minimum) returns exit 0. - The full method set on `ZhipuAIModel` still matches the `ModelDriver` interface (NewInstance, Name, ChatWithMessages, ChatStreamlyWithSender, Encode, ListModels, Balance, CheckConnection, Rerank). - Pattern parity with the existing SiliconFlow Rerank implementation (`internal/entity/models/siliconflow.go`). Closes #14607	2026-05-07 17:56:30 +08:00
Haruko386	078ea3bf4a	Go: implement provider: Nvidia (#14623 ) ### What problem does this PR solve? 1. Implement `Nvidia` Provider: Fully support NVIDIA NIM APIs with robust parameter handling (including the `thinking` parameter) and safe URL merging in `NewInstance`. 2. Fix Misleading CLI Errors: Corrected a bug in `common_command.go` where failed chat requests inaccurately reported `failed to list instance models`. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality)	2026-05-07 14:17:57 +08:00
Panda Dev	b8b741555f	Go: implement provider: OpenAI (#14605 ) ### What problem does this PR solve? Add a Go driver for OpenAI (GPT models). The config file conf/models/openai.json has been in the repo for a while with the full GPT-5 model list, but internal/entity/models/factory.go had no case for "openai". So any tenant that configured OpenAI as a model provider in the Go layer fell through to the default branch and got the dummy driver. Chat, list models, and check connection all returned dummy responses instead of reaching the API. OpenAI is the most commonly requested provider and the JSON config already ships with the repo, so this gap is high impact even though the JSON has been there for some time. ### What this PR includes - New file internal/entity/models/openai.go with an OpenAIModel that implements the ModelDriver interface. - factory.go: route the "openai" provider name to NewOpenAIModel. - conf/models/openai.json: add "models": "models" under url_suffix so ListModels can hit /v1/models with no hardcoded fallback. ### How the driver works - OpenAI exposes the canonical OpenAI-compatible API at https://api.openai.com/v1. - ChatWithMessages and ChatStreamlyWithSender post to /chat/completions in the same shape the moonshot, vllm, and xai drivers use. - ListModels and CheckConnection call /models to list available ids and confirm the API key works. - reasoning_content is passed through for the o-series and other reasoning models, in both the non-stream and stream paths. - Encode (embeddings) is left as "not implemented" for now, the same way the other recent provider drivers do it. Rerank and Balance are not part of OpenAI's public API surface in this layer and return a clear "not implemented" or "no such method" error. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### How was this tested? - go build ./internal/entity/models/... in a clean go 1.25 image (the go.mod minimum) returns exit 0 with no errors. - Method set of OpenAIModel matches the ModelDriver interface: NewInstance, Name, ChatWithMessages, ChatStreamlyWithSender, Encode, Rerank, ListModels, Balance, CheckConnection. - Pattern parity with the merged moonshot (#14433), volcengine (#14460), minimax (#14478), vllm (#14532), xai (#14550), and lm-studio (#14586) PRs. Closes #14604	2026-05-07 13:09:51 +08:00
Haruko386	dd7a0ce1d3	Go: implement provider: lm-studio (#14586 ) ### What problem does this PR solve? implement `lm-studio` provider ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2026-05-06 19:23:11 +08:00
Jack Storment	c2ad672c09	Go: implement provider: xAI (#14550 ) Closes #14552 ### What problem does this PR solve? Add a Go driver for xAI (Grok models). The config file conf/models/xai.json has been in the repo since the early Go provider work, but internal/entity/models/factory.go had no case for "xai". So any xAI request fell through to the dummy driver and never reached the API. This PR adds the missing driver and wires it up. ### What this PR includes - New file internal/entity/models/xai.go with an XAIModel that implements the ModelDriver interface. - factory.go: route the "xai" provider name to NewXAIModel. ### How the driver works - xAI exposes an OpenAI-compatible API at https://api.x.ai/v1. - ChatWithMessages and ChatStreamlyWithSender post to /chat/completions in the same shape the moonshot and deepseek drivers use. - ListModels and CheckConnection call /models to confirm the API key works and to list available model ids. - reasoning_content is passed through for grok-3-mini and other xAI reasoning models, both in the non-stream and stream paths. - Encode, Rerank, and Balance are not part of the public xAI API at the moment, so they return a clear "not implemented" or "no such method" error. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### How was this tested? - go build ./internal/entity/models/... in a clean go 1.25 image (the go.mod minimum) returns exit 0 with no errors. - Method set of XAIModel matches the ModelDriver interface: NewInstance, Name, ChatWithMessages, ChatStreamlyWithSender, Encode, Rerank, ListModels, Balance, CheckConnection. - Pattern parity with the merged moonshot (#14433), volcengine (#14460), minimax (#14478), and vllm (#14532) PRs. --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-06 12:16:37 +08:00
Haruko386	cd54c08e84	Go: implement provider: Ollama (#14580 ) ### What problem does this PR solve? implement `Ollama` provider ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-06 12:03:58 +08:00
qinling0210	7335916868	Use GetChatModel, remove duplicate functions in model_service.go (#14546 ) ### What problem does this PR solve? Use GetChatModel, remove duplicate functions in model_service.go ### Type of change - [x] Refactoring Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-06 11:33:32 +08:00
Jin Hai	aa57b5bd8b	Go: move logger to common module (#14545 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-05-06 10:41:58 +08:00
qinling0210	12af73f2ca	Support stream for multimodal chat (#14537 ) ### What problem does this PR solve? Support stream for multimodal chat ### Type of change - [x] Refactoring	2026-04-30 19:33:57 +08:00
Haruko386	93f3b90121	Go: implement provider: Vllm (#14532 ) ### What problem does this PR solve? Implement the vLLM model provider for RAGFlow to fully support local and self-hosted open-source models (e.g., Qwen, GLM, Llama) via the vLLM framework, and fix several critical bugs related to model instance management and API requests. Key changes and fixes: 1. Added Standard vLLM Provider (`vllm.go`, `vllm.json`): - Implemented `VllmModel` driver strictly adhering to the OpenAI API specification. - Removed hardcoded and dangerous routing logic (e.g., forcing `AsyncChat` for Qwen/GLM prefixes), ensuring standard `/v1/chat/completions` compatibility. - Refactored `ListModels` to use safe JSON parsing (resolving nil pointer panics) and standard `GET` requests without bodies. - Added `APIConfig.Region` fallback logic to prevent empty `base_url` fetching when checking models. 2. Fixed `ChatToModelStreamWithSender` Bug (`model_service.go`): - Resolved the `model is disabled` error when streaming chat with local database-saved models. - Added the missing `if modelInfo.Status == "active"` block to correctly invoke `NewInstance` and inject the dynamic `base_url` into the provider driver before starting the SSE stream. 3. Fixed `ListSupportedModels` Bug (`model_service.go`): - Added dynamic `NewInstance` injection for `base_url`. Previously, the list models function used the static JSON config without injecting user-configured dynamic URLs from the database, resulting in an `unsupported protocol scheme ""` error. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality)	2026-04-30 16:30:14 +08:00
qinling0210	265f92c83e	Simplify chat and support multimodal chat (#14523 ) ### What problem does this PR solve? Simplify chat and support multimodal chat ### Type of change - [x] Refactoring	2026-04-30 15:25:01 +08:00
Yingfeng	4ee0702aed	Feat: add skills space to context engine (#13908 ) ### What problem does this PR solve? issue #13714 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-04-30 12:36:03 +08:00
Jin Hai	261be81127	Go: add drop instance models (#14485 ) ### What problem does this PR solve? 1. drop instance model 2. Fix issue of drop instance but not drop models. ### Type of change - [x] New Feature (non-breaking change which adds functionality) Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-29 19:18:49 +08:00
Haruko386	0e1477eb23	Go: implement provider: MiniMax (#14478 ) ### What problem does this PR solve? implement MiniMax provider ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality)	2026-04-29 19:06:40 +08:00
Jin Hai	bb05a8bd7e	Update create model instance command (#14441 ) ### What problem does this PR solve? 1. support command: ``` RAGFlow(user)> create provider 'vllm' instance 'test' key 'test-key' url 'base-url' region 'abc'; SUCCESS RAGFlow(user)> list instances from 'vllm'; +----------+----------------------------------------+----------------------------------+--------------+----------------------------------+--------+ \| apiKey \| extra \| id \| instanceName \| providerID \| status \| +----------+----------------------------------------+----------------------------------+--------------+----------------------------------+--------+ \| test-key \| {"base_url":"base-url","region":"abc"} \| 40213c89430311f1a7cf38a74640adcc \| test \| b4d40e6142d311f1a4f938a74640adcc \| enable \| +----------+----------------------------------------+----------------------------------+--------------+----------------------------------+--------+ ``` 2. support add vllm model ``` RAGFlow(user)> add model 'Qwen/Qwen2-0.5B' to provider 'vllm' instance 'test' with tokens 131072 chat; SUCCESS ``` 3. add vllm chat ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-29 17:05:08 +08:00
Haruko386	decf673049	Go: implement provider: volcengine (#14460 ) ### What problem does this PR solve? implement `volcengine` provider ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-04-29 15:45:08 +08:00
qinling0210	f3c232cf47	Remove model_bundle.go, modify chat_session.go (#14458 ) ### What problem does this PR solve? Remove model_bundle.go, modify chat_session.go ### Type of change - [x] Refactoring	2026-04-29 14:44:12 +08:00
qinling0210	dcce864d4c	Simplify Encode (#14437 ) ### What problem does this PR solve? Simplify Encode ### Type of change - [x] Refactoring	2026-04-28 18:07:42 +08:00
Haruko386	4e5a093ac5	Go: implement provider: Moonshot (#14433 ) ### What problem does this PR solve? implement `Moonshot` provider ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-04-28 18:06:25 +08:00
Jin Hai	f670913bb4	Refactor model type to model class (#14426 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-28 16:05:15 +08:00
Jin Hai	7c25870923	Go: update db model (#14423 ) ### What problem does this PR solve? As title. ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-28 16:04:55 +08:00
Jin Hai	ae420f6358	Go: fix compilation (#14418 ) ### What problem does this PR solve? Add methods to volcengine ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-28 13:21:05 +08:00
qinling0210	effc84a042	Refactor model in GO (#14398 ) ### What problem does this PR solve? Refactor model in GO ### Type of change - [x] Refactoring	2026-04-28 12:59:01 +08:00
Jin Hai	819257f257	Go: add volcengine (#14409 ) ### What problem does this PR solve? 1. Refactor server_main 2. Add volcengine ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-28 12:12:58 +08:00
Jin Hai	965717c4fb	Go: add new provider: google (#14395 ) ### What problem does this PR solve? As title. ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-27 20:35:47 +08:00
Jin Hai	c3eac4103a	Go: aliyun model provider (#14379 ) ### What problem does this PR solve? As title. ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-27 14:53:33 +08:00
Jin Hai	1c244df90d	Go: add gitee and siliconflow as model provider (#14336 ) ### What problem does this PR solve? As title ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-24 20:59:30 +08:00
qinling0210	1473000135	Implement retrieval_test in GO (#14231 ) ### What problem does this PR solve? Implement retrieval_test in GO ### Type of change - [x] Refactoring	2026-04-24 15:30:14 +08:00
Jin Hai	2b029882d7	Go: add new provider minimax (#14296 ) ### What problem does this PR solve? 1. Add new provider minimax 2. Add new command: CHECK INSTANCE 'instance_name' FROM 'provider_name'; ``` RAGFlow(user)> check instance 'test' from 'minimax'; SUCCESS ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-23 10:16:20 +08:00
Jin Hai	74b44e1aa3	Go: add balance command (#14262 ) ### What problem does this PR solve? ``` RAGFlow(user)> list supported models from 'moonshot' 'test'; +---------------------------------+ \| model_name \| +---------------------------------+ \| moonshot-v1-32k-vision-preview \| \| kimi-k2.6 \| \| moonshot-v1-8k \| \| moonshot-v1-auto \| \| moonshot-v1-128k \| \| moonshot-v1-32k \| \| kimi-k2.5 \| \| moonshot-v1-8k-vision-preview \| \| moonshot-v1-128k-vision-preview \| +---------------------------------+ RAGFlow(user)> show balance from 'moonshot' 'test'; +---------+----------+ \| balance \| currency \| +---------+----------+ \| 0 \| CNY \| +---------+----------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-21 21:31:50 +08:00
Jin Hai	e48d75987c	Go: add stream / think chat (#14242 ) ### What problem does this PR solve? 1. Supports stream and non-stream chat 2. Supports think and non-think chat 3. List supported models from DeepSeek service. (This command can be used to verify the API validity) ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-21 16:52:32 +08:00
Jin Hai	f269ee9739	Go: add thinking features to zhipu-ai (#14234 ) ### What problem does this PR solve? ``` RAGFlow(user)> list models from 'zhipu-ai'; +------------+------------+---------------+----------------+ \| features \| max_tokens \| model_types \| name \| +------------+------------+---------------+----------------+ \| [thinking] \| 128000 \| [chat] \| glm-4.7 \| \| [thinking] \| 128000 \| [chat] \| glm-4.5 \| \| [thinking] \| 128000 \| [chat vision] \| glm-4.6v-Flash \| \| [thinking] \| 128000 \| [chat] \| glm-4.5-x \| \| [thinking] \| 128000 \| [chat] \| glm-4.5-air \| \| [thinking] \| 128000 \| [chat] \| glm-4.5-airx \| \| [thinking] \| 128000 \| [chat] \| glm-4.5-flash \| \| [thinking] \| 64000 \| [vision] \| glm-4.5v \| \| \| 128000 \| [chat] \| glm-4-plus \| \| \| 128000 \| [chat] \| glm-4-0520 \| \| \| 128000 \| [chat] \| glm-4 \| \| \| 8000 \| [chat] \| glm-4-airx \| \| \| 128000 \| [chat] \| glm-4-air \| \| \| 128000 \| [chat] \| glm-4-flash \| \| \| 128000 \| [chat] \| glm-4-flashx \| \| \| 1000000 \| [chat] \| glm-4-long \| \| \| 128000 \| [chat] \| glm-3-turbo \| \| \| 2000 \| [vision] \| glm-4v \| \| \| 8192 \| [chat] \| glm-4-9b \| \| \| 512 \| [embedding] \| embedding-2 \| \| \| 512 \| [embedding] \| embedding-3 \| \| \| 4096 \| [asr] \| glm-asr \| \| \| 0 \| [tts] \| glm-tts \| \| \| 0 \| [ocr] \| glm-ocr \| \| \| 0 \| [rerank] \| glm-rerank \| +------------+------------+---------------+----------------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-20 21:53:27 +08:00
Jin Hai	af2ed416a7	Add extra field to model instance (#14203 ) ### What problem does this PR solve? Now each model support region with different URL ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-20 15:31:12 +08:00
Jin Hai	94106646e7	Go: set and list default models (#14191 ) ### What problem does this PR solve? ``` RAGFlow(user)> set default vlm "zhipu-ai" "ccc" "glm-4.6v-flash"; SUCCESS RAGFlow(user)> list default models; +--------+----------------+----------------+----------------+------------+ \| enable \| model_instance \| model_name \| model_provider \| model_type \| +--------+----------------+----------------+----------------+------------+ \| true \| ccc \| glm-4.6v-flash \| zhipu-ai \| llm \| \| true \| ccc \| glm-4.6v-flash \| zhipu-ai \| image2text \| +--------+----------------+----------------+----------------+------------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-17 18:05:33 +08:00
Jin Hai	6d9430a125	Add think chat to CLI (#13922 ) ### What problem does this PR solve? Now user can use 'think mode' to chat with LLM ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-03 18:11:23 +08:00
Jin Hai	6c29128de1	Refactor model provider and command (#13887 ) ### What problem does this PR solve? Introduce 5 new tables, including model groups and provider instance. ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-02 20:20:35 +08:00
Jin Hai	e20cf39735	Refactor Go server model provider reading and access (#13831 ) ### What problem does this PR solve? 1. Refactor model provider json file format 2. Use memory data structure to replace database 3. Add CLI command to access ``` RAGFlow(user)> list pool models from 'xai'; +-------------------------------------------------------------------------------------+------------+-------------+-----------------------+ \| features \| max_tokens \| model_types \| name \| +-------------------------------------------------------------------------------------+------------+-------------+-----------------------+ \| map[] \| 256000 \| [llm] \| grok-4 \| \| map[] \| 131072 \| [llm] \| grok-3 \| \| map[] \| 131072 \| [llm] \| grok-3-fast \| \| map[] \| 131072 \| [llm] \| grok-3-mini \| \| map[] \| 131072 \| [llm] \| grok-3-mini-mini-fast \| \| map[multimodal:map[enabled:true input_modalities:[image] output_modalities:[text]]] \| 32768 \| [vlm] \| grok-2-vision \| +-------------------------------------------------------------------------------------+------------+-------------+-----------------------+ RAGFlow(user)> show pool model 'grok-2-vision' from 'xai'; +-------------------------------------------------------------------------------------+------------+-------------+---------------+ \| features \| max_tokens \| model_types \| name \| +-------------------------------------------------------------------------------------+------------+-------------+---------------+ \| map[multimodal:map[enabled:true input_modalities:[image] output_modalities:[text]]] \| 32768 \| [vlm] \| grok-2-vision \| +-------------------------------------------------------------------------------------+------------+-------------+---------------+ RAGFlow(user)> list pool providers; +--------+------------------------------------------------------------+---------------------------+ \| name \| tags \| url \| +--------+------------------------------------------------------------+---------------------------+ \| OpenAI \| LLM,TEXT EMBEDDING,TTS,TEXT RE-RANK,SPEECH2TEXT,MODERATION \| https://api.openai.com/v1 \| \| xAI \| LLM \| https://api.x.ai/v1 \| +--------+------------------------------------------------------------+---------------------------+ RAGFlow(user)> show pool provider 'openai'; +---------------------------+--------+------------------------------------------------------------+--------------+ \| base_url \| name \| tags \| total_models \| +---------------------------+--------+------------------------------------------------------------+--------------+ \| https://api.openai.com/v1 \| OpenAI \| LLM,TEXT EMBEDDING,TTS,TEXT RE-RANK,SPEECH2TEXT,MODERATION \| 27 \| +---------------------------+--------+------------------------------------------------------------+--------------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-03-30 12:00:49 +08:00
Jin Hai	f32a832f92	Add rename model directory to entity to avoid name misunderstanding (#13829 ) ### What problem does this PR solve? Model-> entity ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-03-27 19:25:18 +08:00

48 Commits