ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-06-08 08:07:21 +08:00

Author	SHA1	Message	Date
Zhichang Yu	86fe78c73f	feat(llm): add MiniMax GroupId header support (#14610 ) ## Summary - Add MiniMax provider GroupId query parameter support in `LiteLLMBase` - Extract `group_id` from key configuration in `__init__` - Append `GroupId` as query parameter to `api_base` in `_construct_complete_args` ## Why this change is needed MiniMax provides an OpenAI-compatible API endpoint (`/v1/chat/completions`), but `GroupId` is a MiniMax-specific account identifier required for billing and rate limiting - it is not part of the OpenAI standard. Looking at LiteLLM's `MinimaxChatConfig`: - `get_complete_url()` only constructs the base URL (e.g., `https://api.minimaxi.com/v1/chat/completions`) - LiteLLM does not automatically inject `GroupId` into requests - This must be handled by the caller (ragflow's chat_model.py) The implementation appends `GroupId` as a query parameter to `api_base`: ```python api_base = completion_args.get("api_base", self.base_url) separator = "&" if "?" in api_base else "?" completion_args["api_base"] = f"{api_base}{separator}GroupId={self.group_id}" ``` This matches MiniMax's official API format (as documented by LlamaFactory): ```bash curl --location 'https://api.minimaxi.chat/v1/text/chatcompletion?GroupId=你的GroupId' \ --header 'Authorization: Bearer 你的API_Key' ``` ## Test plan - [ ] Verify MiniMax API calls work with GroupId query parameter - [ ] Verify backward compatibility for other providers 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-07 11:54:49 +08:00
euvre	8269fa01b4	Fix AttributeError when appending non-streaming tool calls to chat history in Agentic Agent (#14456 ) ### What problem does this PR solve? Fix #14340 ## Problem Description When using an Agentic Agent (not Workflow) with one or more Retrieval tools (e.g., Dataset Retrieval + Memory Retrieval), the agent silently returns an empty response (`agent_response: ""`) after hanging for several minutes. The server logs show: ``` AttributeError: 'ChatCompletionMessageToolCall' object has no attribute 'index' ``` This error propagates as a `GENERIC_ERROR`, causing the canvas to return an empty response. The subsequent Memory save task then receives the empty `agent_response` and logs: ``` Document for referred_document_id XXXX not found ``` ## Reproduction Steps 1. Set `DOC_ENGINE=infinity` (or `elasticsearch` — the engine itself is not the root cause). 2. Create a blank Agentic Agent (not a Workflow). 3. Add two Retrieval tools to the Agent node: - `Retrieval_DS` → Dataset (Knowledge Base) - `Retrieval_Mem` → Memory component 4. Add a Message node with Save to Memory enabled. 5. Launch the agent and send any message (e.g., "hola"). 6. The agent hangs and returns an empty response. ## Root Cause Analysis The crash occurs in `_append_history` and `_append_history_batch` inside `rag/llm/chat_model.py`. These methods directly access `.index` on tool call objects: ```python # _append_history_batch { "index": tc.index, # <-- crashes here ... } ``` However, non-streaming LLM responses (`stream=False`) return `ChatCompletionMessageToolCall` objects, which do not have an `index` field according to the OpenAI API specification. The `index` field only exists on `ChoiceDeltaToolCall` objects returned in streaming responses (`stream=True`). When the agentic agent triggers an internal `full_question` call (used to compress multi-turn conversation history), the request is incorrectly routed through `async_chat_with_tools` because `is_tools=True` is set at the `LLMBundle` level. If the LLM decides to emit `tool_calls` during this auxiliary request, the code enters the non-streaming tool loop and crashes when trying to append history. ## Fix Replaced all direct `.index` accesses with `getattr(..., "index", None)` for safe, backward-compatible access: \| Method \| File \| Line \| Change \| \|--------\|------\|------\|--------\| \| `_append_history` \| `rag/llm/chat_model.py` \| ~L304 \| `tool_call.index` → `getattr(tool_call, "index", None)` \| \| `_append_history_batch` \| `rag/llm/chat_model.py` \| ~L332 \| `tc.index` → `getattr(tc, "index", None)` \| \| `_append_history` \| `rag/llm/chat_model.py` \| ~L1467 \| `tool_call.index` → `getattr(tool_call, "index", None)` \| \| `_append_history_batch` \| `rag/llm/chat_model.py` \| ~L1496 \| `tc.index` → `getattr(tc, "index", None)` \| ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Signed-off-by: noob <yixiao121314@outlook.com>	2026-05-06 14:39:40 +08:00
FuturMix	2548c28d65	feat: add FuturMix as model provider (#14419 ) ## Summary Add [FuturMix](https://futurmix.ai) as a new model provider. FuturMix is an OpenAI-compatible unified AI gateway that provides access to 22+ models (GPT, Claude, Gemini, DeepSeek, and more) through a single API endpoint and key. - API Base: `https://futurmix.ai/v1` (OpenAI-compatible) - Supported capabilities: Chat, Embedding, Image2Text, TTS, Speech2Text, Rerank ### Changes \| File \| Change \| \|------\|--------\| \| `rag/llm/__init__.py` \| Add `FuturMix` to `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` \| \| `rag/llm/chat_model.py` \| Add `FuturMixChat(Base)` — follows Astraflow/Avian pattern \| \| `rag/llm/embedding_model.py` \| Add `FuturMixEmbed(OpenAIEmbed)` — follows Astraflow pattern \| \| `rag/llm/cv_model.py` \| Add `FuturMixCV(GptV4)` — follows SILICONFLOW/OpenRouter pattern \| \| `rag/llm/tts_model.py` \| Add `FuturMixTTS(OpenAITTS)` — follows CometAPI/DeerAPI pattern \| \| `rag/llm/sequence2txt_model.py` \| Add `FuturMixSeq2txt(GPTSeq2txt)` — follows StepFun pattern \| \| `rag/llm/rerank_model.py` \| Add `FuturMixRerank(OpenAI_APIRerank)` \| \| `conf/llm_factories.json` \| Add factory config with 8 chat, 2 embedding, 1 image2text, 2 TTS, 1 speech2text models \| \| `docs/guides/models/supported_models.mdx` \| Add FuturMix to supported models table \| ### Models included - Chat: claude-sonnet-4-20250514, claude-3.5-haiku, gpt-4o, gpt-4o-mini, gemini-2.5-flash, gemini-2.0-flash, deepseek-chat, deepseek-reasoner - Embedding: text-embedding-3-small, text-embedding-3-large - Image2Text: gpt-4o - TTS: tts-1, tts-1-hd - Speech2Text: whisper-1 ## Test plan - [ ] Verify FuturMix appears in the model provider list in RAGFlow UI - [ ] Configure FuturMix with API key and test chat completion - [ ] Test embedding model with document indexing - [ ] Test image2text with a sample image 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-30 10:59:37 +08:00
buua436	e6e80041f5	Fix: agent toolcall null response & schema validation & DeepSeek think history (#14425 ) ### What problem does this PR solve? agent toolcall null response & schema validation & DeepSeek think history ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-04-28 17:09:08 +08:00
ucloudnb666	f853a39b40	feat: Add Astraflow provider support (global + China endpoints) (#14270 ) ## Add Astraflow Provider Support This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud / 优刻得) as a new AI model provider in RAGFlow, with support for both global and China endpoints. ### About Astraflow Astraflow is an OpenAI-compatible AI model aggregation platform supporting 200+ models from major providers including DeepSeek, Qwen, GPT, Claude, Gemini, Llama, Mistral, and more. \| Variant \| Factory Name \| Endpoint \| Env Var \| \|---------\|-------------\|----------\|---------\| \| Global \| `Astraflow` \| `https://api-us-ca.umodelverse.ai/v1` \| `ASTRAFLOW_API_KEY` \| \| China \| `Astraflow-CN` \| `https://api.modelverse.cn/v1` \| `ASTRAFLOW_CN_API_KEY` \| - API key signup: https://astraflow.ucloud.cn/ --- ### Files Changed \| File \| Change \| \|------\|--------\| \| `rag/llm/__init__.py` \| Register `Astraflow` and `Astraflow-CN` in `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` \| \| `rag/llm/chat_model.py` \| Add `AstraflowChat` and `AstraflowCNChat` (OpenAI-compatible `Base` subclass) \| \| `rag/llm/embedding_model.py` \| Add `AstraflowEmbed` and `AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) \| \| `rag/llm/rerank_model.py` \| Add `AstraflowRerank` and `AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) \| \| `rag/llm/cv_model.py` \| Add `AstraflowCV` and `AstraflowCNCV` (subclasses of `GptV4`) \| \| `rag/llm/tts_model.py` \| Add `AstraflowTTS` and `AstraflowCNTTS` (subclasses of `OpenAITTS`) \| \| `rag/llm/sequence2txt_model.py` \| Add `AstraflowSeq2txt` and `AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) \| \| `conf/llm_factories.json` \| Register `Astraflow` and `Astraflow-CN` factories with a curated list of popular models \| --- ### Supported Model Types - ✅ Chat / LLM — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7, Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more - ✅ Text Embedding — text-embedding-3-small/large - ✅ Image / Vision (IMAGE2TEXT) — GPT-4o, GPT-4.1, Claude, Gemini, Llama-4, etc. - ✅ Text Re-Rank - ✅ TTS — tts-1 - ✅ Speech-to-Text (SPEECH2TEXT) — whisper-1 ### Implementation Notes - Uses the `openai/` LiteLLM prefix — consistent with other OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI, OpenRouter, n1n, Avian, etc.) - `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249) are separate factory entries, allowing users to choose the optimal endpoint based on their region. - All model classes cleanly subclass existing base classes (`Base`, `OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`) with no custom logic needed — the provider is fully OpenAI-compatible. --------- Co-authored-by: user <user@xzaaaMacBook-Air.local>	2026-04-22 15:38:34 +08:00
Idriss Sbaaoui	ff27ce86d6	fix: gpt-5 name-based config clearing from base chat path (#13949 ) ### What problem does this PR solve? fix #13944 where OpenAI-compatible custom endpoints failed verification when model names contained `gpt-5` becauser of incorrect name-based handling in the Base/backend=`base` path. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-04-07 11:24:47 +08:00
Yongteng Lei	dd839f30e8	Fix: code supports matplotlib (#13724 ) ### What problem does this PR solve? Code as "final" node: ![img_v3_02vs_aece4caf-8403-4939-9e68-9845a22c2cfg](https://github.com/user-attachments/assets/9d87b8df-da6b-401c-bf6d-8b807fe92c22) Code as "mid" node: ![img_v3_02vv_f74f331f-d755-44ab-a18c-96fff8cbd34g](https://github.com/user-attachments/assets/c94ef3f9-2a6c-47cb-9d2b-19703d2752e4) ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-03-20 20:32:00 +08:00
Idriss Sbaaoui	9070408b04	Fix : model-specific handling (#13675 ) ### What problem does this PR solve? add a handler for gpt 5 models that do not accept parameters by dropping them, and centralize all models with specific paramter handling function into a single helper. solves issue #13639 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring	2026-03-18 17:28:20 +08:00
Yongteng Lei	3c80a0ae09	Fix: support vLLM's new reasoning field (#13493 ) ### What problem does this PR solve? Support vLLM's new reasoning field ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-03-10 21:13:14 +08:00
Jonah Hartmann	6023eb27ac	feat: add Ragcon provider (#13425 ) ### What problem does this PR solve? This PR aims to extend the list of possible providers. Adds new Provider "RAGcon" within the Ollama Modal. It provides all model types except OCR via Openai-compatible endpoints. ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: Jakob <16180662+hauberj@users.noreply.github.com>	2026-03-06 09:37:27 +08:00
Yuxing Deng	51b180d991	fix: adding GPUStack chat model requires v1 suffix (#13237 ) ### What problem does this PR solve? Refer to issue: #13236 The base url for GPUStack chat model requires `/v1` suffix. For the other model type like `Embedding` or `Rerank`, the `/v1` suffix is not required and will be appended in code. So keep the same logic for chat model as other model type. ### Type of change - [X] Bug Fix (non-breaking change which fixes an issue)	2026-02-27 20:13:07 +08:00
avianion	5f53fbe0f1	feat: Add Avian as an LLM provider (#13256 ) ### What problem does this PR solve? This PR adds [Avian](https://avian.io) as a new LLM provider to RAGFlow. Avian provides an OpenAI-compatible API with competitive pricing, offering access to models like DeepSeek V3.2, Kimi K2.5, GLM-5, and MiniMax M2.5. Provider details: - API Base URL: `https://api.avian.io/v1` - Auth: Bearer token via API key - OpenAI-compatible (chat completions, streaming, function calling) - Models: - `deepseek/deepseek-v3.2` — 164K context, $0.26/$0.38 per 1M tokens - `moonshotai/kimi-k2.5` — 131K context, $0.45/$2.20 per 1M tokens - `z-ai/glm-5` — 131K context, $0.30/$2.55 per 1M tokens - `minimax/minimax-m2.5` — 1M context, $0.30/$1.10 per 1M tokens Changes: - `rag/llm/chat_model.py` — Add `AvianChat` class extending `Base` - `rag/llm/__init__.py` — Register in `SupportedLiteLLMProvider`, `FACTORY_DEFAULT_BASE_URL`, `LITELLM_PROVIDER_PREFIX` - `conf/llm_factories.json` — Add Avian factory with model definitions - `web/src/constants/llm.ts` — Add to `LLMFactory` enum, `IconMap`, `APIMapUrl` - `web/src/components/svg-icon.tsx` — Register SVG icon - `web/src/assets/svg/llm/avian.svg` — Provider icon - `docs/references/supported_models.mdx` — Add to supported models table This follows the same pattern as other OpenAI-compatible providers (e.g., n1n #12680, TokenPony). cc @KevinHuSh @JinHai-CN ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Documentation Update	2026-02-27 17:36:55 +08:00
eviaaaaa	c59ae4c7c2	Fix: codeExec return types & error handling; Update Spark model mappings (#12896 ) ## What problem does this PR solve? This PR addresses three specific issues to improve agent reliability and model support: 1. `codeExec` Output Limitation: Previously, the `codeExec` tool was strictly limited to returning `string` types. I updated the output constraint to `object` to support structured data (Dicts, Lists, etc.) required for complex downstream tasks. 2. `codeExec` Error Handling: Improved the execution logic so that when runtime errors occur, the tool captures the exception and returns the error message as the output instead of causing the process to abort or fail silently. 3. Spark Model Configuration: - Added support for the `MAX-32k` model variant. - Fixed the `Spark-Lite` mapping from `general` to `lite` to match the latest API specifications. ## Type of change - [x] Bug Fix (fixes execution logic and model mapping) - [x] New Feature / Enhancement (adds model support and improves tool flexibility) ## Key Changes ### `agent/tools/code_exec.py` - Changed the output type definition from `string` to `object`. - Refactored the execution flow to gracefully catch exceptions and return error messages as part of the tool output. ### `rag/llm/chat_model.py` - Added `"Spark-Max-32K": "max-32k"` to the model list. - Updated `"Spark-Lite"` value from `"general"` to `"lite"`. ## Checklist - [x] My code follows the style guidelines of this project. - [x] I have performed a self-review of my own code. Signed-off-by: evilhero <2278596667@qq.com>	2026-01-29 19:22:35 +08:00
Yongteng Lei	b57c82b122	Feat: add kimi-k2.5 (#12852 ) ### What problem does this PR solve? Add kimi-k2.5 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-01-28 12:41:20 +08:00
Yongteng Lei	2a758402ad	Fix: Hunyuan cannot work properly (#12843 ) ### What problem does this PR solve? Hunyuan cannot work properly ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-27 17:04:53 +08:00
Kevin Hu	927db0b373	Refa: asyncio.to_thread to ThreadPoolExecutor to break thread limitat… (#12716 ) ### Type of change - [x] Refactoring	2026-01-20 13:29:37 +08:00
n1n.ai	f3d347f55f	feat: Add n1n provider (#12680 ) This PR adds n1n as an LLM provider to RAGFlow. Co-authored-by: Qun <qun@ip-10-5-5-38.us-west-2.compute.internal>	2026-01-19 13:12:42 +08:00
Pegasus	b091ff2730	Fix enable_thinking parameter for Qwen3 models (#12603 ) ### Issue When using Qwen3 models (`qwen3-32b`, `qwen3-max`) through the Tongyi-Qianwen provider for non-streaming calls (e.g., knowledge graph generation), the API fails with: Closes #12424 ``` parameter.enable_thinking must be set to false for non-streaming calls ``` ### Root Cause In `LiteLLMBase.async_chat()`, the `extra_body={"enable_thinking": False}` was set in `kwargs` but never forwarded to `_construct_completion_args()`. ### What problem does this PR solve? Pass merged kwargs to `_construct_completion_args()` using `{gen_conf, **kwargs}` to safely handle potential duplicate parameters. ### Changes - `rag/llm/chat_model.py`: Forward kwargs containing `extra_body` to `_construct_completion_args()` in `async_chat()` _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Contribution by Gittensor, see my contribution statistics at https://gittensor.io/miners/details?githubId=42954461	2026-01-14 16:35:46 +08:00
Stephen Hu	f1dc2df23c	Fix:Bedrock assume_role auth mode fails with LiteLLM "Extra inputs are not permitted" error (#12495 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/12489 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-08 12:53:41 +08:00
Stephen Hu	6f2fc2f1cb	refactor:re order logics in clean_gen_conf (#12391 ) ### What problem does this PR solve? re order logics in clean_gen_conf #12388 ### Type of change - [x] Refactoring	2026-01-04 10:31:56 +08:00
Magicbook1108	f8fd1ea7e1	Feat: Further update Bedrock model configs (#12029 ) ### What problem does this PR solve? Feat: Further update Bedrock model configs #12020 #12008 <img width="700" alt="2b4f0f7fab803a2a2d5f345c756a2c69" src="https://github.com/user-attachments/assets/e1b9eaad-5c60-47bd-a6f4-88a104ce0c63" /> <img width="700" alt="afe88ec3c58f745f85c5c507b040c250" src="https://github.com/user-attachments/assets/9de39745-395d-4145-930b-96eb452ad6ef" /> <img width="700" alt="1a21bb2b7cd8003dce1e5207f27efc69" src="https://github.com/user-attachments/assets/ddba1682-6654-4954-aa71-41b8ebc04ac0" /> ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-19 11:32:20 +08:00
Magicbook1108	e84d5412bc	Feat: bedrock iam authentication (#12020 ) ### What problem does this PR solve? Feat: bedrock iam authentication #12008 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-18 17:13:09 +08:00
Yongteng Lei	6be0338aa0	Fix: Asure-OpenAI resource not found (#11934 ) ### What problem does this PR solve? Asure-OpenAI resource not found. #11750 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-13 11:32:46 +08:00
Magicbook1108	948bc93786	Feat: Add GPT-5.2 & pro (#11929 ) ### What problem does this PR solve? Feat: Add GPT-5.2 & pro ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-12 17:35:08 +08:00
Magicbook1108	ca2d6f3301	Fix: duplicate output by async_chat_streamly (#11842 ) ### What problem does this PR solve? Fix: duplicate output by async_chat_streamly Refact: revert manual modification ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-12-09 19:21:52 +08:00
N0bodycan	9863862348	fix: prevent redundant retries in async_chat_streamly upon success (#11832 ) ## What changes were proposed in this pull request? Added a return statement after the successful completion of the async for loop in async_chat_streamly. ## Why are the changes needed? Previously, the code lacked a break/return mechanism inside the try block. This caused the retry loop (for attempt in range...) to continue executing even after the LLM response was successfully generated and yielded, resulting in duplicate requests (up to max_retries times). ## Does this PR introduce any user-facing change? No (it fixes an internal logic bug).	2025-12-09 17:14:30 +08:00
Yongteng Lei	c51e6b2a58	Refa: migrate CV model chat to Async (#11828 ) ### What problem does this PR solve? Migrate CV model chat to Async. #11750 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring	2025-12-09 13:08:37 +08:00
Yongteng Lei	51ec708c58	Refa: cleanup synchronous functions in chat_model and implement synchronization for conversation and dialog chats (#11779 ) ### What problem does this PR solve? Cleanup synchronous functions in chat_model and implement synchronization for conversation and dialog chats. ### Type of change - [x] Refactoring - [x] Performance Improvement	2025-12-08 09:43:03 +08:00
Yongteng Lei	e3f40db963	Refa: make RAGFlow more asynchronous 2 (#11689 ) ### What problem does this PR solve? Make RAGFlow more asynchronous 2. #11551, #11579, #11619. ### Type of change - [x] Refactoring - [x] Performance Improvement	2025-12-03 14:19:53 +08:00
Kevin Hu	a6681d6366	Revert "Refa: make RAGFlow more asynchronous 2" (#11669 ) Reverts infiniflow/ragflow#11664	2025-12-02 19:42:05 +08:00
Yongteng Lei	627c11c429	Refa: make RAGFlow more asynchronous 2 (#11664 ) ### What problem does this PR solve? Make RAGFlow more asynchronous 2. #11551, #11579, #11619. ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring - [x] Performance Improvement	2025-12-02 18:57:07 +08:00
Yongteng Lei	a713f54732	Refa: add MiniMax-M2 and remove deprecated MiniMax models (#11642 ) ### What problem does this PR solve? Add MiniMax-M2 and remove deprecated models. ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2025-12-02 14:43:44 +08:00
Yongteng Lei	b6c4722687	Refa: make RAGFlow more asynchronous (#11601 ) ### What problem does this PR solve? Try to make this more asynchronous. Verified in chat and agent scenarios, reducing blocking behavior. #11551, #11579. However, the impact of these changes still requires further investigation to ensure everything works as expected. ### Type of change - [x] Refactoring	2025-12-01 14:24:06 +08:00
Yongteng Lei	9d0309aedc	Fix: [MinerU] Missing output file (#11623 ) ### What problem does this PR solve? Add fallbacks for MinerU output path. #11613, #11620. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-01 12:17:43 +08:00
Yongteng Lei	8604c4f57c	Feat: add GPT-5.1, GPT‑5.1 Instant and Claude-Opus-4.5 (#11559 ) ### What problem does this PR solve? Add GPT-5.1, GPT‑5.1 Instant and Claude-Opus-4.5. #11548 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-11-27 17:59:17 +08:00
Yongteng Lei	174a2578e8	Feat: add auth header for Ollama chat model (#11452 ) ### What problem does this PR solve? Add auth header for Ollama chat model. #11350 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-11-21 19:47:06 +08:00
cnJasonZ	3fcf2ee54c	feat: add new LLM provider Jiekou.AI (#11300 ) ### What problem does this PR solve? _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] New Feature (non-breaking change which adds functionality) Co-authored-by: Jason <ggbbddjm@gmail.com>	2025-11-17 19:47:46 +08:00
Jin Hai	bd4bc57009	Refactor: move mcp connection utilities to common (#11304 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-11-17 15:34:17 +08:00
Kevin Hu	f441f8ffc2	Fix: waitForResponse component. (#11172 ) ### What problem does this PR solve? #10056 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality)	2025-11-11 16:58:47 +08:00
Jin Hai	360f5c1179	Move token related functions to common (#10942 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-11-03 08:50:05 +08:00
纷繁下的无奈	84d1ffe44c	Feature/add new models for token pony and bug fix for use llm (#10823 ) new models for token pony and bug fix for use llm Co-authored-by: huangzl <huangzl@shinemo.com>	2025-10-28 10:04:41 +08:00
Stephen Hu	b30f0be858	Refactor: How LiteLLMBase Calculate total count (#10532 ) ### What problem does this PR solve? How LiteLLMBase Calculate total count ### Type of change - [x] Refactoring	2025-10-22 12:25:31 +08:00
buua436	4e86ee4ff9	Feat: Support Specifying OpenRouter Model Provider (#10550 ) ### What problem does this PR solve? issue: [#5787](https://github.com/infiniflow/ragflow/issues/5787) change: Support Specifying OpenRouter Model Provider ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-16 09:39:59 +08:00
Günter Lukas	5037a28e4d	Fix problem with Google Cloud models with reasoning (like gemini) - Additional fix to issue #10474 (#10502 ) ### What problem does this PR solve? Issue #10474 - Update to PR #10477 ### Type of change - [X] Bug Fix (non-breaking change which fixes an issue)	2025-10-15 14:54:20 +08:00
Günter Lukas	fee757eb41	Fix: Disable reasoning on Gemini 2.5 Flash by default (#10477 ) ### What problem does this PR solve? Gemini 2.5 Flash Models use reasoning by default. There is currently no way to disable this behaviour. This leads to very long response times (> 1min). The default behaviour should be, that reasoning is disabled and configurable issue #10474 ### Type of change - [X] Bug Fix (non-breaking change which fixes an issue)	2025-10-11 10:22:51 +08:00
Günter Lukas	0283e4098f	Fix #10408 (#10471 ) ### What problem does this PR solve? Google Cloud model does not work correctly with gemini-2.5 models Close #10408 ### Type of change - [X] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-10-10 19:18:24 +08:00
Kevin Hu	0d8791936e	Feat: TOC retrieval (#10456 ) ### What problem does this PR solve? #10436 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-10 17:07:55 +08:00
buua436	5d167cd772	feat: support qwq reasoning models with non-stream output (#10468 ) ### What problem does this PR solve? issue: [#6193](https://github.com/infiniflow/ragflow/issues/6193) change: support qwq reasoning models with non-stream output ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-10 16:38:04 +08:00
Billy Bao	1a47e136e3	Feat: Adds a new feature that enables the LLM to extract a structured table of contents (TOC) directly from plain text. (#10428 ) ### What problem does this PR solve? Adds a new feature that enables the LLM to extract a structured table of contents (TOC) directly from plain text. _This implementation prioritizes efficiency over reasoning — the model runs in a strictly deterministic mode (thinking disabled) to minimize latency. As a result, overall performance may be less optimal, but the extraction speed and consistency are guaranteed._ ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-09 13:47:31 +08:00
DeerAPI	dfc5fa1f4d	Feat: add DeerAPI support (#10303 ) ### Related issues #10078 ### What problem does this PR solve? Integrate DeerAPI provider. ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Documentation Update Co-authored-by: DeerAPI <tensor.null@gmail.com>	2025-10-09 11:14:49 +08:00

1 2 3 4 5

245 Commits