Fix enable_thinking parameter for Qwen3 models (#12603)

### Issue When using Qwen3 models (`qwen3-32b`, `qwen3-max`) through the Tongyi-Qianwen provider for non-streaming calls (e.g., knowledge graph generation), the API fails with: Closes #12424 ``` parameter.enable_thinking must be set to false for non-streaming calls ``` ### Root Cause In `LiteLLMBase.async_chat()`, the `extra_body={"enable_thinking": False}` was set in `kwargs` but never forwarded to `_construct_completion_args()`. ### What problem does this PR solve? Pass merged kwargs to `_construct_completion_args()` using `**{**gen_conf, **kwargs}` to safely handle potential duplicate parameters. ### Changes - `rag/llm/chat_model.py`: Forward kwargs containing `extra_body` to `_construct_completion_args()` in `async_chat()` _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Contribution by Gittensor, see my contribution statistics at https://gittensor.io/miners/details?githubId=42954461
2026-01-19 03:35:11 +08:00 · 2026-01-14 03:35:46 -05:00
parent 5b22f94502
commit b091ff2730
1 changed files with 1 additions and 1 deletions
--- a/rag/llm/chat_model.py
+++ b/rag/llm/chat_model.py
@ -1263,7 +1263,7 @@ class LiteLLMBase(ABC):
        if self.model_name.lower().find("qwen3") >= 0:
            kwargs["extra_body"] = {"enable_thinking": False}

-        completion_args = self._construct_completion_args(history=hist, stream=False, tools=False, **gen_conf)
+        completion_args = self._construct_completion_args(history=hist, stream=False, tools=False, **{**gen_conf, **kwargs})

        for attempt in range(self.max_retries + 1):
            try: