Files
ragflow/api/apps/sdk/session.py
tmimmanuel 6ce014c23b fix: offload blocking DB/Redis calls to thread pool for high-concurrency support (#13825) (#13941)
### What problem does this PR solve?

Addresses event-loop blocking under high concurrency reported in #13825.
When multiple requests hit the API simultaneously, synchronous DB/Redis
calls block the async event loop, preventing Quart from handling other
requests and causing cascading 502/504 timeouts.

This PR wraps all remaining blocking DB/Redis calls in `canvas_app.py`,
`chat_api.py`, `session.py`, and `canvas_service.py` with `await
thread_pool_exec()`
- Offload all synchronous `Service.*`, `REDIS_CONN.*`, and
`APIToken.query` calls to the thread pool
- Convert sync endpoint handlers (`list_chats`, `get_chat`, `templates`,
`sessions`, etc.) to `async def`
- Convert sync helper functions (`_ensure_owned_chat`,
`_validate_llm_id`, `_validate_dataset_ids`, etc.) to async - no
duplicate sync/async pairs
- Wrap `CanvasReplicaService` Redis IO calls (`bootstrap`,
`replace_for_set`, `commit_after_run`)
- Use `asyncio.gather()` for concurrent file uploads and chat response
building

**Note:** This fixes the code-level event-loop blocking, which is a
prerequisite for handling concurrent requests. For the full "30
concurrent requests without 502/504" goal described in the issue, users
should also tune deployment config:
- `WS=4` or higher (HTTP worker processes, default 1)
- `MAX_CONCURRENT_CHATS=50` (default 10)
- `SANDBOX_EXECUTOR_MANAGER_POOL_SIZE` for workflow-heavy workloads

### Performance verification

Reviewer asked for a before-vs-after comparison
([comment](https://github.com/infiniflow/ragflow/pull/13941#issuecomment-4393667231)).
I built a self-contained microbenchmark that reproduces the exact
failure mode this PR targets: an async handler that performs blocking
DB/Redis-style calls (50 ms each, 3 per request, 30 concurrent requests)
is run twice — once with the pre-PR pattern (sync call directly inside
the async handler) and once with the post-PR pattern (`await
thread_pool_exec(...)`). The benchmark imports nothing from RAGFlow
except `thread_pool_exec` itself, so it is hermetic and reproducible
(`THREAD_POOL_MAX_WORKERS=128`, Python 3.13.12).

**Throughput — wall-clock for 30 concurrent requests (lower is better)**

| flavour | wall(s) | p50(s) | p95(s) | max(s) |
|---|---:|---:|---:|---:|
| before | 4.986 | 0.158 | 0.207 | 0.269 |
| after  | 0.248 | 0.181 | 0.230 | 0.231 |

The pre-PR handler serializes the entire load on the event-loop thread,
so 30 × 3 × 50 ms ≈ 4.5 s shows up as the wall time. The post-PR handler
parallelizes the blocking work across the thread pool and finishes the
same load in 248 ms — a **~20× speedup** on this workload.

**Event-loop responsiveness — latency of an unrelated probe coroutine
while the 30 slow requests are running (lower is better)**

| flavour | samples | probe p50 (ms) | probe p95 (ms) | probe max (ms) |
|---|---:|---:|---:|---:|
| before | 1 | 5442.26 | 5442.26 | 5442.26 |
| after  | 28 | 0.88 | 11.53 | 98.02 |

This is the metric that maps directly to "the API still answers other
requests while one is busy". A 5 ms-interval probe was scheduled while
the 30 slow handlers ran. With the pre-PR code the event loop was frozen
for the entire duration of the blocking work, so only one probe sample
was ever picked up and it waited **5,442 ms**. After the PR, 28 probe
samples landed with **p50 0.88 ms / p95 11.53 ms**, meaning unrelated
requests are no longer starved by the slow ones. That is the regression
mode behind the cascading 502/504s reported in #13825.

<details>
<summary>Raw benchmark output</summary>

```
config: 30 concurrent requests, 3 blocking calls of 50ms each per request, THREAD_POOL_MAX_WORKERS=128

=== Throughput (lower wall is better) ===
flavour       wall(s)     p50(s)     p95(s)     max(s)
before          4.986      0.158      0.207      0.269
after           0.248      0.181      0.230      0.231

=== Event-loop responsiveness (lower probe latency is better) ===
flavour       samples    probe p50(ms)    probe p95(ms)    probe max(ms)
before              1          5442.26          5442.26          5442.26
after              28             0.88            11.53            98.02
```

</details>

The benchmark script is included as a comment on the PR for
reproducibility.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Performance Improvement

Closes [#13825](https://github.com/infiniflow/ragflow/issues/13825)

---------

Co-authored-by: tmimmanuel <tmimmanuel@users.noreply.github.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2026-05-11 15:08:55 +08:00

618 lines
26 KiB
Python

#
# Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import json
import re
import logging
from quart import Response, request
from agent.canvas import Canvas
from api.db.db_models import APIToken
from api.db.services.api_service import API4ConversationService
from api.db.services.canvas_service import UserCanvasService
from api.db.services.canvas_service import completion as agent_completion
from api.db.services.user_canvas_version import UserCanvasVersionService
from api.db.services.conversation_service import async_iframe_completion as iframe_completion
from api.db.services.dialog_service import DialogService, async_ask, gen_mindmap
from api.db.services.doc_metadata_service import DocMetadataService
from api.db.services.knowledgebase_service import KnowledgebaseService
from api.db.services.llm_service import LLMBundle
from common.metadata_utils import apply_meta_data_filter
from api.db.services.search_service import SearchService
from api.db.services.user_service import UserTenantService
from api.db.joint_services.tenant_model_service import get_tenant_default_model_by_type, get_model_config_by_id, \
get_model_config_by_type_and_name
from common.misc_utils import get_uuid, thread_pool_exec
from api.utils.api_utils import check_duplicate_ids, get_error_data_result, get_json_result, \
get_result, get_request_json, server_error_response, token_required, validate_request
from rag.app.tag import label_question
from rag.prompts.template import load_prompt
from rag.prompts.generator import cross_languages, keyword_extraction
from common.constants import RetCode, LLMType, StatusEnum
from common import settings
from api.utils.reference_metadata_utils import (
enrich_chunks_with_document_metadata,
resolve_reference_metadata_preferences,
)
logger = logging.getLogger(__name__)
@token_required
async def create_agent_session(tenant_id, agent_id):
req = await get_request_json()
user_id = req.get("user_id") or request.args.get("user_id", tenant_id)
release_mode = bool(req.get("release", request.args.get("release", False)))
if not await thread_pool_exec(UserCanvasService.query, user_id=tenant_id, id=agent_id):
return get_error_data_result("You cannot access the agent.")
try:
cvs, dsl = await thread_pool_exec(UserCanvasService.get_agent_dsl_with_release, agent_id, release_mode, tenant_id)
except LookupError:
return get_error_data_result("Agent not found.")
except PermissionError as e:
return get_error_data_result(str(e))
session_id = get_uuid()
canvas = Canvas(dsl, tenant_id, agent_id, canvas_id=cvs.id)
canvas.reset()
cvs.dsl = json.loads(str(canvas))
# Get the version title based on release_mode
version_title = await thread_pool_exec(UserCanvasVersionService.get_latest_version_title, cvs.id, release_mode=release_mode)
conv = {
"id": session_id,
"dialog_id": cvs.id,
"user_id": user_id,
"message": [{"role": "assistant", "content": canvas.get_prologue()}],
"source": "agent",
"dsl": cvs.dsl,
"version_title": version_title
}
await thread_pool_exec(API4ConversationService.save, **conv)
conv["agent_id"] = conv.pop("dialog_id")
return get_result(data=conv)
@manager.route("/agents/<agent_id>/sessions", methods=["DELETE"]) # noqa: F821
@token_required
async def delete_agent_session(tenant_id, agent_id):
errors = []
success_count = 0
req = await get_request_json()
cvs = await thread_pool_exec(UserCanvasService.query, user_id=tenant_id, id=agent_id)
if not cvs:
return get_error_data_result(f"You don't own the agent {agent_id}")
if not req:
return get_result()
ids = req.get("ids")
if not ids:
if req.get("delete_all") is True:
ids = [conv.id for conv in await thread_pool_exec(API4ConversationService.query, dialog_id=agent_id)]
if not ids:
return get_result()
else:
return get_result()
conv_list = ids
unique_conv_ids, duplicate_messages = check_duplicate_ids(conv_list, "session")
conv_list = unique_conv_ids
for session_id in conv_list:
conv = await thread_pool_exec(API4ConversationService.query, id=session_id, dialog_id=agent_id)
if not conv:
errors.append(f"The agent doesn't own the session {session_id}")
continue
await thread_pool_exec(API4ConversationService.delete_by_id, session_id)
success_count += 1
if errors:
if success_count > 0:
return get_result(data={"success_count": success_count, "errors": errors},
message=f"Partially deleted {success_count} sessions with {len(errors)} errors")
else:
return get_error_data_result(message="; ".join(errors))
if duplicate_messages:
if success_count > 0:
return get_result(
message=f"Partially deleted {success_count} sessions with {len(duplicate_messages)} errors",
data={"success_count": success_count, "errors": duplicate_messages})
else:
return get_error_data_result(message=";".join(duplicate_messages))
return get_result()
@manager.route("/chatbots/<dialog_id>/completions", methods=["POST"]) # noqa: F821
async def chatbot_completions(dialog_id):
req = await get_request_json()
token = request.headers.get("Authorization").split()
if len(token) != 2:
return get_error_data_result(message='Authorization is not valid!')
token = token[1]
objs = await thread_pool_exec(APIToken.query, beta=token)
if not objs:
return get_error_data_result(message='Authentication error: API key is invalid!"')
tenant_id = objs[0].tenant_id
exists, dialog = DialogService.get_by_id(dialog_id)
if (not exists
or getattr(dialog, "tenant_id", None) != tenant_id
or str(getattr(dialog, "status", "")) != StatusEnum.VALID.value):
logger.warning(
"Denied chatbot access: reason=%s tenant_id=%s dialog_id=%s user_id=%s session_id=%s",
"no access to this chatbot",
tenant_id,
dialog_id,
req.get("user_id"),
req.get("session_id"),
)
return get_error_data_result(message="Authentication error: no access to this chatbot!")
if "quote" not in req:
req["quote"] = False
def _validate_iframe_access():
if req.get("session_id"):
exists, conv = API4ConversationService.get_by_id(req.get("session_id"))
if not exists:
raise AssertionError("Session not found!")
if conv.dialog_id != dialog_id:
raise AssertionError("Session does not belong to this dialog")
if tenant_id and conv.user_id and conv.user_id != tenant_id:
raise AssertionError("Session does not belong to this tenant")
if req.get("stream", True):
try:
_validate_iframe_access()
except AssertionError:
logger.warning(
"Denied chatbot completion stream: reason=%s tenant_id=%s dialog_id=%s user_id=%s session_id=%s",
"no access to this chatbot",
tenant_id,
dialog_id,
req.get("user_id"),
req.get("session_id"),
)
return get_error_data_result(message="Authentication error: no access to this chatbot!")
resp = Response(iframe_completion(dialog_id, tenant_id=tenant_id, **req), mimetype="text/event-stream")
resp.headers.add_header("Cache-control", "no-cache")
resp.headers.add_header("Connection", "keep-alive")
resp.headers.add_header("X-Accel-Buffering", "no")
resp.headers.add_header("Content-Type", "text/event-stream; charset=utf-8")
return resp
try:
_validate_iframe_access()
async for answer in iframe_completion(dialog_id, tenant_id=tenant_id, **req):
return get_result(data=answer)
except AssertionError:
logger.warning(
"Denied chatbot completion: reason=%s tenant_id=%s dialog_id=%s user_id=%s session_id=%s",
"no access to this chatbot",
tenant_id,
dialog_id,
req.get("user_id"),
req.get("session_id"),
)
return get_error_data_result(message="Authentication error: no access to this chatbot!")
return None
@manager.route("/chatbots/<dialog_id>/info", methods=["GET"]) # noqa: F821
async def chatbots_inputs(dialog_id):
token = request.headers.get("Authorization").split()
if len(token) != 2:
return get_error_data_result(message='Authorization is not valid!')
token = token[1]
objs = await thread_pool_exec(APIToken.query, beta=token)
if not objs:
return get_error_data_result(message='Authentication error: API key is invalid!"')
tenant_id = objs[0].tenant_id
exists, dialog = await thread_pool_exec(DialogService.get_by_id, dialog_id)
if (not exists
or getattr(dialog, "tenant_id", None) != tenant_id
or str(getattr(dialog, "status", "")) != StatusEnum.VALID.value):
request_args = getattr(request, "args", {}) or {}
request_user_id = request_args.get("user_id") if hasattr(request_args, "get") else None
request_session_id = request_args.get("session_id") if hasattr(request_args, "get") else None
logger.warning(
"Denied chatbot access: reason=%s tenant_id=%s dialog_id=%s user_id=%s session_id=%s",
"no access to this chatbot",
tenant_id,
dialog_id,
request_user_id,
request_session_id,
)
return get_error_data_result(message="Authentication error: no access to this chatbot!")
return get_result(
data={
"title": dialog.name,
"avatar": dialog.icon,
"prologue": dialog.prompt_config.get("prologue", ""),
"has_tavily_key": bool(dialog.prompt_config.get("tavily_api_key", "").strip()),
}
)
@manager.route("/agentbots/<agent_id>/completions", methods=["POST"]) # noqa: F821
async def agent_bot_completions(agent_id):
req = await get_request_json()
token = request.headers.get("Authorization").split()
if len(token) != 2:
return get_error_data_result(message='Authorization is not valid!')
token = token[1]
objs = await thread_pool_exec(APIToken.query, beta=token)
if not objs:
return get_error_data_result(message='Authentication error: API key is invalid!"')
if req.get("stream", True):
async def stream():
try:
async for answer in agent_completion(objs[0].tenant_id, agent_id, **req):
yield answer
except Exception as e:
logging.exception(e)
error_result = get_error_data_result(message=str(e) or "Unknown error")
yield "data:" + json.dumps(
{
"event": "message",
"data": {"content": f"Error {error_result['code']}: {error_result['message']}\n\n"},
**error_result,
},
ensure_ascii=False,
) + "\n\n"
resp = Response(stream(), mimetype="text/event-stream")
resp.headers.add_header("Cache-control", "no-cache")
resp.headers.add_header("Connection", "keep-alive")
resp.headers.add_header("X-Accel-Buffering", "no")
resp.headers.add_header("Content-Type", "text/event-stream; charset=utf-8")
return resp
try:
async for answer in agent_completion(objs[0].tenant_id, agent_id, **req):
return get_result(data=answer)
except Exception as e:
logging.exception(e)
return get_error_data_result(message=str(e) or "Unknown error")
return None
@manager.route("/agentbots/<agent_id>/inputs", methods=["GET"]) # noqa: F821
async def begin_inputs(agent_id):
token = request.headers.get("Authorization").split()
if len(token) != 2:
return get_error_data_result(message='Authorization is not valid!')
token = token[1]
objs = await thread_pool_exec(APIToken.query, beta=token)
if not objs:
return get_error_data_result(message='Authentication error: API key is invalid!"')
e, cvs = await thread_pool_exec(UserCanvasService.get_by_id, agent_id)
if not e:
return get_error_data_result(f"Can't find agent by ID: {agent_id}")
canvas = Canvas(json.dumps(cvs.dsl), objs[0].tenant_id, canvas_id=cvs.id)
return get_result(
data={"title": cvs.title, "avatar": cvs.avatar, "inputs": canvas.get_component_input_form("begin"),
"prologue": canvas.get_prologue(), "mode": canvas.get_mode()})
@manager.route("/searchbots/ask", methods=["POST"]) # noqa: F821
@validate_request("question", "kb_ids")
async def ask_about_embedded():
token = request.headers.get("Authorization").split()
if len(token) != 2:
return get_error_data_result(message='Authorization is not valid!')
token = token[1]
objs = await thread_pool_exec(APIToken.query, beta=token)
if not objs:
return get_error_data_result(message='Authentication error: API key is invalid!"')
req = await get_request_json()
uid = objs[0].tenant_id
search_id = req.get("search_id", "")
search_config = {}
if search_id:
if search_app := await thread_pool_exec(SearchService.get_detail, search_id):
search_config = search_app.get("search_config", {})
async def stream():
nonlocal req, uid
try:
async for ans in async_ask(req["question"], req["kb_ids"], uid, search_config=search_config):
yield "data:" + json.dumps({"code": 0, "message": "", "data": ans}, ensure_ascii=False) + "\n\n"
except Exception as e:
yield "data:" + json.dumps(
{"code": 500, "message": str(e), "data": {"answer": "**ERROR**: " + str(e), "reference": []}},
ensure_ascii=False) + "\n\n"
yield "data:" + json.dumps({"code": 0, "message": "", "data": True}, ensure_ascii=False) + "\n\n"
resp = Response(stream(), mimetype="text/event-stream")
resp.headers.add_header("Cache-control", "no-cache")
resp.headers.add_header("Connection", "keep-alive")
resp.headers.add_header("X-Accel-Buffering", "no")
resp.headers.add_header("Content-Type", "text/event-stream; charset=utf-8")
return resp
@manager.route("/searchbots/retrieval_test", methods=["POST"]) # noqa: F821
@validate_request("kb_id", "question")
async def retrieval_test_embedded():
token = request.headers.get("Authorization").split()
if len(token) != 2:
return get_error_data_result(message='Authorization is not valid!')
token = token[1]
objs = await thread_pool_exec(APIToken.query, beta=token)
if not objs:
return get_error_data_result(message='Authentication error: API key is invalid!"')
req = await get_request_json()
page = int(req.get("page", 1))
size = int(req.get("size", 30))
question = req["question"]
kb_ids = req["kb_id"]
if isinstance(kb_ids, str):
kb_ids = [kb_ids]
if not kb_ids:
return get_json_result(data=False, message='Please specify dataset firstly.',
code=RetCode.DATA_ERROR)
doc_ids = req.get("doc_ids", [])
similarity_threshold = float(req.get("similarity_threshold", 0.0))
vector_similarity_weight = float(req.get("vector_similarity_weight", 0.3))
use_kg = req.get("use_kg", False)
top = int(req.get("top_k", 1024))
if top <= 0:
return get_error_data_result("`top_k` must be greater than 0")
langs = req.get("cross_languages", [])
rerank_id = req.get("rerank_id", "")
tenant_rerank_id = req.get("tenant_rerank_id", "")
tenant_id = objs[0].tenant_id
if not tenant_id:
return get_error_data_result(message="permission denined.")
search_config = {}
async def _retrieval():
nonlocal similarity_threshold, vector_similarity_weight, top, rerank_id
local_doc_ids = list(doc_ids) if doc_ids else []
tenant_ids = []
_question = question
meta_data_filter = {}
chat_mdl = None
if req.get("search_id", ""):
nonlocal search_config
detail = await thread_pool_exec(SearchService.get_detail, req.get("search_id", ""))
if detail:
search_config = detail.get("search_config", {})
meta_data_filter = search_config.get("meta_data_filter", {})
if meta_data_filter.get("method") in ["auto", "semi_auto"]:
chat_id = search_config.get("chat_id", "")
if chat_id:
chat_model_config = await thread_pool_exec(get_model_config_by_type_and_name, tenant_id, LLMType.CHAT, chat_id)
else:
chat_model_config = await thread_pool_exec(get_tenant_default_model_by_type, tenant_id, LLMType.CHAT)
chat_mdl = LLMBundle(tenant_id, chat_model_config)
# Apply search_config settings if not explicitly provided in request
if not req.get("similarity_threshold"):
similarity_threshold = float(search_config.get("similarity_threshold", similarity_threshold))
if not req.get("vector_similarity_weight"):
vector_similarity_weight = float(search_config.get("vector_similarity_weight", vector_similarity_weight))
if not req.get("top_k"):
top = int(search_config.get("top_k", top))
if not req.get("rerank_id"):
rerank_id = search_config.get("rerank_id", "")
else:
meta_data_filter = req.get("meta_data_filter") or {}
if meta_data_filter.get("method") in ["auto", "semi_auto"]:
chat_model_config = await thread_pool_exec(get_tenant_default_model_by_type, tenant_id, LLMType.CHAT)
chat_mdl = LLMBundle(tenant_id, chat_model_config)
if meta_data_filter:
local_doc_ids = await apply_meta_data_filter(
meta_data_filter,
None,
_question,
chat_mdl,
local_doc_ids,
kb_ids=kb_ids,
metas_loader=lambda: DocMetadataService.get_flatted_meta_by_kbs(kb_ids),
)
tenants = await thread_pool_exec(UserTenantService.query, user_id=tenant_id)
for kb_id in kb_ids:
for tenant in tenants:
if await thread_pool_exec(KnowledgebaseService.query, tenant_id=tenant.tenant_id, id=kb_id):
tenant_ids.append(tenant.tenant_id)
break
else:
return get_json_result(data=False, message="Only owner of dataset authorized for this operation.",
code=RetCode.OPERATING_ERROR)
e, kb = await thread_pool_exec(KnowledgebaseService.get_by_id, kb_ids[0])
if not e:
return get_error_data_result(message="Knowledgebase not found!")
if langs:
_question = await cross_languages(kb.tenant_id, None, _question, langs)
if kb.tenant_embd_id:
embd_model_config = await thread_pool_exec(get_model_config_by_id, kb.tenant_embd_id)
else:
embd_model_config = await thread_pool_exec(get_model_config_by_type_and_name, kb.tenant_id, LLMType.EMBEDDING, kb.embd_id)
embd_mdl = LLMBundle(kb.tenant_id, embd_model_config)
rerank_mdl = None
if tenant_rerank_id:
rerank_model_config = await thread_pool_exec(get_model_config_by_id, tenant_rerank_id)
rerank_mdl = LLMBundle(kb.tenant_id, rerank_model_config)
elif rerank_id:
rerank_model_config = await thread_pool_exec(get_model_config_by_type_and_name, tenant_id, LLMType.RERANK, rerank_id)
rerank_mdl = LLMBundle(kb.tenant_id, rerank_model_config)
if req.get("keyword", False):
default_chat_model = await thread_pool_exec(get_tenant_default_model_by_type, kb.tenant_id, LLMType.CHAT)
chat_mdl = LLMBundle(kb.tenant_id, default_chat_model)
_question += await keyword_extraction(chat_mdl, _question)
labels = label_question(_question, [kb])
ranks = await settings.retriever.retrieval(
_question, embd_mdl, tenant_ids, kb_ids, page, size, similarity_threshold, vector_similarity_weight, top,
local_doc_ids, rerank_mdl=rerank_mdl, highlight=req.get("highlight"), rank_feature=labels
)
if use_kg:
default_chat_model = await thread_pool_exec(get_tenant_default_model_by_type, kb.tenant_id, LLMType.CHAT)
ck = await settings.kg_retriever.retrieval(_question, tenant_ids, kb_ids, embd_mdl,
LLMBundle(kb.tenant_id, default_chat_model))
if ck["content_with_weight"]:
ranks["chunks"].insert(0, ck)
for c in ranks["chunks"]:
c.pop("vector", None)
include_metadata, metadata_fields = _resolve_reference_metadata(req, search_config)
if include_metadata:
enrich_chunks_with_document_metadata(ranks["chunks"], metadata_fields)
ranks["labels"] = labels
return get_json_result(data=ranks)
try:
return await _retrieval()
except Exception as e:
if str(e).find("not_found") > 0:
return get_json_result(data=False, message="No chunk found! Check the chunk status please!",
code=RetCode.DATA_ERROR)
return server_error_response(e)
@manager.route("/searchbots/related_questions", methods=["POST"]) # noqa: F821
@validate_request("question")
async def related_questions_embedded():
token = request.headers.get("Authorization").split()
if len(token) != 2:
return get_error_data_result(message='Authorization is not valid!')
token = token[1]
objs = await thread_pool_exec(APIToken.query, beta=token)
if not objs:
return get_error_data_result(message='Authentication error: API key is invalid!"')
req = await get_request_json()
tenant_id = objs[0].tenant_id
if not tenant_id:
return get_error_data_result(message="permission denined.")
search_id = req.get("search_id", "")
search_config = {}
if search_id:
if search_app := await thread_pool_exec(SearchService.get_detail, search_id):
search_config = search_app.get("search_config", {})
question = req["question"]
chat_id = search_config.get("chat_id", "")
if chat_id:
chat_model_config = await thread_pool_exec(get_model_config_by_type_and_name, tenant_id, LLMType.CHAT, chat_id)
else:
chat_model_config = await thread_pool_exec(get_tenant_default_model_by_type, tenant_id, LLMType.CHAT)
chat_mdl = LLMBundle(tenant_id, chat_model_config)
gen_conf = search_config.get("llm_setting", {"temperature": 0.9})
prompt = load_prompt("related_question")
ans = await chat_mdl.async_chat(
prompt,
[
{
"role": "user",
"content": f"""
Keywords: {question}
Related search terms:
""",
}
],
gen_conf,
)
return get_json_result(data=[re.sub(r"^[0-9]\. ", "", a) for a in ans.split("\n") if re.match(r"^[0-9]\. ", a)])
@manager.route("/searchbots/detail", methods=["GET"]) # noqa: F821
async def detail_share_embedded():
token = request.headers.get("Authorization").split()
if len(token) != 2:
return get_error_data_result(message='Authorization is not valid!')
token = token[1]
objs = await thread_pool_exec(APIToken.query, beta=token)
if not objs:
return get_error_data_result(message='Authentication error: API key is invalid!"')
search_id = request.args["search_id"]
tenant_id = objs[0].tenant_id
if not tenant_id:
return get_error_data_result(message="permission denined.")
try:
tenants = await thread_pool_exec(UserTenantService.query, user_id=tenant_id)
for tenant in tenants:
if await thread_pool_exec(SearchService.query, tenant_id=tenant.tenant_id, id=search_id):
break
else:
return get_json_result(data=False, message="Has no permission for this operation.",
code=RetCode.OPERATING_ERROR)
search = await thread_pool_exec(SearchService.get_detail, search_id)
if not search:
return get_error_data_result(message="Can't find this Search App!")
return get_json_result(data=search)
except Exception as e:
return server_error_response(e)
@manager.route("/searchbots/mindmap", methods=["POST"]) # noqa: F821
@validate_request("question", "kb_ids")
async def mindmap():
token = request.headers.get("Authorization").split()
if len(token) != 2:
return get_error_data_result(message='Authorization is not valid!')
token = token[1]
objs = await thread_pool_exec(APIToken.query, beta=token)
if not objs:
return get_error_data_result(message='Authentication error: API key is invalid!"')
tenant_id = objs[0].tenant_id
req = await get_request_json()
search_id = req.get("search_id", "")
search_app = await thread_pool_exec(SearchService.get_detail, search_id) if search_id else {}
mind_map =await gen_mindmap(req["question"], req["kb_ids"], tenant_id, search_app.get("search_config", {}))
if "error" in mind_map:
return server_error_response(Exception(mind_map["error"]))
return get_json_result(data=mind_map)
def _resolve_reference_metadata(req, search_config=None):
return resolve_reference_metadata_preferences(req, search_config)