Fix: avoid empty doc filter in knowledge retrieval (#13484)

## Summary
Fix knowledge-base chat retrieval when no individual document IDs are
selected.

## Root Cause
`async_chat()` initialized `doc_ids` as an empty list when the request
did not explicitly select documents. That empty list was then forwarded
into retrieval as an active `doc_id` filter, effectively becoming
`doc_id IN []` and suppressing all chunk matches.

## Changes
- treat missing selected document IDs as `None` instead of `[]`
- keep explicit document filtering when IDs are actually provided
- add regression coverage for the shared chat retrieval path

## Validation
- `python3 -m py_compile api/db/services/dialog_service.py
test/unit_test/api/db/services/test_dialog_service_use_sql_source_columns.py`
- `.venv/bin/python -m pytest
test/unit_test/api/db/services/test_dialog_service_use_sql_source_columns.py`
- manually verified that chat completions again inject retrieved
knowledge into the prompt

---------

Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
This commit is contained in:
Josh
2026-03-12 09:03:30 +01:00
committed by GitHub
parent 227c852e67
commit a353c7bdd7
2 changed files with 100 additions and 3 deletions

View File

@ -494,12 +494,14 @@ async def async_chat(dialog, messages, stream=True, **kwargs):
retriever = settings.retriever
questions = [m["content"] for m in messages if m["role"] == "user"][-3:]
attachments = kwargs["doc_ids"].split(",") if "doc_ids" in kwargs else []
attachments = None
if "doc_ids" in kwargs:
attachments = [doc_id for doc_id in kwargs["doc_ids"].split(",") if doc_id]
attachments_= ""
image_attachments = []
image_files = []
if "doc_ids" in messages[-1]:
attachments = messages[-1]["doc_ids"]
attachments = [doc_id for doc_id in messages[-1]["doc_ids"] if doc_id]
if "files" in messages[-1]:
if llm_type == "chat":
text_attachments, image_attachments = split_file_attachments(messages[-1]["files"])
@ -559,7 +561,7 @@ async def async_chat(dialog, messages, stream=True, **kwargs):
kbinfos = {"total": 0, "chunks": [], "doc_aggs": []}
knowledges = []
if attachments is not None and "knowledge" in param_keys:
if "knowledge" in param_keys:
logging.debug("Proceeding with retrieval")
tenant_ids = list(set([kb.tenant_id for kb in kbs]))
knowledges = []