Fix: preserve field boundaries in chunked documents from MySQL… (#13369)

### What problem does this PR solve?

When multiple columns are used as content columns in RDBMS connector,
the generated document text gets chunked by TxtParser which strips
newline delimiters during merge. This causes field names and values from
different columns to be concatenated without any separator, making the
content unreadable.

Changes:
- txt_parser.py: restore newline separator when merging adjacent text
segments within a chunk, so that split sections are not directly
concatenated
- rdbms_connector.py: use double newline between fields and place field
value on a new line after the field name bracket, giving TxtParser
clearer boundaries to work with

Closes #13001

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: tunsuytang <tunsuytang@tencent.com>
This commit is contained in:
tunsuy
2026-03-04 21:42:02 +08:00
committed by GitHub
parent 9deb3a6249
commit 020068dd16
2 changed files with 8 additions and 5 deletions

View File

@ -40,7 +40,10 @@ class RAGFlowTxtParser:
cks.append(t)
tk_nums.append(tnum)
else:
cks[-1] += t
if cks[-1]:
cks[-1] += "\n" + t
else:
cks[-1] += t
tk_nums[-1] += tnum
dels = []