eb5522ff29
revert: api/core/rag/retrieval/dataset_retrieval.py, separate it into another PR
...
Signed-off-by: -LAN- <laipz8200@outlook.com >
2026-01-09 16:23:42 +08:00
7e33faecfe
fix(api): switch dataset query created_by_role to CreatorUserRole enums
...
Note: `CreatorUserRole.END_USER` is `"end_user"` (underscore), matching the prior value.
Tests not run (not requested).
2026-01-09 16:23:42 +08:00
a015cad8b8
chore: run make lint (2 files reformatted, all checks passed)
...
chore: run make type-check (0 errors, 0 warnings, 0 notes)
2026-01-09 16:23:42 +08:00
27932cb669
Updated api/core/rag/retrieval/dataset_retrieval.py to set user_from via UserFrom, so dataset query attribution aligns with the enum used elsewhere in the codepath.
...
fix(api): use UserFrom enum for dataset retrieval user_from
Tests not run (not requested).
1) `make lint`
2) `make type-check`
3) `uv run --project api --dev dev/pytest/pytest_unit_tests.sh`
2026-01-09 16:23:41 +08:00
4f0fb6df2b
chore: use from __future__ import annotations ( #30254 )
...
Co-authored-by: Dev <dev@Devs-MacBook-Pro-4.local >
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Asuka Minato <i@asukaminato.eu.org >
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com >
2026-01-06 23:57:20 +09:00
114a34e008
fix: correct docx hyperlink extraction ( #30360 )
2026-01-06 11:24:26 +08:00
615c313f80
fix(api): refactors the SQL LIKE pattern escaping logic to use a centralized utility function, ensuring consistent and secure handling of special characters across all database queries. ( #30450 )
...
Signed-off-by: NeatGuyCoding <15627489+NeatGuyCoding@users.noreply.github.com >
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2026-01-06 09:56:30 +08:00
631f999f65
refactor: use contains_any instead of Chaining where = where | f ( #30559 )
...
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2026-01-05 15:48:31 +08:00
be3ef9f050
fix : #30511 [Bug] knowledge_retrieval_node fails when using Rerank Model: "Working outside of application context" and add regression test ( #30549 )
...
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2026-01-05 15:02:21 +08:00
473f8ef29c
feat: skip rerank if only one dataset is retrieved ( #30075 )
...
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com >
2026-01-04 20:22:51 +08:00
cad7101534
feat: support image extraction in PDF RAG extractor ( #30399 )
...
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-12-31 15:49:06 +08:00
9007109a6b
fix: [xxx](xxx) render as xxx](xxx) ( #30392 )
...
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-12-31 10:30:15 +08:00
30dd50ff83
feat: allow fail fast ( #30262 )
2025-12-30 09:27:40 +08:00
f610f6895f
fix: retrieval test and knowledge retrieval node failed in multimodal mode ( #30210 )
...
Co-authored-by: Stephen Zhou <38493346+hyoban@users.noreply.github.com >
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-26 21:42:06 +08:00
d20a8d5b77
fix: fix missing not in ( #30207 )
2025-12-26 16:52:34 +08:00
8611301722
fix: fix DatasetRetrieval._process_metadata_filter_func miss in operator ( #30199 )
2025-12-26 16:34:50 +08:00
61d255a6e6
chore: bypass InsufficientPrivilege on Azure PostgreSQL ( #30191 )
2025-12-26 14:35:05 +08:00
a5309bee25
fix: handle missing credential_id ( #30051 )
...
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-24 11:21:51 +08:00
111a39b549
fix: fix firecrawl url concat ( #30008 )
2025-12-24 09:40:32 +08:00
9701a2994b
chore: Translate stray Chinese comment to English ( #30024 )
...
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-23 14:05:21 +08:00
eaf4146e2f
perf: optimize DatasetRetrieval.retrieve、RetrievalService._deduplicat… ( #29981 )
2025-12-22 20:08:21 +08:00
32605181bd
feat: first use INTERNAL_FILES_URL first, then FILES_URL ( #29962 )
2025-12-21 16:53:37 +08:00
5067e4f255
fix 29184 ( #29188 )
...
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-18 17:11:52 +08:00
78ca5ad142
fix: fix fixed_separator ( #29861 )
2025-12-18 16:50:44 +08:00
8d1e36540a
fix: detect_file_encodings TypeError: tuple indices must be integers or slices, not str ( #29595 )
...
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com >
2025-12-17 13:58:05 +08:00
ae4a9040df
Feat/update notion preview ( #29345 )
...
Co-authored-by: twwu <twwu@dify.ai >
2025-12-16 16:43:45 +08:00
4cc6652424
feat: VECTOR_STORE supports seekdb ( #29658 )
2025-12-16 12:35:04 +09:00
4bf6c4dafa
chore: add online drive metadata source enum ( #29674 )
2025-12-15 21:13:23 +08:00
8f3fd9a728
perf: commit once ( #29590 )
2025-12-15 11:40:26 +08:00
569c593240
feat: Add InterSystems IRIS vector database support ( #29480 )
...
Co-authored-by: Tomo Okuyama <tomo.okuyama@intersystems.com >
2025-12-15 10:20:43 +08:00
db42f467c8
fix: docx extractor external image failed ( #29558 )
2025-12-12 13:41:51 +08:00
12e39365fa
perf(core/rag): optimize Excel extractor performance and memory usage ( #29551 )
...
Co-authored-by: 01393547 <nieronghua@sf-express.com >
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-12-12 12:15:03 +08:00
69a22af1c9
fix: optimize database query when retrieval knowledge in App ( #29467 )
2025-12-11 13:50:46 +08:00
18082752a0
fix knowledge pipeline run multimodal document failed ( #29431 )
2025-12-10 20:42:51 +08:00
784008997b
fix parent-child check when child chunk is not exist ( #29426 )
2025-12-10 18:45:43 +08:00
b49e2646ff
fix: session unbound during parent-child retrieval ( #29396 )
2025-12-10 14:08:55 +08:00
e205182e1f
fix: Parent instance <DocumentSegment at 0x7955b5572c90> is not bound… ( #29377 )
2025-12-10 10:01:45 +08:00
9affc546c6
Feat/support multimodal embedding ( #29115 )
...
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-12-09 14:41:46 +08:00
ca61bb5de0
fix: Weaviate was not closed properly ( #29301 )
2025-12-09 10:23:29 +08:00
45911ab0af
feat: using charset_normalizer instead of chardet ( #29022 )
2025-12-05 11:19:19 +08:00
0af8a7b958
feat: enhance OceanBase vector database with SQL injection fixes, unified processing, and improved error handling ( #28951 )
2025-12-01 09:51:47 +08:00
acbc886ecd
fix: implement score_threshold filtering for OceanBase vector search ( #28536 )
...
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-11-29 18:50:21 +08:00
d7010f582f
Fix 500 error in knowledge base, select weightedScore and click retrieve. ( #28586 )
...
Signed-off-by: -LAN- <laipz8200@outlook.com >
Co-authored-by: -LAN- <laipz8200@outlook.com >
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-11-26 16:44:00 +08:00
f76a3f545c
Feat/add weaviate tokenization configurable ( #28159 )
...
Co-authored-by: lijiezhao <lijiezhao@perfect99.com >
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-11-25 20:07:45 +08:00
83702762c8
use no-root user in docker image by default ( #26419 )
2025-11-25 19:59:45 +08:00
751ce4ec41
more typed orm ( #28577 )
...
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-11-24 21:01:46 +08:00
5f61ca5e6f
feat: Implement partial update for document metadata, allowing merging of new values with existing ones. ( #28390 )
...
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-11-21 12:58:20 +08:00
c0b7ffd5d0
feat:mysql adaptation for metadb ( #28188 )
2025-11-20 09:44:39 +08:00
c74eb4fcf3
minor fix(rag): return early when pushing empty tasks to avoid Redis DataError ( #28027 )
...
Signed-off-by: NeatGuyCoding <15627489+NeatGuyCoding@users.noreply.github.com >
2025-11-13 20:18:11 +08:00
81832c14ee
Fix: Correctly handle merged cells in DOCX tables to prevent content duplication and loss ( #27871 )
...
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com >
2025-11-13 15:56:24 +08:00