Commit Graph

174 Commits

Author SHA1 Message Date
Yi
b92fced974 Merge branch 'main' into feat/external-knowledge-api 2024-09-27 22:39:04 +08:00
55e6123db9 feat: add min-connection and max-connection for pgvector (#8841) 2024-09-27 18:16:20 +08:00
020766a5e8 Merge branch 'main' into feat/external-knowledge-api
# Conflicts:
#	api/poetry.lock
2024-09-27 17:49:40 +08:00
4c1063e1c5 fix: AnalyticdbVector retrieval scores (#8803) 2024-09-27 12:05:21 +08:00
008e0efeb0 refactor: update delete method as an abstract method (#8794) 2024-09-26 16:36:21 +08:00
bf64ff215b fix: . is missing in file_extension (#8736) 2024-09-25 10:09:20 +08:00
a258f8dfdf remove description 2024-09-24 23:32:23 +08:00
30dc137ccc Merge branch 'main' into feat/external-knowledge-api
# Conflicts:
#	api/core/rag/retrieval/dataset_retrieval.py
2024-09-24 18:03:14 +08:00
089da063d4 External knowledge api 2024-09-24 18:00:45 +08:00
ed92c90a40 External knowledge api 2024-09-24 17:52:16 +08:00
omr
8fd297f8b4 fix: redundant check for available_document_count (#8491) 2024-09-22 13:39:41 +08:00
19c526120c external knowledge api 2024-09-19 17:07:33 +08:00
dcb033d221 Merge branch 'main' into feat/external-knowledge
# Conflicts:
#	api/core/rag/datasource/retrieval_service.py
#	api/models/dataset.py
#	api/services/dataset_service.py
2024-09-18 14:40:43 +08:00
9f894bb3b3 external knowledge api 2024-09-18 14:36:51 +08:00
7e611ffbf3 multi-retrival use dataset's top-k (#8416) 2024-09-14 21:48:44 +08:00
a1104ab97e chore: refurish python code by applying Pylint linter rules (#8322) 2024-09-13 22:42:08 +08:00
6613b8f2e0 chore: fix unnecessary string concatation in single line (#8311) 2024-09-13 14:24:49 +08:00
08c486452f fix: score_threshold handling in vector search methods (#8356) 2024-09-13 14:24:35 +08:00
49cee773c5 fixed score threshold is none (#8342) 2024-09-13 10:21:58 +08:00
89e81873c4 merge error 2024-09-13 09:49:24 +08:00
40fb4d16ef chore: refurbish Python code by applying refurb linter rules (#8296) 2024-09-12 15:50:49 +08:00
c69f5b07ba chore: apply ruff E501 line-too-long linter rule (#8275)
Co-authored-by: -LAN- <laipz8200@outlook.com>
2024-09-12 14:00:36 +08:00
56c90e212a fix(workflow): missing content in the answer node stream output during iterations (#8292)
Co-authored-by: -LAN- <laipz8200@outlook.com>
2024-09-12 13:59:48 +08:00
0f14873255 chore: cleanup ruff flake8-simplify linter rules (#8286)
Co-authored-by: -LAN- <laipz8200@outlook.com>
2024-09-12 12:55:45 +08:00
292220c596 chore: apply pep8-naming rules for naming convention (#8261) 2024-09-11 16:40:52 +08:00
bb3002b173 revert page column (#8217) 2024-09-10 18:21:22 +08:00
2cf1187b32 chore(api/core): apply ruff reformatting (#7624) 2024-09-10 17:00:20 +08:00
af92f19291 filter excel empty sheet (#8194) 2024-09-10 14:55:08 +08:00
2d7954c7da Fix variable typo (#8084) 2024-09-08 13:14:11 +08:00
2060db8e11 fix: change milvus init args from (host, port) to (url, token) (#8019)
Signed-off-by: ChengZi <chen.zhang@zilliz.com>
2024-09-06 17:32:48 +08:00
d489b8b3e0 feat: return page number of pdf documents upon retrieval (#7749) 2024-09-05 16:43:26 +08:00
0e71f6db84 fix spliter length missed (#7987) 2024-09-04 21:47:12 +08:00
14af87527f Feat:remove estimation of embedding cost (#7950)
Co-authored-by: jyong <718720800@qq.com>
2024-09-04 14:41:47 +08:00
571415d1a4 fix: split text keep separator (#7930) 2024-09-04 12:59:10 +08:00
d8b6c053a2 fix rerank model value is empty string (#7937) 2024-09-03 21:25:21 +08:00
01581dd35f improve the notion table extract (#7925) 2024-09-03 17:52:07 +08:00
6f33351eb3 ignore linked images when image id is none (#7890) 2024-09-02 19:37:05 +08:00
60001a62c4 fixed chunk_overlap is None (#7703) 2024-08-27 16:38:06 +08:00
122ce41020 feat: rewrite Elasticsearch index and search code to achieve Elasticsearch vector and full-text search (#7641)
Co-authored-by: haokai <haokai@shuwen.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
Co-authored-by: Bowen Liang <bowenliang@apache.org>
Co-authored-by: wellCh4n <wellCh4n@foxmail.com>
2024-08-27 11:43:44 +08:00
162faee4f2 fix: set score_threshold to zero if it is None for MyScale vectordb (#7640)
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
2024-08-27 09:47:16 +08:00
7ae728a9a3 fix nltk averaged_perceptron_tagger download and fix score limit is none (#7582)
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
2024-08-26 15:14:05 +08:00
f29685f8a1 fix score_threshold is none, return all top K documents (#7581) 2024-08-23 16:59:34 +08:00
0223fc6fd5 feat: add pgvector full_text_search (#7396) 2024-08-20 11:01:13 +08:00
ba79088ffc Fix SQL parser Error in MyScale vdb. (#7255) 2024-08-14 16:41:18 +08:00
f104b930cf feat: support elasticsearch vector database (#3558)
Co-authored-by: miendinh <miendinh@users.noreply.github.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
Co-authored-by: crazywoola <427733928@qq.com>
2024-08-13 17:36:20 +08:00
ccb6ddd840 chore: bump Ruff to 0.5.7 (#7174) 2024-08-12 10:24:48 +08:00
Joe
425174e82f feat: update ops trace (#7102) 2024-08-09 15:22:16 +08:00
12095f8cd6 extract docx filter comment element (#7092) 2024-08-08 16:53:29 +08:00
169cde6c3c add nltk punkt resource (#7063) 2024-08-08 14:23:22 +08:00
40c6f3c724 fix: add redis lock to AnalyticdbVector init (#6859)
Co-authored-by: xiaozeyu <xiaozeyu.xzy@alibaba-inc.com>
2024-08-07 17:32:06 +08:00