0501134820
Fix: support tool call config ( #14616 )
...
### What problem does this PR solve?
support tool call config
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-07 15:54:57 +08:00
78188ce9e9
Feat: add OpenDataLoader PDF parser backend ( #14058 ) ( #14097 )
...
### What problem does this PR solve?
Closes #14058 .
RAGFlow supports multiple PDF parsing backends (DeepDOC, MinerU,
Docling, TCADP, PaddleOCR). This PR adds **OpenDataLoader**
([opendataloader-project/opendataloader-pdf](https://github.com/opendataloader-project/opendataloader-pdf ))
as a new optional backend, giving users a deterministic, local-first
alternative with competitive table extraction accuracy.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
---
### Changes
#### Backend
- `deepdoc/parser/opendataloader_parser.py` — new `OpenDataLoaderParser`
class inheriting `RAGFlowPdfParser`. Implements `check_installation()`
(guards Python package + Java 11+ runtime), `parse_pdf()` with
JSON-first extraction (heading/paragraph/table/list/image/formula) and
Markdown fallback, position-tag generation compatible with the shared
`@@page\tx0\tx1\ty0\ty1##` format, and temp-dir lifecycle with cleanup.
- `rag/app/naive.py` — new `by_opendataloader()` wrapper, registered in
`PARSERS` dict, added to `chunk_token_num=0` override list.
- `rag/flow/parser/parser.py` — `"opendataloader"` branch in the
pipeline PDF handler + check validation list.
#### Infrastructure
- `docker/entrypoint.sh` — `ensure_opendataloader()` function: opt-in
via `USE_OPENDATALOADER=true`, skips gracefully if Java is not on PATH.
#### Frontend
- `web/src/components/layout-recognize-form-field.tsx` —
`OpenDataLoader` added to `ParseDocumentType` enum and parser dropdown.
Cascades automatically to the pipeline editor's Parser component.
#### Docs
- `docs/guides/dataset/select_pdf_parser.md` — added OpenDataLoader
entry and full env-var reference.
---
### Environment variables
| Variable | Default | Description |
|---|---|---|
| `USE_OPENDATALOADER` | `false` | Set `true` to install
`opendataloader-pdf` on container startup |
| `OPENDATALOADER_VERSION` | latest | Pin the PyPI release (e.g.
`==2.2.1`) |
| `OPENDATALOADER_HYBRID` | _(unset)_ | Enable hybrid AI mode (e.g.
`docling-fast`) |
| `OPENDATALOADER_IMAGE_OUTPUT` | _(unset)_ | `off` / `embedded` /
`external` |
| `OPENDATALOADER_OUTPUT_DIR` | _(tmp)_ | Persistent output dir; temp
dir used + cleaned if unset |
| `OPENDATALOADER_DELETE_OUTPUT` | `1` | `0` to retain intermediate
files for debugging |
| `OPENDATALOADER_SANITIZE` | _(unset)_ | `1` to filter prompt-injection
patterns from output |
---
### Dependencies
- **Runtime**: `opendataloader-pdf` (PyPI, Apache 2.0) — opt-in, not
added to `pyproject.toml` core deps. Installed by
`ensure_opendataloader()` at container startup when
`USE_OPENDATALOADER=true`.
- **System**: Java 11+ on PATH (JVM is the underlying engine). The
installer skips with a warning if `java` is not found.
---
### How to test
**Standalone parser:**
```bash
source .venv/bin/activate
uv pip install opendataloader-pdf
python3 -c "
import sys; sys.path.insert(0, '.')
from deepdoc.parser.opendataloader_parser import OpenDataLoaderParser
p = OpenDataLoaderParser()
print('available:', p.check_installation())
s, t = p.parse_pdf('path/to/test.pdf', parse_method='pipeline')
print(f'sections={len(s)} tables={len(t)}')
"
```
### Benchmark vs Docling
```
file parser secs sections tables
----------------------------------------------------------------------
text-heavy.pdf docling 45.29 148 10
text-heavy.pdf opendataloader 3.14 559 0
table-heavy.pdf docling 7.05 76 3
table-heavy.pdf opendataloader 3.71 90 0
complex.pdf docling 42.67 114 8
complex.pdf opendataloader 3.51 180 0
```
2026-04-25 00:33:02 +08:00
62cb292635
Feat/tenant model ( #13072 )
...
### What problem does this PR solve?
Add id for table tenant_llm and apply in LLMBundle.
### Type of change
- [x] Refactoring
---------
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com >
Co-authored-by: Liu An <asiro@qq.com >
2026-03-05 17:27:17 +08:00
f13a1fb007
Refa: improve model verification ux ( #13392 )
...
### What problem does this PR solve?
Improve model verification UX. #13395
### Type of change
- [x] Refactoring
---------
Co-authored-by: Liu An <asiro@qq.com >
2026-03-05 17:23:47 +08:00
5fc3bd38b0
Feat: Support siliconflow.com ( #13308 )
...
### What problem does this PR solve?
Feat: Support siliconflow.com
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2026-03-02 15:37:42 +08:00
1262533b74
Feat: support verify to set llm key and boost bigrams. ( #12980 )
...
#12863
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2026-02-05 19:19:09 +08:00
2a758402ad
Fix: Hunyuan cannot work properly ( #12843 )
...
### What problem does this PR solve?
Hunyuan cannot work properly
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2026-01-27 17:04:53 +08:00
cec06bfb5d
Fix: empty chunk issue. ( #12638 )
...
#12570
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2026-01-15 17:46:21 +08:00
2e09db02f3
feat: add paddleocr parser ( #12513 )
...
### What problem does this PR solve?
Add PaddleOCR as a new PDF parser.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2026-01-09 17:48:45 +08:00
e5f3d5ae26
Refactor add_llm and add speech to text ( #12089 )
...
### What problem does this PR solve?
1. Refactor implementation of add_llm
2. Add speech to text model.
### Type of change
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com >
2025-12-22 19:27:26 +08:00
f8fd1ea7e1
Feat: Further update Bedrock model configs ( #12029 )
...
### What problem does this PR solve?
Feat: Further update Bedrock model configs #12020 #12008
<img width="700" alt="2b4f0f7fab803a2a2d5f345c756a2c69"
src="https://github.com/user-attachments/assets/e1b9eaad-5c60-47bd-a6f4-88a104ce0c63 "
/>
<img width="700" alt="afe88ec3c58f745f85c5c507b040c250"
src="https://github.com/user-attachments/assets/9de39745-395d-4145-930b-96eb452ad6ef "
/>
<img width="700" alt="1a21bb2b7cd8003dce1e5207f27efc69"
src="https://github.com/user-attachments/assets/ddba1682-6654-4954-aa71-41b8ebc04ac0 "
/>
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2025-12-19 11:32:20 +08:00
e9710b7aa9
Refa: treat MinerU as an OCR model 2 ( #11905 )
...
### What problem does this PR solve?
Treat MinerU as an OCR model 2. #11903
### Type of change
- [x] Refactoring
2025-12-11 17:33:12 +08:00
a94b3b9df2
Refa: treat MinerU as an OCR model ( #11849 )
...
### What problem does this PR solve?
Treat MinerU as an OCR model.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
2025-12-09 18:54:14 +08:00
51ec708c58
Refa: cleanup synchronous functions in chat_model and implement synchronization for conversation and dialog chats ( #11779 )
...
### What problem does this PR solve?
Cleanup synchronous functions in chat_model and implement
synchronization for conversation and dialog chats.
### Type of change
- [x] Refactoring
- [x] Performance Improvement
2025-12-08 09:43:03 +08:00
b6c4722687
Refa: make RAGFlow more asynchronous ( #11601 )
...
### What problem does this PR solve?
Try to make this more asynchronous. Verified in chat and agent
scenarios, reducing blocking behavior. #11551 , #11579 .
However, the impact of these changes still requires further
investigation to ensure everything works as expected.
### Type of change
- [x] Refactoring
2025-12-01 14:24:06 +08:00
d1716d865a
Feat: Alter flask to Quart for async API serving. ( #11275 )
...
### What problem does this PR solve?
#11277
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2025-11-18 17:05:16 +08:00
c30ffb5716
Fix: ollama model list issue. ( #11175 )
...
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-11-11 19:46:41 +08:00
26cf5131c9
Fix: filter builtin llm factories. ( #11163 )
...
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-11-11 14:52:59 +08:00
68b952abb1
Don't select vector on infinity ( #11151 )
...
### What problem does this PR solve?
Don't select vector on infinity
### Type of change
- [x] Performance Improvement
2025-11-10 18:01:40 +08:00
5a8fbc5a81
Fix: Can't add more models ( #11076 )
...
### What problem does this PR solve?
Currently we cannot add any models, since factory is a string, and the
return type of get_allowed_llm_factories() is List[object]
https://github.com/infiniflow/ragflow/pull/11003
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-11-06 18:54:13 +08:00
1a9215bc6f
Move some vars to globals ( #11017 )
...
### What problem does this PR solve?
As title.
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com >
2025-11-05 14:14:38 +08:00
3654ae61c1
feat: add allowed factories variable to allow admins to restrict llms users can add ( #11003 )
...
### What problem does this PR solve?
Currently, if we want to restrict the allowed factories users can use we
need to delete from the database table manually. The proposal of this PR
is to include a variable to that, if set, will restrict the LLM
factories the users can see and add. This allow us to not touch the
llm_factories.json or the database if the LLM factory is already
inserted.
Obs.: All the lint changes were from the pre-commit hook which I did not
change.
### Type of change
- [X] New Feature (non-breaking change which adds functionality)
2025-11-05 10:47:50 +08:00
bab3fce136
Move some constants to common ( #11004 )
...
### What problem does this PR solve?
As title.
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com >
2025-11-05 08:01:39 +08:00
3e5a39482e
Feat: Support multiple data sources synchronizations ( #10954 )
...
### What problem does this PR solve?
#10953
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2025-11-03 19:59:18 +08:00
d008a4df9f
Move base64_image related functions to common directory ( #10957 )
...
### What problem does this PR solve?
As title
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com >
2025-11-03 15:20:46 +08:00
fa38aed01b
Fix: the input length exceeds the context length ( #10895 )
...
### What problem does this PR solve?
Fix: the input length exceeds the context length #10750
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-10-30 19:00:53 +08:00
40b2c48957
Chore(config): remove Youdao and BAAI embedding model providers ( #10873 )
...
### What problem does this PR solve?
This commit removes the Youdao and BAAI entries from the LLM factories
configuration as they are no longer needed or supported.
### Type of change
- [x] Config update
2025-10-29 19:38:57 +08:00
c0c2a10680
Feat: allow initialize Redis without password ( #10856 )
...
### What problem does this PR solve?
Allow initialize Redis without password.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2025-10-29 09:45:28 +08:00
73144e278b
Don't release full image ( #10654 )
...
### What problem does this PR solve?
Introduced gpu profile in .env
Added Dockerfile_tei
fix datrie
Removed LIGHTEN flag
### Type of change
- [x] Documentation Update
- [x] Refactoring
2025-10-23 23:02:27 +08:00
4e86ee4ff9
Feat: Support Specifying OpenRouter Model Provider ( #10550 )
...
### What problem does this PR solve?
issue:
[#5787 ](https://github.com/infiniflow/ragflow/issues/5787 )
change:
Support Specifying OpenRouter Model Provider
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2025-10-16 09:39:59 +08:00
5abd0bbac1
Fix typo ( #9766 )
...
### What problem does this PR solve?
As title
### Type of change
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com >
2025-08-27 18:56:40 +08:00
5e8cd693a5
Refa: split services about llm. ( #9450 )
...
### What problem does this PR solve?
### Type of change
- [x] Refactoring
2025-08-13 16:41:01 +08:00
83771e500c
Refa: migrate chat models to LiteLLM ( #9394 )
...
### What problem does this PR solve?
All models pass the mock response tests, which means that if a model can
return the correct response, everything should work as expected.
However, not all models have been fully tested in a real environment,
the real API_KEY. I suggest actively monitoring the refactored models
over the coming period to ensure they work correctly and fixing them
step by step, or waiting to merge until most have been tested in
practical environment.
### Type of change
- [x] Refactoring
2025-08-12 10:59:20 +08:00
9ca86d801e
Refa: add provider info while adding model. ( #9273 )
...
### What problem does this PR solve?
#9248
### Type of change
- [x] Refactoring
2025-08-07 09:40:42 +08:00
1409bb30df
Refactor:Improve the logic so that it does not decode base 64 for the test image each time ( #9264 )
...
### What problem does this PR solve?
Improve the logic so that it does not decode base 64 for the test image
each time
### Type of change
- [x] Refactoring
- [x] Performance Improvement
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com >
2025-08-06 11:42:25 +08:00
b638d3f773
Image validation of the image2text model without using local paths ( #9052 )
...
### What problem does this PR solve?
#9050
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-07-30 12:57:24 +08:00
6691532079
Feat: Add model editing functionality with improved UI labels ( #8855 )
...
### What problem does this PR solve?
Add edit button for local LLM models
<img width="1531" height="1428" alt="image"
src="https://github.com/user-attachments/assets/19d62255-59a6-4a7e-9772-8b8743101f78 "
/>
<img width="1531" height="1428" alt="image"
src="https://github.com/user-attachments/assets/c3a0f77e-cc6b-4190-95a6-13835463428b "
/>
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: Liu An <asiro@qq.com >
2025-07-21 19:16:53 +08:00
163e71d06f
Fix: Hunyuan model adding error. ( #6531 )
...
### What problem does this PR solve?
#6523
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-26 10:33:33 +08:00
5748d58c74
Refa: refine the error message. ( #6151 )
...
### What problem does this PR solve?
#6138
### Type of change
- [x] Refactoring
2025-03-17 13:07:22 +08:00
471bd92b4c
Fix: empty api-key causes problems. ( #6022 )
...
### What problem does this PR solve?
#5926
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-13 14:57:47 +08:00
45123dcc0a
Fix: ollama model add error. ( #5947 )
...
### What problem does this PR solve?
#5944
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-12 10:56:05 +08:00
82f5d901c8
Refa: add model. ( #5820 )
...
### What problem does this PR solve?
#5783
### Type of change
- [x] Refactoring
2025-03-10 11:22:06 +08:00
4c9a3e918f
Fix: add image2text issue. ( #5431 )
...
### What problem does this PR solve?
#5356
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-27 14:06:49 +08:00
0e920a91dd
FIX: correct typo ( #5387 )
...
### What problem does this PR solve?
Correct typo in supported_models file
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-26 17:21:09 +08:00
cdcaae17c6
Feat: add VLLM ( #5380 )
...
### What problem does this PR solve?
Read to add VLMM.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2025-02-26 16:04:53 +08:00
4f40f685d9
Code refactor ( #5371 )
...
### What problem does this PR solve?
#5173
### Type of change
- [x] Refactoring
2025-02-26 15:40:52 +08:00
605cfdb8dc
Refine error message for re-rank model. ( #5278 )
...
### What problem does this PR solve?
#5261
### Type of change
- [x] Refactoring
2025-02-24 13:01:34 +08:00
7ce675030b
Support downloading models from ModelScope Community. ( #5073 )
...
This PR supports downloading models from ModelScope. The main
modifications are as follows:
-New Feature (non-breaking change which adds functionality)
-Documentation Update
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com >
2025-02-24 10:12:20 +08:00
ef8847eda7
Double check error of adding llm. ( #5237 )
...
### What problem does this PR solve?
#5227
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-21 19:09:49 +08:00
78982d88e0
Reformat error message. ( #4829 )
...
### What problem does this PR solve?
#4828
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-10 16:47:53 +08:00