1308 Commits

Author SHA1 Message Date
0a8eb11c3d fix: Add proper error handling for database reconnection attempts (#12650)
## Problem
When database connection is lost, the reconnection logic had a bug: if
the first reconnect attempt failed, the second attempt was not wrapped
in error handling, causing unhandled exceptions.

## Solution
Added proper try-except blocks around the second reconnect attempt in
both MySQL and PostgreSQL database classes to ensure errors are properly
logged and handled.

## Changes
- Fixed `_handle_connection_loss()` in `RetryingPooledMySQLDatabase`
- Fixed `_handle_connection_loss()` in
`RetryingPooledPostgresqlDatabase`

Fixes #12294

---

Contribution by Gittensor, see my contribution statistics at
https://gittensor.io/miners/details?githubId=158349177

Co-authored-by: SID <158349177+0xsid0703@users.noreply.github.com>
2026-01-19 09:48:10 +08:00
38f0a92da9 Use RAGFlow CLI to replace RAGFlow Admin CLI (#12653)
### What problem does this PR solve?

```
$ python admin/client/ragflow_cli.py -t user -u aaa@aaa.com -p 9380

ragflow> list datasets;
ragflow> list default models;
ragflow> show version;

```


### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-01-17 17:52:38 +08:00
4f036a881d Fix: Infinity keyword round-trip, highlight fallback, and KB update guards (#12660)
### What problem does this PR solve?

Fixes Infinity-specific API regressions: preserves ```important_kwd```
round‑trip for ```[""]```, restores required highlight key in retrieval
responses, and enforces Infinity guards for unsupported
```parser_id=tag``` and pagerank in ```/v1/kb/update```. Also removes a
slow/buggy pandas row-wise apply that was throwing ```ValueError``` and
causing flakiness.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-01-16 20:03:52 +08:00
2b20d0b3bb Fix : Web API tests by normalizing errors, validation, and uploads (#12620)
### What problem does this PR solve?

Fixes web API behavior mismatches that caused test failures by
normalizing error responses, tightening validations, correcting error
messages, and closing upload file handles.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-01-16 11:09:22 +08:00
cec06bfb5d Fix: empty chunk issue. (#12638)
#12570

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-01-15 17:46:21 +08:00
ac936005e6 fix: ensure deleted chunks are not returned in retrieval (#12520) (#12546)
## Summary
Fixes #12520 - Deleted chunks should not appear in retrieval/reference
results.

## Changes

### Core Fix
- **api/apps/chunk_app.py**: Include \doc_id\ in delete condition to
properly scope the delete operation

### Improved Error Handling
- **api/db/services/document_service.py**: Better separation of concerns
with individual try-catch blocks and proper logging for each cleanup
operation

### Doc Store Updates
- **rag/utils/es_conn.py**: Updated delete query construction to support
compound conditions
- **rag/utils/opensearch_conn.py**: Same updates for OpenSearch
compatibility

### Tests
- **test/testcases/.../test_retrieval_chunks.py**: Added
\TestDeletedChunksNotRetrievable\ class with regression tests
- **test/unit/test_delete_query_construction.py**: Unit tests for delete
query construction

## Testing
- Added regression tests that verify deleted chunks are not returned by
retrieval API
- Tests cover single chunk deletion and batch deletion scenarios
2026-01-15 14:45:55 +08:00
b40a7b2e7d Feat: Hash doc id to avoid duplicate name. (#12573)
### What problem does this PR solve?

Feat: Hash doc id to avoid duplicate name. 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-01-15 14:02:15 +08:00
9a10558f80 Refa: async retrieval process. (#12629)
### Type of change

- [x] Refactoring
- [x] Performance Improvement
2026-01-15 12:28:49 +08:00
SID
f82628c40c Fix: langfuse connection error handling #12621 (#12626)
## Description

Fixes connection error handling when langfuse service is unavailable.
The application now gracefully handles connection failures instead of
crashing.

## Changes

- Wrapped `langfuse.auth_check()` calls in try-except blocks in:
  - `api/db/services/dialog_service.py`
  - `api/db/services/tenant_llm_service.py`

## Problem

When langfuse service is unavailable or connection is refused,
`langfuse.auth_check()` throws `httpx.ConnectError: [Errno 111]
Connection refused`, causing the application to crash during document
parsing or dialog operations.

## Solution

Added try-except blocks around `langfuse.auth_check()` calls to catch
connection errors and gracefully skip langfuse tracing instead of
crashing. The application continues functioning normally even when
langfuse is unavailable.

## Related Issue

Fixes #12621

---

Contribution by Gittensor, see my contribution statistics at
https://gittensor.io/miners/details?githubId=158349177
2026-01-15 11:23:15 +08:00
15a8bb2e9c Fix: chunk list async issue. (#12615)
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-01-14 17:32:07 +08:00
f72a35188d refactor: remove debug print statements (#12598)
### What problem does this PR solve?

This PR eliminates unnecessary debug print statements that were left in
hot paths of the codebase.

### Type of change

- [x] Refactoring
2026-01-14 10:05:34 +08:00
0795616b34 Align p3 HTTP/SDK tests with current backend behavior (#12563)
### What problem does this PR solve?

Updates pre-existing HTTP API and SDK tests to align with current
backend behavior (validation errors, 404s, and schema defaults). This
ensures p3 regression coverage is accurate without changing production
code.

### Type of change

- [x] Other (please describe): align p3 HTTP/SDK tests with current
backend behavior

---------

Co-authored-by: Liu An <asiro@qq.com>
2026-01-13 19:22:47 +08:00
947e63ca14 Fixed typos and added pptx preview for frontend (#12577)
### What problem does this PR solve?
Previously, we added support for previewing PPT and PPTX files in the
backend. Now, we are adding it to the frontend, so when the slides in
the chat interface are referenced, they will no longer be blank.
### Type of change

- Bug Fix (non-breaking change which fixes an issue)
2026-01-13 17:02:36 +08:00
41c84fd78f Add MIME types for PPT and PPTX files (#12562)
Otherwise, slide files cannot be opened in Chat module

### What problem does this PR solve?

Backend Reason (API): In the api/utils/web_utils.py file of the backend,
the CONTENT_TYPE_MAP dictionary is missing ppt and pptx.
MIME type mapping. This means that when the frontend requests a PPTX
file, the backend cannot correctly inform the browser that it is a PPTX
file, resulting in the file being displayed incorrectly.
Type identification error.

### Type of change

-  Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-01-13 12:17:49 +08:00
44bada64c9 Feat: support tree structured deep-research policy. (#12559)
### What problem does this PR solve?

#12558
### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-01-13 09:41:35 +08:00
a7dd3b7e9e Add time cost when start servers (#12552)
### What problem does this PR solve?

- API server
- Ingestion server
- Data sync server
- Admin server

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-01-12 12:48:23 +08:00
2e09db02f3 feat: add paddleocr parser (#12513)
### What problem does this PR solve?

Add PaddleOCR as a new PDF parser.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-01-09 17:48:45 +08:00
6abf55c048 Feat: support openapi (#12521)
### What problem does this PR solve?
Support OpenAPI interface description.

The issue of not supporting the Swagger interface after upgrading the
system framework from Flask to Quart has been resolved.

Resolved https://github.com/infiniflow/ragflow/issues/5264

### Type of change
- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: puhaiyang <“761396462@qq.com”>
2026-01-09 17:48:20 +08:00
f9d4179bf2 Feat:memory sdk (#12538)
### What problem does this PR solve?

Move memory and message apis to /api, and add sdk support.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-01-09 17:45:58 +08:00
f522391d1e Fix: "AttributeError(\"'list' object has no attribute 'get'\")" (#12518)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/12515

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-01-09 10:19:51 +08:00
14c250e3d7 Fix adding column error (#12503)
### What problem does this PR solve?

1. Fix redundant column adding
2. Refactor the code

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-01-08 16:44:53 +08:00
a093e616cf Fix: add multimodel models in chat api (#12496)
### What problem does this PR solve?

Fix: add multimodel models in chat api #11986
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2026-01-08 16:12:08 +08:00
696397ebba Fix: apply kb setting while re-parsing.... (#12501)
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-01-08 16:11:50 +08:00
1996aa0dac Refactor: Enhance delta streaming in chat functions for improved reasoning and content handling (#12453)
### What problem does this PR solve?

change:
Enhance delta streaming in chat functions for improved reasoning and
content handling

### Type of change


- [x] Refactoring
2026-01-08 13:34:16 +08:00
f4e2783eb4 optimize doc id check: do not query db when doc id to validate is empty (#12500)
### What problem does this PR solve?
when a kb contains many documents, say 50000, and the retrieval is only
made against some kb without specifying any doc ids, the query for all
docs from the db is not necessary, and can be omitted to improve
performance.

### Type of change

- [x] Performance Improvement
2026-01-08 13:22:58 +08:00
2fd4a3134d Doc: memory http api (#12499)
### What problem does this PR solve?

Use task save function for add_message api, and added http API document.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Documentation Update
2026-01-08 12:54:10 +08:00
23a9544b73 Fix: toc async issue. (#12485)
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-01-07 15:35:30 +08:00
011bbe9556 Feat: support context window for docx (#12455)
### What problem does this PR solve?

Feat: support context window for docx

#12303

Done:
- [x] naive.py
- [x] one.py

TODO:
- [ ] book.py
- [ ] manual.py

Fix: incorrect image position
Fix: incorrect chunk type tag

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
2026-01-07 15:08:17 +08:00
07845be5bd Fix: display agent name for extract messages (#12480)
### What problem does this PR solve?

Display agent name for extract messages

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-01-07 13:19:54 +08:00
ca9645f39b Feat: adapt to , arglist (#12468)
### What problem does this PR solve?

Adapt to ',' joined arg list in get method url.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-01-07 09:59:08 +08:00
bdd9f3d4d1 Fix: try handle authorization as api-token (#12462)
### What problem does this PR solve?

Try handle authorization as api-token when jwt load failed.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-01-06 19:25:42 +08:00
02e6870755 Refactor: import_test_cases use bulk_create (#12456)
### What problem does this PR solve?

import_test_cases use bulk_create

### Type of change

- [x] Refactoring
2026-01-06 11:39:07 +08:00
fada223249 Feat: process memory (#12445)
### What problem does this PR solve?

Add task status for raw message, and move extract message as a nested
property under raw message

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-01-05 17:58:32 +08:00
lif
81f9296d79 Fix: handle invalid img_id format in chunk update (#12422)
## Summary
- Fix ValueError when updating chunk with invalid/empty `img_id` format
- Add validation before splitting `img_id` by hyphen
- Use `split("-", 1)` to handle object names containing hyphens

## Test plan
- [x] Verify chunk update works with valid `img_id` (format:
`bucket-objectname`)
- [x] Verify chunk update doesn't crash with empty `img_id`
- [x] Verify chunk update doesn't crash when `img_id` has no hyphen
- [x] Verify ruff check passes

Fixes #12035

Signed-off-by: majiayu000 <1835304752@qq.com>
2026-01-05 11:27:19 +08:00
5ebe334a2f Refactor setting type (#12425)
### What problem does this PR solve?

Refactor setting type

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-01-04 20:26:12 +08:00
932496a8ec Fix:bug fix (#12423)
### What problem does this PR solve?
change: 
initialize webhook configuration in webhook function
remove debug print statement from airtable_connector
remove redundant uuid import in imap_connector

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-01-04 19:16:29 +08:00
ac9113b0ef feature: add system setting service (#12408)
### What problem does this PR solve?

#12409 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-01-04 14:21:39 +08:00
KKM
89a97be2c5 Remove duplicated tag_feas assignment in create route (api/apps/chunk_app.py) (#12392)
### What problem does this PR solve?

This PR removes a duplicated assignment of `tag_feas` in the
`@manager.route('/create')` API handler located in
`api/apps/chunk_app.py`.

The same conditional block was unintentionally repeated twice, which had
no
functional impact but reduced code readability and maintainability.
This change eliminates the redundancy while preserving existing
behavior.

### Type of change

- [x] Refactoring

Co-authored-by: 김경만 <kmkim7@humaxit.com>
2026-01-04 10:32:36 +08:00
461c81e14a Fix: KG search issue. (#12364)
### What problem does this PR solve?

Close #12347

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-31 14:40:27 +08:00
f141947085 [Feature] Admin: sort user list by email (#12350)
### What problem does this PR solve?

Sort the user list by user email address.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-12-31 11:55:18 +08:00
7dac269429 fix: correct session reference initialization to prevent dialogue misalignment (#12343)
## Summary

Fixes #12311

Changes the `reference` field initialization from `[{}]` to `[]` in
session creation.

### Problem

When creating a session via the SDK API, the `reference` field was
incorrectly initialized as `[{}]`. This caused:
- First dialogue round: Empty reference
- Second dialogue round: Reference pointing to first round's data
- Overall misalignment between dialogue rounds and their references

### Solution

Changed the initialization to `[]` (empty list), which:
- Matches the `Conversation` model's expected default
- Ensures references grow correctly one-to-one with assistant responses
- Aligns with the service layer's expectations

### Testing

After applying this fix:
1. Create a session via `POST /api/v1/chats/{conversation_id}/sessions`
2. Send multiple questions via `POST
/api/v1/chats/{conversation_id}/completions`
3. View the conversation on web - references should now align correctly
with each dialogue round
2025-12-31 10:25:49 +08:00
ff2c70608d Fix: judge index exist before delete memory message. (#12318)
### What problem does this PR solve?

Judge index exist before delete memory message.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-30 15:54:07 +08:00
4a6d37f0e8 Fix: use async task to save memory (#12308)
### What problem does this PR solve?

Use async task to save memory.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2025-12-30 11:41:38 +08:00
731e2d5f26 api key delete bug - Bug #3045 (#12299)
Description:
Fixed an issue where deleting an API token would fail because it was
incorrectly using current_user.id as the tenant_id instead of querying
the actual tenant ID from UserTenantService.

Changes:

Updated rm() endpoint to fetch the correct tenant_id from
UserTenantService before deleting the API token
Added proper error handling with try/except block
Code style cleanup: consistent quote usage and formatting
Related Issue: #3045

https://github.com/infiniflow/ragflow/issues/3045

Co-authored-by: Mardani, Ramin <ramin.mardani@sscinc.com>
2025-12-30 11:27:04 +08:00
fddfce303c Fix (sdk): ensure variables defined in rm_chunk API (#12274)
### What problem does this PR solve?

Fixes a bug in the `rm_chunk` SDK interface where an `UnboundLocalError`
could
occur if `chunk_ids` is not provided in the request. 

- `unique_chunk_ids` and `duplicate_messages` are now always initialized
  in the `else` branch when `chunk_ids` is missing.
- API behavior remains unchanged when `chunk_ids` is present.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-29 13:18:23 +08:00
37e4485415 feat: add MDX file support (#12261)
Feat: add MDX file support  #12057 
### What problem does this PR solve?

<img width="1055" height="270" alt="image"
src="https://github.com/user-attachments/assets/a0ab49f9-7806-41cd-8a96-f593591ab36b"
/>

The page states that MDX files are supported, but uploading fails with
the error: "x.mdx: This type of file has not been supported yet!"
<img width="381" height="110" alt="image"
src="https://github.com/user-attachments/assets/4bbb7d08-cb47-416a-95fc-bc90b90fcc39"
/>


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-29 12:54:31 +08:00
8d3f9d61da Fix: Delete chunk images on document parser config change. (#12262)
### What problem does this PR solve?

Modifying a document’s parser config previously left behind obsolete
chunk images. If the dataset isn’t manually deleted, these images
accumulate and waste storage. This PR fixes the issue by automatically
removing associated images when the parser config changes.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-29 12:54:11 +08:00
27c55f6514 Fix the consistency of ts and datetime (#12288)
### What problem does this PR solve?

#12279
#11942 

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-12-29 12:37:13 +08:00
9883c572cd Refactor: keep timestamp consistency (#12279)
### What problem does this PR solve?

keep timestamp consistency

### Type of change

- [x] Refactoring
2025-12-29 12:02:43 +08:00
3364cf96cf Fix: optimize init memory_size (#12254)
### What problem does this PR solve?

Handle 404 exception when init memory size from es.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Liu An <asiro@qq.com>
2025-12-26 21:18:44 +08:00