### What problem does this PR solve?
Since secret key get and set logic is updated, the go server also need
to update.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Follow on PR: https://github.com/infiniflow/ragflow/pull/14602
to fix: team member cannot edit agent.
new behavior: beside delete, everything is allowed for team member.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
A and B, two API servers and a REDIS server.
If A and REDIS restart, B will hold the obsolete secret key and will
lead to error.
TODO:
app.config['SECRET_KEY'] and app.secret_key still hold obsolete secret
key.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
support non-stream runtime agent completion
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
add file convert backward compatibility
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
This PR addresses three related GraphRAG reliability issues that
together allow long-running GraphRAG tasks (10+ hours of LLM extraction)
to be resumed after a crash or pause without re-doing completed work. It
builds on #14096 (per-doc subgraph cache) and extends the same idea to
the resolution and community-detection phases.
Fixes#14236.
## 1. Fix concurrent merge crash
Long GraphRAG runs would crash near the end of entity resolution with:
```
RuntimeError: dictionary keys changed during iteration
```
in `Extractor._merge_graph_nodes`. Two changes:
- `rag/graphrag/general/extractor.py`: snapshot `graph.neighbors(node1)`
via `list(...)` before iterating, so concurrent `add_edge` /
`remove_node` mutations on the shared `nx.Graph` cannot invalidate the
iterator. Also tracks each redirected neighbour in `node0_neighbors` so
a later merged node sharing the same external neighbour takes the
edge-merge branch instead of overwriting via `add_edge`.
- `rag/graphrag/entity_resolution.py`: serialize the merge step with a
dedicated `asyncio.Semaphore(1)`. `nx.Graph` is not thread-safe and
concurrent merges on overlapping neighbourhoods can produce incorrect
results even with the snapshot fix.
## 2. Don't wipe partial graph on pause
Previously the pause / cancel UI path called
`settings.docStoreConn.delete({"knowledge_graph_kwd": [...]}, ...)`,
destroying every subgraph, entity, relation, and graph row.
Re-triggering then started GraphRAG from scratch even though #14096 had
already added `load_subgraph_from_store`.
After main was merged in (which deleted `api/apps/kb_app.py` per
#14394), the pause path now lives on the new REST surface `DELETE
/v1/datasets/<id>/<index_type>`:
- `api/apps/services/dataset_api_service.py`: `delete_index` accepts a
`wipe: bool = True` parameter. When `False` the doc-store rows and
GraphRAG phase markers are left intact and only the running task is
cancelled. Default preserves historical behaviour.
- `api/apps/restful_apis/dataset_api.py`: parses `?wipe=false|0|no|off`
from the query string and forwards it.
- `web/src/utils/api.ts` + `web/src/services/knowledge-service.ts`:
`unbindPipelineTask` appends `?wipe=false` when explicitly false.
- The GraphRAG pause action in
`web/src/pages/dataset/dataset/generate-button/hook.ts` passes `wipe:
false` for `KnowledgeGraph`; raptor is unchanged.
**UX impact:** the pause icon next to a running GraphRAG task no longer
wipes graph data. The only path that still wipes is the explicit Delete
action in `GenerateLogButton` (trash icon behind a confirmation modal).
## 3. Phase-completion markers (`rag/graphrag/phase_markers.py`)
A small Redis-backed marker layer at
`graphrag:phase:{kb_id}:{resolution_done|community_done}` (7-day TTL).
`run_graphrag_for_kb` consults the markers on entry and skips phases
that already completed in a prior run. Markers are cleared automatically
when:
- new docs are merged into the graph (which invalidates prior resolution
and community results),
- `delete_index` wipes the graph, or
- `delete_knowledge_graph` is called.
Redis failures never block a run -- markers are an optimization, not a
gate.
## 4. Idempotent community detection
`extract_community` previously did `delete-then-insert` on
`community_report` rows; a crash mid-insert left the dataset with no
reports. Now report IDs are derived deterministically from `(kb_id,
community.title)`, the existing report IDs are snapshotted before
insert, new rows are written, then only stale rows are pruned. A failure
at any step leaves either the prior or the new report set intact --
never a partial mix.
## 5. Tunable doc-store insert pipeline
The GraphRAG insert loop in `rag/graphrag/utils.py` and the
`community_report` insert in `rag/graphrag/general/index.py` were both
hardcoded to `es_bulk_size = 4` and ran strictly sequentially. On a real
KB this meant 1077 chunks took ~21 minutes for a 100-chunk slice -- pure
round-trip overhead.
- New `insert_chunks_bounded()` helper in `rag/graphrag/utils.py`
batches inserts via a bounded `asyncio.Semaphore`. Same retry / timeout
semantics as the prior loop.
- Defaults: 64 docs per batch, 4 batches in flight (matches the regular
ingest pipeline in `document_service.py`). Tunable per-deployment via
`GRAPHRAG_INSERT_BULK_SIZE` and `GRAPHRAG_INSERT_CONCURRENCY`.
- Both `set_graph` and `extract_community` now use the helper.
This dropped the same 1077-chunk insert from minutes to seconds in local
testing without measurable extra pressure on Infinity (total in-flight
docs ≤ `BULK_SIZE × CONCURRENCY` = 256 by default).
## Tests
- `test/unit_test/rag/graphrag/test_merge_graph_nodes.py` (3 tests):
dense neighbourhood merge, neighbour-snapshot regression, concurrent
serialized merges.
- `test/unit_test/rag/graphrag/test_phase_markers.py` (4 tests): set/has
round-trip, kb-scoped clear, no-op on empty input, graceful Redis
failure.
-
`test/testcases/test_web_api/test_dataset_management/test_dataset_sdk_routes_unit.py`:
new `test_delete_index_wipe_flag_unit` covers `wipe=false` for both
GraphRAG and raptor on the new REST route, and confirms the default
still wipes and clears phase markers.
## Compatibility
- Backward compatible: tasks queued before this change behave
identically (default `wipe=true`, no markers expected).
- No schema/migration changes; all new state lives in Redis.
- New optional REST query param `wipe` on `DELETE
/v1/datasets/<id>/<index_type>`.
- New optional env vars `GRAPHRAG_INSERT_BULK_SIZE` and
`GRAPHRAG_INSERT_CONCURRENCY`; defaults preserve safe behaviour.
## Example of resume
Screenshot below shows a test resuming knowledge graph generation after
applying the concurrency fix and re-deploying.
<img width="521" height="677" alt="image"
src="https://github.com/user-attachments/assets/9ef0d405-cbb3-420d-a1a1-e51f3e7e9b7a"
/>
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
## Summary
This fixes a missing authorization check in the beta API document
download endpoint:
- **CWE:** CWE-862 (Missing Authorization)
- **Severity:** Medium
- **Affected route/file:** `GET /api/v1/documents/<document_id>` in
`api/apps/sdk/doc.py`
- **Data flow:** the route reads a bearer beta API token, resolves the
token with `APIToken.query(beta=token)`, accepts `document_id` directly
from the URL, loads the document with
`DocumentService.query(id=document_id)`, and then fetches the backing
object through `File2DocumentService.get_storage_address()` /
`settings.STORAGE_IMPL.get()`.
Before this change, that flow verified that the API token was valid, but
it did not verify that the token's tenant owned the document's knowledge
base. A caller with any valid beta API token and a known document ID
could therefore reach storage for a document belonging to another
tenant.
## Fix
The endpoint now takes the tenant ID from the resolved API token and
checks the document's knowledge base with:
```python
KnowledgebaseService.query(id=doc[0].kb_id, tenant_id=tenant_id)
```
If the knowledge base is not owned by the token tenant, the request
returns an access error before any storage lookup occurs. This mirrors
the tenant-scoped ownership checks used by the dataset-scoped document
download path and keeps the patch small.
## Tests
Added unit coverage for `download_doc()` to assert that:
- the beta token tenant ID is used in the knowledge-base ownership
lookup;
- cross-tenant access returns `You do not have access to this
document.`;
- storage resolution is not called before tenant authorization succeeds;
- the existing same-tenant empty-file and successful-download paths
still run after the authorization gate passes.
I also verified the final patch is limited to `api/apps/sdk/doc.py` and
the related document SDK route unit test. A local `pytest` invocation
could not complete in this checkout because the shared test fixture
attempts to log in to a RAGFlow server at `127.0.0.1:9380`, which was
not running in the local environment.
## Security analysis
This is exploitable when an attacker has a valid beta API token for
their own tenant and obtains or guesses a document ID from another
tenant. The token alone should not grant access to other tenants' files,
but the direct document route previously authorized only the token
itself and not the requested resource. The new tenant-scoped
knowledge-base check binds the requested document back to the token
tenant before storage is accessed, preventing cross-tenant document
downloads through this endpoint.
Before submitting, we attempted to disprove this by checking whether
existing dataset-scoped routes, token validation, or framework
protections already enforced ownership. They do not apply to this direct
document-ID route: it bypassed the dataset path parameter and used only
`DocumentService.query(id=document_id)` before reading storage.
cc @lewiswigmore
### What problem does this PR solve?
add legacy agent completion API compatibility
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR fixes missing authorization checks in the Memory API.
Previously, several authenticated endpoints accepted caller-supplied
`tenant_id`, `owner_ids`, or `memory_id` values and used them directly
to list, read, update, delete, or search Memory data.
That could allow an authenticated user to access or mutate another
tenant's Memory records if they knew a tenant ID or memory ID. The fix
centralizes Memory access checks and applies them consistently across
Memory and Memory-message operations.
The change:
- Adds helper logic to parse list filters and compute tenant IDs
accessible to `current_user`.
- Requires direct `memory_id` operations to pass Memory access checks
before reading, updating, deleting, or changing message state.
- Filters list/search/recent-message requests to accessible memories
only.
- Applies Memory visibility filtering before count and pagination in
`MemoryService.get_by_filter`.
- Accepts `owner_ids` in the Memory list route, matching the frontend
owner filter while still intersecting values with the caller's
accessible tenants.
-
### Related issues
Closes#14534
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: jony376 <jony376@gmail.com>
### What problem does this PR solve?
Currently, RAGFlow's Search and Chat interfaces display only raw
vectorized text chunks during retrieval, without contextual information
about their source documents. Users cannot see document titles, page
numbers, upload dates, or custom metadata fields that would help them
understand and trust the retrieved results.
This PR introduces an **optional metadata display feature** that
enriches retrieved chunks with document-level metadata in both the
Search tab and Chatbot interface.
**Key improvements:**
- **Search results**: Display document metadata as styled badges beneath
chunk snippets
- **Chat citations**: Show metadata in citation popovers and reference
lists for better source context
- **LLM context**: Metadata is injected into the LLM prompt to enable
more accurate, citation-aware responses
- **External API support**: Applications using RAGFlow's SDK retrieval
endpoints (`/v1/retrieval`, `/v1/searchbots/retrieval_test`) can opt-in
via request parameters
- **User control**: Multi-select dropdown UI allows users to choose
which metadata fields to display
**Implementation approach:**
- ✅ Reuses existing `DocMetadataService` infrastructure (no new database
tables or indices)
- ✅ Settings stored in existing JSON configuration fields
(`search_config.reference_metadata`, `prompt_config.reference_metadata`)
- ✅ No database migrations required
- ✅ Disabled by default (fully opt-in and backward-compatible)
- ✅ Dynamic metadata field selection populated from actual document
metadata keys
- ✅ Fixed critical bug where Python's builtin `set()` was shadowed by a
route handler function
**Modified endpoints (all backward-compatible):**
- `POST /v1/retrieval` (Public SDK)
- `POST /v1/searchbots/retrieval_test` (Searchbots)
- `POST /v1/chunk/retrieval_test` (UI/Internal)
- Chat completions endpoints (via `extra_body.reference_metadata` or
`prompt_config`)
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
###Images
-
<img width="879" height="1275" alt="image"
src="https://github.com/user-attachments/assets/95b2d731-31ae-45a1-b081-bf5893f52aeb"
/>
<br><br>
<br><br>
<img width="1532" height="362" alt="image"
src="https://github.com/user-attachments/assets/9cebc65b-b7a7-459f-b25e-3b13fa9b638e"
/>
<br><br>
<br><br>
<img width="2586" height="1320" alt="image"
src="https://github.com/user-attachments/assets/2153d493-d899-461f-a7a9-041391e07776"
/>
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Attili-sys <Attili-sys@users.noreply.github.com>
Co-authored-by: Ahmad Intisar <ahmadintisar@Ahmads-MacBook-M4-Pro.local>
### What problem does this PR solve?
## Summary
Fixed a bug where the **File Logs** tab in the dataset ingestion page
always showed "No logs" even after files were parsed successfully.
## Root Cause
Both the **File Logs** and **Dataset Logs** tabs on the frontend called
the same backend endpoint `/datasets/{dataset_id}/ingestions`. However,
the backend only queried `get_dataset_logs_by_kb_id`, which
hard-filtered records by `document_id == GRAPH_RAPTOR_FAKE_DOC_ID`
(dataset-level logs). As a result, real file-level logs were never
returned, causing the table to appear empty.
## Changes
### Backend
- **`api/apps/restful_apis/dataset_api.py`**
- Added two new query parameters to `list_ingestion_logs`:
- `log_type` — `"file"` or `"dataset"` (default: `"dataset"`)
- `keywords` — search keyword for filtering by document / task name
- **`api/apps/services/dataset_api_service.py`**
- Updated `list_ingestion_logs` signature to accept `log_type` and
`keywords`.
- Added conditional routing:
- When `log_type == "file"`, call
`PipelineOperationLogService.get_file_logs_by_kb_id`
- Otherwise, call
`PipelineOperationLogService.get_dataset_logs_by_kb_id`
- **`api/db/services/pipeline_operation_log_service.py`**
- Extended `get_dataset_logs_by_kb_id` with an optional `keywords`
parameter so dataset logs can also be searched.
### Frontend
- **`web/src/pages/dataset/dataset-overview/hook.ts`**
- Removed the separate API function switching (`listPipelineDatasetLogs`
vs `listDataPipelineLogDocument`).
- Unified both tabs to call `listDataPipelineLogDocument` with the new
`log_type` query parameter (`"file"` or `"dataset"`).
- Ensured `keywords` and filter values are passed through correctly.
## Behavior After Fix
| Tab | `log_type` | Returned Records | Searchable Field |
|---|---|---|---|
| File Logs | `file` | Real document-level logs | `document_name` (file
name) |
| Dataset Logs | `dataset` | GraphRAG / RAPTOR / MindMap logs |
`document_name` (task type) |
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Signed-off-by: noob <yixiao121314@outlook.com>
Co-authored-by: Wang Qi <wangq8@outlook.com>
Co-authored-by: Yingfeng Zhang <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
Fix: Dataset: When configuring the "general chunk method," options such
as chunk size and parent-child slicing are unavailable.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: balibabu <assassin_cike@163.com>
### What problem does this PR solve?
## Summary
Migrate two web API endpoints to REST-style HTTP API endpoints,
following the pattern established in #14222:
| Old Endpoint | New Endpoint |
|---|---|
| `POST /v1/chunk/retrieval_test` | `POST
/api/v1/datasets/<dataset_id>/search` |
| `GET /v1/chunk/knowledge_graph` | `GET
/api/v1/datasets/<dataset_id>/graph` |
### What problem does this PR solve?
Fix: google authentication - gmail && google-drive
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Always return success if no such task id to follow existing code logic.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Before migration
Web API: POST /v1/document/change_parser
HTTP API: PATCH /api/v1/datasets/<dataset_id>/documents
After consolidation, Restful API
PATCH /api/v1/datasets/<dataset_id>/documents
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Before migration: GET /v1/document/thumbnails
After migration: GET /api/v1/thumbnails
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Before migration: POST /v1/document/run
After migration: POST /api/v1/documents/ingest/
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Before migration
Web API: POST /v1/document/change_status
After consolidation, Restful API
POST /api/v1/datasets/<dataset_id>/documents/batch-update-status
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Before migration: POST /v1/document/upload_info/
After migration: POST /api/v1/documentss/upload/
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Before migration: GET /v1/document/artifact/<filename>
After migration: GET /api/v1/documents/artifact/<filename>
### Type of change
- [x] Refactoring
### What problem does this PR solve?
## Summary
Closes#6102
When using Infinity as the document store engine (GPU version), calling
`update()` on a non-existent table throws an unhandled
`InfinityException` with error code 3022 (`TABLE_NOT_EXIST`). This
causes users to see a raw "3022" error when clicking on a parsed
document.
## Root Cause
The `update()` methods in both `rag/utils/infinity_conn.py` and
`memory/utils/infinity_conn.py` call `db_instance.get_table(table_name)`
without catching `InfinityException`. In contrast, other CRUD methods
(`insert`, `delete`, `search`) all handle this exception gracefully:
| Method | Handles table-not-exist? | Behavior |
|----------|--------------------------|----------|
| `insert` | ✅ Yes | Auto-creates the table |
| `search` | ✅ Yes | Skips the table |
| `delete` | ✅ Yes | Returns 0 |
| `update` | ❌ **No** | Crashes with 3022 |
Additionally, `api/apps/document_app.py` worked around this with a
fragile string match (`"3022" in msg`) to detect the error.
## Changes
- **`rag/utils/infinity_conn.py`**: Catch `InfinityException` in
`update()`. When `TABLE_NOT_EXIST` is detected, log a warning and return
`False` — consistent with `delete()`.
- **`memory/utils/infinity_conn.py`**: Apply the same fix to its
`update()` method.
- **`api/apps/document_app.py`**: Remove the fragile `"3022"`
string-matching workaround. Table-not-exist is now handled by the `if
not ok` path with an improved error message.
### Type of change
- [x] Refactoring
---------
Signed-off-by: noob <yixiao121314@outlook.com>
### What problem does this PR solve?
The POST /upload_info?url=<url> endpoint accepted a user-supplied URL
and passed it directly to AsyncWebCrawler without any validation. There
were no restrictions on URL scheme, destination hostname, or resolved IP
address. This allowed any authenticated user to instruct the server to
make outbound HTTP requests to internal infrastructure — including RFC
1918 private networks, loopback addresses, and cloud metadata services
such as http://169.254.169.254 — effectively using the server as a proxy
for internal network reconnaissance or credential theft.
This PR adds an SSRF guard (_validate_url_for_crawl) that runs before
any crawl is initiated. It enforces an allowlist of safe schemes
(http/https), resolves the hostname at validation time, and rejects any
URL whose resolved IP falls within a private or reserved network range.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)