Compare commits

..

245 Commits

Author SHA1 Message Date
08caa4fce3 Merge branch 'feat/summary-index' into deploy/dev 2026-01-19 15:35:41 +08:00
5293fbe8ba fix: hit testing chunk detail summary 2026-01-19 15:35:07 +08:00
ed555c5fe7 Merge branch 'feat/summary-index' into deploy/dev 2026-01-19 15:14:28 +08:00
22974ea6b0 fix: preview chunk summary 2026-01-19 15:13:51 +08:00
754b01366a Merge branch 'chore/relocate-datasets-api-form' into deploy/dev 2026-01-19 14:51:03 +08:00
8af626092e chore: relocate datasets api form 2026-01-19 14:50:01 +08:00
49b3bad26b locl 2026-01-19 11:50:26 +08:00
50616c25d4 Merge branch 'feat/storage-50' into deploy/dev 2026-01-19 11:49:16 +08:00
3b4b5b332c feat(billing): enhance usage info with storage threshold display
- Add storageThreshold, storageTooltip, storageTotalDisplay props to UsageInfo
- Implement indeterminate state in ProgressBar for usage below threshold
- Update VectorSpaceInfo to calculate total based on plan type
- Add i18n for storage threshold tooltip (en-US, ja-JP, zh-Hans)
2026-01-19 11:47:35 +08:00
yyh
e8397ae7a8 fix(web): Zustand testing best practices and state read optimization (#31163)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2026-01-19 10:31:34 +08:00
yyh
8893913b3a feat: add Vercel React Best Practices skill for Claude Code (#31133) 2026-01-19 10:30:49 +08:00
14f123802d chore: update vite related version (#31180) 2026-01-19 10:28:06 +08:00
62c3f14570 Merge branch 'main' into feat/summary-index 2026-01-19 10:21:40 +08:00
41c3b1c57c Merge branch 'feat/support-free-try-app' into deploy/dev 2026-01-18 12:58:58 +08:00
7b66bbc35a chore: introduce bulk-suppressions and multithread linting (#31157) 2026-01-17 19:51:56 +08:00
77366f33a4 feat(web): add loading indicators for infinite scroll pagination (#31110)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Stephen Zhou <38493346+hyoban@users.noreply.github.com>
2026-01-17 17:36:07 +08:00
yyh
e3b0918dd9 test(web): add global zustand mock for tests (#31149) 2026-01-17 17:29:13 +08:00
994357d8b5 merge 2026-01-17 09:46:38 +08:00
5fb9fe3c94 fix: fix summary index bug. (#31134) 2026-01-16 20:24:57 +08:00
4fb08ae7d2 fix: fix summary index bug. 2026-01-16 20:24:18 +08:00
7481762acb fix: fix summary index bug. (#31125) 2026-01-16 18:56:17 +08:00
fcb2fe55e7 fix: fix summary index bug. 2026-01-16 18:55:10 +08:00
yyh
a0aa8cdb45 Merge remote-tracking branch 'origin/main' into feature/task-quadrant-view 2026-01-16 18:20:29 +08:00
yyh
ae8618877b fix(web): quadrant matrix i18n 2026-01-16 18:17:28 +08:00
fad6fa141d chore: improve accessibility for learn more link (#31120)
Co-authored-by: khmandarrin <jeong-ga-eun@jeong-ga-eun-ui-MacBookAir.local>
2026-01-16 18:12:07 +08:00
30821fd26c chore: Update outdated GitHub Actions versions (#31114) 2026-01-16 17:56:55 +08:00
1a9fdd9a65 refactor: migrate tag list API query parameters to Pydantic (#31097)
Co-authored-by: fghpdf <fghpdf@users.noreply.github.com>
2026-01-16 17:49:52 +08:00
de610cbf39 fix: call get_text_content() instead of casting to str (#31121)
Signed-off-by: Stream <Stream_2@qq.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-01-16 18:41:00 +09:00
yyh
1c55602445 fix(web): add calendar icon and DDL label to deadline badge in task-item 2026-01-16 17:24:11 +08:00
yyh
a3f1220d23 feat(web): add fullscreen expand mode to quadrant-matrix component
- Add expand button in header to open FullScreenModal
- Add numbered circles (1-4) to quadrant headers
- Add expanded prop to show full content without line-clamp
- Reorder grid layout: Q1 top-left, Q2 top-right, Q3 bottom-left, Q4 bottom-right
- Remove axis labels for cleaner design
2026-01-16 17:16:13 +08:00
4d7384731e fix: call get_text_content on LLMResult
Signed-off-by: Stream <Stream_2@qq.com>
2026-01-16 17:08:39 +08:00
yyh
d62e16b9bb fix(web): improve quadrant-matrix layout and text overflow handling
- Simplify axis label layout with horizontal/vertical arrangement
- Add proper text truncation with line-clamp and tooltips
- Fix overflow issues by adding min-w-0 on flex children
- Move scores inline with task name for compact display
- Add task count badge to quadrant headers
- Reduce maxDisplay to 3 for better density
2026-01-16 16:58:57 +08:00
yyh
13f2a43ccc feat(web): add Eisenhower Matrix visualization component for task quadrants
Add a new quadrant-matrix component that renders tasks in a 2x2 grid based
on importance and urgency scores. Integrate with code-block as a new
'quadrant' language type for markdown rendering.
2026-01-16 16:58:56 +08:00
553dd3266b fix: call get_text_content on LLMResult
Signed-off-by: Stream <Stream_2@qq.com>
2026-01-16 16:46:28 +08:00
yyh
6903c31b84 fix(search-input): retain focus after clearing input (#31107) 2026-01-16 16:22:14 +08:00
b2cc9b255d chore: Update coding agent workflow for backend (#31093) 2026-01-16 14:28:47 +08:00
e9f0e1e839 fix(web): replace Response.json with legacy Response constructor for pre-Chrome 105 compatibility(#31091) (#31095)
Co-authored-by: Xiaoba Yu <xb1823725853@gmail.com>
2026-01-16 14:26:23 +08:00
cd497a8c52 fix(web): use portal for variable picker in code editor (Fixes #31063) (#31066) 2026-01-16 13:31:57 +08:00
7aab4529e6 chore: lint for state hooks (#31088) 2026-01-16 11:58:28 +08:00
E.G
4bff0cd0ab fix: resolve 'Expand all chunks' button not working (#31074)
Co-authored-by: GlobalStar117 <GlobalStar117@users.noreply.github.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: crazywoola <427733928@qq.com>
2026-01-16 11:34:42 +08:00
5b0590d58e Merge branch 'feat/summary-index' into deploy/dev 2026-01-16 10:56:12 +08:00
d97f2df85c Merge branch 'main' into feat/summary-index 2026-01-16 10:55:58 +08:00
d3c09f16a9 merge feat/summary-index 2026-01-16 10:55:18 +08:00
fde8efa4a2 fix: summary index in parent child chunk 2026-01-16 10:49:38 +08:00
c98870c3f4 refactor: always preserve marketplace search state in URL (#31069)
Co-authored-by: Stephen Zhou <38493346+hyoban@users.noreply.github.com>
2026-01-16 08:52:53 +09:00
b06c7c8f33 ci: disable limit annotation (#31072) 2026-01-15 23:04:26 +08:00
1a2fce7055 ci: eslint annotation (#31056) 2026-01-15 21:49:46 +08:00
5f6d1297b0 fix: fix summary index bug. (#31058) 2026-01-15 18:10:46 +08:00
869e70964f fix: fix summary index bug. 2026-01-15 18:09:48 +08:00
1f313eb15c fix: pipeline run panel summary 2026-01-15 18:03:09 +08:00
f02adc26e5 fix: pipeline run panel summary 2026-01-15 18:02:19 +08:00
73027eab0a fix: fix summary index bug. (#31057) 2026-01-15 17:58:04 +08:00
74245fea8e fix: fix summary index bug. 2026-01-15 17:57:15 +08:00
lif
2b021e8752 fix: remove hardcoded 48-character limit from text inputs (#30156)
Signed-off-by: majiayu000 <1835304752@qq.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
2026-01-15 17:43:00 +08:00
5bc4bba668 Merge branch 'feat/summary-index' into deploy/dev 2026-01-15 16:09:44 +08:00
1126a2aa95 merge main 2026-01-15 16:08:29 +08:00
2107a3c32c feat: knowledgebase summary index (#31047) 2026-01-15 16:07:17 +08:00
4a197b9458 fix: fix log updated_at is refreshed (#31045) 2026-01-15 15:42:46 +08:00
772ff636ec feat: credential sync fix for enterprise edition (#30626) 2026-01-14 23:33:24 -08:00
ab1c5a2027 refactor: remove manual set query logic (#31039) 2026-01-15 15:25:43 +08:00
33e99f069b fix: message clean service ut (#31038) 2026-01-15 15:13:25 +08:00
22d0c55363 fix: fix summary index bug. 2026-01-15 15:10:38 +08:00
52af829f1f refactor: enhance clean messages task (#29638)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: 非法操作 <hjlarry@163.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-01-15 14:03:17 +08:00
7c3ce7b1e6 fix: summary index change in create document 2026-01-15 13:48:07 +08:00
0ef8b5a0ca chore: bump version to 1.11.4 (#30961) 2026-01-15 11:36:15 +08:00
2bfc54314e feat: single run add opentelemetry (#31020) 2026-01-15 11:10:55 +08:00
f4d20a02aa feat: fix summary index bug. 2026-01-15 11:06:18 +08:00
bdd8d5b470 test: add unit tests for PluginPage and related components (#30908)
Co-authored-by: CodingOnStar <hanxujiang@dify.ai>
2026-01-15 10:56:02 +08:00
4955de5905 fix: validation error when uploading images with None URL values (#31012)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
2026-01-15 10:54:10 +08:00
yyh
3bee2ee067 refactor(contract): restructure console contracts with nested billing module (#30999) 2026-01-15 10:41:18 +08:00
328897f81c build: require node 24.13.0 (#30945) 2026-01-15 10:38:55 +08:00
ab078380a3 feat(web): refactor documents component structure and enhance functionality (#30854)
Co-authored-by: CodingOnStar <hanxujiang@dify.ai>
2026-01-15 10:33:58 +08:00
a33ac77a22 feat: implement document creation pipeline with multi-step wizard and datasource management (#30843)
Co-authored-by: CodingOnStar <hanxujiang@dify.ai>
2026-01-15 10:33:48 +08:00
d3923e7b56 refactor: port AppAnnotationHitHistory (#30922)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-01-15 10:14:55 +08:00
2f633de45e refactor: port TenantCreditPool (#30926)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2026-01-15 10:14:15 +08:00
98c88cec34 refactor: delete_endpoint should be idempotent (#30954) 2026-01-15 10:10:10 +08:00
c6999fb5be fix: fix plugin edit endpoint app disappear (#30951) 2026-01-15 10:09:57 +08:00
f7f9a08fa5 refactor: port TidbAuthBinding( (#31006)
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-01-15 10:07:02 +08:00
5008f5e89b fix: Use raw SQL UPDATE to set read status without triggering updated… (#31015) 2026-01-15 09:51:44 +08:00
1dd89a02ea fix: fix missing id and message_id (#31008) 2026-01-14 23:26:17 +09:00
5bf4114d6f fix: increase name length limit in ExternalDatasetCreatePayload (#31000)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Asuka Minato <i@asukaminato.eu.org>
2026-01-14 22:13:53 +09:00
yyh
a56e94ba8e feat: add .agent/skills symlink and orpc-contract-first skill (#30968) 2026-01-14 21:13:14 +08:00
11f1782df0 fix: correct API Extension documentation link (#30962) 2026-01-14 21:21:15 +09:00
8cf5d9a6a1 fix: fix Cannot destructure property 'name' of 'value' as it is undef… (#30991) 2026-01-14 19:30:47 +08:00
0ec2b12e65 feat: allow pass hostname in docker env (#30975) 2026-01-14 19:30:37 +08:00
7eb65b07c8 feat: Make summary index support vision, and make the code more standardized. 2026-01-14 17:52:27 +08:00
f33b1a3332 fix: redirect after login (#30985) 2026-01-14 17:20:49 +08:00
08026f7399 fix(deps): security updates for pdfminer.six, authlib, werkzeug, aiohttp and others (#30976)
Signed-off-by: kenwoodjw <blackxin55+@gmail.com>
2026-01-14 17:03:46 +08:00
yyh
18e051bd66 chore(web): remove unused demo service component (#30979) 2026-01-14 17:03:35 +08:00
yyh
42f991dbef chore(web): disable Serwist dev logs (#30980) 2026-01-14 16:23:58 +08:00
yyh
b1b2c9636f fix(web): preserve HTTP method in ORPC fetchCompat mode (#30971)
Co-authored-by: Stephen Zhou <38493346+hyoban@users.noreply.github.com>
2026-01-14 16:18:12 +08:00
01f17b7ddc refactor(http_request_node): apply DI for http request node (#30509)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2026-01-14 14:19:48 +08:00
yyh
14b2e5bd0d refactor(web): MCP tool availability to context-based version gating (#30955) 2026-01-14 13:40:16 +08:00
830a7fb034 Merge branch 'main' into feat/summary-index 2026-01-14 13:40:15 +08:00
9b7e807690 feat: summary index (#30950) 2026-01-14 11:26:44 +08:00
af86f8de6f Merge branch 'feat/knowledgebase-summaryIndex' into feat/summary-index 2026-01-14 11:25:15 +08:00
d095bd413b fix: fix LOOP_CHILDREN_Z_INDEX (#30719) 2026-01-14 10:22:31 +08:00
3473ff7ad1 fix: use Factory to create repository in Aliyun Trace (#30899) 2026-01-14 10:21:46 +08:00
138c56bd6e fix(logstore): prevent SQL injection, fix serialization issues, and optimize initialization (#30697) 2026-01-14 10:21:26 +08:00
c327d0bb44 fix: Correction to the full name of Volc TOS (#30741) 2026-01-14 10:11:30 +08:00
e4b97fba29 chore(deps): bump azure-core from 1.36.0 to 1.38.0 in /api (#30941)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-14 10:10:49 +08:00
7f9884e7a1 feat: Add option to delete or keep API keys when uninstalling plugin (#28201)
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
Co-authored-by: -LAN- <laipz8200@outlook.com>
2026-01-14 10:09:30 +08:00
e389cd1665 chore(deps): bump filelock from 3.20.0 to 3.20.3 in /api (#30939)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-14 09:56:02 +08:00
87f348a0de feat: change param to pydantic model (#30870) 2026-01-14 09:46:41 +08:00
206706987d refactor(variables): clarify base vs union type naming (#30634)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-01-13 23:39:34 +09:00
91da784f84 refactor: init orpc contract (#30885)
Co-authored-by: yyh <yuanyouhuilyz@gmail.com>
2026-01-13 23:38:28 +09:00
a129e684cc feat: inject traceparent in enterprise api (#30895)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2026-01-13 23:37:39 +09:00
ec78676949 Merge branch 'deploy/dev' into feat/summary-index 2026-01-13 21:30:50 +08:00
fe07c810ba fix: fix instance is not bind to session (#30913) 2026-01-13 21:15:21 +08:00
a22cc5bc5e chore: Bump Dify version to 1.11.3 (#30903) 2026-01-13 17:49:13 +08:00
yyh
1fbdf6b465 refactor(web): setup status caching (#30798) 2026-01-13 16:59:49 +08:00
01a7dbcee8 Merge branch 'main' into feat/summary-index 2026-01-13 16:29:09 +08:00
4fe8d2491e feat: summary index 2026-01-13 16:27:32 +08:00
76da8b4ff3 Merge remote-tracking branch 'origin/deploy/dev' 2026-01-12 17:09:25 +08:00
25bfc1cc3b feat: implement Summary Index feature. 2026-01-12 16:52:21 +08:00
5c2ae922bc merge main 2026-01-12 13:42:17 +08:00
a92df530da mrege main 2026-01-12 13:41:27 +08:00
13eec13a14 feat: summary index 2026-01-12 13:38:18 +08:00
431936beb9 chore: handle callback warning 2026-01-12 11:33:18 +08:00
163540bf4a chore: handle refetch after created 2026-01-12 11:30:03 +08:00
221130b448 chore: remove old i18n 2026-01-12 10:55:02 +08:00
b1eb265fa5 fix: try app not call conversations and sessions 2026-01-09 16:48:03 +08:00
c2a0950660 fix: button ui problem 2026-01-09 15:34:48 +08:00
bfe98009fd chore: fix dataset problems 2026-01-09 14:26:18 +08:00
ea1704d211 fix: try basic detail errors 2026-01-09 14:14:15 +08:00
3ed0937734 merge 2026-01-08 18:27:47 +08:00
1fcf6e4943 Update 2025_12_16_1817-03ea244985ce_add_type_column_not_null_default_tool.py 2025-12-17 11:12:59 +08:00
f4a7efde3d update migration script. 2025-12-16 18:30:12 +08:00
38d4f0fd96 Merge remote-tracking branch 'origin/deploy/dev' 2025-12-16 18:25:54 +08:00
ec4f885dad update migration script. 2025-12-16 18:19:24 +08:00
3781c2a025 [autofix.ci] apply automated fixes 2025-12-16 08:37:32 +00:00
3782f17dc7 Optimize code. 2025-12-16 16:35:15 +08:00
29698aeed2 Merge remote-tracking branch 'origin/deploy/dev' 2025-12-16 16:26:19 +08:00
15ff8efb15 merge alembic head 2025-12-16 16:20:04 +08:00
407e1c8276 [autofix.ci] apply automated fixes 2025-12-16 08:14:05 +00:00
e368825c21 Merge remote-tracking branch 'upstream/main' 2025-12-16 15:50:49 +08:00
8dad6b6a6d Add "type" field to PipelineRecommendedPlugin model; Add query param "type" to recommended-plugins api. 2025-12-16 14:34:59 +08:00
2f54965a72 Add "type" field to PipelineRecommendedPlugin model; Add query param "type" to recommended-plugins api. 2025-12-16 10:43:45 +08:00
a1a3fa0283 Add "type" field to PipelineRecommendedPlugin model; Add query param "type" to recommended-plugins api. 2025-12-15 16:44:32 +08:00
ff7344f3d3 Add "type" field to PipelineRecommendedPlugin model; Add query param "type" to recommended-plugins api. 2025-12-15 16:38:44 +08:00
bcd33be22a Add "type" field to PipelineRecommendedPlugin model; Add query param "type" to recommended-plugins api. 2025-12-15 16:33:06 +08:00
0fb339ca4f fix: saved message 2025-11-18 11:38:12 +08:00
c1871e67aa chore: hide disabed action in try app 2025-11-18 11:28:13 +08:00
f711f9a317 fix: webapp url 2025-11-18 11:22:58 +08:00
9ff3310cb6 chore: handle suggestion readonly 2025-11-18 11:07:01 +08:00
b6bdcc7052 fix: not auther tool in readonly mode 2025-11-18 11:02:46 +08:00
67b0771081 fix: try app not ok in chat 2025-11-17 18:21:43 +08:00
9a07488da9 mrege 2025-11-17 15:42:56 +08:00
ef043c6906 fix: no app not show problem 2025-11-06 14:53:11 +08:00
ab814e3eac fix: inputs overwrite by curr item 2025-10-27 14:08:32 +08:00
a0e1eeb3f1 chore: reset form 2025-10-27 13:57:16 +08:00
b1ebeb67a7 feat: support new chat 2025-10-27 13:50:36 +08:00
082179f70f fix: try chat has not set converstaion 2025-10-27 13:38:41 +08:00
8786ebdbca feat: support use tempalte in create app 2025-10-27 10:58:57 +08:00
b49a4eab62 feat: add app list context 2025-10-24 18:33:54 +08:00
0a7b59f500 feat: add tool requirements to flow 2025-10-24 17:49:29 +08:00
c264d9152f chore: add advanced models 2025-10-24 17:42:38 +08:00
3bf9d898c0 feat: basic app requirements 2025-10-24 17:29:42 +08:00
a7f2849e74 fix: try chatbot ui 2025-10-24 16:22:01 +08:00
0957ece92f fix: the try app always use the curent conversation 2025-10-24 15:57:33 +08:00
949bf38d3c fix: chat setup ui 2025-10-24 15:30:53 +08:00
7bafb7f959 feat: chat info 2025-10-24 14:54:06 +08:00
9735f55ca4 feat: try app alert and i18n 2025-10-24 14:00:24 +08:00
4c1f9b949b feat: alert info and lodash to lodash-es 2025-10-24 11:24:19 +08:00
0af0c94dde fix: preview not full 2025-10-24 10:52:05 +08:00
8e4f0640cc fix: variable readonly in basic app problem 2025-10-24 10:41:18 +08:00
1f513e3b43 chore: remove debug code 2025-10-23 18:26:38 +08:00
aa0841e2a8 chore: 18n 2025-10-23 18:05:34 +08:00
b6a1562357 fix: handle create can not show 2025-10-23 17:54:45 +08:00
bee0797401 feat: create from try app 2025-10-23 17:45:54 +08:00
e085f39c13 chore: description and category 2025-10-23 17:29:32 +08:00
344844d3e0 chore: handle data is large 2025-10-23 16:53:10 +08:00
6e9f82491d chore: reuse the app detail and right meta 2025-10-23 15:51:59 +08:00
372b1c3db8 chore: change detail icon 2025-10-23 15:28:12 +08:00
58d305dbed chore: tab header jp 2025-10-23 15:25:25 +08:00
0360a0416b feat: integration preview page 2025-10-23 15:23:50 +08:00
72282b6e8f feat: try app layout 2025-10-23 14:58:17 +08:00
8391884c4e chore: tab and close btn 2025-10-23 14:45:08 +08:00
b018f2b0a0 feat: can show app detail modal 2025-10-23 14:17:43 +08:00
ab56b4a818 merge main 2025-10-23 11:12:13 +08:00
61ebc756aa feat: workflow preview 2025-10-16 17:38:13 +08:00
4bea38042a feat: text completion form preview 2025-10-16 14:03:30 +08:00
337abc536b fix: update responsive breakpoint and adjust divider visibility in banner component 2025-10-16 13:47:38 +08:00
cc02b78aca feat: different app preview 2025-10-16 11:27:58 +08:00
18f2d24f8e chore: preview input field readonly 2025-10-16 10:42:47 +08:00
0c7b9a462f chore: tools preview readonly 2025-10-16 10:36:36 +08:00
4dd5580854 chore: preview two cols in panel 2025-10-15 18:16:57 +08:00
440bd825d8 feat: can show tools in preview 2025-10-15 17:35:59 +08:00
d2379c38bd chore: handle history panel and completion review crash 2025-10-15 17:35:59 +08:00
cbc55c577b Merge branch 'feat/support-free-try-app' of github.com:langgenius/dify into feat/support-free-try-app 2025-10-15 17:20:20 +08:00
8e962d15d1 feat: improve explore page banner component with enhanced layout and responsive styles 2025-10-15 17:20:00 +08:00
b07c766551 chroe: fix ts problem 2025-10-15 16:00:14 +08:00
9e3dd69277 fix: upload btn not sync right 2025-10-15 15:51:18 +08:00
db9e5665c2 fix: docuemnt and aduio show condition in preview 2025-10-15 15:35:49 +08:00
cad77ce0bf chore: audio config readonly 2025-10-15 15:29:09 +08:00
6f4518ebf7 chore: document readonly 2025-10-15 15:27:18 +08:00
a8f5748dee chore: vision readonly 2025-10-15 15:21:23 +08:00
738d3001be chore: chat input and feature readonly 2025-10-15 15:21:22 +08:00
df4e32aaa0 Merge branch 'feat/support-free-try-app' of github.com:langgenius/dify into feat/support-free-try-app 2025-10-15 14:36:47 +08:00
a25e37a96d feat: implement responsive design and resize handling for explore page banner 2025-10-15 14:36:27 +08:00
f156b46705 chore: user input readonly 2025-10-15 13:48:39 +08:00
3b64e118d0 chore: readonly ui 2025-10-15 11:39:41 +08:00
566cd20849 feat: dataset config support readonly 2025-10-15 11:37:12 +08:00
df76527f29 feat: add pause functionality to explore page banner for improved user interaction 2025-10-15 10:36:09 +08:00
53a80a5dbe feat: enhance explore page banner functionality with state management and animation improvements 2025-10-15 09:55:14 +08:00
1507792a0c Merge branch 'feat/support-free-try-app' of github.com:langgenius/dify into feat/support-free-try-app 2025-10-14 18:54:11 +08:00
00b9bbff75 feat: enhance explore page banner functionality with state management and animation improvements 2025-10-14 18:53:29 +08:00
e1f8b4b387 feat: support show dataset in knowledge 2025-10-14 18:31:42 +08:00
1539d86f7d chore: instruction and vars to readonly 2025-10-14 17:28:49 +08:00
67bb14d3ee chore: update dependencies and improve explore page banner 2025-10-14 15:51:07 +08:00
5653309080 feat: add carousel & new banner of explore page 2025-10-14 15:41:22 +08:00
0f52b34b61 feat: try apps basic app preveiw 2025-10-14 15:38:22 +08:00
75e35857c1 feat: add carousel & new banner of explore page 2025-10-14 14:17:49 +08:00
4f81be70e3 feat: no apps 2025-10-13 18:31:57 +08:00
1d4d627d05 feat: toogle sidebar 2025-10-13 17:36:24 +08:00
2357234f39 chore: sidebar ui 2025-10-13 17:11:51 +08:00
a3f7d8f996 chore: merge main 2025-10-13 16:38:29 +08:00
56f12e70c1 chore: web apps copywritings 2025-10-13 16:18:57 +08:00
b14afda160 chore: app gallary nav 2025-10-13 15:40:13 +08:00
44b4948972 chore: explore card ui and permission 2025-10-13 15:07:25 +08:00
487eac3b91 chore: add banner permission 2025-10-13 11:27:50 +08:00
84b2913cd9 feat: filter title 2025-10-13 11:12:10 +08:00
176d810c8d chore: update category ui 2025-10-13 10:55:49 +08:00
9e66564526 feat: banner placeholder 2025-10-11 15:07:03 +08:00
781a9a56cd feat: explore title change 2025-10-11 14:58:54 +08:00
93be1219eb chore: try app title 2025-10-11 11:00:26 +08:00
3276d6429d chore: handle completion acion 2025-10-11 10:53:24 +08:00
50072a63ae feat: support try agent app 2025-10-11 10:42:55 +08:00
1ab7e1cba8 fix: try chatflow run url problem 2025-10-11 10:11:14 +08:00
b0aef35c63 feat: try chat flow app 2025-10-10 18:24:56 +08:00
ac351b700c chore: some ui 2025-10-10 16:51:49 +08:00
d1e5d30ea9 fix: text generation api url 2025-10-10 16:39:42 +08:00
c73e84d992 feat: can show text completion run result pages 2025-10-10 16:34:10 +08:00
5f0bd5119a chore: temp 2025-09-24 13:39:52 +08:00
8353352bda chore: try app can use web app run 2025-09-22 15:17:11 +08:00
73845cbec5 feat: text generation 2025-09-19 16:32:11 +08:00
c2f94e9e8a feat: api call the try app and support disable feedback 2025-09-19 11:32:30 +08:00
e54efda36f feat: try app page 2025-09-18 14:54:15 +08:00
d4bd19f6d8 fix: api login detect problems 2025-09-17 17:15:23 +08:00
4decbbbf18 chore: remove useless api 2025-09-17 14:34:59 +08:00
b15867f92e chore: feedback api 2025-09-17 14:12:34 +08:00
a5e5fbc6e0 chore: some api change to new 2025-09-17 14:10:56 +08:00
1b1471b6d8 fix: stop response api 2025-09-17 14:07:15 +08:00
5280bffde2 feat: change api to new 2025-09-17 11:17:12 +08:00
db0fc94b39 chore: change api to support try apps 2025-09-16 18:21:23 +08:00
572 changed files with 39952 additions and 7843 deletions

1
.agent/skills Symbolic link
View File

@ -0,0 +1 @@
../.claude/skills

View File

@ -1,11 +1,4 @@
{
"enabledPlugins": {
"feature-dev@claude-plugins-official": true,
"context7@claude-plugins-official": true,
"typescript-lsp@claude-plugins-official": true,
"pyright-lsp@claude-plugins-official": true,
"ralph-loop@claude-plugins-official": true
},
"hooks": {
"PreToolUse": [
{
@ -18,5 +11,10 @@
]
}
]
},
"enabledPlugins": {
"feature-dev@claude-plugins-official": true,
"context7@claude-plugins-official": true,
"ralph-loop@claude-plugins-official": true
}
}

View File

@ -83,6 +83,9 @@ vi.mock('next/navigation', () => ({
usePathname: () => '/test',
}))
// ✅ Zustand stores: Use real stores (auto-mocked globally)
// Set test state with: useAppStore.setState({ ... })
// Shared state for mocks (if needed)
let mockSharedState = false
@ -296,7 +299,7 @@ For each test file generated, aim for:
For more detailed information, refer to:
- `references/workflow.md` - **Incremental testing workflow** (MUST READ for multi-file testing)
- `references/mocking.md` - Mock patterns and best practices
- `references/mocking.md` - Mock patterns, Zustand store testing, and best practices
- `references/async-testing.md` - Async operations and API calls
- `references/domain-components.md` - Workflow, Dataset, Configuration testing
- `references/common-patterns.md` - Frequently used testing patterns

View File

@ -37,16 +37,36 @@ Only mock these categories:
1. **Third-party libraries with side effects** - `next/navigation`, external SDKs
1. **i18n** - Always mock to return keys
### Zustand Stores - DO NOT Mock Manually
**Zustand is globally mocked** in `web/vitest.setup.ts`. Use real stores with `setState()`:
```typescript
// ✅ CORRECT: Use real store, set test state
import { useAppStore } from '@/app/components/app/store'
useAppStore.setState({ appDetail: { id: 'test', name: 'Test' } })
render(<MyComponent />)
// ❌ WRONG: Don't mock the store module
vi.mock('@/app/components/app/store', () => ({ ... }))
```
See [Zustand Store Testing](#zustand-store-testing) section for full details.
## Mock Placement
| Location | Purpose |
|----------|---------|
| `web/vitest.setup.ts` | Global mocks shared by all tests (for example `react-i18next`, `next/image`) |
| `web/vitest.setup.ts` | Global mocks shared by all tests (`react-i18next`, `next/image`, `zustand`) |
| `web/__mocks__/zustand.ts` | Zustand mock implementation (auto-resets stores after each test) |
| `web/__mocks__/` | Reusable mock factories shared across multiple test files |
| Test file | Test-specific mocks, inline with `vi.mock()` |
Modules are not mocked automatically. Use `vi.mock` in test files, or add global mocks in `web/vitest.setup.ts`.
**Note**: Zustand is special - it's globally mocked but you should NOT mock store modules manually. See [Zustand Store Testing](#zustand-store-testing).
## Essential Mocks
### 1. i18n (Auto-loaded via Global Mock)
@ -276,6 +296,7 @@ const renderWithQueryClient = (ui: React.ReactElement) => {
1. **Use real base components** - Import from `@/app/components/base/` directly
1. **Use real project components** - Prefer importing over mocking
1. **Use real Zustand stores** - Set test state via `store.setState()`
1. **Reset mocks in `beforeEach`**, not `afterEach`
1. **Match actual component behavior** in mocks (when mocking is necessary)
1. **Use factory functions** for complex mock data
@ -285,6 +306,7 @@ const renderWithQueryClient = (ui: React.ReactElement) => {
### ❌ DON'T
1. **Don't mock base components** (`Loading`, `Button`, `Tooltip`, etc.)
1. **Don't mock Zustand store modules** - Use real stores with `setState()`
1. Don't mock components you can import directly
1. Don't create overly simplified mocks that miss conditional logic
1. Don't forget to clean up nock after each test
@ -308,10 +330,151 @@ Need to use a component in test?
├─ Is it a third-party lib with side effects?
│ └─ YES → Mock it (next/navigation, external SDKs)
├─ Is it a Zustand store?
│ └─ YES → DO NOT mock the module!
│ Use real store + setState() to set test state
│ (Global mock handles auto-reset)
└─ Is it i18n?
└─ YES → Uses shared mock (auto-loaded). Override only for custom translations
```
## Zustand Store Testing
### Global Zustand Mock (Auto-loaded)
Zustand is globally mocked in `web/vitest.setup.ts` following the [official Zustand testing guide](https://zustand.docs.pmnd.rs/guides/testing). The mock in `web/__mocks__/zustand.ts` provides:
- Real store behavior with `getState()`, `setState()`, `subscribe()` methods
- Automatic store reset after each test via `afterEach`
- Proper test isolation between tests
### ✅ Recommended: Use Real Stores (Official Best Practice)
**DO NOT mock store modules manually.** Import and use the real store, then use `setState()` to set test state:
```typescript
// ✅ CORRECT: Use real store with setState
import { useAppStore } from '@/app/components/app/store'
describe('MyComponent', () => {
it('should render app details', () => {
// Arrange: Set test state via setState
useAppStore.setState({
appDetail: {
id: 'test-app',
name: 'Test App',
mode: 'chat',
},
})
// Act
render(<MyComponent />)
// Assert
expect(screen.getByText('Test App')).toBeInTheDocument()
// Can also verify store state directly
expect(useAppStore.getState().appDetail?.name).toBe('Test App')
})
// No cleanup needed - global mock auto-resets after each test
})
```
### ❌ Avoid: Manual Store Module Mocking
Manual mocking conflicts with the global Zustand mock and loses store functionality:
```typescript
// ❌ WRONG: Don't mock the store module
vi.mock('@/app/components/app/store', () => ({
useStore: (selector) => mockSelector(selector), // Missing getState, setState!
}))
// ❌ WRONG: This conflicts with global zustand mock
vi.mock('@/app/components/workflow/store', () => ({
useWorkflowStore: vi.fn(() => mockState),
}))
```
**Problems with manual mocking:**
1. Loses `getState()`, `setState()`, `subscribe()` methods
1. Conflicts with global Zustand mock behavior
1. Requires manual maintenance of store API
1. Tests don't reflect actual store behavior
### When Manual Store Mocking is Necessary
In rare cases where the store has complex initialization or side effects, you can mock it, but ensure you provide the full store API:
```typescript
// If you MUST mock (rare), include full store API
const mockStore = {
appDetail: { id: 'test', name: 'Test' },
setAppDetail: vi.fn(),
}
vi.mock('@/app/components/app/store', () => ({
useStore: Object.assign(
(selector: (state: typeof mockStore) => unknown) => selector(mockStore),
{
getState: () => mockStore,
setState: vi.fn(),
subscribe: vi.fn(),
},
),
}))
```
### Store Testing Decision Tree
```
Need to test a component using Zustand store?
├─ Can you use the real store?
│ └─ YES → Use real store + setState (RECOMMENDED)
│ useAppStore.setState({ ... })
├─ Does the store have complex initialization/side effects?
│ └─ YES → Consider mocking, but include full API
│ (getState, setState, subscribe)
└─ Are you testing the store itself (not a component)?
└─ YES → Test store directly with getState/setState
const store = useMyStore
store.setState({ count: 0 })
store.getState().increment()
expect(store.getState().count).toBe(1)
```
### Example: Testing Store Actions
```typescript
import { useCounterStore } from '@/stores/counter'
describe('Counter Store', () => {
it('should increment count', () => {
// Initial state (auto-reset by global mock)
expect(useCounterStore.getState().count).toBe(0)
// Call action
useCounterStore.getState().increment()
// Verify state change
expect(useCounterStore.getState().count).toBe(1)
})
it('should reset to initial state', () => {
// Set some state
useCounterStore.setState({ count: 100 })
expect(useCounterStore.getState().count).toBe(100)
// After this test, global mock will reset to initial state
})
})
```
## Factory Function Pattern
```typescript

View File

@ -0,0 +1,46 @@
---
name: orpc-contract-first
description: Guide for implementing oRPC contract-first API patterns in Dify frontend. Triggers when creating new API contracts, adding service endpoints, integrating TanStack Query with typed contracts, or migrating legacy service calls to oRPC. Use for all API layer work in web/contract and web/service directories.
---
# oRPC Contract-First Development
## Project Structure
```
web/contract/
├── base.ts # Base contract (inputStructure: 'detailed')
├── router.ts # Router composition & type exports
├── marketplace.ts # Marketplace contracts
└── console/ # Console contracts by domain
├── system.ts
└── billing.ts
```
## Workflow
1. **Create contract** in `web/contract/console/{domain}.ts`
- Import `base` from `../base` and `type` from `@orpc/contract`
- Define route with `path`, `method`, `input`, `output`
2. **Register in router** at `web/contract/router.ts`
- Import directly from domain file (no barrel files)
- Nest by API prefix: `billing: { invoices, bindPartnerStack }`
3. **Create hooks** in `web/service/use-{domain}.ts`
- Use `consoleQuery.{group}.{contract}.queryKey()` for query keys
- Use `consoleClient.{group}.{contract}()` for API calls
## Key Rules
- **Input structure**: Always use `{ params, query?, body? }` format
- **Path params**: Use `{paramName}` in path, match in `params` object
- **Router nesting**: Group by API prefix (e.g., `/billing/*``billing: {}`)
- **No barrel files**: Import directly from specific files
- **Types**: Import from `@/types/`, use `type<T>()` helper
## Type Export
```typescript
export type ConsoleInputs = InferContractRouterInputs<typeof consoleRouterContract>
```

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,125 @@
---
name: vercel-react-best-practices
description: React and Next.js performance optimization guidelines from Vercel Engineering. This skill should be used when writing, reviewing, or refactoring React/Next.js code to ensure optimal performance patterns. Triggers on tasks involving React components, Next.js pages, data fetching, bundle optimization, or performance improvements.
license: MIT
metadata:
author: vercel
version: "1.0.0"
---
# Vercel React Best Practices
Comprehensive performance optimization guide for React and Next.js applications, maintained by Vercel. Contains 45 rules across 8 categories, prioritized by impact to guide automated refactoring and code generation.
## When to Apply
Reference these guidelines when:
- Writing new React components or Next.js pages
- Implementing data fetching (client or server-side)
- Reviewing code for performance issues
- Refactoring existing React/Next.js code
- Optimizing bundle size or load times
## Rule Categories by Priority
| Priority | Category | Impact | Prefix |
|----------|----------|--------|--------|
| 1 | Eliminating Waterfalls | CRITICAL | `async-` |
| 2 | Bundle Size Optimization | CRITICAL | `bundle-` |
| 3 | Server-Side Performance | HIGH | `server-` |
| 4 | Client-Side Data Fetching | MEDIUM-HIGH | `client-` |
| 5 | Re-render Optimization | MEDIUM | `rerender-` |
| 6 | Rendering Performance | MEDIUM | `rendering-` |
| 7 | JavaScript Performance | LOW-MEDIUM | `js-` |
| 8 | Advanced Patterns | LOW | `advanced-` |
## Quick Reference
### 1. Eliminating Waterfalls (CRITICAL)
- `async-defer-await` - Move await into branches where actually used
- `async-parallel` - Use Promise.all() for independent operations
- `async-dependencies` - Use better-all for partial dependencies
- `async-api-routes` - Start promises early, await late in API routes
- `async-suspense-boundaries` - Use Suspense to stream content
### 2. Bundle Size Optimization (CRITICAL)
- `bundle-barrel-imports` - Import directly, avoid barrel files
- `bundle-dynamic-imports` - Use next/dynamic for heavy components
- `bundle-defer-third-party` - Load analytics/logging after hydration
- `bundle-conditional` - Load modules only when feature is activated
- `bundle-preload` - Preload on hover/focus for perceived speed
### 3. Server-Side Performance (HIGH)
- `server-cache-react` - Use React.cache() for per-request deduplication
- `server-cache-lru` - Use LRU cache for cross-request caching
- `server-serialization` - Minimize data passed to client components
- `server-parallel-fetching` - Restructure components to parallelize fetches
- `server-after-nonblocking` - Use after() for non-blocking operations
### 4. Client-Side Data Fetching (MEDIUM-HIGH)
- `client-swr-dedup` - Use SWR for automatic request deduplication
- `client-event-listeners` - Deduplicate global event listeners
### 5. Re-render Optimization (MEDIUM)
- `rerender-defer-reads` - Don't subscribe to state only used in callbacks
- `rerender-memo` - Extract expensive work into memoized components
- `rerender-dependencies` - Use primitive dependencies in effects
- `rerender-derived-state` - Subscribe to derived booleans, not raw values
- `rerender-functional-setstate` - Use functional setState for stable callbacks
- `rerender-lazy-state-init` - Pass function to useState for expensive values
- `rerender-transitions` - Use startTransition for non-urgent updates
### 6. Rendering Performance (MEDIUM)
- `rendering-animate-svg-wrapper` - Animate div wrapper, not SVG element
- `rendering-content-visibility` - Use content-visibility for long lists
- `rendering-hoist-jsx` - Extract static JSX outside components
- `rendering-svg-precision` - Reduce SVG coordinate precision
- `rendering-hydration-no-flicker` - Use inline script for client-only data
- `rendering-activity` - Use Activity component for show/hide
- `rendering-conditional-render` - Use ternary, not && for conditionals
### 7. JavaScript Performance (LOW-MEDIUM)
- `js-batch-dom-css` - Group CSS changes via classes or cssText
- `js-index-maps` - Build Map for repeated lookups
- `js-cache-property-access` - Cache object properties in loops
- `js-cache-function-results` - Cache function results in module-level Map
- `js-cache-storage` - Cache localStorage/sessionStorage reads
- `js-combine-iterations` - Combine multiple filter/map into one loop
- `js-length-check-first` - Check array length before expensive comparison
- `js-early-exit` - Return early from functions
- `js-hoist-regexp` - Hoist RegExp creation outside loops
- `js-min-max-loop` - Use loop for min/max instead of sort
- `js-set-map-lookups` - Use Set/Map for O(1) lookups
- `js-tosorted-immutable` - Use toSorted() for immutability
### 8. Advanced Patterns (LOW)
- `advanced-event-handler-refs` - Store event handlers in refs
- `advanced-use-latest` - useLatest for stable callback refs
## How to Use
Read individual rule files for detailed explanations and code examples:
```
rules/async-parallel.md
rules/bundle-barrel-imports.md
rules/_sections.md
```
Each rule file contains:
- Brief explanation of why it matters
- Incorrect code example with explanation
- Correct code example with explanation
- Additional context and references
## Full Compiled Document
For the complete guide with all rules expanded: `AGENTS.md`

View File

@ -0,0 +1,55 @@
---
title: Store Event Handlers in Refs
impact: LOW
impactDescription: stable subscriptions
tags: advanced, hooks, refs, event-handlers, optimization
---
## Store Event Handlers in Refs
Store callbacks in refs when used in effects that shouldn't re-subscribe on callback changes.
**Incorrect (re-subscribes on every render):**
```tsx
function useWindowEvent(event: string, handler: (e) => void) {
useEffect(() => {
window.addEventListener(event, handler)
return () => window.removeEventListener(event, handler)
}, [event, handler])
}
```
**Correct (stable subscription):**
```tsx
function useWindowEvent(event: string, handler: (e) => void) {
const handlerRef = useRef(handler)
useEffect(() => {
handlerRef.current = handler
}, [handler])
useEffect(() => {
const listener = (e) => handlerRef.current(e)
window.addEventListener(event, listener)
return () => window.removeEventListener(event, listener)
}, [event])
}
```
**Alternative: use `useEffectEvent` if you're on latest React:**
```tsx
import { useEffectEvent } from 'react'
function useWindowEvent(event: string, handler: (e) => void) {
const onEvent = useEffectEvent(handler)
useEffect(() => {
window.addEventListener(event, onEvent)
return () => window.removeEventListener(event, onEvent)
}, [event])
}
```
`useEffectEvent` provides a cleaner API for the same pattern: it creates a stable function reference that always calls the latest version of the handler.

View File

@ -0,0 +1,49 @@
---
title: useLatest for Stable Callback Refs
impact: LOW
impactDescription: prevents effect re-runs
tags: advanced, hooks, useLatest, refs, optimization
---
## useLatest for Stable Callback Refs
Access latest values in callbacks without adding them to dependency arrays. Prevents effect re-runs while avoiding stale closures.
**Implementation:**
```typescript
function useLatest<T>(value: T) {
const ref = useRef(value)
useLayoutEffect(() => {
ref.current = value
}, [value])
return ref
}
```
**Incorrect (effect re-runs on every callback change):**
```tsx
function SearchInput({ onSearch }: { onSearch: (q: string) => void }) {
const [query, setQuery] = useState('')
useEffect(() => {
const timeout = setTimeout(() => onSearch(query), 300)
return () => clearTimeout(timeout)
}, [query, onSearch])
}
```
**Correct (stable effect, fresh callback):**
```tsx
function SearchInput({ onSearch }: { onSearch: (q: string) => void }) {
const [query, setQuery] = useState('')
const onSearchRef = useLatest(onSearch)
useEffect(() => {
const timeout = setTimeout(() => onSearchRef.current(query), 300)
return () => clearTimeout(timeout)
}, [query])
}
```

View File

@ -0,0 +1,38 @@
---
title: Prevent Waterfall Chains in API Routes
impact: CRITICAL
impactDescription: 2-10× improvement
tags: api-routes, server-actions, waterfalls, parallelization
---
## Prevent Waterfall Chains in API Routes
In API routes and Server Actions, start independent operations immediately, even if you don't await them yet.
**Incorrect (config waits for auth, data waits for both):**
```typescript
export async function GET(request: Request) {
const session = await auth()
const config = await fetchConfig()
const data = await fetchData(session.user.id)
return Response.json({ data, config })
}
```
**Correct (auth and config start immediately):**
```typescript
export async function GET(request: Request) {
const sessionPromise = auth()
const configPromise = fetchConfig()
const session = await sessionPromise
const [config, data] = await Promise.all([
configPromise,
fetchData(session.user.id)
])
return Response.json({ data, config })
}
```
For operations with more complex dependency chains, use `better-all` to automatically maximize parallelism (see Dependency-Based Parallelization).

View File

@ -0,0 +1,80 @@
---
title: Defer Await Until Needed
impact: HIGH
impactDescription: avoids blocking unused code paths
tags: async, await, conditional, optimization
---
## Defer Await Until Needed
Move `await` operations into the branches where they're actually used to avoid blocking code paths that don't need them.
**Incorrect (blocks both branches):**
```typescript
async function handleRequest(userId: string, skipProcessing: boolean) {
const userData = await fetchUserData(userId)
if (skipProcessing) {
// Returns immediately but still waited for userData
return { skipped: true }
}
// Only this branch uses userData
return processUserData(userData)
}
```
**Correct (only blocks when needed):**
```typescript
async function handleRequest(userId: string, skipProcessing: boolean) {
if (skipProcessing) {
// Returns immediately without waiting
return { skipped: true }
}
// Fetch only when needed
const userData = await fetchUserData(userId)
return processUserData(userData)
}
```
**Another example (early return optimization):**
```typescript
// Incorrect: always fetches permissions
async function updateResource(resourceId: string, userId: string) {
const permissions = await fetchPermissions(userId)
const resource = await getResource(resourceId)
if (!resource) {
return { error: 'Not found' }
}
if (!permissions.canEdit) {
return { error: 'Forbidden' }
}
return await updateResourceData(resource, permissions)
}
// Correct: fetches only when needed
async function updateResource(resourceId: string, userId: string) {
const resource = await getResource(resourceId)
if (!resource) {
return { error: 'Not found' }
}
const permissions = await fetchPermissions(userId)
if (!permissions.canEdit) {
return { error: 'Forbidden' }
}
return await updateResourceData(resource, permissions)
}
```
This optimization is especially valuable when the skipped branch is frequently taken, or when the deferred operation is expensive.

View File

@ -0,0 +1,36 @@
---
title: Dependency-Based Parallelization
impact: CRITICAL
impactDescription: 2-10× improvement
tags: async, parallelization, dependencies, better-all
---
## Dependency-Based Parallelization
For operations with partial dependencies, use `better-all` to maximize parallelism. It automatically starts each task at the earliest possible moment.
**Incorrect (profile waits for config unnecessarily):**
```typescript
const [user, config] = await Promise.all([
fetchUser(),
fetchConfig()
])
const profile = await fetchProfile(user.id)
```
**Correct (config and profile run in parallel):**
```typescript
import { all } from 'better-all'
const { user, config, profile } = await all({
async user() { return fetchUser() },
async config() { return fetchConfig() },
async profile() {
return fetchProfile((await this.$.user).id)
}
})
```
Reference: [https://github.com/shuding/better-all](https://github.com/shuding/better-all)

View File

@ -0,0 +1,28 @@
---
title: Promise.all() for Independent Operations
impact: CRITICAL
impactDescription: 2-10× improvement
tags: async, parallelization, promises, waterfalls
---
## Promise.all() for Independent Operations
When async operations have no interdependencies, execute them concurrently using `Promise.all()`.
**Incorrect (sequential execution, 3 round trips):**
```typescript
const user = await fetchUser()
const posts = await fetchPosts()
const comments = await fetchComments()
```
**Correct (parallel execution, 1 round trip):**
```typescript
const [user, posts, comments] = await Promise.all([
fetchUser(),
fetchPosts(),
fetchComments()
])
```

View File

@ -0,0 +1,99 @@
---
title: Strategic Suspense Boundaries
impact: HIGH
impactDescription: faster initial paint
tags: async, suspense, streaming, layout-shift
---
## Strategic Suspense Boundaries
Instead of awaiting data in async components before returning JSX, use Suspense boundaries to show the wrapper UI faster while data loads.
**Incorrect (wrapper blocked by data fetching):**
```tsx
async function Page() {
const data = await fetchData() // Blocks entire page
return (
<div>
<div>Sidebar</div>
<div>Header</div>
<div>
<DataDisplay data={data} />
</div>
<div>Footer</div>
</div>
)
}
```
The entire layout waits for data even though only the middle section needs it.
**Correct (wrapper shows immediately, data streams in):**
```tsx
function Page() {
return (
<div>
<div>Sidebar</div>
<div>Header</div>
<div>
<Suspense fallback={<Skeleton />}>
<DataDisplay />
</Suspense>
</div>
<div>Footer</div>
</div>
)
}
async function DataDisplay() {
const data = await fetchData() // Only blocks this component
return <div>{data.content}</div>
}
```
Sidebar, Header, and Footer render immediately. Only DataDisplay waits for data.
**Alternative (share promise across components):**
```tsx
function Page() {
// Start fetch immediately, but don't await
const dataPromise = fetchData()
return (
<div>
<div>Sidebar</div>
<div>Header</div>
<Suspense fallback={<Skeleton />}>
<DataDisplay dataPromise={dataPromise} />
<DataSummary dataPromise={dataPromise} />
</Suspense>
<div>Footer</div>
</div>
)
}
function DataDisplay({ dataPromise }: { dataPromise: Promise<Data> }) {
const data = use(dataPromise) // Unwraps the promise
return <div>{data.content}</div>
}
function DataSummary({ dataPromise }: { dataPromise: Promise<Data> }) {
const data = use(dataPromise) // Reuses the same promise
return <div>{data.summary}</div>
}
```
Both components share the same promise, so only one fetch occurs. Layout renders immediately while both components wait together.
**When NOT to use this pattern:**
- Critical data needed for layout decisions (affects positioning)
- SEO-critical content above the fold
- Small, fast queries where suspense overhead isn't worth it
- When you want to avoid layout shift (loading → content jump)
**Trade-off:** Faster initial paint vs potential layout shift. Choose based on your UX priorities.

View File

@ -0,0 +1,59 @@
---
title: Avoid Barrel File Imports
impact: CRITICAL
impactDescription: 200-800ms import cost, slow builds
tags: bundle, imports, tree-shaking, barrel-files, performance
---
## Avoid Barrel File Imports
Import directly from source files instead of barrel files to avoid loading thousands of unused modules. **Barrel files** are entry points that re-export multiple modules (e.g., `index.js` that does `export * from './module'`).
Popular icon and component libraries can have **up to 10,000 re-exports** in their entry file. For many React packages, **it takes 200-800ms just to import them**, affecting both development speed and production cold starts.
**Why tree-shaking doesn't help:** When a library is marked as external (not bundled), the bundler can't optimize it. If you bundle it to enable tree-shaking, builds become substantially slower analyzing the entire module graph.
**Incorrect (imports entire library):**
```tsx
import { Check, X, Menu } from 'lucide-react'
// Loads 1,583 modules, takes ~2.8s extra in dev
// Runtime cost: 200-800ms on every cold start
import { Button, TextField } from '@mui/material'
// Loads 2,225 modules, takes ~4.2s extra in dev
```
**Correct (imports only what you need):**
```tsx
import Check from 'lucide-react/dist/esm/icons/check'
import X from 'lucide-react/dist/esm/icons/x'
import Menu from 'lucide-react/dist/esm/icons/menu'
// Loads only 3 modules (~2KB vs ~1MB)
import Button from '@mui/material/Button'
import TextField from '@mui/material/TextField'
// Loads only what you use
```
**Alternative (Next.js 13.5+):**
```js
// next.config.js - use optimizePackageImports
module.exports = {
experimental: {
optimizePackageImports: ['lucide-react', '@mui/material']
}
}
// Then you can keep the ergonomic barrel imports:
import { Check, X, Menu } from 'lucide-react'
// Automatically transformed to direct imports at build time
```
Direct imports provide 15-70% faster dev boot, 28% faster builds, 40% faster cold starts, and significantly faster HMR.
Libraries commonly affected: `lucide-react`, `@mui/material`, `@mui/icons-material`, `@tabler/icons-react`, `react-icons`, `@headlessui/react`, `@radix-ui/react-*`, `lodash`, `ramda`, `date-fns`, `rxjs`, `react-use`.
Reference: [How we optimized package imports in Next.js](https://vercel.com/blog/how-we-optimized-package-imports-in-next-js)

View File

@ -0,0 +1,31 @@
---
title: Conditional Module Loading
impact: HIGH
impactDescription: loads large data only when needed
tags: bundle, conditional-loading, lazy-loading
---
## Conditional Module Loading
Load large data or modules only when a feature is activated.
**Example (lazy-load animation frames):**
```tsx
function AnimationPlayer({ enabled, setEnabled }: { enabled: boolean; setEnabled: React.Dispatch<React.SetStateAction<boolean>> }) {
const [frames, setFrames] = useState<Frame[] | null>(null)
useEffect(() => {
if (enabled && !frames && typeof window !== 'undefined') {
import('./animation-frames.js')
.then(mod => setFrames(mod.frames))
.catch(() => setEnabled(false))
}
}, [enabled, frames, setEnabled])
if (!frames) return <Skeleton />
return <Canvas frames={frames} />
}
```
The `typeof window !== 'undefined'` check prevents bundling this module for SSR, optimizing server bundle size and build speed.

View File

@ -0,0 +1,49 @@
---
title: Defer Non-Critical Third-Party Libraries
impact: MEDIUM
impactDescription: loads after hydration
tags: bundle, third-party, analytics, defer
---
## Defer Non-Critical Third-Party Libraries
Analytics, logging, and error tracking don't block user interaction. Load them after hydration.
**Incorrect (blocks initial bundle):**
```tsx
import { Analytics } from '@vercel/analytics/react'
export default function RootLayout({ children }) {
return (
<html>
<body>
{children}
<Analytics />
</body>
</html>
)
}
```
**Correct (loads after hydration):**
```tsx
import dynamic from 'next/dynamic'
const Analytics = dynamic(
() => import('@vercel/analytics/react').then(m => m.Analytics),
{ ssr: false }
)
export default function RootLayout({ children }) {
return (
<html>
<body>
{children}
<Analytics />
</body>
</html>
)
}
```

View File

@ -0,0 +1,35 @@
---
title: Dynamic Imports for Heavy Components
impact: CRITICAL
impactDescription: directly affects TTI and LCP
tags: bundle, dynamic-import, code-splitting, next-dynamic
---
## Dynamic Imports for Heavy Components
Use `next/dynamic` to lazy-load large components not needed on initial render.
**Incorrect (Monaco bundles with main chunk ~300KB):**
```tsx
import { MonacoEditor } from './monaco-editor'
function CodePanel({ code }: { code: string }) {
return <MonacoEditor value={code} />
}
```
**Correct (Monaco loads on demand):**
```tsx
import dynamic from 'next/dynamic'
const MonacoEditor = dynamic(
() => import('./monaco-editor').then(m => m.MonacoEditor),
{ ssr: false }
)
function CodePanel({ code }: { code: string }) {
return <MonacoEditor value={code} />
}
```

View File

@ -0,0 +1,50 @@
---
title: Preload Based on User Intent
impact: MEDIUM
impactDescription: reduces perceived latency
tags: bundle, preload, user-intent, hover
---
## Preload Based on User Intent
Preload heavy bundles before they're needed to reduce perceived latency.
**Example (preload on hover/focus):**
```tsx
function EditorButton({ onClick }: { onClick: () => void }) {
const preload = () => {
if (typeof window !== 'undefined') {
void import('./monaco-editor')
}
}
return (
<button
onMouseEnter={preload}
onFocus={preload}
onClick={onClick}
>
Open Editor
</button>
)
}
```
**Example (preload when feature flag is enabled):**
```tsx
function FlagsProvider({ children, flags }: Props) {
useEffect(() => {
if (flags.editorEnabled && typeof window !== 'undefined') {
void import('./monaco-editor').then(mod => mod.init())
}
}, [flags.editorEnabled])
return <FlagsContext.Provider value={flags}>
{children}
</FlagsContext.Provider>
}
```
The `typeof window !== 'undefined'` check prevents bundling preloaded modules for SSR, optimizing server bundle size and build speed.

View File

@ -0,0 +1,74 @@
---
title: Deduplicate Global Event Listeners
impact: LOW
impactDescription: single listener for N components
tags: client, swr, event-listeners, subscription
---
## Deduplicate Global Event Listeners
Use `useSWRSubscription()` to share global event listeners across component instances.
**Incorrect (N instances = N listeners):**
```tsx
function useKeyboardShortcut(key: string, callback: () => void) {
useEffect(() => {
const handler = (e: KeyboardEvent) => {
if (e.metaKey && e.key === key) {
callback()
}
}
window.addEventListener('keydown', handler)
return () => window.removeEventListener('keydown', handler)
}, [key, callback])
}
```
When using the `useKeyboardShortcut` hook multiple times, each instance will register a new listener.
**Correct (N instances = 1 listener):**
```tsx
import useSWRSubscription from 'swr/subscription'
// Module-level Map to track callbacks per key
const keyCallbacks = new Map<string, Set<() => void>>()
function useKeyboardShortcut(key: string, callback: () => void) {
// Register this callback in the Map
useEffect(() => {
if (!keyCallbacks.has(key)) {
keyCallbacks.set(key, new Set())
}
keyCallbacks.get(key)!.add(callback)
return () => {
const set = keyCallbacks.get(key)
if (set) {
set.delete(callback)
if (set.size === 0) {
keyCallbacks.delete(key)
}
}
}
}, [key, callback])
useSWRSubscription('global-keydown', () => {
const handler = (e: KeyboardEvent) => {
if (e.metaKey && keyCallbacks.has(e.key)) {
keyCallbacks.get(e.key)!.forEach(cb => cb())
}
}
window.addEventListener('keydown', handler)
return () => window.removeEventListener('keydown', handler)
})
}
function Profile() {
// Multiple shortcuts will share the same listener
useKeyboardShortcut('p', () => { /* ... */ })
useKeyboardShortcut('k', () => { /* ... */ })
// ...
}
```

View File

@ -0,0 +1,71 @@
---
title: Version and Minimize localStorage Data
impact: MEDIUM
impactDescription: prevents schema conflicts, reduces storage size
tags: client, localStorage, storage, versioning, data-minimization
---
## Version and Minimize localStorage Data
Add version prefix to keys and store only needed fields. Prevents schema conflicts and accidental storage of sensitive data.
**Incorrect:**
```typescript
// No version, stores everything, no error handling
localStorage.setItem('userConfig', JSON.stringify(fullUserObject))
const data = localStorage.getItem('userConfig')
```
**Correct:**
```typescript
const VERSION = 'v2'
function saveConfig(config: { theme: string; language: string }) {
try {
localStorage.setItem(`userConfig:${VERSION}`, JSON.stringify(config))
} catch {
// Throws in incognito/private browsing, quota exceeded, or disabled
}
}
function loadConfig() {
try {
const data = localStorage.getItem(`userConfig:${VERSION}`)
return data ? JSON.parse(data) : null
} catch {
return null
}
}
// Migration from v1 to v2
function migrate() {
try {
const v1 = localStorage.getItem('userConfig:v1')
if (v1) {
const old = JSON.parse(v1)
saveConfig({ theme: old.darkMode ? 'dark' : 'light', language: old.lang })
localStorage.removeItem('userConfig:v1')
}
} catch {}
}
```
**Store minimal fields from server responses:**
```typescript
// User object has 20+ fields, only store what UI needs
function cachePrefs(user: FullUser) {
try {
localStorage.setItem('prefs:v1', JSON.stringify({
theme: user.preferences.theme,
notifications: user.preferences.notifications
}))
} catch {}
}
```
**Always wrap in try-catch:** `getItem()` and `setItem()` throw in incognito/private browsing (Safari, Firefox), when quota exceeded, or when disabled.
**Benefits:** Schema evolution via versioning, reduced storage size, prevents storing tokens/PII/internal flags.

View File

@ -0,0 +1,48 @@
---
title: Use Passive Event Listeners for Scrolling Performance
impact: MEDIUM
impactDescription: eliminates scroll delay caused by event listeners
tags: client, event-listeners, scrolling, performance, touch, wheel
---
## Use Passive Event Listeners for Scrolling Performance
Add `{ passive: true }` to touch and wheel event listeners to enable immediate scrolling. Browsers normally wait for listeners to finish to check if `preventDefault()` is called, causing scroll delay.
**Incorrect:**
```typescript
useEffect(() => {
const handleTouch = (e: TouchEvent) => console.log(e.touches[0].clientX)
const handleWheel = (e: WheelEvent) => console.log(e.deltaY)
document.addEventListener('touchstart', handleTouch)
document.addEventListener('wheel', handleWheel)
return () => {
document.removeEventListener('touchstart', handleTouch)
document.removeEventListener('wheel', handleWheel)
}
}, [])
```
**Correct:**
```typescript
useEffect(() => {
const handleTouch = (e: TouchEvent) => console.log(e.touches[0].clientX)
const handleWheel = (e: WheelEvent) => console.log(e.deltaY)
document.addEventListener('touchstart', handleTouch, { passive: true })
document.addEventListener('wheel', handleWheel, { passive: true })
return () => {
document.removeEventListener('touchstart', handleTouch)
document.removeEventListener('wheel', handleWheel)
}
}, [])
```
**Use passive when:** tracking/analytics, logging, any listener that doesn't call `preventDefault()`.
**Don't use passive when:** implementing custom swipe gestures, custom zoom controls, or any listener that needs `preventDefault()`.

View File

@ -0,0 +1,56 @@
---
title: Use SWR for Automatic Deduplication
impact: MEDIUM-HIGH
impactDescription: automatic deduplication
tags: client, swr, deduplication, data-fetching
---
## Use SWR for Automatic Deduplication
SWR enables request deduplication, caching, and revalidation across component instances.
**Incorrect (no deduplication, each instance fetches):**
```tsx
function UserList() {
const [users, setUsers] = useState([])
useEffect(() => {
fetch('/api/users')
.then(r => r.json())
.then(setUsers)
}, [])
}
```
**Correct (multiple instances share one request):**
```tsx
import useSWR from 'swr'
function UserList() {
const { data: users } = useSWR('/api/users', fetcher)
}
```
**For immutable data:**
```tsx
import { useImmutableSWR } from '@/lib/swr'
function StaticContent() {
const { data } = useImmutableSWR('/api/config', fetcher)
}
```
**For mutations:**
```tsx
import { useSWRMutation } from 'swr/mutation'
function UpdateButton() {
const { trigger } = useSWRMutation('/api/user', updateUser)
return <button onClick={() => trigger()}>Update</button>
}
```
Reference: [https://swr.vercel.app](https://swr.vercel.app)

View File

@ -0,0 +1,57 @@
---
title: Batch DOM CSS Changes
impact: MEDIUM
impactDescription: reduces reflows/repaints
tags: javascript, dom, css, performance, reflow
---
## Batch DOM CSS Changes
Avoid interleaving style writes with layout reads. When you read a layout property (like `offsetWidth`, `getBoundingClientRect()`, or `getComputedStyle()`) between style changes, the browser is forced to trigger a synchronous reflow.
**Incorrect (interleaved reads and writes force reflows):**
```typescript
function updateElementStyles(element: HTMLElement) {
element.style.width = '100px'
const width = element.offsetWidth // Forces reflow
element.style.height = '200px'
const height = element.offsetHeight // Forces another reflow
}
```
**Correct (batch writes, then read once):**
```typescript
function updateElementStyles(element: HTMLElement) {
// Batch all writes together
element.style.width = '100px'
element.style.height = '200px'
element.style.backgroundColor = 'blue'
element.style.border = '1px solid black'
// Read after all writes are done (single reflow)
const { width, height } = element.getBoundingClientRect()
}
```
**Better: use CSS classes**
```css
.highlighted-box {
width: 100px;
height: 200px;
background-color: blue;
border: 1px solid black;
}
```
```typescript
function updateElementStyles(element: HTMLElement) {
element.classList.add('highlighted-box')
const { width, height } = element.getBoundingClientRect()
}
```
Prefer CSS classes over inline styles when possible. CSS files are cached by the browser, and classes provide better separation of concerns and are easier to maintain.

View File

@ -0,0 +1,80 @@
---
title: Cache Repeated Function Calls
impact: MEDIUM
impactDescription: avoid redundant computation
tags: javascript, cache, memoization, performance
---
## Cache Repeated Function Calls
Use a module-level Map to cache function results when the same function is called repeatedly with the same inputs during render.
**Incorrect (redundant computation):**
```typescript
function ProjectList({ projects }: { projects: Project[] }) {
return (
<div>
{projects.map(project => {
// slugify() called 100+ times for same project names
const slug = slugify(project.name)
return <ProjectCard key={project.id} slug={slug} />
})}
</div>
)
}
```
**Correct (cached results):**
```typescript
// Module-level cache
const slugifyCache = new Map<string, string>()
function cachedSlugify(text: string): string {
if (slugifyCache.has(text)) {
return slugifyCache.get(text)!
}
const result = slugify(text)
slugifyCache.set(text, result)
return result
}
function ProjectList({ projects }: { projects: Project[] }) {
return (
<div>
{projects.map(project => {
// Computed only once per unique project name
const slug = cachedSlugify(project.name)
return <ProjectCard key={project.id} slug={slug} />
})}
</div>
)
}
```
**Simpler pattern for single-value functions:**
```typescript
let isLoggedInCache: boolean | null = null
function isLoggedIn(): boolean {
if (isLoggedInCache !== null) {
return isLoggedInCache
}
isLoggedInCache = document.cookie.includes('auth=')
return isLoggedInCache
}
// Clear cache when auth changes
function onAuthChange() {
isLoggedInCache = null
}
```
Use a Map (not a hook) so it works everywhere: utilities, event handlers, not just React components.
Reference: [How we made the Vercel Dashboard twice as fast](https://vercel.com/blog/how-we-made-the-vercel-dashboard-twice-as-fast)

View File

@ -0,0 +1,28 @@
---
title: Cache Property Access in Loops
impact: LOW-MEDIUM
impactDescription: reduces lookups
tags: javascript, loops, optimization, caching
---
## Cache Property Access in Loops
Cache object property lookups in hot paths.
**Incorrect (3 lookups × N iterations):**
```typescript
for (let i = 0; i < arr.length; i++) {
process(obj.config.settings.value)
}
```
**Correct (1 lookup total):**
```typescript
const value = obj.config.settings.value
const len = arr.length
for (let i = 0; i < len; i++) {
process(value)
}
```

View File

@ -0,0 +1,70 @@
---
title: Cache Storage API Calls
impact: LOW-MEDIUM
impactDescription: reduces expensive I/O
tags: javascript, localStorage, storage, caching, performance
---
## Cache Storage API Calls
`localStorage`, `sessionStorage`, and `document.cookie` are synchronous and expensive. Cache reads in memory.
**Incorrect (reads storage on every call):**
```typescript
function getTheme() {
return localStorage.getItem('theme') ?? 'light'
}
// Called 10 times = 10 storage reads
```
**Correct (Map cache):**
```typescript
const storageCache = new Map<string, string | null>()
function getLocalStorage(key: string) {
if (!storageCache.has(key)) {
storageCache.set(key, localStorage.getItem(key))
}
return storageCache.get(key)
}
function setLocalStorage(key: string, value: string) {
localStorage.setItem(key, value)
storageCache.set(key, value) // keep cache in sync
}
```
Use a Map (not a hook) so it works everywhere: utilities, event handlers, not just React components.
**Cookie caching:**
```typescript
let cookieCache: Record<string, string> | null = null
function getCookie(name: string) {
if (!cookieCache) {
cookieCache = Object.fromEntries(
document.cookie.split('; ').map(c => c.split('='))
)
}
return cookieCache[name]
}
```
**Important (invalidate on external changes):**
If storage can change externally (another tab, server-set cookies), invalidate cache:
```typescript
window.addEventListener('storage', (e) => {
if (e.key) storageCache.delete(e.key)
})
document.addEventListener('visibilitychange', () => {
if (document.visibilityState === 'visible') {
storageCache.clear()
}
})
```

View File

@ -0,0 +1,32 @@
---
title: Combine Multiple Array Iterations
impact: LOW-MEDIUM
impactDescription: reduces iterations
tags: javascript, arrays, loops, performance
---
## Combine Multiple Array Iterations
Multiple `.filter()` or `.map()` calls iterate the array multiple times. Combine into one loop.
**Incorrect (3 iterations):**
```typescript
const admins = users.filter(u => u.isAdmin)
const testers = users.filter(u => u.isTester)
const inactive = users.filter(u => !u.isActive)
```
**Correct (1 iteration):**
```typescript
const admins: User[] = []
const testers: User[] = []
const inactive: User[] = []
for (const user of users) {
if (user.isAdmin) admins.push(user)
if (user.isTester) testers.push(user)
if (!user.isActive) inactive.push(user)
}
```

View File

@ -0,0 +1,50 @@
---
title: Early Return from Functions
impact: LOW-MEDIUM
impactDescription: avoids unnecessary computation
tags: javascript, functions, optimization, early-return
---
## Early Return from Functions
Return early when result is determined to skip unnecessary processing.
**Incorrect (processes all items even after finding answer):**
```typescript
function validateUsers(users: User[]) {
let hasError = false
let errorMessage = ''
for (const user of users) {
if (!user.email) {
hasError = true
errorMessage = 'Email required'
}
if (!user.name) {
hasError = true
errorMessage = 'Name required'
}
// Continues checking all users even after error found
}
return hasError ? { valid: false, error: errorMessage } : { valid: true }
}
```
**Correct (returns immediately on first error):**
```typescript
function validateUsers(users: User[]) {
for (const user of users) {
if (!user.email) {
return { valid: false, error: 'Email required' }
}
if (!user.name) {
return { valid: false, error: 'Name required' }
}
}
return { valid: true }
}
```

View File

@ -0,0 +1,45 @@
---
title: Hoist RegExp Creation
impact: LOW-MEDIUM
impactDescription: avoids recreation
tags: javascript, regexp, optimization, memoization
---
## Hoist RegExp Creation
Don't create RegExp inside render. Hoist to module scope or memoize with `useMemo()`.
**Incorrect (new RegExp every render):**
```tsx
function Highlighter({ text, query }: Props) {
const regex = new RegExp(`(${query})`, 'gi')
const parts = text.split(regex)
return <>{parts.map((part, i) => ...)}</>
}
```
**Correct (memoize or hoist):**
```tsx
const EMAIL_REGEX = /^[^\s@]+@[^\s@]+\.[^\s@]+$/
function Highlighter({ text, query }: Props) {
const regex = useMemo(
() => new RegExp(`(${escapeRegex(query)})`, 'gi'),
[query]
)
const parts = text.split(regex)
return <>{parts.map((part, i) => ...)}</>
}
```
**Warning (global regex has mutable state):**
Global regex (`/g`) has mutable `lastIndex` state:
```typescript
const regex = /foo/g
regex.test('foo') // true, lastIndex = 3
regex.test('foo') // false, lastIndex = 0
```

View File

@ -0,0 +1,37 @@
---
title: Build Index Maps for Repeated Lookups
impact: LOW-MEDIUM
impactDescription: 1M ops to 2K ops
tags: javascript, map, indexing, optimization, performance
---
## Build Index Maps for Repeated Lookups
Multiple `.find()` calls by the same key should use a Map.
**Incorrect (O(n) per lookup):**
```typescript
function processOrders(orders: Order[], users: User[]) {
return orders.map(order => ({
...order,
user: users.find(u => u.id === order.userId)
}))
}
```
**Correct (O(1) per lookup):**
```typescript
function processOrders(orders: Order[], users: User[]) {
const userById = new Map(users.map(u => [u.id, u]))
return orders.map(order => ({
...order,
user: userById.get(order.userId)
}))
}
```
Build map once (O(n)), then all lookups are O(1).
For 1000 orders × 1000 users: 1M ops → 2K ops.

View File

@ -0,0 +1,49 @@
---
title: Early Length Check for Array Comparisons
impact: MEDIUM-HIGH
impactDescription: avoids expensive operations when lengths differ
tags: javascript, arrays, performance, optimization, comparison
---
## Early Length Check for Array Comparisons
When comparing arrays with expensive operations (sorting, deep equality, serialization), check lengths first. If lengths differ, the arrays cannot be equal.
In real-world applications, this optimization is especially valuable when the comparison runs in hot paths (event handlers, render loops).
**Incorrect (always runs expensive comparison):**
```typescript
function hasChanges(current: string[], original: string[]) {
// Always sorts and joins, even when lengths differ
return current.sort().join() !== original.sort().join()
}
```
Two O(n log n) sorts run even when `current.length` is 5 and `original.length` is 100. There is also overhead of joining the arrays and comparing the strings.
**Correct (O(1) length check first):**
```typescript
function hasChanges(current: string[], original: string[]) {
// Early return if lengths differ
if (current.length !== original.length) {
return true
}
// Only sort when lengths match
const currentSorted = current.toSorted()
const originalSorted = original.toSorted()
for (let i = 0; i < currentSorted.length; i++) {
if (currentSorted[i] !== originalSorted[i]) {
return true
}
}
return false
}
```
This new approach is more efficient because:
- It avoids the overhead of sorting and joining the arrays when lengths differ
- It avoids consuming memory for the joined strings (especially important for large arrays)
- It avoids mutating the original arrays
- It returns early when a difference is found

View File

@ -0,0 +1,82 @@
---
title: Use Loop for Min/Max Instead of Sort
impact: LOW
impactDescription: O(n) instead of O(n log n)
tags: javascript, arrays, performance, sorting, algorithms
---
## Use Loop for Min/Max Instead of Sort
Finding the smallest or largest element only requires a single pass through the array. Sorting is wasteful and slower.
**Incorrect (O(n log n) - sort to find latest):**
```typescript
interface Project {
id: string
name: string
updatedAt: number
}
function getLatestProject(projects: Project[]) {
const sorted = [...projects].sort((a, b) => b.updatedAt - a.updatedAt)
return sorted[0]
}
```
Sorts the entire array just to find the maximum value.
**Incorrect (O(n log n) - sort for oldest and newest):**
```typescript
function getOldestAndNewest(projects: Project[]) {
const sorted = [...projects].sort((a, b) => a.updatedAt - b.updatedAt)
return { oldest: sorted[0], newest: sorted[sorted.length - 1] }
}
```
Still sorts unnecessarily when only min/max are needed.
**Correct (O(n) - single loop):**
```typescript
function getLatestProject(projects: Project[]) {
if (projects.length === 0) return null
let latest = projects[0]
for (let i = 1; i < projects.length; i++) {
if (projects[i].updatedAt > latest.updatedAt) {
latest = projects[i]
}
}
return latest
}
function getOldestAndNewest(projects: Project[]) {
if (projects.length === 0) return { oldest: null, newest: null }
let oldest = projects[0]
let newest = projects[0]
for (let i = 1; i < projects.length; i++) {
if (projects[i].updatedAt < oldest.updatedAt) oldest = projects[i]
if (projects[i].updatedAt > newest.updatedAt) newest = projects[i]
}
return { oldest, newest }
}
```
Single pass through the array, no copying, no sorting.
**Alternative (Math.min/Math.max for small arrays):**
```typescript
const numbers = [5, 2, 8, 1, 9]
const min = Math.min(...numbers)
const max = Math.max(...numbers)
```
This works for small arrays, but can be slower or just throw an error for very large arrays due to spread operator limitations. Maximal array length is approximately 124000 in Chrome 143 and 638000 in Safari 18; exact numbers may vary - see [the fiddle](https://jsfiddle.net/qw1jabsx/4/). Use the loop approach for reliability.

View File

@ -0,0 +1,24 @@
---
title: Use Set/Map for O(1) Lookups
impact: LOW-MEDIUM
impactDescription: O(n) to O(1)
tags: javascript, set, map, data-structures, performance
---
## Use Set/Map for O(1) Lookups
Convert arrays to Set/Map for repeated membership checks.
**Incorrect (O(n) per check):**
```typescript
const allowedIds = ['a', 'b', 'c', ...]
items.filter(item => allowedIds.includes(item.id))
```
**Correct (O(1) per check):**
```typescript
const allowedIds = new Set(['a', 'b', 'c', ...])
items.filter(item => allowedIds.has(item.id))
```

View File

@ -0,0 +1,57 @@
---
title: Use toSorted() Instead of sort() for Immutability
impact: MEDIUM-HIGH
impactDescription: prevents mutation bugs in React state
tags: javascript, arrays, immutability, react, state, mutation
---
## Use toSorted() Instead of sort() for Immutability
`.sort()` mutates the array in place, which can cause bugs with React state and props. Use `.toSorted()` to create a new sorted array without mutation.
**Incorrect (mutates original array):**
```typescript
function UserList({ users }: { users: User[] }) {
// Mutates the users prop array!
const sorted = useMemo(
() => users.sort((a, b) => a.name.localeCompare(b.name)),
[users]
)
return <div>{sorted.map(renderUser)}</div>
}
```
**Correct (creates new array):**
```typescript
function UserList({ users }: { users: User[] }) {
// Creates new sorted array, original unchanged
const sorted = useMemo(
() => users.toSorted((a, b) => a.name.localeCompare(b.name)),
[users]
)
return <div>{sorted.map(renderUser)}</div>
}
```
**Why this matters in React:**
1. Props/state mutations break React's immutability model - React expects props and state to be treated as read-only
2. Causes stale closure bugs - Mutating arrays inside closures (callbacks, effects) can lead to unexpected behavior
**Browser support (fallback for older browsers):**
`.toSorted()` is available in all modern browsers (Chrome 110+, Safari 16+, Firefox 115+, Node.js 20+). For older environments, use spread operator:
```typescript
// Fallback for older browsers
const sorted = [...items].sort((a, b) => a.value - b.value)
```
**Other immutable array methods:**
- `.toSorted()` - immutable sort
- `.toReversed()` - immutable reverse
- `.toSpliced()` - immutable splice
- `.with()` - immutable element replacement

View File

@ -0,0 +1,26 @@
---
title: Use Activity Component for Show/Hide
impact: MEDIUM
impactDescription: preserves state/DOM
tags: rendering, activity, visibility, state-preservation
---
## Use Activity Component for Show/Hide
Use React's `<Activity>` to preserve state/DOM for expensive components that frequently toggle visibility.
**Usage:**
```tsx
import { Activity } from 'react'
function Dropdown({ isOpen }: Props) {
return (
<Activity mode={isOpen ? 'visible' : 'hidden'}>
<ExpensiveMenu />
</Activity>
)
}
```
Avoids expensive re-renders and state loss.

View File

@ -0,0 +1,47 @@
---
title: Animate SVG Wrapper Instead of SVG Element
impact: LOW
impactDescription: enables hardware acceleration
tags: rendering, svg, css, animation, performance
---
## Animate SVG Wrapper Instead of SVG Element
Many browsers don't have hardware acceleration for CSS3 animations on SVG elements. Wrap SVG in a `<div>` and animate the wrapper instead.
**Incorrect (animating SVG directly - no hardware acceleration):**
```tsx
function LoadingSpinner() {
return (
<svg
className="animate-spin"
width="24"
height="24"
viewBox="0 0 24 24"
>
<circle cx="12" cy="12" r="10" stroke="currentColor" />
</svg>
)
}
```
**Correct (animating wrapper div - hardware accelerated):**
```tsx
function LoadingSpinner() {
return (
<div className="animate-spin">
<svg
width="24"
height="24"
viewBox="0 0 24 24"
>
<circle cx="12" cy="12" r="10" stroke="currentColor" />
</svg>
</div>
)
}
```
This applies to all CSS transforms and transitions (`transform`, `opacity`, `translate`, `scale`, `rotate`). The wrapper div allows browsers to use GPU acceleration for smoother animations.

View File

@ -0,0 +1,40 @@
---
title: Use Explicit Conditional Rendering
impact: LOW
impactDescription: prevents rendering 0 or NaN
tags: rendering, conditional, jsx, falsy-values
---
## Use Explicit Conditional Rendering
Use explicit ternary operators (`? :`) instead of `&&` for conditional rendering when the condition can be `0`, `NaN`, or other falsy values that render.
**Incorrect (renders "0" when count is 0):**
```tsx
function Badge({ count }: { count: number }) {
return (
<div>
{count && <span className="badge">{count}</span>}
</div>
)
}
// When count = 0, renders: <div>0</div>
// When count = 5, renders: <div><span class="badge">5</span></div>
```
**Correct (renders nothing when count is 0):**
```tsx
function Badge({ count }: { count: number }) {
return (
<div>
{count > 0 ? <span className="badge">{count}</span> : null}
</div>
)
}
// When count = 0, renders: <div></div>
// When count = 5, renders: <div><span class="badge">5</span></div>
```

View File

@ -0,0 +1,38 @@
---
title: CSS content-visibility for Long Lists
impact: HIGH
impactDescription: faster initial render
tags: rendering, css, content-visibility, long-lists
---
## CSS content-visibility for Long Lists
Apply `content-visibility: auto` to defer off-screen rendering.
**CSS:**
```css
.message-item {
content-visibility: auto;
contain-intrinsic-size: 0 80px;
}
```
**Example:**
```tsx
function MessageList({ messages }: { messages: Message[] }) {
return (
<div className="overflow-y-auto h-screen">
{messages.map(msg => (
<div key={msg.id} className="message-item">
<Avatar user={msg.author} />
<div>{msg.content}</div>
</div>
))}
</div>
)
}
```
For 1000 messages, browser skips layout/paint for ~990 off-screen items (10× faster initial render).

View File

@ -0,0 +1,46 @@
---
title: Hoist Static JSX Elements
impact: LOW
impactDescription: avoids re-creation
tags: rendering, jsx, static, optimization
---
## Hoist Static JSX Elements
Extract static JSX outside components to avoid re-creation.
**Incorrect (recreates element every render):**
```tsx
function LoadingSkeleton() {
return <div className="animate-pulse h-20 bg-gray-200" />
}
function Container() {
return (
<div>
{loading && <LoadingSkeleton />}
</div>
)
}
```
**Correct (reuses same element):**
```tsx
const loadingSkeleton = (
<div className="animate-pulse h-20 bg-gray-200" />
)
function Container() {
return (
<div>
{loading && loadingSkeleton}
</div>
)
}
```
This is especially helpful for large and static SVG nodes, which can be expensive to recreate on every render.
**Note:** If your project has [React Compiler](https://react.dev/learn/react-compiler) enabled, the compiler automatically hoists static JSX elements and optimizes component re-renders, making manual hoisting unnecessary.

View File

@ -0,0 +1,82 @@
---
title: Prevent Hydration Mismatch Without Flickering
impact: MEDIUM
impactDescription: avoids visual flicker and hydration errors
tags: rendering, ssr, hydration, localStorage, flicker
---
## Prevent Hydration Mismatch Without Flickering
When rendering content that depends on client-side storage (localStorage, cookies), avoid both SSR breakage and post-hydration flickering by injecting a synchronous script that updates the DOM before React hydrates.
**Incorrect (breaks SSR):**
```tsx
function ThemeWrapper({ children }: { children: ReactNode }) {
// localStorage is not available on server - throws error
const theme = localStorage.getItem('theme') || 'light'
return (
<div className={theme}>
{children}
</div>
)
}
```
Server-side rendering will fail because `localStorage` is undefined.
**Incorrect (visual flickering):**
```tsx
function ThemeWrapper({ children }: { children: ReactNode }) {
const [theme, setTheme] = useState('light')
useEffect(() => {
// Runs after hydration - causes visible flash
const stored = localStorage.getItem('theme')
if (stored) {
setTheme(stored)
}
}, [])
return (
<div className={theme}>
{children}
</div>
)
}
```
Component first renders with default value (`light`), then updates after hydration, causing a visible flash of incorrect content.
**Correct (no flicker, no hydration mismatch):**
```tsx
function ThemeWrapper({ children }: { children: ReactNode }) {
return (
<>
<div id="theme-wrapper">
{children}
</div>
<script
dangerouslySetInnerHTML={{
__html: `
(function() {
try {
var theme = localStorage.getItem('theme') || 'light';
var el = document.getElementById('theme-wrapper');
if (el) el.className = theme;
} catch (e) {}
})();
`,
}}
/>
</>
)
}
```
The inline script executes synchronously before showing the element, ensuring the DOM already has the correct value. No flickering, no hydration mismatch.
This pattern is especially useful for theme toggles, user preferences, authentication states, and any client-only data that should render immediately without flashing default values.

View File

@ -0,0 +1,28 @@
---
title: Optimize SVG Precision
impact: LOW
impactDescription: reduces file size
tags: rendering, svg, optimization, svgo
---
## Optimize SVG Precision
Reduce SVG coordinate precision to decrease file size. The optimal precision depends on the viewBox size, but in general reducing precision should be considered.
**Incorrect (excessive precision):**
```svg
<path d="M 10.293847 20.847362 L 30.938472 40.192837" />
```
**Correct (1 decimal place):**
```svg
<path d="M 10.3 20.8 L 30.9 40.2" />
```
**Automate with SVGO:**
```bash
npx svgo --precision=1 --multipass icon.svg
```

View File

@ -0,0 +1,39 @@
---
title: Defer State Reads to Usage Point
impact: MEDIUM
impactDescription: avoids unnecessary subscriptions
tags: rerender, searchParams, localStorage, optimization
---
## Defer State Reads to Usage Point
Don't subscribe to dynamic state (searchParams, localStorage) if you only read it inside callbacks.
**Incorrect (subscribes to all searchParams changes):**
```tsx
function ShareButton({ chatId }: { chatId: string }) {
const searchParams = useSearchParams()
const handleShare = () => {
const ref = searchParams.get('ref')
shareChat(chatId, { ref })
}
return <button onClick={handleShare}>Share</button>
}
```
**Correct (reads on demand, no subscription):**
```tsx
function ShareButton({ chatId }: { chatId: string }) {
const handleShare = () => {
const params = new URLSearchParams(window.location.search)
const ref = params.get('ref')
shareChat(chatId, { ref })
}
return <button onClick={handleShare}>Share</button>
}
```

View File

@ -0,0 +1,45 @@
---
title: Narrow Effect Dependencies
impact: LOW
impactDescription: minimizes effect re-runs
tags: rerender, useEffect, dependencies, optimization
---
## Narrow Effect Dependencies
Specify primitive dependencies instead of objects to minimize effect re-runs.
**Incorrect (re-runs on any user field change):**
```tsx
useEffect(() => {
console.log(user.id)
}, [user])
```
**Correct (re-runs only when id changes):**
```tsx
useEffect(() => {
console.log(user.id)
}, [user.id])
```
**For derived state, compute outside effect:**
```tsx
// Incorrect: runs on width=767, 766, 765...
useEffect(() => {
if (width < 768) {
enableMobileMode()
}
}, [width])
// Correct: runs only on boolean transition
const isMobile = width < 768
useEffect(() => {
if (isMobile) {
enableMobileMode()
}
}, [isMobile])
```

View File

@ -0,0 +1,29 @@
---
title: Subscribe to Derived State
impact: MEDIUM
impactDescription: reduces re-render frequency
tags: rerender, derived-state, media-query, optimization
---
## Subscribe to Derived State
Subscribe to derived boolean state instead of continuous values to reduce re-render frequency.
**Incorrect (re-renders on every pixel change):**
```tsx
function Sidebar() {
const width = useWindowWidth() // updates continuously
const isMobile = width < 768
return <nav className={isMobile ? 'mobile' : 'desktop'} />
}
```
**Correct (re-renders only when boolean changes):**
```tsx
function Sidebar() {
const isMobile = useMediaQuery('(max-width: 767px)')
return <nav className={isMobile ? 'mobile' : 'desktop'} />
}
```

View File

@ -0,0 +1,74 @@
---
title: Use Functional setState Updates
impact: MEDIUM
impactDescription: prevents stale closures and unnecessary callback recreations
tags: react, hooks, useState, useCallback, callbacks, closures
---
## Use Functional setState Updates
When updating state based on the current state value, use the functional update form of setState instead of directly referencing the state variable. This prevents stale closures, eliminates unnecessary dependencies, and creates stable callback references.
**Incorrect (requires state as dependency):**
```tsx
function TodoList() {
const [items, setItems] = useState(initialItems)
// Callback must depend on items, recreated on every items change
const addItems = useCallback((newItems: Item[]) => {
setItems([...items, ...newItems])
}, [items]) // ❌ items dependency causes recreations
// Risk of stale closure if dependency is forgotten
const removeItem = useCallback((id: string) => {
setItems(items.filter(item => item.id !== id))
}, []) // ❌ Missing items dependency - will use stale items!
return <ItemsEditor items={items} onAdd={addItems} onRemove={removeItem} />
}
```
The first callback is recreated every time `items` changes, which can cause child components to re-render unnecessarily. The second callback has a stale closure bug—it will always reference the initial `items` value.
**Correct (stable callbacks, no stale closures):**
```tsx
function TodoList() {
const [items, setItems] = useState(initialItems)
// Stable callback, never recreated
const addItems = useCallback((newItems: Item[]) => {
setItems(curr => [...curr, ...newItems])
}, []) // ✅ No dependencies needed
// Always uses latest state, no stale closure risk
const removeItem = useCallback((id: string) => {
setItems(curr => curr.filter(item => item.id !== id))
}, []) // ✅ Safe and stable
return <ItemsEditor items={items} onAdd={addItems} onRemove={removeItem} />
}
```
**Benefits:**
1. **Stable callback references** - Callbacks don't need to be recreated when state changes
2. **No stale closures** - Always operates on the latest state value
3. **Fewer dependencies** - Simplifies dependency arrays and reduces memory leaks
4. **Prevents bugs** - Eliminates the most common source of React closure bugs
**When to use functional updates:**
- Any setState that depends on the current state value
- Inside useCallback/useMemo when state is needed
- Event handlers that reference state
- Async operations that update state
**When direct updates are fine:**
- Setting state to a static value: `setCount(0)`
- Setting state from props/arguments only: `setName(newName)`
- State doesn't depend on previous value
**Note:** If your project has [React Compiler](https://react.dev/learn/react-compiler) enabled, the compiler can automatically optimize some cases, but functional updates are still recommended for correctness and to prevent stale closure bugs.

View File

@ -0,0 +1,58 @@
---
title: Use Lazy State Initialization
impact: MEDIUM
impactDescription: wasted computation on every render
tags: react, hooks, useState, performance, initialization
---
## Use Lazy State Initialization
Pass a function to `useState` for expensive initial values. Without the function form, the initializer runs on every render even though the value is only used once.
**Incorrect (runs on every render):**
```tsx
function FilteredList({ items }: { items: Item[] }) {
// buildSearchIndex() runs on EVERY render, even after initialization
const [searchIndex, setSearchIndex] = useState(buildSearchIndex(items))
const [query, setQuery] = useState('')
// When query changes, buildSearchIndex runs again unnecessarily
return <SearchResults index={searchIndex} query={query} />
}
function UserProfile() {
// JSON.parse runs on every render
const [settings, setSettings] = useState(
JSON.parse(localStorage.getItem('settings') || '{}')
)
return <SettingsForm settings={settings} onChange={setSettings} />
}
```
**Correct (runs only once):**
```tsx
function FilteredList({ items }: { items: Item[] }) {
// buildSearchIndex() runs ONLY on initial render
const [searchIndex, setSearchIndex] = useState(() => buildSearchIndex(items))
const [query, setQuery] = useState('')
return <SearchResults index={searchIndex} query={query} />
}
function UserProfile() {
// JSON.parse runs only on initial render
const [settings, setSettings] = useState(() => {
const stored = localStorage.getItem('settings')
return stored ? JSON.parse(stored) : {}
})
return <SettingsForm settings={settings} onChange={setSettings} />
}
```
Use lazy initialization when computing initial values from localStorage/sessionStorage, building data structures (indexes, maps), reading from the DOM, or performing heavy transformations.
For simple primitives (`useState(0)`), direct references (`useState(props.value)`), or cheap literals (`useState({})`), the function form is unnecessary.

View File

@ -0,0 +1,44 @@
---
title: Extract to Memoized Components
impact: MEDIUM
impactDescription: enables early returns
tags: rerender, memo, useMemo, optimization
---
## Extract to Memoized Components
Extract expensive work into memoized components to enable early returns before computation.
**Incorrect (computes avatar even when loading):**
```tsx
function Profile({ user, loading }: Props) {
const avatar = useMemo(() => {
const id = computeAvatarId(user)
return <Avatar id={id} />
}, [user])
if (loading) return <Skeleton />
return <div>{avatar}</div>
}
```
**Correct (skips computation when loading):**
```tsx
const UserAvatar = memo(function UserAvatar({ user }: { user: User }) {
const id = useMemo(() => computeAvatarId(user), [user])
return <Avatar id={id} />
})
function Profile({ user, loading }: Props) {
if (loading) return <Skeleton />
return (
<div>
<UserAvatar user={user} />
</div>
)
}
```
**Note:** If your project has [React Compiler](https://react.dev/learn/react-compiler) enabled, manual memoization with `memo()` and `useMemo()` is not necessary. The compiler automatically optimizes re-renders.

View File

@ -0,0 +1,40 @@
---
title: Use Transitions for Non-Urgent Updates
impact: MEDIUM
impactDescription: maintains UI responsiveness
tags: rerender, transitions, startTransition, performance
---
## Use Transitions for Non-Urgent Updates
Mark frequent, non-urgent state updates as transitions to maintain UI responsiveness.
**Incorrect (blocks UI on every scroll):**
```tsx
function ScrollTracker() {
const [scrollY, setScrollY] = useState(0)
useEffect(() => {
const handler = () => setScrollY(window.scrollY)
window.addEventListener('scroll', handler, { passive: true })
return () => window.removeEventListener('scroll', handler)
}, [])
}
```
**Correct (non-blocking updates):**
```tsx
import { startTransition } from 'react'
function ScrollTracker() {
const [scrollY, setScrollY] = useState(0)
useEffect(() => {
const handler = () => {
startTransition(() => setScrollY(window.scrollY))
}
window.addEventListener('scroll', handler, { passive: true })
return () => window.removeEventListener('scroll', handler)
}, [])
}
```

View File

@ -0,0 +1,73 @@
---
title: Use after() for Non-Blocking Operations
impact: MEDIUM
impactDescription: faster response times
tags: server, async, logging, analytics, side-effects
---
## Use after() for Non-Blocking Operations
Use Next.js's `after()` to schedule work that should execute after a response is sent. This prevents logging, analytics, and other side effects from blocking the response.
**Incorrect (blocks response):**
```tsx
import { logUserAction } from '@/app/utils'
export async function POST(request: Request) {
// Perform mutation
await updateDatabase(request)
// Logging blocks the response
const userAgent = request.headers.get('user-agent') || 'unknown'
await logUserAction({ userAgent })
return new Response(JSON.stringify({ status: 'success' }), {
status: 200,
headers: { 'Content-Type': 'application/json' }
})
}
```
**Correct (non-blocking):**
```tsx
import { after } from 'next/server'
import { headers, cookies } from 'next/headers'
import { logUserAction } from '@/app/utils'
export async function POST(request: Request) {
// Perform mutation
await updateDatabase(request)
// Log after response is sent
after(async () => {
const userAgent = (await headers()).get('user-agent') || 'unknown'
const sessionCookie = (await cookies()).get('session-id')?.value || 'anonymous'
logUserAction({ sessionCookie, userAgent })
})
return new Response(JSON.stringify({ status: 'success' }), {
status: 200,
headers: { 'Content-Type': 'application/json' }
})
}
```
The response is sent immediately while logging happens in the background.
**Common use cases:**
- Analytics tracking
- Audit logging
- Sending notifications
- Cache invalidation
- Cleanup tasks
**Important notes:**
- `after()` runs even if the response fails or redirects
- Works in Server Actions, Route Handlers, and Server Components
Reference: [https://nextjs.org/docs/app/api-reference/functions/after](https://nextjs.org/docs/app/api-reference/functions/after)

View File

@ -0,0 +1,41 @@
---
title: Cross-Request LRU Caching
impact: HIGH
impactDescription: caches across requests
tags: server, cache, lru, cross-request
---
## Cross-Request LRU Caching
`React.cache()` only works within one request. For data shared across sequential requests (user clicks button A then button B), use an LRU cache.
**Implementation:**
```typescript
import { LRUCache } from 'lru-cache'
const cache = new LRUCache<string, any>({
max: 1000,
ttl: 5 * 60 * 1000 // 5 minutes
})
export async function getUser(id: string) {
const cached = cache.get(id)
if (cached) return cached
const user = await db.user.findUnique({ where: { id } })
cache.set(id, user)
return user
}
// Request 1: DB query, result cached
// Request 2: cache hit, no DB query
```
Use when sequential user actions hit multiple endpoints needing the same data within seconds.
**With Vercel's [Fluid Compute](https://vercel.com/docs/fluid-compute):** LRU caching is especially effective because multiple concurrent requests can share the same function instance and cache. This means the cache persists across requests without needing external storage like Redis.
**In traditional serverless:** Each invocation runs in isolation, so consider Redis for cross-process caching.
Reference: [https://github.com/isaacs/node-lru-cache](https://github.com/isaacs/node-lru-cache)

View File

@ -0,0 +1,76 @@
---
title: Per-Request Deduplication with React.cache()
impact: MEDIUM
impactDescription: deduplicates within request
tags: server, cache, react-cache, deduplication
---
## Per-Request Deduplication with React.cache()
Use `React.cache()` for server-side request deduplication. Authentication and database queries benefit most.
**Usage:**
```typescript
import { cache } from 'react'
export const getCurrentUser = cache(async () => {
const session = await auth()
if (!session?.user?.id) return null
return await db.user.findUnique({
where: { id: session.user.id }
})
})
```
Within a single request, multiple calls to `getCurrentUser()` execute the query only once.
**Avoid inline objects as arguments:**
`React.cache()` uses shallow equality (`Object.is`) to determine cache hits. Inline objects create new references each call, preventing cache hits.
**Incorrect (always cache miss):**
```typescript
const getUser = cache(async (params: { uid: number }) => {
return await db.user.findUnique({ where: { id: params.uid } })
})
// Each call creates new object, never hits cache
getUser({ uid: 1 })
getUser({ uid: 1 }) // Cache miss, runs query again
```
**Correct (cache hit):**
```typescript
const getUser = cache(async (uid: number) => {
return await db.user.findUnique({ where: { id: uid } })
})
// Primitive args use value equality
getUser(1)
getUser(1) // Cache hit, returns cached result
```
If you must pass objects, pass the same reference:
```typescript
const params = { uid: 1 }
getUser(params) // Query runs
getUser(params) // Cache hit (same reference)
```
**Next.js-Specific Note:**
In Next.js, the `fetch` API is automatically extended with request memoization. Requests with the same URL and options are automatically deduplicated within a single request, so you don't need `React.cache()` for `fetch` calls. However, `React.cache()` is still essential for other async tasks:
- Database queries (Prisma, Drizzle, etc.)
- Heavy computations
- Authentication checks
- File system operations
- Any non-fetch async work
Use `React.cache()` to deduplicate these operations across your component tree.
Reference: [React.cache documentation](https://react.dev/reference/react/cache)

View File

@ -0,0 +1,83 @@
---
title: Parallel Data Fetching with Component Composition
impact: CRITICAL
impactDescription: eliminates server-side waterfalls
tags: server, rsc, parallel-fetching, composition
---
## Parallel Data Fetching with Component Composition
React Server Components execute sequentially within a tree. Restructure with composition to parallelize data fetching.
**Incorrect (Sidebar waits for Page's fetch to complete):**
```tsx
export default async function Page() {
const header = await fetchHeader()
return (
<div>
<div>{header}</div>
<Sidebar />
</div>
)
}
async function Sidebar() {
const items = await fetchSidebarItems()
return <nav>{items.map(renderItem)}</nav>
}
```
**Correct (both fetch simultaneously):**
```tsx
async function Header() {
const data = await fetchHeader()
return <div>{data}</div>
}
async function Sidebar() {
const items = await fetchSidebarItems()
return <nav>{items.map(renderItem)}</nav>
}
export default function Page() {
return (
<div>
<Header />
<Sidebar />
</div>
)
}
```
**Alternative with children prop:**
```tsx
async function Header() {
const data = await fetchHeader()
return <div>{data}</div>
}
async function Sidebar() {
const items = await fetchSidebarItems()
return <nav>{items.map(renderItem)}</nav>
}
function Layout({ children }: { children: ReactNode }) {
return (
<div>
<Header />
{children}
</div>
)
}
export default function Page() {
return (
<Layout>
<Sidebar />
</Layout>
)
}
```

View File

@ -0,0 +1,38 @@
---
title: Minimize Serialization at RSC Boundaries
impact: HIGH
impactDescription: reduces data transfer size
tags: server, rsc, serialization, props
---
## Minimize Serialization at RSC Boundaries
The React Server/Client boundary serializes all object properties into strings and embeds them in the HTML response and subsequent RSC requests. This serialized data directly impacts page weight and load time, so **size matters a lot**. Only pass fields that the client actually uses.
**Incorrect (serializes all 50 fields):**
```tsx
async function Page() {
const user = await fetchUser() // 50 fields
return <Profile user={user} />
}
'use client'
function Profile({ user }: { user: User }) {
return <div>{user.name}</div> // uses 1 field
}
```
**Correct (serializes only 1 field):**
```tsx
async function Page() {
const user = await fetchUser()
return <Profile name={user.name} />
}
'use client'
function Profile({ name }: { name: string }) {
return <div>{name}</div>
}
```

View File

@ -16,14 +16,14 @@ jobs:
- name: Check Docker Compose inputs
id: docker-compose-changes
uses: tj-actions/changed-files@v46
uses: tj-actions/changed-files@v47
with:
files: |
docker/generate_docker_compose
docker/.env.example
docker/docker-compose-template.yaml
docker/docker-compose.yaml
- uses: actions/setup-python@v5
- uses: actions/setup-python@v6
with:
python-version: "3.11"
@ -82,6 +82,6 @@ jobs:
# mdformat breaks YAML front matter in markdown files. Add --exclude for directories containing YAML front matter.
- name: mdformat
run: |
uvx --python 3.13 mdformat . --exclude ".claude/skills/**/SKILL.md"
uvx --python 3.13 mdformat . --exclude ".claude/skills/**"
- uses: autofix-ci/action@635ffb0c9798bd160680f18fd73371e355b85f27

View File

@ -112,7 +112,7 @@ jobs:
context: "web"
steps:
- name: Download digests
uses: actions/download-artifact@v4
uses: actions/download-artifact@v7
with:
path: /tmp/digests
pattern: digests-${{ matrix.context }}-*

View File

@ -19,7 +19,7 @@ jobs:
github.event.workflow_run.head_branch == 'deploy/agent-dev'
steps:
- name: Deploy to server
uses: appleboy/ssh-action@v0.1.8
uses: appleboy/ssh-action@v1
with:
host: ${{ secrets.AGENT_DEV_SSH_HOST }}
username: ${{ secrets.SSH_USER }}

View File

@ -16,7 +16,7 @@ jobs:
github.event.workflow_run.head_branch == 'deploy/dev'
steps:
- name: Deploy to server
uses: appleboy/ssh-action@v0.1.8
uses: appleboy/ssh-action@v1
with:
host: ${{ secrets.SSH_HOST }}
username: ${{ secrets.SSH_USER }}

View File

@ -20,7 +20,7 @@ jobs:
)
steps:
- name: Deploy to server
uses: appleboy/ssh-action@v0.1.8
uses: appleboy/ssh-action@v1
with:
host: ${{ secrets.HITL_SSH_HOST }}
username: ${{ secrets.SSH_USER }}

View File

@ -18,7 +18,7 @@ jobs:
pull-requests: write
steps:
- uses: actions/stale@v5
- uses: actions/stale@v10
with:
days-before-issue-stale: 15
days-before-issue-close: 3

View File

@ -65,6 +65,9 @@ jobs:
defaults:
run:
working-directory: ./web
permissions:
checks: write
pull-requests: read
steps:
- name: Checkout code
@ -90,7 +93,7 @@ jobs:
uses: actions/setup-node@v6
if: steps.changed-files.outputs.any_changed == 'true'
with:
node-version: 22
node-version: 24
cache: pnpm
cache-dependency-path: ./web/pnpm-lock.yaml
@ -103,7 +106,16 @@ jobs:
if: steps.changed-files.outputs.any_changed == 'true'
working-directory: ./web
run: |
pnpm run lint
pnpm run lint:ci
# pnpm run lint:report
# continue-on-error: true
# - name: Annotate Code
# if: steps.changed-files.outputs.any_changed == 'true' && github.event_name == 'pull_request'
# uses: DerLev/eslint-annotations@51347b3a0abfb503fc8734d5ae31c4b151297fae
# with:
# eslint-report: web/eslint_report.json
# github-token: ${{ secrets.GITHUB_TOKEN }}
- name: Web type check
if: steps.changed-files.outputs.any_changed == 'true'
@ -115,11 +127,6 @@ jobs:
working-directory: ./web
run: pnpm run knip
- name: Web build check
if: steps.changed-files.outputs.any_changed == 'true'
working-directory: ./web
run: pnpm run build
superlinter:
name: SuperLinter
runs-on: ubuntu-latest

View File

@ -16,10 +16,6 @@ jobs:
name: unit test for Node.js SDK
runs-on: ubuntu-latest
strategy:
matrix:
node-version: [16, 18, 20, 22]
defaults:
run:
working-directory: sdks/nodejs-client
@ -29,10 +25,10 @@ jobs:
with:
persist-credentials: false
- name: Use Node.js ${{ matrix.node-version }}
- name: Use Node.js
uses: actions/setup-node@v6
with:
node-version: ${{ matrix.node-version }}
node-version: 24
cache: ''
cache-dependency-path: 'pnpm-lock.yaml'

View File

@ -57,7 +57,7 @@ jobs:
- name: Set up Node.js
uses: actions/setup-node@v6
with:
node-version: 'lts/*'
node-version: 24
cache: pnpm
cache-dependency-path: ./web/pnpm-lock.yaml

View File

@ -21,7 +21,7 @@ jobs:
steps:
- name: Checkout repository
uses: actions/checkout@v4
uses: actions/checkout@v6
with:
fetch-depth: 0

View File

@ -31,7 +31,7 @@ jobs:
- name: Setup Node.js
uses: actions/setup-node@v6
with:
node-version: 22
node-version: 24
cache: pnpm
cache-dependency-path: ./web/pnpm-lock.yaml
@ -366,3 +366,48 @@ jobs:
path: web/coverage
retention-days: 30
if-no-files-found: error
web-build:
name: Web Build
runs-on: ubuntu-latest
defaults:
run:
working-directory: ./web
steps:
- name: Checkout code
uses: actions/checkout@v6
with:
persist-credentials: false
- name: Check changed files
id: changed-files
uses: tj-actions/changed-files@v47
with:
files: |
web/**
.github/workflows/web-tests.yml
- name: Install pnpm
uses: pnpm/action-setup@v4
with:
package_json_file: web/package.json
run_install: false
- name: Setup NodeJS
uses: actions/setup-node@v6
if: steps.changed-files.outputs.any_changed == 'true'
with:
node-version: 24
cache: pnpm
cache-dependency-path: ./web/pnpm-lock.yaml
- name: Web dependencies
if: steps.changed-files.outputs.any_changed == 'true'
working-directory: ./web
run: pnpm install --frozen-lockfile
- name: Web build check
if: steps.changed-files.outputs.any_changed == 'true'
working-directory: ./web
run: pnpm run build

View File

@ -12,12 +12,8 @@ The codebase is split into:
## Backend Workflow
- Read `api/AGENTS.md` for details
- Run backend CLI commands through `uv run --project api <command>`.
- Before submission, all backend modifications must pass local checks: `make lint`, `make type-check`, and `uv run --project api --dev dev/pytest/pytest_unit_tests.sh`.
- Use Makefile targets for linting and formatting; `make lint` and `make type-check` cover the required checks.
- Integration tests are CI-only and are not expected to run in the local environment.
## Frontend Workflow

View File

@ -61,7 +61,8 @@ check:
lint:
@echo "🔧 Running ruff format, check with fixes, import linter, and dotenv-linter..."
@uv run --project api --dev sh -c 'ruff format ./api && ruff check --fix ./api'
@uv run --project api --dev ruff format ./api
@uv run --project api --dev ruff check --fix ./api
@uv run --directory api --dev lint-imports
@uv run --project api --dev dotenv-linter ./api/.env.example ./web/.env.example
@echo "✅ Linting complete"
@ -73,7 +74,12 @@ type-check:
test:
@echo "🧪 Running backend unit tests..."
@uv run --project api --dev dev/pytest/pytest_unit_tests.sh
@if [ -n "$(TARGET_TESTS)" ]; then \
echo "Target: $(TARGET_TESTS)"; \
uv run --project api --dev pytest $(TARGET_TESTS); \
else \
uv run --project api --dev dev/pytest/pytest_unit_tests.sh; \
fi
@echo "✅ Tests complete"
# Build Docker images
@ -125,7 +131,7 @@ help:
@echo " make check - Check code with ruff"
@echo " make lint - Format, fix, and lint code (ruff, imports, dotenv)"
@echo " make type-check - Run type checking with basedpyright"
@echo " make test - Run backend unit tests"
@echo " make test - Run backend unit tests (or TARGET_TESTS=./api/tests/<target_tests>)"
@echo ""
@echo "Docker Build Targets:"
@echo " make build-web - Build web Docker image"

0
agent-notes/.gitkeep Normal file
View File

View File

@ -417,6 +417,8 @@ SMTP_USERNAME=123
SMTP_PASSWORD=abc
SMTP_USE_TLS=true
SMTP_OPPORTUNISTIC_TLS=false
# Optional: override the local hostname used for SMTP HELO/EHLO
SMTP_LOCAL_HOSTNAME=
# Sendgid configuration
SENDGRID_API_KEY=
# Sentry configuration
@ -713,3 +715,4 @@ ANNOTATION_IMPORT_MAX_CONCURRENT=5
SANDBOX_EXPIRED_RECORDS_CLEAN_GRACEFUL_PERIOD=21
SANDBOX_EXPIRED_RECORDS_CLEAN_BATCH_SIZE=1000
SANDBOX_EXPIRED_RECORDS_RETENTION_DAYS=30

View File

@ -1,62 +1,236 @@
# Agent Skill Index
# API Agent Guide
## Agent Notes (must-check)
Before you start work on any backend file under `api/`, you MUST check whether a related note exists under:
- `agent-notes/<same-relative-path-as-target-file>.md`
Rules:
- **Path mapping**: for a target file `<path>/<name>.py`, the note must be `agent-notes/<path>/<name>.py.md` (same folder structure, same filename, plus `.md`).
- **Before working**:
- If the note exists, read it first and follow any constraints/decisions recorded there.
- If the note conflicts with the current code, or references an "origin" file/path that has been deleted, renamed, or migrated, treat the **code as the single source of truth** and update the note to match reality.
- If the note does not exist, create it with a short architecture/intent summary and any relevant invariants/edge cases.
- **During working**:
- Keep the note in sync as you discover constraints, make decisions, or change approach.
- If you move/rename a file, migrate its note to the new mapped path (and fix any outdated references inside the note).
- Record non-obvious edge cases, trade-offs, and the test/verification plan as you go (not just at the end).
- Keep notes **coherent**: integrate new findings into the relevant sections and rewrite for clarity; avoid append-only “recent fix” / changelog-style additions unless the note is explicitly intended to be a changelog.
- **When finishing work**:
- Update the related note(s) to reflect what changed, why, and any new edge cases/tests.
- If a file is deleted, remove or clearly deprecate the corresponding note so it cannot be mistaken as current guidance.
- Keep notes concise and accurate; they are meant to prevent repeated rediscovery.
## Skill Index
Start with the section that best matches your need. Each entry lists the problems it solves plus key files/concepts so you know what to expect before opening it.
______________________________________________________________________
### Platform Foundations
## Platform Foundations
- **[Infrastructure Overview](agent_skills/infra.md)**\
When to read this:
#### [Infrastructure Overview](agent_skills/infra.md)
- **When to read this**
- You need to understand where a feature belongs in the architecture.
- Youre wiring storage, Redis, vector stores, or OTEL.
- Youre about to add CLI commands or async jobs.\
What it covers: configuration stack (`configs/app_config.py`, remote settings), storage entry points (`extensions/ext_storage.py`, `core/file/file_manager.py`), Redis conventions (`extensions/ext_redis.py`), plugin runtime topology, vector-store factory (`core/rag/datasource/vdb/*`), observability hooks, SSRF proxy usage, and core CLI commands.
- Youre about to add CLI commands or async jobs.
- **What it covers**
- Configuration stack (`configs/app_config.py`, remote settings)
- Storage entry points (`extensions/ext_storage.py`, `core/file/file_manager.py`)
- Redis conventions (`extensions/ext_redis.py`)
- Plugin runtime topology
- Vector-store factory (`core/rag/datasource/vdb/*`)
- Observability hooks
- SSRF proxy usage
- Core CLI commands
- **[Coding Style](agent_skills/coding_style.md)**\
When to read this:
### Plugin & Extension Development
- Youre writing or reviewing backend code and need the authoritative checklist.
- Youre unsure about Pydantic validators, SQLAlchemy session usage, or logging patterns.
- You want the exact lint/type/test commands used in PRs.\
Includes: Ruff & BasedPyright commands, no-annotation policy, session examples (`with Session(db.engine, ...)`), `@field_validator` usage, logging expectations, and the rule set for file size, helpers, and package management.
______________________________________________________________________
## Plugin & Extension Development
- **[Plugin Systems](agent_skills/plugin.md)**\
When to read this:
#### [Plugin Systems](agent_skills/plugin.md)
- **When to read this**
- Youre building or debugging a marketplace plugin.
- You need to know how manifests, providers, daemons, and migrations fit together.\
What it covers: plugin manifests (`core/plugin/entities/plugin.py`), installation/upgrade flows (`services/plugin/plugin_service.py`, CLI commands), runtime adapters (`core/plugin/impl/*` for tool/model/datasource/trigger/endpoint/agent), daemon coordination (`core/plugin/entities/plugin_daemon.py`), and how provider registries surface capabilities to the rest of the platform.
- You need to know how manifests, providers, daemons, and migrations fit together.
- **What it covers**
- Plugin manifests (`core/plugin/entities/plugin.py`)
- Installation/upgrade flows (`services/plugin/plugin_service.py`, CLI commands)
- Runtime adapters (`core/plugin/impl/*` for tool/model/datasource/trigger/endpoint/agent)
- Daemon coordination (`core/plugin/entities/plugin_daemon.py`)
- How provider registries surface capabilities to the rest of the platform
- **[Plugin OAuth](agent_skills/plugin_oauth.md)**\
When to read this:
#### [Plugin OAuth](agent_skills/plugin_oauth.md)
- **When to read this**
- You must integrate OAuth for a plugin or datasource.
- Youre handling credential encryption or refresh flows.\
Topics: credential storage, encryption helpers (`core/helper/provider_encryption.py`), OAuth client bootstrap (`services/plugin/oauth_service.py`, `services/plugin/plugin_parameter_service.py`), and how console/API layers expose the flows.
- Youre handling credential encryption or refresh flows.
- **Topics**
- Credential storage
- Encryption helpers (`core/helper/provider_encryption.py`)
- OAuth client bootstrap (`services/plugin/oauth_service.py`, `services/plugin/plugin_parameter_service.py`)
- How console/API layers expose the flows
______________________________________________________________________
### Workflow Entry & Execution
## Workflow Entry & Execution
#### [Trigger Concepts](agent_skills/trigger.md)
- **[Trigger Concepts](agent_skills/trigger.md)**\
When to read this:
- **When to read this**
- Youre debugging why a workflow didnt start.
- Youre adding a new trigger type or hook.
- You need to trace async execution, draft debugging, or webhook/schedule pipelines.\
Details: Start-node taxonomy, webhook & schedule internals (`core/workflow/nodes/trigger_*`, `services/trigger/*`), async orchestration (`services/async_workflow_service.py`, Celery queues), debug event bus, and storage/logging interactions.
- You need to trace async execution, draft debugging, or webhook/schedule pipelines.
- **Details**
- Start-node taxonomy
- Webhook & schedule internals (`core/workflow/nodes/trigger_*`, `services/trigger/*`)
- Async orchestration (`services/async_workflow_service.py`, Celery queues)
- Debug event bus
- Storage/logging interactions
______________________________________________________________________
## General Reminders
## Additional Notes for Agents
- All skill docs assume you follow the coding style guide—run Ruff/BasedPyright/tests listed there before submitting changes.
- All skill docs assume you follow the coding style rules below—run the lint/type/test commands before submitting changes.
- When you cannot find an answer in these briefs, search the codebase using the paths referenced (e.g., `core/plugin/impl/tool.py`, `services/dataset_service.py`).
- If you run into cross-cutting concerns (tenancy, configuration, storage), check the infrastructure guide first; it links to most supporting modules.
- Keep multi-tenancy and configuration central: everything flows through `configs.dify_config` and `tenant_id`.
- When touching plugins or triggers, consult both the system overview and the specialised doc to ensure you adjust lifecycle, storage, and observability consistently.
## Coding Style
This is the default standard for backend code in this repo. Follow it for new code and use it as the checklist when reviewing changes.
### Linting & Formatting
- Use Ruff for formatting and linting (follow `.ruff.toml`).
- Keep each line under 120 characters (including spaces).
### Naming Conventions
- Use `snake_case` for variables and functions.
- Use `PascalCase` for classes.
- Use `UPPER_CASE` for constants.
### Typing & Class Layout
- Code should usually include type annotations that match the repos current Python version (avoid untyped public APIs and “mystery” values).
- Prefer modern typing forms (e.g. `list[str]`, `dict[str, int]`) and avoid `Any` unless theres a strong reason.
- For classes, declare member variables at the top of the class body (before `__init__`) so the class shape is obvious at a glance:
```python
from datetime import datetime
class Example:
user_id: str
created_at: datetime
def __init__(self, user_id: str, created_at: datetime) -> None:
self.user_id = user_id
self.created_at = created_at
```
### General Rules
- Use Pydantic v2 conventions.
- Use `uv` for Python package management in this repo (usually with `--project api`).
- Prefer simple functions over small “utility classes” for lightweight helpers.
- Avoid implementing dunder methods unless its clearly needed and matches existing patterns.
- Never start long-running services as part of agent work (`uv run app.py`, `flask run`, etc.); running tests is allowed.
- Keep files below ~800 lines; split when necessary.
- Keep code readable and explicit—avoid clever hacks.
### Architecture & Boundaries
- Mirror the layered architecture: controller → service → core/domain.
- Reuse existing helpers in `core/`, `services/`, and `libs/` before creating new abstractions.
- Optimise for observability: deterministic control flow, clear logging, actionable errors.
### Logging & Errors
- Never use `print`; use a module-level logger:
- `logger = logging.getLogger(__name__)`
- Include tenant/app/workflow identifiers in log context when relevant.
- Raise domain-specific exceptions (`services/errors`, `core/errors`) and translate them into HTTP responses in controllers.
- Log retryable events at `warning`, terminal failures at `error`.
### SQLAlchemy Patterns
- Models inherit from `models.base.TypeBase`; do not create ad-hoc metadata or engines.
- Open sessions with context managers:
```python
from sqlalchemy.orm import Session
with Session(db.engine, expire_on_commit=False) as session:
stmt = select(Workflow).where(
Workflow.id == workflow_id,
Workflow.tenant_id == tenant_id,
)
workflow = session.execute(stmt).scalar_one_or_none()
```
- Prefer SQLAlchemy expressions; avoid raw SQL unless necessary.
- Always scope queries by `tenant_id` and protect write paths with safeguards (`FOR UPDATE`, row counts, etc.).
- Introduce repository abstractions only for very large tables (e.g., workflow executions) or when alternative storage strategies are required.
### Storage & External I/O
- Access storage via `extensions.ext_storage.storage`.
- Use `core.helper.ssrf_proxy` for outbound HTTP fetches.
- Background tasks that touch storage must be idempotent, and should log relevant object identifiers.
### Pydantic Usage
- Define DTOs with Pydantic v2 models and forbid extras by default.
- Use `@field_validator` / `@model_validator` for domain rules.
Example:
```python
from pydantic import BaseModel, ConfigDict, HttpUrl, field_validator
class TriggerConfig(BaseModel):
endpoint: HttpUrl
secret: str
model_config = ConfigDict(extra="forbid")
@field_validator("secret")
def ensure_secret_prefix(cls, value: str) -> str:
if not value.startswith("dify_"):
raise ValueError("secret must start with dify_")
return value
```
### Generics & Protocols
- Use `typing.Protocol` to define behavioural contracts (e.g., cache interfaces).
- Apply generics (`TypeVar`, `Generic`) for reusable utilities like caches or providers.
- Validate dynamic inputs at runtime when generics cannot enforce safety alone.
### Tooling & Checks
Quick checks while iterating:
- Format: `make format`
- Lint (includes auto-fix): `make lint`
- Type check: `make type-check`
- Targeted tests: `make test TARGET_TESTS=./api/tests/<target_tests>`
Before opening a PR / submitting:
- `make lint`
- `make type-check`
- `make test`
### Controllers & Services
- Controllers: parse input via Pydantic, invoke services, return serialised responses; no business logic.
- Services: coordinate repositories, providers, background tasks; keep side effects explicit.
- Document non-obvious behaviour with concise comments.
### Miscellaneous
- Use `configs.dify_config` for configuration—never read environment variables directly.
- Maintain tenant awareness end-to-end; `tenant_id` must flow through every layer touching shared resources.
- Queue async work through `services/async_workflow_service`; implement tasks under `tasks/` with explicit queue selection.
- Keep experimental scripts under `dev/`; do not ship them in production builds.

View File

@ -1,115 +0,0 @@
## Linter
- Always follow `.ruff.toml`.
- Run `uv run ruff check --fix --unsafe-fixes`.
- Keep each line under 100 characters (including spaces).
## Code Style
- `snake_case` for variables and functions.
- `PascalCase` for classes.
- `UPPER_CASE` for constants.
## Rules
- Use Pydantic v2 standard.
- Use `uv` for package management.
- Do not override dunder methods like `__init__`, `__iadd__`, etc.
- Never launch services (`uv run app.py`, `flask run`, etc.); running tests under `tests/` is allowed.
- Prefer simple functions over classes for lightweight helpers.
- Keep files below 800 lines; split when necessary.
- Keep code readable—no clever hacks.
- Never use `print`; log with `logger = logging.getLogger(__name__)`.
## Guiding Principles
- Mirror the projects layered architecture: controller → service → core/domain.
- Reuse existing helpers in `core/`, `services/`, and `libs/` before creating new abstractions.
- Optimise for observability: deterministic control flow, clear logging, actionable errors.
## SQLAlchemy Patterns
- Models inherit from `models.base.Base`; never create ad-hoc metadata or engines.
- Open sessions with context managers:
```python
from sqlalchemy.orm import Session
with Session(db.engine, expire_on_commit=False) as session:
stmt = select(Workflow).where(
Workflow.id == workflow_id,
Workflow.tenant_id == tenant_id,
)
workflow = session.execute(stmt).scalar_one_or_none()
```
- Use SQLAlchemy expressions; avoid raw SQL unless necessary.
- Introduce repository abstractions only for very large tables (e.g., workflow executions) to support alternative storage strategies.
- Always scope queries by `tenant_id` and protect write paths with safeguards (`FOR UPDATE`, row counts, etc.).
## Storage & External IO
- Access storage via `extensions.ext_storage.storage`.
- Use `core.helper.ssrf_proxy` for outbound HTTP fetches.
- Background tasks that touch storage must be idempotent and log the relevant object identifiers.
## Pydantic Usage
- Define DTOs with Pydantic v2 models and forbid extras by default.
- Use `@field_validator` / `@model_validator` for domain rules.
- Example:
```python
from pydantic import BaseModel, ConfigDict, HttpUrl, field_validator
class TriggerConfig(BaseModel):
endpoint: HttpUrl
secret: str
model_config = ConfigDict(extra="forbid")
@field_validator("secret")
def ensure_secret_prefix(cls, value: str) -> str:
if not value.startswith("dify_"):
raise ValueError("secret must start with dify_")
return value
```
## Generics & Protocols
- Use `typing.Protocol` to define behavioural contracts (e.g., cache interfaces).
- Apply generics (`TypeVar`, `Generic`) for reusable utilities like caches or providers.
- Validate dynamic inputs at runtime when generics cannot enforce safety alone.
## Error Handling & Logging
- Raise domain-specific exceptions (`services/errors`, `core/errors`) and translate to HTTP responses in controllers.
- Declare `logger = logging.getLogger(__name__)` at module top.
- Include tenant/app/workflow identifiers in log context.
- Log retryable events at `warning`, terminal failures at `error`.
## Tooling & Checks
- Format/lint: `uv run --project api --dev ruff format ./api` and `uv run --project api --dev ruff check --fix --unsafe-fixes ./api`.
- Type checks: `uv run --directory api --dev basedpyright`.
- Tests: `uv run --project api --dev dev/pytest/pytest_unit_tests.sh`.
- Run all of the above before submitting your work.
## Controllers & Services
- Controllers: parse input via Pydantic, invoke services, return serialised responses; no business logic.
- Services: coordinate repositories, providers, background tasks; keep side effects explicit.
- Avoid repositories unless necessary; direct SQLAlchemy usage is preferred for typical tables.
- Document non-obvious behaviour with concise comments.
## Miscellaneous
- Use `configs.dify_config` for configuration—never read environment variables directly.
- Maintain tenant awareness end-to-end; `tenant_id` must flow through every layer touching shared resources.
- Queue async work through `services/async_workflow_service`; implement tasks under `tasks/` with explicit queue selection.
- Keep experimental scripts under `dev/`; do not ship them in production builds.

View File

@ -3,6 +3,7 @@ import datetime
import json
import logging
import secrets
import time
from typing import Any
import click
@ -46,6 +47,8 @@ from services.clear_free_plan_tenant_expired_logs import ClearFreePlanTenantExpi
from services.plugin.data_migration import PluginDataMigration
from services.plugin.plugin_migration import PluginMigration
from services.plugin.plugin_service import PluginService
from services.retention.conversation.messages_clean_policy import create_message_clean_policy
from services.retention.conversation.messages_clean_service import MessagesCleanService
from services.retention.workflow_run.clear_free_plan_expired_workflow_run_logs import WorkflowRunCleanup
from tasks.remove_app_and_related_data_task import delete_draft_variables_batch
@ -2172,3 +2175,79 @@ def migrate_oss(
except Exception as e:
db.session.rollback()
click.echo(click.style(f"Failed to update DB storage_type: {str(e)}", fg="red"))
@click.command("clean-expired-messages", help="Clean expired messages.")
@click.option(
"--start-from",
type=click.DateTime(formats=["%Y-%m-%d", "%Y-%m-%dT%H:%M:%S"]),
required=True,
help="Lower bound (inclusive) for created_at.",
)
@click.option(
"--end-before",
type=click.DateTime(formats=["%Y-%m-%d", "%Y-%m-%dT%H:%M:%S"]),
required=True,
help="Upper bound (exclusive) for created_at.",
)
@click.option("--batch-size", default=1000, show_default=True, help="Batch size for selecting messages.")
@click.option(
"--graceful-period",
default=21,
show_default=True,
help="Graceful period in days after subscription expiration, will be ignored when billing is disabled.",
)
@click.option("--dry-run", is_flag=True, default=False, help="Show messages logs would be cleaned without deleting")
def clean_expired_messages(
batch_size: int,
graceful_period: int,
start_from: datetime.datetime,
end_before: datetime.datetime,
dry_run: bool,
):
"""
Clean expired messages and related data for tenants based on clean policy.
"""
click.echo(click.style("clean_messages: start clean messages.", fg="green"))
start_at = time.perf_counter()
try:
# Create policy based on billing configuration
# NOTE: graceful_period will be ignored when billing is disabled.
policy = create_message_clean_policy(graceful_period_days=graceful_period)
# Create and run the cleanup service
service = MessagesCleanService.from_time_range(
policy=policy,
start_from=start_from,
end_before=end_before,
batch_size=batch_size,
dry_run=dry_run,
)
stats = service.run()
end_at = time.perf_counter()
click.echo(
click.style(
f"clean_messages: completed successfully\n"
f" - Latency: {end_at - start_at:.2f}s\n"
f" - Batches processed: {stats['batches']}\n"
f" - Total messages scanned: {stats['total_messages']}\n"
f" - Messages filtered: {stats['filtered_messages']}\n"
f" - Messages deleted: {stats['total_deleted']}",
fg="green",
)
)
except Exception as e:
end_at = time.perf_counter()
logger.exception("clean_messages failed")
click.echo(
click.style(
f"clean_messages: failed after {end_at - start_at:.2f}s - {str(e)}",
fg="red",
)
)
raise
click.echo(click.style("messages cleanup completed.", fg="green"))

View File

@ -949,6 +949,12 @@ class MailConfig(BaseSettings):
default=False,
)
SMTP_LOCAL_HOSTNAME: str | None = Field(
description="Override the local hostname used in SMTP HELO/EHLO. "
"Useful behind NAT or when the default hostname causes rejections.",
default=None,
)
EMAIL_SEND_IP_LIMIT_PER_MINUTE: PositiveInt = Field(
description="Maximum number of emails allowed to be sent from the same IP address in a minute",
default=50,

View File

@ -4,7 +4,7 @@ from pydantic_settings import BaseSettings
class VolcengineTOSStorageConfig(BaseSettings):
"""
Configuration settings for Volcengine Tinder Object Storage (TOS)
Configuration settings for Volcengine Torch Object Storage (TOS)
"""
VOLCENGINE_TOS_BUCKET_NAME: str | None = Field(

View File

@ -592,9 +592,12 @@ def _get_conversation(app_model, conversation_id):
if not conversation:
raise NotFound("Conversation Not Exists.")
if not conversation.read_at:
conversation.read_at = naive_utc_now()
conversation.read_account_id = current_user.id
db.session.commit()
db.session.execute(
sa.update(Conversation)
.where(Conversation.id == conversation_id, Conversation.read_at.is_(None))
.values(read_at=naive_utc_now(), read_account_id=current_user.id)
)
db.session.commit()
db.session.refresh(conversation)
return conversation

View File

@ -146,6 +146,7 @@ class DatasetUpdatePayload(BaseModel):
embedding_model: str | None = None
embedding_model_provider: str | None = None
retrieval_model: dict[str, Any] | None = None
summary_index_setting: dict[str, Any] | None = None
partial_member_list: list[dict[str, str]] | None = None
external_retrieval_model: dict[str, Any] | None = None
external_knowledge_id: str | None = None

View File

@ -7,7 +7,7 @@ from typing import Literal, cast
import sqlalchemy as sa
from flask import request
from flask_restx import Resource, fields, marshal, marshal_with
from pydantic import BaseModel
from pydantic import BaseModel, Field
from sqlalchemy import asc, desc, select
from werkzeug.exceptions import Forbidden, NotFound
@ -39,9 +39,10 @@ from fields.document_fields import (
from libs.datetime_utils import naive_utc_now
from libs.login import current_account_with_tenant, login_required
from models import DatasetProcessRule, Document, DocumentSegment, UploadFile
from models.dataset import DocumentPipelineExecutionLog
from models.dataset import DocumentPipelineExecutionLog, DocumentSegmentSummary
from services.dataset_service import DatasetService, DocumentService
from services.entities.knowledge_entities.knowledge_entities import KnowledgeConfig, ProcessRule, RetrievalModel
from tasks.generate_summary_index_task import generate_summary_index_task
from ..app.error import (
ProviderModelCurrentlyNotSupportError,
@ -104,6 +105,19 @@ class DocumentRenamePayload(BaseModel):
name: str
class GenerateSummaryPayload(BaseModel):
document_list: list[str]
class DocumentDatasetListParam(BaseModel):
page: int = Field(1, title="Page", description="Page number.")
limit: int = Field(20, title="Limit", description="Page size.")
search: str | None = Field(None, alias="keyword", title="Search", description="Search keyword.")
sort_by: str = Field("-created_at", alias="sort", title="SortBy", description="Sort by field.")
status: str | None = Field(None, title="Status", description="Document status.")
fetch_val: str = Field("false", alias="fetch")
register_schema_models(
console_ns,
KnowledgeConfig,
@ -111,6 +125,7 @@ register_schema_models(
RetrievalModel,
DocumentRetryPayload,
DocumentRenamePayload,
GenerateSummaryPayload,
)
@ -225,14 +240,16 @@ class DatasetDocumentListApi(Resource):
def get(self, dataset_id):
current_user, current_tenant_id = current_account_with_tenant()
dataset_id = str(dataset_id)
page = request.args.get("page", default=1, type=int)
limit = request.args.get("limit", default=20, type=int)
search = request.args.get("keyword", default=None, type=str)
sort = request.args.get("sort", default="-created_at", type=str)
status = request.args.get("status", default=None, type=str)
raw_args = request.args.to_dict()
param = DocumentDatasetListParam.model_validate(raw_args)
page = param.page
limit = param.limit
search = param.search
sort = param.sort_by
status = param.status
# "yes", "true", "t", "y", "1" convert to True, while others convert to False.
try:
fetch_val = request.args.get("fetch", default="false")
fetch_val = param.fetch_val
if isinstance(fetch_val, bool):
fetch = fetch_val
else:
@ -295,6 +312,94 @@ class DatasetDocumentListApi(Resource):
paginated_documents = db.paginate(select=query, page=page, per_page=limit, max_per_page=100, error_out=False)
documents = paginated_documents.items
# Check if dataset has summary index enabled
has_summary_index = dataset.summary_index_setting and dataset.summary_index_setting.get("enable") is True
# Filter documents that need summary calculation
documents_need_summary = [doc for doc in documents if doc.need_summary is True]
document_ids_need_summary = [str(doc.id) for doc in documents_need_summary]
# Calculate summary_index_status for documents that need summary (only if dataset summary index is enabled)
summary_status_map = {}
if has_summary_index and document_ids_need_summary:
# Get all segments for these documents (excluding qa_model and re_segment)
segments = (
db.session.query(DocumentSegment.id, DocumentSegment.document_id)
.where(
DocumentSegment.document_id.in_(document_ids_need_summary),
DocumentSegment.status != "re_segment",
DocumentSegment.tenant_id == current_tenant_id,
)
.all()
)
# Group segments by document_id
document_segments_map = {}
for segment in segments:
doc_id = str(segment.document_id)
if doc_id not in document_segments_map:
document_segments_map[doc_id] = []
document_segments_map[doc_id].append(segment.id)
# Get all summary records for these segments
all_segment_ids = [seg.id for seg in segments]
summaries = {}
if all_segment_ids:
summary_records = (
db.session.query(DocumentSegmentSummary)
.where(
DocumentSegmentSummary.chunk_id.in_(all_segment_ids),
DocumentSegmentSummary.dataset_id == dataset_id,
DocumentSegmentSummary.enabled == True, # Only count enabled summaries
)
.all()
)
summaries = {summary.chunk_id: summary.status for summary in summary_records}
# Calculate summary_index_status for each document
for doc_id in document_ids_need_summary:
segment_ids = document_segments_map.get(doc_id, [])
if not segment_ids:
# No segments, status is "GENERATING" (waiting to generate)
summary_status_map[doc_id] = "GENERATING"
continue
# Count summary statuses for this document's segments
status_counts = {"completed": 0, "generating": 0, "error": 0, "not_started": 0}
for segment_id in segment_ids:
status = summaries.get(segment_id, "not_started")
if status in status_counts:
status_counts[status] += 1
else:
status_counts["not_started"] += 1
total_segments = len(segment_ids)
completed_count = status_counts["completed"]
generating_count = status_counts["generating"]
error_count = status_counts["error"]
# Determine overall status (only three states: GENERATING, COMPLETED, ERROR)
if completed_count == total_segments:
summary_status_map[doc_id] = "COMPLETED"
elif error_count > 0:
# Has errors (even if some are completed or generating)
summary_status_map[doc_id] = "ERROR"
elif generating_count > 0 or status_counts["not_started"] > 0:
# Still generating or not started
summary_status_map[doc_id] = "GENERATING"
else:
# Default to generating
summary_status_map[doc_id] = "GENERATING"
# Add summary_index_status to each document
for document in documents:
if has_summary_index and document.need_summary is True:
document.summary_index_status = summary_status_map.get(str(document.id), "GENERATING")
else:
# Return null if summary index is not enabled or document doesn't need summary
document.summary_index_status = None
if fetch:
for document in documents:
completed_segments = (
@ -780,6 +885,7 @@ class DocumentApi(DocumentResource):
"display_status": document.display_status,
"doc_form": document.doc_form,
"doc_language": document.doc_language,
"need_summary": document.need_summary if document.need_summary is not None else False,
}
else:
dataset_process_rules = DatasetService.get_process_rules(dataset_id)
@ -815,6 +921,7 @@ class DocumentApi(DocumentResource):
"display_status": document.display_status,
"doc_form": document.doc_form,
"doc_language": document.doc_language,
"need_summary": document.need_summary if document.need_summary is not None else False,
}
return response, 200
@ -1182,3 +1289,216 @@ class DocumentPipelineExecutionLogApi(DocumentResource):
"input_data": log.input_data,
"datasource_node_id": log.datasource_node_id,
}, 200
@console_ns.route("/datasets/<uuid:dataset_id>/documents/generate-summary")
class DocumentGenerateSummaryApi(Resource):
@console_ns.doc("generate_summary_for_documents")
@console_ns.doc(description="Generate summary index for documents")
@console_ns.doc(params={"dataset_id": "Dataset ID"})
@console_ns.expect(console_ns.models[GenerateSummaryPayload.__name__])
@console_ns.response(200, "Summary generation started successfully")
@console_ns.response(400, "Invalid request or dataset configuration")
@console_ns.response(403, "Permission denied")
@console_ns.response(404, "Dataset not found")
@setup_required
@login_required
@account_initialization_required
@cloud_edition_billing_rate_limit_check("knowledge")
def post(self, dataset_id):
"""
Generate summary index for specified documents.
This endpoint checks if the dataset configuration supports summary generation
(indexing_technique must be 'high_quality' and summary_index_setting.enable must be true),
then asynchronously generates summary indexes for the provided documents.
"""
current_user, _ = current_account_with_tenant()
dataset_id = str(dataset_id)
# Get dataset
dataset = DatasetService.get_dataset(dataset_id)
if not dataset:
raise NotFound("Dataset not found.")
# Check permissions
if not current_user.is_dataset_editor:
raise Forbidden()
try:
DatasetService.check_dataset_permission(dataset, current_user)
except services.errors.account.NoPermissionError as e:
raise Forbidden(str(e))
# Validate request payload
payload = GenerateSummaryPayload.model_validate(console_ns.payload or {})
document_list = payload.document_list
if not document_list:
raise ValueError("document_list cannot be empty.")
# Check if dataset configuration supports summary generation
if dataset.indexing_technique != "high_quality":
raise ValueError(
f"Summary generation is only available for 'high_quality' indexing technique. "
f"Current indexing technique: {dataset.indexing_technique}"
)
summary_index_setting = dataset.summary_index_setting
if not summary_index_setting or not summary_index_setting.get("enable"):
raise ValueError("Summary index is not enabled for this dataset. Please enable it in the dataset settings.")
# Verify all documents exist and belong to the dataset
documents = (
db.session.query(Document)
.filter(
Document.id.in_(document_list),
Document.dataset_id == dataset_id,
)
.all()
)
if len(documents) != len(document_list):
found_ids = {doc.id for doc in documents}
missing_ids = set(document_list) - found_ids
raise NotFound(f"Some documents not found: {list(missing_ids)}")
# Dispatch async tasks for each document
for document in documents:
# Skip qa_model documents as they don't generate summaries
if document.doc_form == "qa_model":
logger.info("Skipping summary generation for qa_model document %s", document.id)
continue
# Dispatch async task
generate_summary_index_task(dataset_id, document.id)
logger.info(
"Dispatched summary generation task for document %s in dataset %s",
document.id,
dataset_id,
)
return {"result": "success"}, 200
@console_ns.route("/datasets/<uuid:dataset_id>/documents/<uuid:document_id>/summary-status")
class DocumentSummaryStatusApi(DocumentResource):
@console_ns.doc("get_document_summary_status")
@console_ns.doc(description="Get summary index generation status for a document")
@console_ns.doc(params={"dataset_id": "Dataset ID", "document_id": "Document ID"})
@console_ns.response(200, "Summary status retrieved successfully")
@console_ns.response(404, "Document not found")
@setup_required
@login_required
@account_initialization_required
def get(self, dataset_id, document_id):
"""
Get summary index generation status for a document.
Returns:
- total_segments: Total number of segments in the document
- summary_status: Dictionary with status counts
- completed: Number of summaries completed
- generating: Number of summaries being generated
- error: Number of summaries with errors
- not_started: Number of segments without summary records
- summaries: List of summary records with status and content preview
"""
current_user, _ = current_account_with_tenant()
dataset_id = str(dataset_id)
document_id = str(document_id)
# Get document
document = self.get_document(dataset_id, document_id)
# Get dataset
dataset = DatasetService.get_dataset(dataset_id)
if not dataset:
raise NotFound("Dataset not found.")
# Check permissions
try:
DatasetService.check_dataset_permission(dataset, current_user)
except services.errors.account.NoPermissionError as e:
raise Forbidden(str(e))
# Get all segments for this document
segments = (
db.session.query(DocumentSegment)
.filter(
DocumentSegment.document_id == document_id,
DocumentSegment.dataset_id == dataset_id,
DocumentSegment.status == "completed",
DocumentSegment.enabled == True,
)
.all()
)
total_segments = len(segments)
# Get all summary records for these segments
segment_ids = [segment.id for segment in segments]
summaries = []
if segment_ids:
summaries = (
db.session.query(DocumentSegmentSummary)
.filter(
DocumentSegmentSummary.document_id == document_id,
DocumentSegmentSummary.dataset_id == dataset_id,
DocumentSegmentSummary.chunk_id.in_(segment_ids),
DocumentSegmentSummary.enabled == True, # Only return enabled summaries
)
.all()
)
# Create a mapping of chunk_id to summary
summary_map = {summary.chunk_id: summary for summary in summaries}
# Count statuses
status_counts = {
"completed": 0,
"generating": 0,
"error": 0,
"not_started": 0,
}
summary_list = []
for segment in segments:
summary = summary_map.get(segment.id)
if summary:
status = summary.status
status_counts[status] = status_counts.get(status, 0) + 1
summary_list.append(
{
"segment_id": segment.id,
"segment_position": segment.position,
"status": summary.status,
"summary_preview": (
summary.summary_content[:100] + "..."
if summary.summary_content and len(summary.summary_content) > 100
else summary.summary_content
),
"error": summary.error,
"created_at": int(summary.created_at.timestamp()) if summary.created_at else None,
"updated_at": int(summary.updated_at.timestamp()) if summary.updated_at else None,
}
)
else:
status_counts["not_started"] += 1
summary_list.append(
{
"segment_id": segment.id,
"segment_position": segment.position,
"status": "not_started",
"summary_preview": None,
"error": None,
"created_at": None,
"updated_at": None,
}
)
return {
"total_segments": total_segments,
"summary_status": status_counts,
"summaries": summary_list,
}, 200

View File

@ -32,7 +32,7 @@ from extensions.ext_redis import redis_client
from fields.segment_fields import child_chunk_fields, segment_fields
from libs.helper import escape_like_pattern
from libs.login import current_account_with_tenant, login_required
from models.dataset import ChildChunk, DocumentSegment
from models.dataset import ChildChunk, DocumentSegment, DocumentSegmentSummary
from models.model import UploadFile
from services.dataset_service import DatasetService, DocumentService, SegmentService
from services.entities.knowledge_entities.knowledge_entities import ChildChunkUpdateArgs, SegmentUpdateArgs
@ -41,6 +41,23 @@ from services.errors.chunk import ChildChunkIndexingError as ChildChunkIndexingS
from tasks.batch_create_segment_to_index_task import batch_create_segment_to_index_task
def _get_segment_with_summary(segment, dataset_id):
"""Helper function to marshal segment and add summary information."""
segment_dict = marshal(segment, segment_fields)
# Query summary for this segment (only enabled summaries)
summary = (
db.session.query(DocumentSegmentSummary)
.where(
DocumentSegmentSummary.chunk_id == segment.id,
DocumentSegmentSummary.dataset_id == dataset_id,
DocumentSegmentSummary.enabled == True, # Only return enabled summaries
)
.first()
)
segment_dict["summary"] = summary.summary_content if summary else None
return segment_dict
class SegmentListQuery(BaseModel):
limit: int = Field(default=20, ge=1, le=100)
status: list[str] = Field(default_factory=list)
@ -63,6 +80,7 @@ class SegmentUpdatePayload(BaseModel):
keywords: list[str] | None = None
regenerate_child_chunks: bool = False
attachment_ids: list[str] | None = None
summary: str | None = None # Summary content for summary index
class BatchImportPayload(BaseModel):
@ -180,8 +198,32 @@ class DatasetDocumentSegmentListApi(Resource):
segments = db.paginate(select=query, page=page, per_page=limit, max_per_page=100, error_out=False)
# Query summaries for all segments in this page (batch query for efficiency)
segment_ids = [segment.id for segment in segments.items]
summaries = {}
if segment_ids:
summary_records = (
db.session.query(DocumentSegmentSummary)
.where(
DocumentSegmentSummary.chunk_id.in_(segment_ids),
DocumentSegmentSummary.dataset_id == dataset_id,
)
.all()
)
# Only include enabled summaries
summaries = {
summary.chunk_id: summary.summary_content for summary in summary_records if summary.enabled is True
}
# Add summary to each segment
segments_with_summary = []
for segment in segments.items:
segment_dict = marshal(segment, segment_fields)
segment_dict["summary"] = summaries.get(segment.id)
segments_with_summary.append(segment_dict)
response = {
"data": marshal(segments.items, segment_fields),
"data": segments_with_summary,
"limit": limit,
"total": segments.total,
"total_pages": segments.pages,
@ -327,7 +369,7 @@ class DatasetDocumentSegmentAddApi(Resource):
payload_dict = payload.model_dump(exclude_none=True)
SegmentService.segment_create_args_validate(payload_dict, document)
segment = SegmentService.create_segment(payload_dict, document, dataset)
return {"data": marshal(segment, segment_fields), "doc_form": document.doc_form}, 200
return {"data": _get_segment_with_summary(segment, dataset_id), "doc_form": document.doc_form}, 200
@console_ns.route("/datasets/<uuid:dataset_id>/documents/<uuid:document_id>/segments/<uuid:segment_id>")
@ -389,10 +431,12 @@ class DatasetDocumentSegmentUpdateApi(Resource):
payload = SegmentUpdatePayload.model_validate(console_ns.payload or {})
payload_dict = payload.model_dump(exclude_none=True)
SegmentService.segment_create_args_validate(payload_dict, document)
# Update segment (summary update with change detection is handled in SegmentService.update_segment)
segment = SegmentService.update_segment(
SegmentUpdateArgs.model_validate(payload.model_dump(exclude_none=True)), segment, document, dataset
)
return {"data": marshal(segment, segment_fields), "doc_form": document.doc_form}, 200
return {"data": _get_segment_with_summary(segment, dataset_id), "doc_form": document.doc_form}, 200
@setup_required
@login_required

View File

@ -81,7 +81,7 @@ class ExternalKnowledgeApiPayload(BaseModel):
class ExternalDatasetCreatePayload(BaseModel):
external_knowledge_api_id: str
external_knowledge_id: str
name: str = Field(..., min_length=1, max_length=40)
name: str = Field(..., min_length=1, max_length=100)
description: str | None = Field(None, max_length=400)
external_retrieval_model: dict[str, object] | None = None

View File

@ -1,6 +1,13 @@
from flask_restx import Resource
from flask_restx import Resource, fields
from controllers.common.schema import register_schema_model
from fields.hit_testing_fields import (
child_chunk_fields,
document_fields,
files_fields,
hit_testing_record_fields,
segment_fields,
)
from libs.login import login_required
from .. import console_ns
@ -14,13 +21,45 @@ from ..wraps import (
register_schema_model(console_ns, HitTestingPayload)
def _get_or_create_model(model_name: str, field_def):
"""Get or create a flask_restx model to avoid dict type issues in Swagger."""
existing = console_ns.models.get(model_name)
if existing is None:
existing = console_ns.model(model_name, field_def)
return existing
# Register models for flask_restx to avoid dict type issues in Swagger
document_model = _get_or_create_model("HitTestingDocument", document_fields)
segment_fields_copy = segment_fields.copy()
segment_fields_copy["document"] = fields.Nested(document_model)
segment_model = _get_or_create_model("HitTestingSegment", segment_fields_copy)
child_chunk_model = _get_or_create_model("HitTestingChildChunk", child_chunk_fields)
files_model = _get_or_create_model("HitTestingFile", files_fields)
hit_testing_record_fields_copy = hit_testing_record_fields.copy()
hit_testing_record_fields_copy["segment"] = fields.Nested(segment_model)
hit_testing_record_fields_copy["child_chunks"] = fields.List(fields.Nested(child_chunk_model))
hit_testing_record_fields_copy["files"] = fields.List(fields.Nested(files_model))
hit_testing_record_model = _get_or_create_model("HitTestingRecord", hit_testing_record_fields_copy)
# Response model for hit testing API
hit_testing_response_fields = {
"query": fields.String,
"records": fields.List(fields.Nested(hit_testing_record_model)),
}
hit_testing_response_model = _get_or_create_model("HitTestingResponse", hit_testing_response_fields)
@console_ns.route("/datasets/<uuid:dataset_id>/hit-testing")
class HitTestingApi(Resource, DatasetsHitTestingBase):
@console_ns.doc("test_dataset_retrieval")
@console_ns.doc(description="Test dataset knowledge retrieval")
@console_ns.doc(params={"dataset_id": "Dataset ID"})
@console_ns.expect(console_ns.models[HitTestingPayload.__name__])
@console_ns.response(200, "Hit testing completed successfully")
@console_ns.response(200, "Hit testing completed successfully", model=hit_testing_response_model)
@console_ns.response(404, "Dataset not found")
@console_ns.response(400, "Invalid parameters")
@setup_required

View File

@ -30,6 +30,11 @@ class TagBindingRemovePayload(BaseModel):
type: Literal["knowledge", "app"] | None = Field(default=None, description="Tag type")
class TagListQueryParam(BaseModel):
type: Literal["knowledge", "app", ""] = Field("", description="Tag type filter")
keyword: str | None = Field(None, description="Search keyword")
register_schema_models(
console_ns,
TagBasePayload,
@ -43,12 +48,15 @@ class TagListApi(Resource):
@setup_required
@login_required
@account_initialization_required
@console_ns.doc(
params={"type": 'Tag type filter. Can be "knowledge" or "app".', "keyword": "Search keyword for tag name."}
)
@marshal_with(dataset_tag_fields)
def get(self):
_, current_tenant_id = current_account_with_tenant()
tag_type = request.args.get("type", type=str, default="")
keyword = request.args.get("keyword", default=None, type=str)
tags = TagService.get_tags(tag_type, current_tenant_id, keyword)
raw_args = request.args.to_dict()
param = TagListQueryParam.model_validate(raw_args)
tags = TagService.get_tags(param.type, current_tenant_id, param.keyword)
return tags, 200

View File

@ -24,7 +24,7 @@ from core.app.layers.conversation_variable_persist_layer import ConversationVari
from core.db.session_factory import session_factory
from core.moderation.base import ModerationError
from core.moderation.input_moderation import InputModeration
from core.variables.variables import VariableUnion
from core.variables.variables import Variable
from core.workflow.enums import WorkflowType
from core.workflow.graph_engine.command_channels.redis_channel import RedisChannel
from core.workflow.graph_engine.layers.base import GraphEngineLayer
@ -149,8 +149,8 @@ class AdvancedChatAppRunner(WorkflowBasedAppRunner):
system_variables=system_inputs,
user_inputs=inputs,
environment_variables=self._workflow.environment_variables,
# Based on the definition of `VariableUnion`,
# `list[Variable]` can be safely used as `list[VariableUnion]` since they are compatible.
# Based on the definition of `Variable`,
# `VariableBase` instances can be safely used as `Variable` since they are compatible.
conversation_variables=conversation_variables,
)
@ -318,7 +318,7 @@ class AdvancedChatAppRunner(WorkflowBasedAppRunner):
trace_manager=app_generate_entity.trace_manager,
)
def _initialize_conversation_variables(self) -> list[VariableUnion]:
def _initialize_conversation_variables(self) -> list[Variable]:
"""
Initialize conversation variables for the current conversation.
@ -343,7 +343,7 @@ class AdvancedChatAppRunner(WorkflowBasedAppRunner):
conversation_variables = [var.to_variable() for var in existing_variables]
session.commit()
return cast(list[VariableUnion], conversation_variables)
return cast(list[Variable], conversation_variables)
def _load_existing_conversation_variables(self, session: Session) -> list[ConversationVariable]:
"""

View File

@ -1,6 +1,6 @@
import logging
from core.variables import Variable
from core.variables import VariableBase
from core.workflow.constants import CONVERSATION_VARIABLE_NODE_ID
from core.workflow.conversation_variable_updater import ConversationVariableUpdater
from core.workflow.enums import NodeType
@ -44,7 +44,7 @@ class ConversationVariablePersistenceLayer(GraphEngineLayer):
if selector[0] != CONVERSATION_VARIABLE_NODE_ID:
continue
variable = self.graph_runtime_state.variable_pool.get(selector)
if not isinstance(variable, Variable):
if not isinstance(variable, VariableBase):
logger.warning(
"Conversation variable not found in variable pool. selector=%s",
selector,

View File

@ -3,6 +3,7 @@ from pydantic import BaseModel, Field, field_validator
class PreviewDetail(BaseModel):
content: str
summary: str | None = None
child_chunks: list[str] | None = None

View File

@ -33,6 +33,10 @@ class MaxRetriesExceededError(ValueError):
pass
request_error = httpx.RequestError
max_retries_exceeded_error = MaxRetriesExceededError
def _create_proxy_mounts() -> dict[str, httpx.HTTPTransport]:
return {
"http://": httpx.HTTPTransport(

View File

@ -311,14 +311,18 @@ class IndexingRunner:
qa_preview_texts: list[QAPreviewDetail] = []
total_segments = 0
# doc_form represents the segmentation method (general, parent-child, QA)
index_type = doc_form
index_processor = IndexProcessorFactory(index_type).init_index_processor()
# one extract_setting is one source document
for extract_setting in extract_settings:
# extract
processing_rule = DatasetProcessRule(
mode=tmp_processing_rule["mode"], rules=json.dumps(tmp_processing_rule["rules"])
)
# Extract document content
text_docs = index_processor.extract(extract_setting, process_rule_mode=tmp_processing_rule["mode"])
# Cleaning and segmentation
documents = index_processor.transform(
text_docs,
current_user=None,
@ -361,6 +365,12 @@ class IndexingRunner:
if doc_form and doc_form == "qa_model":
return IndexingEstimate(total_segments=total_segments * 20, qa_preview=qa_preview_texts, preview=[])
# Generate summary preview
summary_index_setting = tmp_processing_rule.get("summary_index_setting")
if summary_index_setting and summary_index_setting.get("enable") and preview_texts:
preview_texts = index_processor.generate_summary_preview(tenant_id, preview_texts, summary_index_setting)
return IndexingEstimate(total_segments=total_segments, preview=preview_texts)
def _extract(

View File

@ -71,7 +71,7 @@ class LLMGenerator:
response: LLMResult = model_instance.invoke_llm(
prompt_messages=list(prompts), model_parameters={"max_tokens": 500, "temperature": 1}, stream=False
)
answer = cast(str, response.message.content)
answer = response.message.get_text_content()
if answer is None:
return ""
try:
@ -113,9 +113,11 @@ class LLMGenerator:
output_parser = SuggestedQuestionsAfterAnswerOutputParser()
format_instructions = output_parser.get_format_instructions()
prompt_template = PromptTemplateParser(template="{{histories}}\n{{format_instructions}}\nquestions:\n")
prompt_template = PromptTemplateParser(
template="{{histories}}\n{{format_instructions}}\nquestions:\n")
prompt = prompt_template.format({"histories": histories, "format_instructions": format_instructions})
prompt = prompt_template.format(
{"histories": histories, "format_instructions": format_instructions})
try:
model_manager = ModelManager()
@ -141,11 +143,13 @@ class LLMGenerator:
)
text_content = response.message.get_text_content()
questions = output_parser.parse(text_content) if text_content else []
questions = output_parser.parse(
text_content) if text_content else []
except InvokeError:
questions = []
except Exception:
logger.exception("Failed to generate suggested questions after answer")
logger.exception(
"Failed to generate suggested questions after answer")
questions = []
return questions
@ -156,10 +160,12 @@ class LLMGenerator:
error = ""
error_step = ""
rule_config = {"prompt": "", "variables": [], "opening_statement": "", "error": ""}
rule_config = {"prompt": "", "variables": [],
"opening_statement": "", "error": ""}
model_parameters = model_config.get("completion_params", {})
if no_variable:
prompt_template = PromptTemplateParser(WORKFLOW_RULE_CONFIG_PROMPT_GENERATE_TEMPLATE)
prompt_template = PromptTemplateParser(
WORKFLOW_RULE_CONFIG_PROMPT_GENERATE_TEMPLATE)
prompt_generate = prompt_template.format(
inputs={
@ -184,13 +190,14 @@ class LLMGenerator:
prompt_messages=list(prompt_messages), model_parameters=model_parameters, stream=False
)
rule_config["prompt"] = cast(str, response.message.content)
rule_config["prompt"] = response.message.get_text_content()
except InvokeError as e:
error = str(e)
error_step = "generate rule config"
except Exception as e:
logger.exception("Failed to generate rule config, model: %s", model_config.get("name"))
logger.exception(
"Failed to generate rule config, model: %s", model_config.get("name"))
rule_config["error"] = str(e)
rule_config["error"] = f"Failed to {error_step}. Error: {error}" if error else ""
@ -237,33 +244,34 @@ class LLMGenerator:
return rule_config
rule_config["prompt"] = cast(str, prompt_content.message.content)
rule_config["prompt"] = prompt_content.message.get_text_content()
if not isinstance(prompt_content.message.content, str):
raise NotImplementedError("prompt content is not a string")
parameter_generate_prompt = parameter_template.format(
inputs={
"INPUT_TEXT": prompt_content.message.content,
"INPUT_TEXT": prompt_content.message.get_text_content(),
},
remove_template_variables=False,
)
parameter_messages = [UserPromptMessage(content=parameter_generate_prompt)]
parameter_messages = [UserPromptMessage(
content=parameter_generate_prompt)]
# the second step to generate the task_parameter and task_statement
statement_generate_prompt = statement_template.format(
inputs={
"TASK_DESCRIPTION": instruction,
"INPUT_TEXT": prompt_content.message.content,
"INPUT_TEXT": prompt_content.message.get_text_content(),
},
remove_template_variables=False,
)
statement_messages = [UserPromptMessage(content=statement_generate_prompt)]
statement_messages = [UserPromptMessage(
content=statement_generate_prompt)]
try:
parameter_content: LLMResult = model_instance.invoke_llm(
prompt_messages=list(parameter_messages), model_parameters=model_parameters, stream=False
)
rule_config["variables"] = re.findall(r'"\s*([^"]+)\s*"', cast(str, parameter_content.message.content))
rule_config["variables"] = re.findall(
r'"\s*([^"]+)\s*"', parameter_content.message.get_text_content())
except InvokeError as e:
error = str(e)
error_step = "generate variables"
@ -272,13 +280,15 @@ class LLMGenerator:
statement_content: LLMResult = model_instance.invoke_llm(
prompt_messages=list(statement_messages), model_parameters=model_parameters, stream=False
)
rule_config["opening_statement"] = cast(str, statement_content.message.content)
rule_config["opening_statement"] = statement_content.message.get_text_content(
)
except InvokeError as e:
error = str(e)
error_step = "generate conversation opener"
except Exception as e:
logger.exception("Failed to generate rule config, model: %s", model_config.get("name"))
logger.exception(
"Failed to generate rule config, model: %s", model_config.get("name"))
rule_config["error"] = str(e)
rule_config["error"] = f"Failed to {error_step}. Error: {error}" if error else ""
@ -288,9 +298,11 @@ class LLMGenerator:
@classmethod
def generate_code(cls, tenant_id: str, instruction: str, model_config: dict, code_language: str = "javascript"):
if code_language == "python":
prompt_template = PromptTemplateParser(PYTHON_CODE_GENERATOR_PROMPT_TEMPLATE)
prompt_template = PromptTemplateParser(
PYTHON_CODE_GENERATOR_PROMPT_TEMPLATE)
else:
prompt_template = PromptTemplateParser(JAVASCRIPT_CODE_GENERATOR_PROMPT_TEMPLATE)
prompt_template = PromptTemplateParser(
JAVASCRIPT_CODE_GENERATOR_PROMPT_TEMPLATE)
prompt = prompt_template.format(
inputs={
@ -315,7 +327,7 @@ class LLMGenerator:
prompt_messages=list(prompt_messages), model_parameters=model_parameters, stream=False
)
generated_code = cast(str, response.message.content)
generated_code = response.message.get_text_content()
return {"code": generated_code, "language": code_language, "error": ""}
except InvokeError as e:
@ -323,7 +335,8 @@ class LLMGenerator:
return {"code": "", "language": code_language, "error": f"Failed to generate code. Error: {error}"}
except Exception as e:
logger.exception(
"Failed to invoke LLM model, model: %s, language: %s", model_config.get("name"), code_language
"Failed to invoke LLM model, model: %s, language: %s", model_config.get(
"name"), code_language
)
return {"code": "", "language": code_language, "error": f"An unexpected error occurred: {str(e)}"}
@ -337,7 +350,8 @@ class LLMGenerator:
model_type=ModelType.LLM,
)
prompt_messages: list[PromptMessage] = [SystemPromptMessage(content=prompt), UserPromptMessage(content=query)]
prompt_messages: list[PromptMessage] = [SystemPromptMessage(
content=prompt), UserPromptMessage(content=query)]
# Explicitly use the non-streaming overload
result = model_instance.invoke_llm(
@ -351,7 +365,7 @@ class LLMGenerator:
raise TypeError("Expected LLMResult when stream=False")
response = result
answer = cast(str, response.message.content)
answer = response.message.get_text_content()
return answer.strip()
@classmethod
@ -375,10 +389,7 @@ class LLMGenerator:
prompt_messages=list(prompt_messages), model_parameters=model_parameters, stream=False
)
raw_content = response.message.content
if not isinstance(raw_content, str):
raise ValueError(f"LLM response content must be a string, got: {type(raw_content)}")
raw_content = response.message.get_text_content()
try:
parsed_content = json.loads(raw_content)
@ -386,16 +397,19 @@ class LLMGenerator:
parsed_content = json_repair.loads(raw_content)
if not isinstance(parsed_content, dict | list):
raise ValueError(f"Failed to parse structured output from llm: {raw_content}")
raise ValueError(
f"Failed to parse structured output from llm: {raw_content}")
generated_json_schema = json.dumps(parsed_content, indent=2, ensure_ascii=False)
generated_json_schema = json.dumps(
parsed_content, indent=2, ensure_ascii=False)
return {"output": generated_json_schema, "error": ""}
except InvokeError as e:
error = str(e)
return {"output": "", "error": f"Failed to generate JSON Schema. Error: {error}"}
except Exception as e:
logger.exception("Failed to invoke LLM model, model: %s", model_config.get("name"))
logger.exception(
"Failed to invoke LLM model, model: %s", model_config.get("name"))
return {"output": "", "error": f"An unexpected error occurred: {str(e)}"}
@staticmethod
@ -403,7 +417,8 @@ class LLMGenerator:
tenant_id: str, flow_id: str, current: str, instruction: str, model_config: dict, ideal_output: str | None
):
last_run: Message | None = (
db.session.query(Message).where(Message.app_id == flow_id).order_by(Message.created_at.desc()).first()
db.session.query(Message).where(Message.app_id == flow_id).order_by(
Message.created_at.desc()).first()
)
if not last_run:
return LLMGenerator.__instruction_modify_common(
@ -451,7 +466,8 @@ class LLMGenerator:
workflow = workflow_service.get_draft_workflow(app_model=app)
if not workflow:
raise ValueError("Workflow not found for the given app model.")
last_run = workflow_service.get_node_last_run(app_model=app, workflow=workflow, node_id=node_id)
last_run = workflow_service.get_node_last_run(
app_model=app, workflow=workflow, node_id=node_id)
try:
node_type = cast(WorkflowNodeExecutionModel, last_run).node_type
except Exception:
@ -475,7 +491,8 @@ class LLMGenerator:
)
def agent_log_of(node_execution: WorkflowNodeExecutionModel) -> Sequence:
raw_agent_log = node_execution.execution_metadata_dict.get(WorkflowNodeExecutionMetadataKey.AGENT_LOG, [])
raw_agent_log = node_execution.execution_metadata_dict.get(
WorkflowNodeExecutionMetadataKey.AGENT_LOG, [])
if not raw_agent_log:
return []
@ -523,11 +540,14 @@ class LLMGenerator:
ERROR_MESSAGE = "{{#error_message#}}"
injected_instruction = instruction
if LAST_RUN in injected_instruction:
injected_instruction = injected_instruction.replace(LAST_RUN, json.dumps(last_run))
injected_instruction = injected_instruction.replace(
LAST_RUN, json.dumps(last_run))
if CURRENT in injected_instruction:
injected_instruction = injected_instruction.replace(CURRENT, current or "null")
injected_instruction = injected_instruction.replace(
CURRENT, current or "null")
if ERROR_MESSAGE in injected_instruction:
injected_instruction = injected_instruction.replace(ERROR_MESSAGE, error_message or "null")
injected_instruction = injected_instruction.replace(
ERROR_MESSAGE, error_message or "null")
model_instance = ModelManager().get_model_instance(
tenant_id=tenant_id,
model_type=ModelType.LLM,
@ -565,11 +585,13 @@ class LLMGenerator:
first_brace = generated_raw.find("{")
last_brace = generated_raw.rfind("}")
if first_brace == -1 or last_brace == -1 or last_brace < first_brace:
raise ValueError(f"Could not find a valid JSON object in response: {generated_raw}")
json_str = generated_raw[first_brace : last_brace + 1]
raise ValueError(
f"Could not find a valid JSON object in response: {generated_raw}")
json_str = generated_raw[first_brace: last_brace + 1]
data = json_repair.loads(json_str)
if not isinstance(data, dict):
raise TypeError(f"Expected a JSON object, but got {type(data).__name__}")
raise TypeError(
f"Expected a JSON object, but got {type(data).__name__}")
return data
except InvokeError as e:
error = str(e)

View File

@ -434,3 +434,20 @@ INSTRUCTION_GENERATE_TEMPLATE_PROMPT = """The output of this prompt is not as ex
You should edit the prompt according to the IDEAL OUTPUT."""
INSTRUCTION_GENERATE_TEMPLATE_CODE = """Please fix the errors in the {{#error_message#}}."""
DEFAULT_GENERATOR_SUMMARY_PROMPT = (
"""Summarize the following content. Extract only the key information and main points. """
"""Remove redundant details.
Requirements:
1. Write a concise summary in plain text
2. Use the same language as the input content
3. Focus on important facts, concepts, and details
4. If images are included, describe their key information
5. Do not use words like "好的", "ok", "I understand", "This text discusses", "The content mentions"
6. Write directly without extra words
Output only the summary text. Start summarizing now:
"""
)

View File

@ -55,7 +55,7 @@ from core.ops.entities.trace_entity import (
ToolTraceInfo,
WorkflowTraceInfo,
)
from core.repositories import SQLAlchemyWorkflowNodeExecutionRepository
from core.repositories import DifyCoreRepositoryFactory
from core.workflow.entities import WorkflowNodeExecution
from core.workflow.enums import NodeType, WorkflowNodeExecutionMetadataKey
from extensions.ext_database import db
@ -275,7 +275,7 @@ class AliyunDataTrace(BaseTraceInstance):
service_account = self.get_service_account_with_tenant(app_id)
session_factory = sessionmaker(bind=db.engine)
workflow_node_execution_repository = SQLAlchemyWorkflowNodeExecutionRepository(
workflow_node_execution_repository = DifyCoreRepositoryFactory.create_workflow_node_execution_repository(
session_factory=session_factory,
user=service_account,
app_id=app_id,

View File

@ -1,5 +1,6 @@
from core.plugin.entities.endpoint import EndpointEntityWithInstance
from core.plugin.impl.base import BasePluginClient
from core.plugin.impl.exc import PluginDaemonInternalServerError
class PluginEndpointClient(BasePluginClient):
@ -70,18 +71,27 @@ class PluginEndpointClient(BasePluginClient):
def delete_endpoint(self, tenant_id: str, user_id: str, endpoint_id: str):
"""
Delete the given endpoint.
This operation is idempotent: if the endpoint is already deleted (record not found),
it will return True instead of raising an error.
"""
return self._request_with_plugin_daemon_response(
"POST",
f"plugin/{tenant_id}/endpoint/remove",
bool,
data={
"endpoint_id": endpoint_id,
},
headers={
"Content-Type": "application/json",
},
)
try:
return self._request_with_plugin_daemon_response(
"POST",
f"plugin/{tenant_id}/endpoint/remove",
bool,
data={
"endpoint_id": endpoint_id,
},
headers={
"Content-Type": "application/json",
},
)
except PluginDaemonInternalServerError as e:
# Make delete idempotent: if record is not found, consider it a success
if "record not found" in str(e.description).lower():
return True
raise
def enable_endpoint(self, tenant_id: str, user_id: str, endpoint_id: str):
"""

View File

@ -389,15 +389,14 @@ class RetrievalService:
.all()
}
records = []
include_segment_ids = set()
segment_child_map = {}
valid_dataset_documents = {}
image_doc_ids: list[Any] = []
child_index_node_ids = []
index_node_ids = []
doc_to_document_map = {}
summary_segment_ids = set() # Track segments retrieved via summary
# First pass: collect all document IDs and identify summary documents
for document in documents:
document_id = document.metadata.get("document_id")
if document_id not in dataset_documents:
@ -408,16 +407,24 @@ class RetrievalService:
continue
valid_dataset_documents[document_id] = dataset_document
doc_id = document.metadata.get("doc_id") or ""
doc_to_document_map[doc_id] = document
# Check if this is a summary document
is_summary = document.metadata.get("is_summary", False)
if is_summary:
# For summary documents, find the original chunk via original_chunk_id
original_chunk_id = document.metadata.get("original_chunk_id")
if original_chunk_id:
summary_segment_ids.add(original_chunk_id)
continue # Skip adding to other lists for summary documents
if dataset_document.doc_form == IndexStructureType.PARENT_CHILD_INDEX:
doc_id = document.metadata.get("doc_id") or ""
doc_to_document_map[doc_id] = document
if document.metadata.get("doc_type") == DocType.IMAGE:
image_doc_ids.append(doc_id)
else:
child_index_node_ids.append(doc_id)
else:
doc_id = document.metadata.get("doc_id") or ""
doc_to_document_map[doc_id] = document
if document.metadata.get("doc_type") == DocType.IMAGE:
image_doc_ids.append(doc_id)
else:
@ -433,6 +440,7 @@ class RetrievalService:
attachment_map: dict[str, list[dict[str, Any]]] = {}
child_chunk_map: dict[str, list[ChildChunk]] = {}
doc_segment_map: dict[str, list[str]] = {}
segment_summary_map: dict[str, str] = {} # Map segment_id to summary content
with session_factory.create_session() as session:
attachments = cls.get_segment_attachment_infos(image_doc_ids, session)
@ -447,6 +455,7 @@ class RetrievalService:
doc_segment_map[attachment["segment_id"]].append(attachment["attachment_id"])
else:
doc_segment_map[attachment["segment_id"]] = [attachment["attachment_id"]]
child_chunk_stmt = select(ChildChunk).where(ChildChunk.index_node_id.in_(child_index_node_ids))
child_index_nodes = session.execute(child_chunk_stmt).scalars().all()
@ -470,6 +479,7 @@ class RetrievalService:
index_node_segments = session.execute(document_segment_stmt).scalars().all() # type: ignore
for index_node_segment in index_node_segments:
doc_segment_map[index_node_segment.id] = [index_node_segment.index_node_id]
if segment_ids:
document_segment_stmt = select(DocumentSegment).where(
DocumentSegment.enabled == True,
@ -481,6 +491,42 @@ class RetrievalService:
if index_node_segments:
segments.extend(index_node_segments)
# Handle summary documents: query segments by original_chunk_id
if summary_segment_ids:
summary_segment_ids_list = list(summary_segment_ids)
summary_segment_stmt = select(DocumentSegment).where(
DocumentSegment.enabled == True,
DocumentSegment.status == "completed",
DocumentSegment.id.in_(summary_segment_ids_list),
)
summary_segments = session.execute(summary_segment_stmt).scalars().all() # type: ignore
segments.extend(summary_segments)
# Add summary segment IDs to segment_ids for summary query
for seg in summary_segments:
if seg.id not in segment_ids:
segment_ids.append(seg.id)
# Batch query summaries for segments retrieved via summary (only enabled summaries)
if summary_segment_ids:
from models.dataset import DocumentSegmentSummary
summaries = (
session.query(DocumentSegmentSummary)
.filter(
DocumentSegmentSummary.chunk_id.in_(list(summary_segment_ids)),
DocumentSegmentSummary.status == "completed",
DocumentSegmentSummary.enabled == True, # Only retrieve enabled summaries
)
.all()
)
for summary in summaries:
if summary.summary_content:
segment_summary_map[summary.chunk_id] = summary.summary_content
include_segment_ids = set()
segment_child_map: dict[str, dict[str, Any]] = {}
records: list[dict[str, Any]] = []
for segment in segments:
child_chunks: list[ChildChunk] = child_chunk_map.get(segment.id, [])
attachment_infos: list[dict[str, Any]] = attachment_map.get(segment.id, [])
@ -493,7 +539,7 @@ class RetrievalService:
child_chunk_details = []
max_score = 0.0
for child_chunk in child_chunks:
document = doc_to_document_map[child_chunk.index_node_id]
document = doc_to_document_map.get(child_chunk.index_node_id)
child_chunk_detail = {
"id": child_chunk.id,
"content": child_chunk.content,
@ -503,7 +549,7 @@ class RetrievalService:
child_chunk_details.append(child_chunk_detail)
max_score = max(max_score, document.metadata.get("score", 0.0) if document else 0.0)
for attachment_info in attachment_infos:
file_document = doc_to_document_map[attachment_info["id"]]
file_document = doc_to_document_map.get(attachment_info["id"])
max_score = max(
max_score, file_document.metadata.get("score", 0.0) if file_document else 0.0
)
@ -576,9 +622,16 @@ class RetrievalService:
else None
)
# Extract summary if this segment was retrieved via summary
summary_content = segment_summary_map.get(segment.id)
# Create RetrievalSegments object
retrieval_segment = RetrievalSegments(
segment=segment, child_chunks=child_chunks_list, score=score, files=files
segment=segment,
child_chunks=child_chunks_list,
score=score,
files=files,
summary=summary_content,
)
result.append(retrieval_segment)

View File

@ -20,3 +20,4 @@ class RetrievalSegments(BaseModel):
child_chunks: list[RetrievalChildChunk] | None = None
score: float | None = None
files: list[dict[str, str | int]] | None = None
summary: str | None = None # Summary content if retrieved via summary index

View File

@ -13,6 +13,7 @@ from urllib.parse import unquote, urlparse
import httpx
from configs import dify_config
from core.entities.knowledge_entities import PreviewDetail
from core.helper import ssrf_proxy
from core.rag.extractor.entity.extract_setting import ExtractSetting
from core.rag.index_processor.constant.doc_type import DocType
@ -45,6 +46,17 @@ class BaseIndexProcessor(ABC):
def transform(self, documents: list[Document], current_user: Account | None = None, **kwargs) -> list[Document]:
raise NotImplementedError
@abstractmethod
def generate_summary_preview(
self, tenant_id: str, preview_texts: list[PreviewDetail], summary_index_setting: dict
) -> list[PreviewDetail]:
"""
For each segment in preview_texts, generate a summary using LLM and attach it to the segment.
The summary can be stored in a new attribute, e.g., summary.
This method should be implemented by subclasses.
"""
raise NotImplementedError
@abstractmethod
def load(
self,

View File

@ -1,9 +1,25 @@
"""Paragraph index processor."""
import logging
import re
import uuid
from collections.abc import Mapping
from typing import Any
logger = logging.getLogger(__name__)
from core.entities.knowledge_entities import PreviewDetail
from core.file import File, FileTransferMethod, FileType, file_manager
from core.llm_generator.prompts import DEFAULT_GENERATOR_SUMMARY_PROMPT
from core.model_manager import ModelInstance
from core.model_runtime.entities.message_entities import (
ImagePromptMessageContent,
PromptMessageContentUnionTypes,
TextPromptMessageContent,
UserPromptMessage,
)
from core.model_runtime.entities.model_entities import ModelFeature, ModelType
from core.provider_manager import ProviderManager
from core.rag.cleaner.clean_processor import CleanProcessor
from core.rag.datasource.keyword.keyword_factory import Keyword
from core.rag.datasource.retrieval_service import RetrievalService
@ -17,12 +33,16 @@ from core.rag.index_processor.index_processor_base import BaseIndexProcessor
from core.rag.models.document import AttachmentDocument, Document, MultimodalGeneralStructureChunk
from core.rag.retrieval.retrieval_methods import RetrievalMethod
from core.tools.utils.text_processing_utils import remove_leading_symbols
from extensions.ext_database import db
from factories.file_factory import build_from_mapping
from libs import helper
from models import UploadFile
from models.account import Account
from models.dataset import Dataset, DatasetProcessRule
from models.dataset import Dataset, DatasetProcessRule, DocumentSegment, SegmentAttachmentBinding
from models.dataset import Document as DatasetDocument
from services.account_service import AccountService
from services.entities.knowledge_entities.knowledge_entities import Rule
from services.summary_index_service import SummaryIndexService
class ParagraphIndexProcessor(BaseIndexProcessor):
@ -108,6 +128,29 @@ class ParagraphIndexProcessor(BaseIndexProcessor):
keyword.add_texts(documents)
def clean(self, dataset: Dataset, node_ids: list[str] | None, with_keywords: bool = True, **kwargs):
# Note: Summary indexes are now disabled (not deleted) when segments are disabled.
# This method is called for actual deletion scenarios (e.g., when segment is deleted).
# For disable operations, disable_summaries_for_segments is called directly in the task.
# Only delete summaries if explicitly requested (e.g., when segment is actually deleted)
delete_summaries = kwargs.get("delete_summaries", False)
if delete_summaries:
if node_ids:
# Find segments by index_node_id
segments = (
db.session.query(DocumentSegment)
.filter(
DocumentSegment.dataset_id == dataset.id,
DocumentSegment.index_node_id.in_(node_ids),
)
.all()
)
segment_ids = [segment.id for segment in segments]
if segment_ids:
SummaryIndexService.delete_summaries_for_segments(dataset, segment_ids)
else:
# Delete all summaries for the dataset
SummaryIndexService.delete_summaries_for_segments(dataset, None)
if dataset.indexing_technique == "high_quality":
vector = Vector(dataset)
if node_ids:
@ -227,3 +270,263 @@ class ParagraphIndexProcessor(BaseIndexProcessor):
}
else:
raise ValueError("Chunks is not a list")
def generate_summary_preview(
self, tenant_id: str, preview_texts: list[PreviewDetail], summary_index_setting: dict
) -> list[PreviewDetail]:
"""
For each segment, concurrently call generate_summary to generate a summary
and write it to the summary attribute of PreviewDetail.
"""
import concurrent.futures
from flask import current_app
# Capture Flask app context for worker threads
flask_app = None
try:
flask_app = current_app._get_current_object() # type: ignore
except RuntimeError:
logger.warning("No Flask application context available, summary generation may fail")
def process(preview: PreviewDetail) -> None:
"""Generate summary for a single preview item."""
try:
if flask_app:
# Ensure Flask app context in worker thread
with flask_app.app_context():
summary = self.generate_summary(tenant_id, preview.content, summary_index_setting)
preview.summary = summary
else:
# Fallback: try without app context (may fail)
summary = self.generate_summary(tenant_id, preview.content, summary_index_setting)
preview.summary = summary
except Exception:
logger.exception("Failed to generate summary for preview")
# Don't fail the entire preview if summary generation fails
preview.summary = None
with concurrent.futures.ThreadPoolExecutor() as executor:
list(executor.map(process, preview_texts))
return preview_texts
@staticmethod
def generate_summary(
tenant_id: str,
text: str,
summary_index_setting: dict | None = None,
segment_id: str | None = None,
) -> str:
"""
Generate summary for the given text using ModelInstance.invoke_llm and the default or custom summary prompt,
and supports vision models by including images from the segment attachments or text content.
Args:
tenant_id: Tenant ID
text: Text content to summarize
summary_index_setting: Summary index configuration
segment_id: Optional segment ID to fetch attachments from SegmentAttachmentBinding table
"""
if not summary_index_setting or not summary_index_setting.get("enable"):
raise ValueError("summary_index_setting is required and must be enabled to generate summary.")
model_name = summary_index_setting.get("model_name")
model_provider_name = summary_index_setting.get("model_provider_name")
summary_prompt = summary_index_setting.get("summary_prompt")
# Import default summary prompt
if not summary_prompt:
summary_prompt = DEFAULT_GENERATOR_SUMMARY_PROMPT
provider_manager = ProviderManager()
provider_model_bundle = provider_manager.get_provider_model_bundle(
tenant_id, model_provider_name, ModelType.LLM
)
model_instance = ModelInstance(provider_model_bundle, model_name)
# Get model schema to check if vision is supported
model_schema = model_instance.model_type_instance.get_model_schema(model_name, model_instance.credentials)
supports_vision = model_schema and model_schema.features and ModelFeature.VISION in model_schema.features
# Extract images if model supports vision
image_files = []
if supports_vision:
# First, try to get images from SegmentAttachmentBinding (preferred method)
if segment_id:
image_files = ParagraphIndexProcessor._extract_images_from_segment_attachments(tenant_id, segment_id)
# If no images from attachments, fall back to extracting from text
if not image_files:
image_files = ParagraphIndexProcessor._extract_images_from_text(tenant_id, text)
# Build prompt messages
prompt_messages = []
if image_files:
# If we have images, create a UserPromptMessage with both text and images
prompt_message_contents: list[PromptMessageContentUnionTypes] = []
# Add images first
for file in image_files:
try:
file_content = file_manager.to_prompt_message_content(
file, image_detail_config=ImagePromptMessageContent.DETAIL.LOW
)
prompt_message_contents.append(file_content)
except Exception as e:
logger.warning("Failed to convert image file to prompt message content: %s", str(e))
continue
# Add text content
if prompt_message_contents: # Only add text if we successfully added images
prompt_message_contents.append(TextPromptMessageContent(data=f"{summary_prompt}\n{text}"))
prompt_messages.append(UserPromptMessage(content=prompt_message_contents))
else:
# If image conversion failed, fall back to text-only
prompt = f"{summary_prompt}\n{text}"
prompt_messages.append(UserPromptMessage(content=prompt))
else:
# No images, use simple text prompt
prompt = f"{summary_prompt}\n{text}"
prompt_messages.append(UserPromptMessage(content=prompt))
result = model_instance.invoke_llm(prompt_messages=prompt_messages, model_parameters={}, stream=False)
return getattr(result.message, "content", "")
@staticmethod
def _extract_images_from_text(tenant_id: str, text: str) -> list[File]:
"""
Extract images from markdown text and convert them to File objects.
Args:
tenant_id: Tenant ID
text: Text content that may contain markdown image links
Returns:
List of File objects representing images found in the text
"""
# Extract markdown images using regex pattern
pattern = r"!\[.*?\]\((.*?)\)"
images = re.findall(pattern, text)
if not images:
return []
upload_file_id_list = []
for image in images:
# For data before v0.10.0
pattern = r"/files/([a-f0-9\-]+)/image-preview(?:\?.*?)?"
match = re.search(pattern, image)
if match:
upload_file_id = match.group(1)
upload_file_id_list.append(upload_file_id)
continue
# For data after v0.10.0
pattern = r"/files/([a-f0-9\-]+)/file-preview(?:\?.*?)?"
match = re.search(pattern, image)
if match:
upload_file_id = match.group(1)
upload_file_id_list.append(upload_file_id)
continue
# For tools directory - direct file formats (e.g., .png, .jpg, etc.)
pattern = r"/files/tools/([a-f0-9\-]+)\.([a-zA-Z0-9]+)(?:\?[^\s\)\"\']*)?"
match = re.search(pattern, image)
if match:
# Tool files are handled differently, skip for now
continue
if not upload_file_id_list:
return []
# Get unique IDs for database query
unique_upload_file_ids = list(set(upload_file_id_list))
upload_files = (
db.session.query(UploadFile)
.where(UploadFile.id.in_(unique_upload_file_ids), UploadFile.tenant_id == tenant_id)
.all()
)
# Create File objects from UploadFile records
file_objects = []
for upload_file in upload_files:
# Only process image files
if not upload_file.mime_type or "image" not in upload_file.mime_type:
continue
mapping = {
"upload_file_id": upload_file.id,
"transfer_method": FileTransferMethod.LOCAL_FILE.value,
"type": FileType.IMAGE.value,
}
try:
file_obj = build_from_mapping(
mapping=mapping,
tenant_id=tenant_id,
)
file_objects.append(file_obj)
except Exception as e:
logger.warning("Failed to create File object from UploadFile %s: %s", upload_file.id, str(e))
continue
return file_objects
@staticmethod
def _extract_images_from_segment_attachments(tenant_id: str, segment_id: str) -> list[File]:
"""
Extract images from SegmentAttachmentBinding table (preferred method).
This matches how DatasetRetrieval gets segment attachments.
Args:
tenant_id: Tenant ID
segment_id: Segment ID to fetch attachments for
Returns:
List of File objects representing images found in segment attachments
"""
from sqlalchemy import select
# Query attachments from SegmentAttachmentBinding table
attachments_with_bindings = db.session.execute(
select(SegmentAttachmentBinding, UploadFile)
.join(UploadFile, UploadFile.id == SegmentAttachmentBinding.attachment_id)
.where(
SegmentAttachmentBinding.segment_id == segment_id,
SegmentAttachmentBinding.tenant_id == tenant_id,
)
).all()
if not attachments_with_bindings:
return []
file_objects = []
for _, upload_file in attachments_with_bindings:
# Only process image files
if not upload_file.mime_type or "image" not in upload_file.mime_type:
continue
try:
# Create File object directly (similar to DatasetRetrieval)
file_obj = File(
id=upload_file.id,
filename=upload_file.name,
extension="." + upload_file.extension,
mime_type=upload_file.mime_type,
tenant_id=tenant_id,
type=FileType.IMAGE,
transfer_method=FileTransferMethod.LOCAL_FILE,
remote_url=upload_file.source_url,
related_id=upload_file.id,
size=upload_file.size,
storage_key=upload_file.key,
)
file_objects.append(file_obj)
except Exception as e:
logger.warning("Failed to create File object from UploadFile %s: %s", upload_file.id, str(e))
continue
return file_objects

View File

@ -1,11 +1,13 @@
"""Paragraph index processor."""
import json
import logging
import uuid
from collections.abc import Mapping
from typing import Any
from configs import dify_config
from core.entities.knowledge_entities import PreviewDetail
from core.model_manager import ModelInstance
from core.rag.cleaner.clean_processor import CleanProcessor
from core.rag.datasource.retrieval_service import RetrievalService
@ -25,6 +27,9 @@ from models.dataset import ChildChunk, Dataset, DatasetProcessRule, DocumentSegm
from models.dataset import Document as DatasetDocument
from services.account_service import AccountService
from services.entities.knowledge_entities.knowledge_entities import ParentMode, Rule
from services.summary_index_service import SummaryIndexService
logger = logging.getLogger(__name__)
class ParentChildIndexProcessor(BaseIndexProcessor):
@ -135,6 +140,29 @@ class ParentChildIndexProcessor(BaseIndexProcessor):
def clean(self, dataset: Dataset, node_ids: list[str] | None, with_keywords: bool = True, **kwargs):
# node_ids is segment's node_ids
# Note: Summary indexes are now disabled (not deleted) when segments are disabled.
# This method is called for actual deletion scenarios (e.g., when segment is deleted).
# For disable operations, disable_summaries_for_segments is called directly in the task.
# Only delete summaries if explicitly requested (e.g., when segment is actually deleted)
delete_summaries = kwargs.get("delete_summaries", False)
if delete_summaries:
if node_ids:
# Find segments by index_node_id
segments = (
db.session.query(DocumentSegment)
.filter(
DocumentSegment.dataset_id == dataset.id,
DocumentSegment.index_node_id.in_(node_ids),
)
.all()
)
segment_ids = [segment.id for segment in segments]
if segment_ids:
SummaryIndexService.delete_summaries_for_segments(dataset, segment_ids)
else:
# Delete all summaries for the dataset
SummaryIndexService.delete_summaries_for_segments(dataset, None)
if dataset.indexing_technique == "high_quality":
delete_child_chunks = kwargs.get("delete_child_chunks") or False
precomputed_child_node_ids = kwargs.get("precomputed_child_node_ids")
@ -326,3 +354,57 @@ class ParentChildIndexProcessor(BaseIndexProcessor):
"preview": preview,
"total_segments": len(parent_childs.parent_child_chunks),
}
def generate_summary_preview(
self, tenant_id: str, preview_texts: list[PreviewDetail], summary_index_setting: dict
) -> list[PreviewDetail]:
"""
For each parent chunk in preview_texts, concurrently call generate_summary to generate a summary
and write it to the summary attribute of PreviewDetail.
Note: For parent-child structure, we only generate summaries for parent chunks.
"""
import concurrent.futures
from flask import current_app
# Capture Flask app context for worker threads
flask_app = None
try:
flask_app = current_app._get_current_object() # type: ignore
except RuntimeError:
logger.warning("No Flask application context available, summary generation may fail")
def process(preview: PreviewDetail) -> None:
"""Generate summary for a single preview item (parent chunk)."""
try:
if flask_app:
# Ensure Flask app context in worker thread
with flask_app.app_context():
# Use ParagraphIndexProcessor's generate_summary method
from core.rag.index_processor.processor.paragraph_index_processor import ParagraphIndexProcessor
summary = ParagraphIndexProcessor.generate_summary(
tenant_id=tenant_id,
text=preview.content,
summary_index_setting=summary_index_setting,
)
if summary:
preview.summary = summary
else:
# Fallback: try without app context (may fail)
from core.rag.index_processor.processor.paragraph_index_processor import ParagraphIndexProcessor
summary = ParagraphIndexProcessor.generate_summary(
tenant_id=tenant_id,
text=preview.content,
summary_index_setting=summary_index_setting,
)
if summary:
preview.summary = summary
except Exception:
logger.exception("Failed to generate summary for preview")
# Don't fail the entire preview if summary generation fails
preview.summary = None
with concurrent.futures.ThreadPoolExecutor() as executor:
list(executor.map(process, preview_texts))
return preview_texts

View File

@ -11,6 +11,7 @@ import pandas as pd
from flask import Flask, current_app
from werkzeug.datastructures import FileStorage
from core.entities.knowledge_entities import PreviewDetail
from core.llm_generator.llm_generator import LLMGenerator
from core.rag.cleaner.clean_processor import CleanProcessor
from core.rag.datasource.retrieval_service import RetrievalService
@ -25,9 +26,10 @@ from core.rag.retrieval.retrieval_methods import RetrievalMethod
from core.tools.utils.text_processing_utils import remove_leading_symbols
from libs import helper
from models.account import Account
from models.dataset import Dataset
from models.dataset import Dataset, DocumentSegment
from models.dataset import Document as DatasetDocument
from services.entities.knowledge_entities.knowledge_entities import Rule
from services.summary_index_service import SummaryIndexService
logger = logging.getLogger(__name__)
@ -144,6 +146,30 @@ class QAIndexProcessor(BaseIndexProcessor):
vector.create_multimodal(multimodal_documents)
def clean(self, dataset: Dataset, node_ids: list[str] | None, with_keywords: bool = True, **kwargs):
# Note: Summary indexes are now disabled (not deleted) when segments are disabled.
# This method is called for actual deletion scenarios (e.g., when segment is deleted).
# For disable operations, disable_summaries_for_segments is called directly in the task.
# Note: qa_model doesn't generate summaries, but we clean them for completeness
# Only delete summaries if explicitly requested (e.g., when segment is actually deleted)
delete_summaries = kwargs.get("delete_summaries", False)
if delete_summaries:
if node_ids:
# Find segments by index_node_id
segments = (
db.session.query(DocumentSegment)
.filter(
DocumentSegment.dataset_id == dataset.id,
DocumentSegment.index_node_id.in_(node_ids),
)
.all()
)
segment_ids = [segment.id for segment in segments]
if segment_ids:
SummaryIndexService.delete_summaries_for_segments(dataset, segment_ids)
else:
# Delete all summaries for the dataset
SummaryIndexService.delete_summaries_for_segments(dataset, None)
vector = Vector(dataset)
if node_ids:
vector.delete_by_ids(node_ids)
@ -212,6 +238,17 @@ class QAIndexProcessor(BaseIndexProcessor):
"total_segments": len(qa_chunks.qa_chunks),
}
def generate_summary_preview(
self, tenant_id: str, preview_texts: list[PreviewDetail], summary_index_setting: dict
) -> list[PreviewDetail]:
"""
QA model doesn't generate summaries, so this method returns preview_texts unchanged.
Note: QA model uses question-answer pairs, which don't require summary generation.
"""
# QA model doesn't generate summaries, return as-is
return preview_texts
def _format_qa_document(self, flask_app: Flask, tenant_id: str, document_node, all_qa_documents, document_language):
format_documents = []
if document_node.page_content is None or not document_node.page_content.strip():

View File

@ -7,8 +7,8 @@ from typing import Any, cast
from flask import has_request_context
from sqlalchemy import select
from sqlalchemy.orm import Session
from core.db.session_factory import session_factory
from core.file import FILE_MODEL_IDENTITY, File, FileTransferMethod
from core.model_runtime.entities.llm_entities import LLMUsage, LLMUsageMetadata
from core.tools.__base.tool import Tool
@ -20,7 +20,6 @@ from core.tools.entities.tool_entities import (
ToolProviderType,
)
from core.tools.errors import ToolInvokeError
from extensions.ext_database import db
from factories.file_factory import build_from_mapping
from libs.login import current_user
from models import Account, Tenant
@ -230,30 +229,32 @@ class WorkflowTool(Tool):
"""
Resolve user from database (worker/Celery context).
"""
with session_factory.create_session() as session:
tenant_stmt = select(Tenant).where(Tenant.id == self.runtime.tenant_id)
tenant = session.scalar(tenant_stmt)
if not tenant:
return None
user_stmt = select(Account).where(Account.id == user_id)
user = session.scalar(user_stmt)
if user:
user.current_tenant = tenant
session.expunge(user)
return user
end_user_stmt = select(EndUser).where(EndUser.id == user_id, EndUser.tenant_id == tenant.id)
end_user = session.scalar(end_user_stmt)
if end_user:
session.expunge(end_user)
return end_user
tenant_stmt = select(Tenant).where(Tenant.id == self.runtime.tenant_id)
tenant = db.session.scalar(tenant_stmt)
if not tenant:
return None
user_stmt = select(Account).where(Account.id == user_id)
user = db.session.scalar(user_stmt)
if user:
user.current_tenant = tenant
return user
end_user_stmt = select(EndUser).where(EndUser.id == user_id, EndUser.tenant_id == tenant.id)
end_user = db.session.scalar(end_user_stmt)
if end_user:
return end_user
return None
def _get_workflow(self, app_id: str, version: str) -> Workflow:
"""
get the workflow by app id and version
"""
with Session(db.engine, expire_on_commit=False) as session, session.begin():
with session_factory.create_session() as session, session.begin():
if not version:
stmt = (
select(Workflow)
@ -265,22 +266,24 @@ class WorkflowTool(Tool):
stmt = select(Workflow).where(Workflow.app_id == app_id, Workflow.version == version)
workflow = session.scalar(stmt)
if not workflow:
raise ValueError("workflow not found or not published")
if not workflow:
raise ValueError("workflow not found or not published")
return workflow
session.expunge(workflow)
return workflow
def _get_app(self, app_id: str) -> App:
"""
get the app by app id
"""
stmt = select(App).where(App.id == app_id)
with Session(db.engine, expire_on_commit=False) as session, session.begin():
with session_factory.create_session() as session, session.begin():
app = session.scalar(stmt)
if not app:
raise ValueError("app not found")
if not app:
raise ValueError("app not found")
return app
session.expunge(app)
return app
def _transform_args(self, tool_parameters: dict) -> tuple[dict, list[dict]]:
"""

View File

@ -30,6 +30,7 @@ from .variables import (
SecretVariable,
StringVariable,
Variable,
VariableBase,
)
__all__ = [
@ -62,4 +63,5 @@ __all__ = [
"StringSegment",
"StringVariable",
"Variable",
"VariableBase",
]

View File

@ -232,7 +232,7 @@ def get_segment_discriminator(v: Any) -> SegmentType | None:
# - All variants in `SegmentUnion` must inherit from the `Segment` class.
# - The union must include all non-abstract subclasses of `Segment`, except:
# - `SegmentGroup`, which is not added to the variable pool.
# - `Variable` and its subclasses, which are handled by `VariableUnion`.
# - `VariableBase` and its subclasses, which are handled by `Variable`.
SegmentUnion: TypeAlias = Annotated[
(
Annotated[NoneSegment, Tag(SegmentType.NONE)]

View File

@ -27,7 +27,7 @@ from .segments import (
from .types import SegmentType
class Variable(Segment):
class VariableBase(Segment):
"""
A variable is a segment that has a name.
@ -45,23 +45,23 @@ class Variable(Segment):
selector: Sequence[str] = Field(default_factory=list)
class StringVariable(StringSegment, Variable):
class StringVariable(StringSegment, VariableBase):
pass
class FloatVariable(FloatSegment, Variable):
class FloatVariable(FloatSegment, VariableBase):
pass
class IntegerVariable(IntegerSegment, Variable):
class IntegerVariable(IntegerSegment, VariableBase):
pass
class ObjectVariable(ObjectSegment, Variable):
class ObjectVariable(ObjectSegment, VariableBase):
pass
class ArrayVariable(ArraySegment, Variable):
class ArrayVariable(ArraySegment, VariableBase):
pass
@ -89,16 +89,16 @@ class SecretVariable(StringVariable):
return encrypter.obfuscated_token(self.value)
class NoneVariable(NoneSegment, Variable):
class NoneVariable(NoneSegment, VariableBase):
value_type: SegmentType = SegmentType.NONE
value: None = None
class FileVariable(FileSegment, Variable):
class FileVariable(FileSegment, VariableBase):
pass
class BooleanVariable(BooleanSegment, Variable):
class BooleanVariable(BooleanSegment, VariableBase):
pass
@ -139,13 +139,13 @@ class RAGPipelineVariableInput(BaseModel):
value: Any
# The `VariableUnion`` type is used to enable serialization and deserialization with Pydantic.
# Use `Variable` for type hinting when serialization is not required.
# The `Variable` type is used to enable serialization and deserialization with Pydantic.
# Use `VariableBase` for type hinting when serialization is not required.
#
# Note:
# - All variants in `VariableUnion` must inherit from the `Variable` class.
# - The union must include all non-abstract subclasses of `Segment`, except:
VariableUnion: TypeAlias = Annotated[
# - All variants in `Variable` must inherit from the `VariableBase` class.
# - The union must include all non-abstract subclasses of `VariableBase`.
Variable: TypeAlias = Annotated[
(
Annotated[NoneVariable, Tag(SegmentType.NONE)]
| Annotated[StringVariable, Tag(SegmentType.STRING)]

Some files were not shown because too many files have changed in this diff Show More