Compare commits

..

69 Commits

Author SHA1 Message Date
d34c95bf8e feat: convert components to dynamic imports for improved performance 2025-07-17 15:32:42 +08:00
1df1ffa2ec fix(apps): add translation and document title for Apps component 2025-07-17 14:08:00 +08:00
947dbd8854 chore(apps): move app list components to components folder 2025-07-17 14:00:29 +08:00
ce619287b3 feat(app-publisher): add relative time formatting for timestamps 2025-07-16 18:34:03 +08:00
cd1ec65286 feat: convert components to dynamic imports for improved performance 2025-07-16 16:51:55 +08:00
qfl
bdb9f29948 feat(app): support custom max_active_requests per app (#22073) 2025-07-16 15:31:19 +08:00
66cc1b4308 feat(variable-list): add drag-and-drop functionality for variables in code node (#22127) 2025-07-16 15:24:19 +08:00
d52fb18457 feat: auto-fill MCP server description with app description #22443 (#22477) 2025-07-16 15:03:33 +08:00
4a2169bd5f Chore/update gh template (#22480) 2025-07-16 14:22:51 +08:00
2c9ee54a16 fix aliyun trace session_id (#22468) 2025-07-16 13:56:44 +08:00
aef67ed7ec fix: add background color for chat bubble in light and dark themes (#22472) 2025-07-16 13:36:51 +08:00
ddfd8c8525 feat(api): add UUIDv7 implementation in SQL and Python (#22058)
This PR introduces UUIDv7 implementations in both Python and SQL to establish the foundation for migrating from UUIDv4 to UUIDv7 as proposed in #19754.

ID generation algorithm of existing models are not changed, and new models should use UUIDv7 for ID generation.

Close #19754.
2025-07-16 13:07:08 +08:00
2c1ab4879f refactor(api): Separate SegmentType for Integer/Float to Enable Pydantic Serialization (#22025)
refactor(api): Separate SegmentType for Integer/Float to Enable Pydantic Serialization (#22025)

This PR addresses serialization issues in the VariablePool model by separating the `value_type` tags for `IntegerSegment`/`FloatSegment` and `IntegerVariable`/`FloatVariable`. Previously, both Integer and Float types shared the same `SegmentType.NUMBER` tag, causing conflicts during serialization.

Key changes:
- Introduce distinct `value_type` tags for Integer and Float segments/variables
- Add `VariableUnion` and `SegmentUnion` types for proper type discrimination
- Leverage Pydantic's discriminated union feature for seamless serialization/deserialization
- Enable accurate serialization of data structures containing these types

Closes #22024.
2025-07-16 12:31:37 +08:00
229b4d621e Improve Tooltip UX by enabling delay by default (#21383) 2025-07-16 11:26:54 +08:00
0dee41c074 fix: When var value changed, PromptEditor should be reset (#22219) 2025-07-16 11:22:54 +08:00
bf542233a9 minor fix: using Pydantic model_validate instead of deprecated parse_obj (#22239)
Signed-off-by: neatguycoding <15627489+NeatGuyCoding@users.noreply.github.com>
2025-07-16 10:57:08 +08:00
38106074b4 test: add comprehensive unit tests for console authentication and authorization decorators (#22439) 2025-07-16 10:07:01 +08:00
znn
1f4b3591ae adding tooltip for bindingCount (#22450)
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
Co-authored-by: crazywoola <427733928@qq.com>
2025-07-16 09:59:42 +08:00
7bf3d2c8bf fix(api): Fix potential thread leak in MCP BaseSession (#22169)
The `BaseSession` class in the `core/mcp/session` package uses `ThreadPoolExecutor` 
to run the receive loop but fails to properly clean up the executor and receiver 
future, leading to potential thread leaks.

This PR addresses this issue by:
- Initializing `_executor` and `_receiver_future` attributes to `None` for proper cleanup checks
- Adding graceful shutdown with a 5-second timeout in the `__exit__` method
- Ensuring the ThreadPoolExecutor is properly shut down to prevent resource leaks

This fix prevents memory leaks and hanging threads in long-running scenarios where 
multiple MCP sessions are created and destroyed.

Signed-off-by: neatguycoding <15627489+NeatGuyCoding@users.noreply.github.com>
Co-authored-by: QuantumGhost <obelisk.reg+git@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-16 00:01:44 +08:00
da53bf511f chore: add SQLALCHEMY_POOL_USE_LIFO option and missing SQLALCHEMY_POOL_PRE_PING env default value. (#22371) 2025-07-15 19:46:48 +08:00
7388fd1ec6 fix: Disable question editing in chat history (#22438) 2025-07-15 19:41:51 +08:00
b803eeb528 fix: Update condition items to support variable type acquisition (#22414) 2025-07-15 19:38:13 +08:00
14f79ee652 fix: create api workflow run repository error (#22422) 2025-07-15 16:12:02 +08:00
df89629e04 fix: conversatino statistic including data from debugger (#22412)
Signed-off-by: -LAN- <laipz8200@outlook.com>
2025-07-15 15:45:45 +08:00
d427088ab5 fix: remove PickerPanel padding (#22419) 2025-07-15 15:37:13 +08:00
32c541a9ed fix: generate deterministic operationId for root endpoints without one (#19888) 2025-07-15 14:19:55 +08:00
7e666dc3b1 fix(prompt-editor): show error warning for destructive env and conv var (#21802) 2025-07-15 14:10:50 +08:00
5247c19498 fix: code result included "error" field (#22392) 2025-07-15 13:55:00 +08:00
9823edd3a2 fix workflow node iterator . (#21008)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-15 10:55:49 +08:00
88537991d6 fix: Metadata filtering with Manual option in Agent mode does not take effect when specifying input variables. (#20362) 2025-07-15 10:47:20 +08:00
8e910d8c59 fix(plugin): introduce response_type parameter in plugin list API to enable paginated response support (#22251) 2025-07-15 10:10:37 +08:00
a0b32b6027 feat(config-modal): add space to underscore conversion in variable name input of start node (#22284) 2025-07-15 10:00:19 +08:00
bf7b2c339b tablestore vector support more method (#22225)
Co-authored-by: xiaozhiqing.xzq <xiaozhiqing.xzq@alibaba-inc.com>
2025-07-15 09:58:48 +08:00
a1dfe6d402 chore: bump nextjs to 15.3 (#22262) 2025-07-15 09:35:17 +08:00
d2a3e8b9b1 Provides a set of Kubernetes manifests supporting version 1.6.0 (#22287) 2025-07-15 09:34:17 +08:00
ebb88bbe0b improve opik workflow_trace span name to node name (#22356) 2025-07-15 09:33:06 +08:00
b690a9d839 fix: aliyun trace title&description (#22347) 2025-07-14 17:14:24 +08:00
9d9423808e Update README.md (#22351) 2025-07-14 17:13:50 +08:00
3e96c0c468 fix: close session before doing long latency operation (#22306) 2025-07-14 15:16:10 +08:00
6eb155ae69 feat(api/repo): Allow to config repository implementation (#21458)
Signed-off-by: -LAN- <laipz8200@outlook.com>
Co-authored-by: QuantumGhost <obelisk.reg+git@gmail.com>
2025-07-14 14:54:38 +08:00
b27c540379 Fix: Remove height and overflow style settings (#22327) 2025-07-14 13:57:53 +08:00
8b1f428ead Chore: Replace lodash/noop with lodash-es/noop (#22331) 2025-07-14 13:57:26 +08:00
1d54ffcf89 fix: error parsing object type parameters for code node (#22230) 2025-07-14 10:37:26 +08:00
d9eb5554b3 fix: prevent trigger form submit action when press 'enter' (#22313) 2025-07-14 09:59:20 +08:00
da94bdeb54 Update README.md (#22305) 2025-07-14 09:37:17 +08:00
27e5e2745b test: add comprehensive unit tests for login decorator (#22294) 2025-07-14 09:34:13 +08:00
znn
1b26f9a4c6 fixing Enum part in backend and making it same as front end (#22296) 2025-07-14 09:34:04 +08:00
df886259bd test(web): add password regexp test case (#22308) 2025-07-14 09:32:34 +08:00
016ff0feae fix(ui): prevent var icon hidden when only one var in list of start node (#22290) 2025-07-13 20:10:15 +08:00
aa6cad5f1d fix: tool's model selector and app selector not work (#22291) 2025-07-13 20:04:29 +08:00
e7388779a1 chore: bump ruff to 0.12.x (#22259) 2025-07-12 20:00:54 +08:00
6c233e05a9 minor fix: wrong and (#22242) 2025-07-12 19:59:07 +08:00
9f013f7644 Add unit test for account service (#22278) 2025-07-12 19:58:42 +08:00
253d8e5a5f test: add comprehensive unit tests for PassportService with exception handling optimization (#22268) 2025-07-12 19:56:20 +08:00
7f5087c6db reject whitespace characters in password regexp (#22232) 2025-07-11 19:18:18 +08:00
817071e448 fix: iteration itemType support conversation var (#22220) (#22236) 2025-07-11 19:16:30 +08:00
f193e9764e fix: Optimize the workspace panel width calculation (#22195) 2025-07-11 19:12:12 +08:00
5f9628e027 feat(workflow): add drag-and-drop support for variable list items for start node (#22150) 2025-07-11 18:53:29 +08:00
76d21743fd fix(web): Optimize AppInfo Component Layout (#22212) (#22218) 2025-07-11 17:54:09 +08:00
2d3c5b3b7c fix(emoji-picker): Adjust the style of the emoji picker (#22161) (#22231) 2025-07-11 17:52:16 +08:00
1d85979a74 chore:extract last run common logic (#22214) 2025-07-11 16:41:25 +08:00
2a85f28963 fix:Fixed the problem of plugin installation failure caused by incons… (#22156) 2025-07-11 15:18:42 +08:00
fe4e2f7921 feat: support var in suggested questions (#17340)
Co-authored-by: crazywoola <427733928@qq.com>
2025-07-11 15:07:32 +08:00
9a9ec0c99b feat: Add Audio configuration setting to app configuration UI (#21957) 2025-07-11 14:04:42 +08:00
K
d5624ba671 fix: resolve Docker file URL networking issue for plugins (#21334) (#21382)
Co-authored-by: crazywoola <427733928@qq.com>
2025-07-11 12:11:59 +08:00
c805238471 fix: adjust layout styles for header and dataset update (#22182) 2025-07-11 11:17:28 +08:00
e576b989b8 feat(tool): add support for API key authentication via query parameter (#21656) 2025-07-11 10:39:20 +08:00
f929bfb94c minor fix: remove duplicates, fix typo, and add restriction for get mcp server (#22170)
Signed-off-by: neatguycoding <15627489+NeatGuyCoding@users.noreply.github.com>
2025-07-11 09:40:17 +08:00
f4df80e093 fix(custom_tool): omit optional parameters instead of setting them to None (#22171) 2025-07-10 20:56:45 +08:00
290 changed files with 10617 additions and 2428 deletions

View File

@ -8,13 +8,13 @@ body:
label: Self Checks
description: "To make sure we get to you in time, please check the following :)"
options:
- label: I have read the [Contributing Guide](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md) and [Language Policy](https://github.com/langgenius/dify/issues/1542).
required: true
- label: This is only for bug report, if you would like to ask a question, please head to [Discussions](https://github.com/langgenius/dify/discussions/categories/general).
required: true
- label: I have searched for existing issues [search for existing issues](https://github.com/langgenius/dify/issues), including closed ones.
required: true
- label: I confirm that I am using English to submit this report (我已阅读并同意 [Language Policy](https://github.com/langgenius/dify/issues/1542)).
required: true
- label: "[FOR CHINESE USERS] 请务必使用英文提交 Issue否则会被关闭。谢谢:)"
- label: I confirm that I am using English to submit this report, otherwise it will be closed.
required: true
- label: "Please do not modify this template :) and fill in all the required fields."
required: true
@ -42,20 +42,22 @@ body:
attributes:
label: Steps to reproduce
description: We highly suggest including screenshots and a bug report log. Please use the right markdown syntax for code blocks.
placeholder: Having detailed steps helps us reproduce the bug.
placeholder: Having detailed steps helps us reproduce the bug. If you have logs, please use fenced code blocks (triple backticks ```) to format them.
validations:
required: true
- type: textarea
attributes:
label: ✔️ Expected Behavior
placeholder: What were you expecting?
description: Describe what you expected to happen.
placeholder: What were you expecting? Please do not copy and paste the steps to reproduce here.
validations:
required: false
required: true
- type: textarea
attributes:
label: ❌ Actual Behavior
placeholder: What happened instead?
description: Describe what actually happened.
placeholder: What happened instead? Please do not copy and paste the steps to reproduce here.
validations:
required: false

View File

@ -1,5 +1,11 @@
blank_issues_enabled: false
contact_links:
- name: "\U0001F4A1 Model Providers & Plugins"
url: "https://github.com/langgenius/dify-official-plugins/issues/new/choose"
about: Report issues with official plugins or model providers, you will need to provide the plugin version and other relevant details.
- name: "\U0001F4AC Documentation Issues"
url: "https://github.com/langgenius/dify-docs/issues/new"
about: Report issues with the documentation, such as typos, outdated information, or missing content. Please provide the specific section and details of the issue.
- name: "\U0001F4E7 Discussions"
url: https://github.com/langgenius/dify/discussions/categories/general
about: General discussions and request help from the community
about: General discussions and seek help from the community

View File

@ -1,24 +0,0 @@
name: "📚 Documentation Issue"
description: Report issues in our documentation
labels:
- documentation
body:
- type: checkboxes
attributes:
label: Self Checks
description: "To make sure we get to you in time, please check the following :)"
options:
- label: I have searched for existing issues [search for existing issues](https://github.com/langgenius/dify/issues), including closed ones.
required: true
- label: I confirm that I am using English to submit report (我已阅读并同意 [Language Policy](https://github.com/langgenius/dify/issues/1542)).
required: true
- label: "[FOR CHINESE USERS] 请务必使用英文提交 Issue否则会被关闭。谢谢:)"
required: true
- label: "Please do not modify this template :) and fill in all the required fields."
required: true
- type: textarea
attributes:
label: Provide a description of requested docs changes
placeholder: Briefly describe which document needs to be corrected and why.
validations:
required: true

View File

@ -8,11 +8,11 @@ body:
label: Self Checks
description: "To make sure we get to you in time, please check the following :)"
options:
- label: I have read the [Contributing Guide](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md) and [Language Policy](https://github.com/langgenius/dify/issues/1542).
required: true
- label: I have searched for existing issues [search for existing issues](https://github.com/langgenius/dify/issues), including closed ones.
required: true
- label: I confirm that I am using English to submit this report (我已阅读并同意 [Language Policy](https://github.com/langgenius/dify/issues/1542)).
required: true
- label: "[FOR CHINESE USERS] 请务必使用英文提交 Issue否则会被关闭。谢谢:)"
- label: I confirm that I am using English to submit this report, otherwise it will be closed.
required: true
- label: "Please do not modify this template :) and fill in all the required fields."
required: true

View File

@ -1,55 +0,0 @@
name: "🌐 Localization/Translation issue"
description: Report incorrect translations. [please use English :)]
labels:
- translation
body:
- type: checkboxes
attributes:
label: Self Checks
description: "To make sure we get to you in time, please check the following :)"
options:
- label: I have searched for existing issues [search for existing issues](https://github.com/langgenius/dify/issues), including closed ones.
required: true
- label: I confirm that I am using English to submit this report (我已阅读并同意 [Language Policy](https://github.com/langgenius/dify/issues/1542)).
required: true
- label: "[FOR CHINESE USERS] 请务必使用英文提交 Issue否则会被关闭。谢谢:)"
required: true
- label: "Please do not modify this template :) and fill in all the required fields."
required: true
- type: input
attributes:
label: Dify version
description: Hover over system tray icon or look at Settings
validations:
required: true
- type: input
attributes:
label: Utility with translation issue
placeholder: Some area
description: Please input here the utility with the translation issue
validations:
required: true
- type: input
attributes:
label: 🌐 Language affected
placeholder: "German"
validations:
required: true
- type: textarea
attributes:
label: ❌ Actual phrase(s)
placeholder: What is there? Please include a screenshot as that is extremely helpful.
validations:
required: true
- type: textarea
attributes:
label: ✔️ Expected phrase(s)
placeholder: What was expected?
validations:
required: true
- type: textarea
attributes:
label: Why is the current translation wrong
placeholder: Why do you feel this is incorrect?
validations:
required: true

View File

@ -54,7 +54,7 @@
<a href="./README_BN.md"><img alt="README in বাংলা" src="https://img.shields.io/badge/বাংলা-d9d9d9"></a>
</p>
Dify is an open-source LLM app development platform. Its intuitive interface combines agentic AI workflow, RAG pipeline, agent capabilities, model management, observability features, and more, allowing you to quickly move from prototype to production.
Dify is an open-source platform for developing LLM applications. Its intuitive interface combines agentic AI workflows, RAG pipelines, agent capabilities, model management, observability features, and moreallowing you to quickly move from prototype to production.
## Quick start
@ -65,7 +65,7 @@ Dify is an open-source LLM app development platform. Its intuitive interface com
</br>
The easiest way to start the Dify server is through [docker compose](docker/docker-compose.yaml). Before running Dify with the following commands, make sure that [Docker](https://docs.docker.com/get-docker/) and [Docker Compose](https://docs.docker.com/compose/install/) are installed on your machine:
The easiest way to start the Dify server is through [Docker Compose](docker/docker-compose.yaml). Before running Dify with the following commands, make sure that [Docker](https://docs.docker.com/get-docker/) and [Docker Compose](https://docs.docker.com/compose/install/) are installed on your machine:
```bash
cd dify
@ -205,6 +205,7 @@ If you'd like to configure a highly-available setup, there are community-contrib
- [Helm Chart by @magicsong](https://github.com/magicsong/ai-charts)
- [YAML file by @Winson-030](https://github.com/Winson-030/dify-kubernetes)
- [YAML file by @wyy-holding](https://github.com/wyy-holding/dify-k8s)
- [🚀 NEW! YAML files (Supports Dify v1.6.0) by @Zhoneym](https://github.com/Zhoneym/DifyAI-Kubernetes)
#### Using Terraform for Deployment
@ -261,8 +262,8 @@ At the same time, please consider supporting Dify by sharing it on social media
## Security disclosure
To protect your privacy, please avoid posting security issues on GitHub. Instead, send your questions to security@dify.ai and we will provide you with a more detailed answer.
To protect your privacy, please avoid posting security issues on GitHub. Instead, report issues to security@dify.ai, and our team will respond with detailed answer.
## License
This repository is available under the [Dify Open Source License](LICENSE), which is essentially Apache 2.0 with a few additional restrictions.
This repository is licensed under the [Dify Open Source License](LICENSE), based on Apache 2.0 with additional conditions.

View File

@ -188,6 +188,7 @@ docker compose up -d
- [رسم بياني Helm من قبل @magicsong](https://github.com/magicsong/ai-charts)
- [ملف YAML من قبل @Winson-030](https://github.com/Winson-030/dify-kubernetes)
- [ملف YAML من قبل @wyy-holding](https://github.com/wyy-holding/dify-k8s)
- [🚀 جديد! ملفات YAML (تدعم Dify v1.6.0) بواسطة @Zhoneym](https://github.com/Zhoneym/DifyAI-Kubernetes)
#### استخدام Terraform للتوزيع

View File

@ -204,6 +204,8 @@ GitHub-এ ডিফাইকে স্টার দিয়ে রাখুন
- [Helm Chart by @magicsong](https://github.com/magicsong/ai-charts)
- [YAML file by @Winson-030](https://github.com/Winson-030/dify-kubernetes)
- [YAML file by @wyy-holding](https://github.com/wyy-holding/dify-k8s)
- [🚀 নতুন! YAML ফাইলসমূহ (Dify v1.6.0 সমর্থিত) তৈরি করেছেন @Zhoneym](https://github.com/Zhoneym/DifyAI-Kubernetes)
#### টেরাফর্ম ব্যবহার করে ডিপ্লয়

View File

@ -194,9 +194,9 @@ docker compose up -d
如果您需要自定义配置,请参考 [.env.example](docker/.env.example) 文件中的注释,并更新 `.env` 文件中对应的值。此外,您可能需要根据您的具体部署环境和需求对 `docker-compose.yaml` 文件本身进行调整,例如更改镜像版本、端口映射或卷挂载。完成任何更改后,请重新运行 `docker-compose up -d`。您可以在[此处](https://docs.dify.ai/getting-started/install-self-hosted/environments)找到可用环境变量的完整列表。
#### 使用 Helm Chart 部署
#### 使用 Helm Chart 或 Kubernetes 资源清单YAML部署
使用 [Helm Chart](https://helm.sh/) 版本或者 YAML 文件,可以在 Kubernetes 上部署 Dify。
使用 [Helm Chart](https://helm.sh/) 版本或者 Kubernetes 资源清单(YAML,可以在 Kubernetes 上部署 Dify。
- [Helm Chart by @LeoQuote](https://github.com/douban/charts/tree/master/charts/dify)
- [Helm Chart by @BorisPolonsky](https://github.com/BorisPolonsky/dify-helm)
@ -204,6 +204,10 @@ docker compose up -d
- [YAML 文件 by @Winson-030](https://github.com/Winson-030/dify-kubernetes)
- [YAML file by @wyy-holding](https://github.com/wyy-holding/dify-k8s)
- [🚀 NEW! YAML 文件 (支持 Dify v1.6.0) by @Zhoneym](https://github.com/Zhoneym/DifyAI-Kubernetes)
#### 使用 Terraform 部署
使用 [terraform](https://www.terraform.io/) 一键将 Dify 部署到云平台

View File

@ -203,6 +203,7 @@ Falls Sie eine hochverfügbare Konfiguration einrichten möchten, gibt es von de
- [Helm Chart by @magicsong](https://github.com/magicsong/ai-charts)
- [YAML file by @Winson-030](https://github.com/Winson-030/dify-kubernetes)
- [YAML file by @wyy-holding](https://github.com/wyy-holding/dify-k8s)
- [🚀 NEW! YAML files (Supports Dify v1.6.0) by @Zhoneym](https://github.com/Zhoneym/DifyAI-Kubernetes)
#### Terraform für die Bereitstellung verwenden

View File

@ -203,6 +203,7 @@ Si desea configurar una configuración de alta disponibilidad, la comunidad prop
- [Gráfico Helm por @magicsong](https://github.com/magicsong/ai-charts)
- [Ficheros YAML por @Winson-030](https://github.com/Winson-030/dify-kubernetes)
- [Ficheros YAML por @wyy-holding](https://github.com/wyy-holding/dify-k8s)
- [🚀 ¡NUEVO! Archivos YAML (compatible con Dify v1.6.0) por @Zhoneym](https://github.com/Zhoneym/DifyAI-Kubernetes)
#### Uso de Terraform para el despliegue

View File

@ -201,6 +201,7 @@ Si vous souhaitez configurer une configuration haute disponibilité, la communau
- [Helm Chart par @magicsong](https://github.com/magicsong/ai-charts)
- [Fichier YAML par @Winson-030](https://github.com/Winson-030/dify-kubernetes)
- [Fichier YAML par @wyy-holding](https://github.com/wyy-holding/dify-k8s)
- [🚀 NOUVEAU ! Fichiers YAML (compatible avec Dify v1.6.0) par @Zhoneym](https://github.com/Zhoneym/DifyAI-Kubernetes)
#### Utilisation de Terraform pour le déploiement

View File

@ -202,6 +202,7 @@ docker compose up -d
- [Helm Chart by @magicsong](https://github.com/magicsong/ai-charts)
- [YAML file by @Winson-030](https://github.com/Winson-030/dify-kubernetes)
- [YAML file by @wyy-holding](https://github.com/wyy-holding/dify-k8s)
- [🚀 新着YAML ファイルDify v1.6.0 対応by @Zhoneym](https://github.com/Zhoneym/DifyAI-Kubernetes)
#### Terraformを使用したデプロイ

View File

@ -201,6 +201,7 @@ If you'd like to configure a highly-available setup, there are community-contrib
- [Helm Chart by @magicsong](https://github.com/magicsong/ai-charts)
- [YAML file by @Winson-030](https://github.com/Winson-030/dify-kubernetes)
- [YAML file by @wyy-holding](https://github.com/wyy-holding/dify-k8s)
- [🚀 NEW! YAML files (Supports Dify v1.6.0) by @Zhoneym](https://github.com/Zhoneym/DifyAI-Kubernetes)
#### Terraform atorlugu pilersitsineq

View File

@ -195,6 +195,7 @@ Dify를 Kubernetes에 배포하고 프리미엄 스케일링 설정을 구성했
- [Helm Chart by @magicsong](https://github.com/magicsong/ai-charts)
- [YAML file by @Winson-030](https://github.com/Winson-030/dify-kubernetes)
- [YAML file by @wyy-holding](https://github.com/wyy-holding/dify-k8s)
- [🚀 NEW! YAML files (Supports Dify v1.6.0) by @Zhoneym](https://github.com/Zhoneym/DifyAI-Kubernetes)
#### Terraform을 사용한 배포

View File

@ -200,6 +200,7 @@ Se deseja configurar uma instalação de alta disponibilidade, há [Helm Charts]
- [Helm Chart de @magicsong](https://github.com/magicsong/ai-charts)
- [Arquivo YAML por @Winson-030](https://github.com/Winson-030/dify-kubernetes)
- [Arquivo YAML por @wyy-holding](https://github.com/wyy-holding/dify-k8s)
- [🚀 NOVO! Arquivos YAML (Compatível com Dify v1.6.0) por @Zhoneym](https://github.com/Zhoneym/DifyAI-Kubernetes)
#### Usando o Terraform para Implantação

View File

@ -201,6 +201,7 @@ Star Dify on GitHub and be instantly notified of new releases.
- [Helm Chart by @BorisPolonsky](https://github.com/BorisPolonsky/dify-helm)
- [YAML file by @Winson-030](https://github.com/Winson-030/dify-kubernetes)
- [YAML file by @wyy-holding](https://github.com/wyy-holding/dify-k8s)
- [🚀 NEW! YAML files (Supports Dify v1.6.0) by @Zhoneym](https://github.com/Zhoneym/DifyAI-Kubernetes)
#### Uporaba Terraform za uvajanje

View File

@ -194,6 +194,7 @@ Yüksek kullanılabilirliğe sahip bir kurulum yapılandırmak isterseniz, Dify'
- [@BorisPolonsky tarafından Helm Chart](https://github.com/BorisPolonsky/dify-helm)
- [@Winson-030 tarafından YAML dosyası](https://github.com/Winson-030/dify-kubernetes)
- [@wyy-holding tarafından YAML dosyası](https://github.com/wyy-holding/dify-k8s)
- [🚀 YENİ! YAML dosyaları (Dify v1.6.0 destekli) @Zhoneym tarafından](https://github.com/Zhoneym/DifyAI-Kubernetes)
#### Dağıtım için Terraform Kullanımı

View File

@ -197,12 +197,13 @@ Dify 的所有功能都提供相應的 API因此您可以輕鬆地將 Dify
如果您需要自定義配置,請參考我們的 [.env.example](docker/.env.example) 文件中的註釋,並在您的 `.env` 文件中更新相應的值。此外,根據您特定的部署環境和需求,您可能需要調整 `docker-compose.yaml` 文件本身,例如更改映像版本、端口映射或卷掛載。進行任何更改後,請重新運行 `docker-compose up -d`。您可以在[這裡](https://docs.dify.ai/getting-started/install-self-hosted/environments)找到可用環境變數的完整列表。
如果您想配置高可用性設置,社區貢獻的 [Helm Charts](https://helm.sh/) 和 YAML 文件允許在 Kubernetes 上部署 Dify。
如果您想配置高可用性設置,社區貢獻的 [Helm Charts](https://helm.sh/) 和 Kubernetes 資源清單(YAML允許在 Kubernetes 上部署 Dify。
- [由 @LeoQuote 提供的 Helm Chart](https://github.com/douban/charts/tree/master/charts/dify)
- [由 @BorisPolonsky 提供的 Helm Chart](https://github.com/BorisPolonsky/dify-helm)
- [由 @Winson-030 提供的 YAML 文件](https://github.com/Winson-030/dify-kubernetes)
- [由 @wyy-holding 提供的 YAML 文件](https://github.com/wyy-holding/dify-k8s)
- [🚀 NEW! YAML 檔案(支援 Dify v1.6.0by @Zhoneym](https://github.com/Zhoneym/DifyAI-Kubernetes)
### 使用 Terraform 進行部署

View File

@ -196,6 +196,7 @@ Nếu bạn muốn cấu hình một cài đặt có độ sẵn sàng cao, có
- [Helm Chart bởi @BorisPolonsky](https://github.com/BorisPolonsky/dify-helm)
- [Tệp YAML bởi @Winson-030](https://github.com/Winson-030/dify-kubernetes)
- [Tệp YAML bởi @wyy-holding](https://github.com/wyy-holding/dify-k8s)
- [🚀 MỚI! Tệp YAML (Hỗ trợ Dify v1.6.0) bởi @Zhoneym](https://github.com/Zhoneym/DifyAI-Kubernetes)
#### Sử dụng Terraform để Triển khai

View File

@ -17,6 +17,11 @@ APP_WEB_URL=http://127.0.0.1:3000
# Files URL
FILES_URL=http://127.0.0.1:5001
# INTERNAL_FILES_URL is used for plugin daemon communication within Docker network.
# Set this to the internal Docker service URL for proper plugin file access.
# Example: INTERNAL_FILES_URL=http://api:5001
INTERNAL_FILES_URL=http://127.0.0.1:5001
# The time in seconds after the signature is rejected
FILES_ACCESS_TIMEOUT=300
@ -444,6 +449,19 @@ MAX_VARIABLE_SIZE=204800
# hybrid: Save new data to object storage, read from both object storage and RDBMS
WORKFLOW_NODE_EXECUTION_STORAGE=rdbms
# Repository configuration
# Core workflow execution repository implementation
CORE_WORKFLOW_EXECUTION_REPOSITORY=core.repositories.sqlalchemy_workflow_execution_repository.SQLAlchemyWorkflowExecutionRepository
# Core workflow node execution repository implementation
CORE_WORKFLOW_NODE_EXECUTION_REPOSITORY=core.repositories.sqlalchemy_workflow_node_execution_repository.SQLAlchemyWorkflowNodeExecutionRepository
# API workflow node execution repository implementation
API_WORKFLOW_NODE_EXECUTION_REPOSITORY=repositories.sqlalchemy_api_workflow_node_execution_repository.DifyAPISQLAlchemyWorkflowNodeExecutionRepository
# API workflow run repository implementation
API_WORKFLOW_RUN_REPOSITORY=repositories.sqlalchemy_api_workflow_run_repository.DifyAPISQLAlchemyWorkflowRunRepository
# App configuration
APP_MAX_EXECUTION_TIME=1200
APP_MAX_ACTIVE_REQUESTS=0

View File

@ -237,6 +237,13 @@ class FileAccessConfig(BaseSettings):
default="",
)
INTERNAL_FILES_URL: str = Field(
description="Internal base URL for file access within Docker network,"
" used for plugin daemon and internal service communication."
" Falls back to FILES_URL if not specified.",
default="",
)
FILES_ACCESS_TIMEOUT: int = Field(
description="Expiration time in seconds for file access URLs",
default=300,
@ -530,6 +537,33 @@ class WorkflowNodeExecutionConfig(BaseSettings):
)
class RepositoryConfig(BaseSettings):
"""
Configuration for repository implementations
"""
CORE_WORKFLOW_EXECUTION_REPOSITORY: str = Field(
description="Repository implementation for WorkflowExecution. Specify as a module path",
default="core.repositories.sqlalchemy_workflow_execution_repository.SQLAlchemyWorkflowExecutionRepository",
)
CORE_WORKFLOW_NODE_EXECUTION_REPOSITORY: str = Field(
description="Repository implementation for WorkflowNodeExecution. Specify as a module path",
default="core.repositories.sqlalchemy_workflow_node_execution_repository.SQLAlchemyWorkflowNodeExecutionRepository",
)
API_WORKFLOW_NODE_EXECUTION_REPOSITORY: str = Field(
description="Service-layer repository implementation for WorkflowNodeExecutionModel operations. "
"Specify as a module path",
default="repositories.sqlalchemy_api_workflow_node_execution_repository.DifyAPISQLAlchemyWorkflowNodeExecutionRepository",
)
API_WORKFLOW_RUN_REPOSITORY: str = Field(
description="Service-layer repository implementation for WorkflowRun operations. Specify as a module path",
default="repositories.sqlalchemy_api_workflow_run_repository.DifyAPISQLAlchemyWorkflowRunRepository",
)
class AuthConfig(BaseSettings):
"""
Configuration for authentication and OAuth
@ -896,6 +930,7 @@ class FeatureConfig(
MultiModalTransferConfig,
PositionConfig,
RagEtlConfig,
RepositoryConfig,
SecurityConfig,
ToolConfig,
UpdateConfig,

View File

@ -162,6 +162,11 @@ class DatabaseConfig(BaseSettings):
default=3600,
)
SQLALCHEMY_POOL_USE_LIFO: bool = Field(
description="If True, SQLAlchemy will use last-in-first-out way to retrieve connections from pool.",
default=False,
)
SQLALCHEMY_POOL_PRE_PING: bool = Field(
description="If True, enables connection pool pre-ping feature to check connections.",
default=False,
@ -199,6 +204,7 @@ class DatabaseConfig(BaseSettings):
"pool_recycle": self.SQLALCHEMY_POOL_RECYCLE,
"pool_pre_ping": self.SQLALCHEMY_POOL_PRE_PING,
"connect_args": connect_args,
"pool_use_lifo": self.SQLALCHEMY_POOL_USE_LIFO,
}

View File

@ -151,6 +151,7 @@ class AppApi(Resource):
parser.add_argument("icon", type=str, location="json")
parser.add_argument("icon_background", type=str, location="json")
parser.add_argument("use_icon_as_answer_icon", type=bool, location="json")
parser.add_argument("max_active_requests", type=int, location="json")
args = parser.parse_args()
app_service = AppService()

View File

@ -35,16 +35,20 @@ class AppMCPServerController(Resource):
@get_app_model
@marshal_with(app_server_fields)
def post(self, app_model):
# The role of the current user in the ta table must be editor, admin, or owner
if not current_user.is_editor:
raise NotFound()
parser = reqparse.RequestParser()
parser.add_argument("description", type=str, required=True, location="json")
parser.add_argument("description", type=str, required=False, location="json")
parser.add_argument("parameters", type=dict, required=True, location="json")
args = parser.parse_args()
description = args.get("description")
if not description:
description = app_model.description or ""
server = AppMCPServer(
name=app_model.name,
description=args["description"],
description=description,
parameters=json.dumps(args["parameters"], ensure_ascii=False),
status=AppMCPServerStatus.ACTIVE,
app_id=app_model.id,
@ -65,14 +69,22 @@ class AppMCPServerController(Resource):
raise NotFound()
parser = reqparse.RequestParser()
parser.add_argument("id", type=str, required=True, location="json")
parser.add_argument("description", type=str, required=True, location="json")
parser.add_argument("description", type=str, required=False, location="json")
parser.add_argument("parameters", type=dict, required=True, location="json")
parser.add_argument("status", type=str, required=False, location="json")
args = parser.parse_args()
server = db.session.query(AppMCPServer).filter(AppMCPServer.id == args["id"]).first()
if not server:
raise NotFound()
server.description = args["description"]
description = args.get("description")
if description is None:
pass
elif not description:
server.description = app_model.description or ""
else:
server.description = description
server.parameters = json.dumps(args["parameters"], ensure_ascii=False)
if args["status"]:
if args["status"] not in [status.value for status in AppMCPServerStatus]:
@ -90,7 +102,12 @@ class AppMCPServerRefreshController(Resource):
def get(self, server_id):
if not current_user.is_editor:
raise NotFound()
server = db.session.query(AppMCPServer).filter(AppMCPServer.id == server_id).first()
server = (
db.session.query(AppMCPServer)
.filter(AppMCPServer.id == server_id)
.filter(AppMCPServer.tenant_id == current_user.current_tenant_id)
.first()
)
if not server:
raise NotFound()
server.server_code = AppMCPServer.generate_server_code(16)

View File

@ -2,6 +2,7 @@ from datetime import datetime
from decimal import Decimal
import pytz
import sqlalchemy as sa
from flask import jsonify
from flask_login import current_user
from flask_restful import Resource, reqparse
@ -9,10 +10,11 @@ from flask_restful import Resource, reqparse
from controllers.console import api
from controllers.console.app.wraps import get_app_model
from controllers.console.wraps import account_initialization_required, setup_required
from core.app.entities.app_invoke_entities import InvokeFrom
from extensions.ext_database import db
from libs.helper import DatetimeString
from libs.login import login_required
from models.model import AppMode
from models import AppMode, Message
class DailyMessageStatistic(Resource):
@ -85,46 +87,41 @@ class DailyConversationStatistic(Resource):
parser.add_argument("end", type=DatetimeString("%Y-%m-%d %H:%M"), location="args")
args = parser.parse_args()
sql_query = """SELECT
DATE(DATE_TRUNC('day', created_at AT TIME ZONE 'UTC' AT TIME ZONE :tz )) AS date,
COUNT(DISTINCT messages.conversation_id) AS conversation_count
FROM
messages
WHERE
app_id = :app_id"""
arg_dict = {"tz": account.timezone, "app_id": app_model.id}
timezone = pytz.timezone(account.timezone)
utc_timezone = pytz.utc
stmt = (
sa.select(
sa.func.date(
sa.func.date_trunc("day", sa.text("created_at AT TIME ZONE 'UTC' AT TIME ZONE :tz"))
).label("date"),
sa.func.count(sa.distinct(Message.conversation_id)).label("conversation_count"),
)
.select_from(Message)
.where(Message.app_id == app_model.id, Message.invoke_from != InvokeFrom.DEBUGGER.value)
)
if args["start"]:
start_datetime = datetime.strptime(args["start"], "%Y-%m-%d %H:%M")
start_datetime = start_datetime.replace(second=0)
start_datetime_timezone = timezone.localize(start_datetime)
start_datetime_utc = start_datetime_timezone.astimezone(utc_timezone)
sql_query += " AND created_at >= :start"
arg_dict["start"] = start_datetime_utc
stmt = stmt.where(Message.created_at >= start_datetime_utc)
if args["end"]:
end_datetime = datetime.strptime(args["end"], "%Y-%m-%d %H:%M")
end_datetime = end_datetime.replace(second=0)
end_datetime_timezone = timezone.localize(end_datetime)
end_datetime_utc = end_datetime_timezone.astimezone(utc_timezone)
stmt = stmt.where(Message.created_at < end_datetime_utc)
sql_query += " AND created_at < :end"
arg_dict["end"] = end_datetime_utc
sql_query += " GROUP BY date ORDER BY date"
stmt = stmt.group_by("date").order_by("date")
response_data = []
with db.engine.begin() as conn:
rs = conn.execute(db.text(sql_query), arg_dict)
for i in rs:
response_data.append({"date": str(i.date), "conversation_count": i.conversation_count})
rs = conn.execute(stmt, {"tz": account.timezone})
for row in rs:
response_data.append({"date": str(row.date), "conversation_count": row.conversation_count})
return jsonify({"data": response_data})

View File

@ -68,13 +68,18 @@ def _create_pagination_parser():
return parser
def _serialize_variable_type(workflow_draft_var: WorkflowDraftVariable) -> str:
value_type = workflow_draft_var.value_type
return value_type.exposed_type().value
_WORKFLOW_DRAFT_VARIABLE_WITHOUT_VALUE_FIELDS = {
"id": fields.String,
"type": fields.String(attribute=lambda model: model.get_variable_type()),
"name": fields.String,
"description": fields.String,
"selector": fields.List(fields.String, attribute=lambda model: model.get_selector()),
"value_type": fields.String,
"value_type": fields.String(attribute=_serialize_variable_type),
"edited": fields.Boolean(attribute=lambda model: model.edited),
"visible": fields.Boolean,
}
@ -90,7 +95,7 @@ _WORKFLOW_DRAFT_ENV_VARIABLE_FIELDS = {
"name": fields.String,
"description": fields.String,
"selector": fields.List(fields.String, attribute=lambda model: model.get_selector()),
"value_type": fields.String,
"value_type": fields.String(attribute=_serialize_variable_type),
"edited": fields.Boolean(attribute=lambda model: model.edited),
"visible": fields.Boolean,
}
@ -396,7 +401,7 @@ class EnvironmentVariableCollectionApi(Resource):
"name": v.name,
"description": v.description,
"selector": v.selector,
"value_type": v.value_type.value,
"value_type": v.value_type.exposed_type().value,
"value": v.value,
# Do not track edited for env vars.
"edited": False,

View File

@ -35,8 +35,6 @@ def get_app_model(view: Optional[Callable] = None, *, mode: Union[AppMode, list[
raise AppNotFoundError()
app_mode = AppMode.value_of(app_model.mode)
if app_mode == AppMode.CHANNEL:
raise AppNotFoundError()
if mode is not None:
if isinstance(mode, list):

View File

@ -3,7 +3,7 @@ import logging
from dateutil.parser import isoparse
from flask_restful import Resource, fields, marshal_with, reqparse
from flask_restful.inputs import int_range
from sqlalchemy.orm import Session
from sqlalchemy.orm import Session, sessionmaker
from werkzeug.exceptions import InternalServerError
from controllers.service_api import api
@ -30,7 +30,7 @@ from fields.workflow_app_log_fields import workflow_app_log_pagination_fields
from libs import helper
from libs.helper import TimestampField
from models.model import App, AppMode, EndUser
from models.workflow import WorkflowRun
from repositories.factory import DifyAPIRepositoryFactory
from services.app_generate_service import AppGenerateService
from services.errors.llm import InvokeRateLimitError
from services.workflow_app_service import WorkflowAppService
@ -63,7 +63,15 @@ class WorkflowRunDetailApi(Resource):
if app_mode not in [AppMode.WORKFLOW, AppMode.ADVANCED_CHAT]:
raise NotWorkflowAppError()
workflow_run = db.session.query(WorkflowRun).filter(WorkflowRun.id == workflow_run_id).first()
# Use repository to get workflow run
session_maker = sessionmaker(bind=db.engine, expire_on_commit=False)
workflow_run_repo = DifyAPIRepositoryFactory.create_api_workflow_run_repository(session_maker)
workflow_run = workflow_run_repo.get_workflow_run_by_id(
tenant_id=app_model.tenant_id,
app_id=app_model.id,
run_id=workflow_run_id,
)
return workflow_run

View File

@ -3,6 +3,8 @@ import logging
import uuid
from typing import Optional, Union, cast
from sqlalchemy import select
from core.agent.entities import AgentEntity, AgentToolEntity
from core.app.app_config.features.file_upload.manager import FileUploadConfigManager
from core.app.apps.agent_chat.app_config_manager import AgentChatAppConfig
@ -417,12 +419,15 @@ class BaseAgentRunner(AppRunner):
if isinstance(prompt_message, SystemPromptMessage):
result.append(prompt_message)
messages: list[Message] = (
db.session.query(Message)
.filter(
Message.conversation_id == self.message.conversation_id,
messages = (
(
db.session.execute(
select(Message)
.where(Message.conversation_id == self.message.conversation_id)
.order_by(Message.created_at.desc())
)
)
.order_by(Message.created_at.desc())
.scalars()
.all()
)

View File

@ -25,8 +25,7 @@ from core.app.entities.task_entities import ChatbotAppBlockingResponse, ChatbotA
from core.model_runtime.errors.invoke import InvokeAuthorizationError
from core.ops.ops_trace_manager import TraceQueueManager
from core.prompt.utils.get_thread_messages_length import get_thread_messages_length
from core.repositories import SQLAlchemyWorkflowNodeExecutionRepository
from core.repositories.sqlalchemy_workflow_execution_repository import SQLAlchemyWorkflowExecutionRepository
from core.repositories import DifyCoreRepositoryFactory
from core.workflow.repositories.draft_variable_repository import (
DraftVariableSaverFactory,
)
@ -183,14 +182,14 @@ class AdvancedChatAppGenerator(MessageBasedAppGenerator):
workflow_triggered_from = WorkflowRunTriggeredFrom.DEBUGGING
else:
workflow_triggered_from = WorkflowRunTriggeredFrom.APP_RUN
workflow_execution_repository = SQLAlchemyWorkflowExecutionRepository(
workflow_execution_repository = DifyCoreRepositoryFactory.create_workflow_execution_repository(
session_factory=session_factory,
user=user,
app_id=application_generate_entity.app_config.app_id,
triggered_from=workflow_triggered_from,
)
# Create workflow node execution repository
workflow_node_execution_repository = SQLAlchemyWorkflowNodeExecutionRepository(
workflow_node_execution_repository = DifyCoreRepositoryFactory.create_workflow_node_execution_repository(
session_factory=session_factory,
user=user,
app_id=application_generate_entity.app_config.app_id,
@ -260,14 +259,14 @@ class AdvancedChatAppGenerator(MessageBasedAppGenerator):
# Create session factory
session_factory = sessionmaker(bind=db.engine, expire_on_commit=False)
# Create workflow execution(aka workflow run) repository
workflow_execution_repository = SQLAlchemyWorkflowExecutionRepository(
workflow_execution_repository = DifyCoreRepositoryFactory.create_workflow_execution_repository(
session_factory=session_factory,
user=user,
app_id=application_generate_entity.app_config.app_id,
triggered_from=WorkflowRunTriggeredFrom.DEBUGGING,
)
# Create workflow node execution repository
workflow_node_execution_repository = SQLAlchemyWorkflowNodeExecutionRepository(
workflow_node_execution_repository = DifyCoreRepositoryFactory.create_workflow_node_execution_repository(
session_factory=session_factory,
user=user,
app_id=application_generate_entity.app_config.app_id,
@ -343,14 +342,14 @@ class AdvancedChatAppGenerator(MessageBasedAppGenerator):
# Create session factory
session_factory = sessionmaker(bind=db.engine, expire_on_commit=False)
# Create workflow execution(aka workflow run) repository
workflow_execution_repository = SQLAlchemyWorkflowExecutionRepository(
workflow_execution_repository = DifyCoreRepositoryFactory.create_workflow_execution_repository(
session_factory=session_factory,
user=user,
app_id=application_generate_entity.app_config.app_id,
triggered_from=WorkflowRunTriggeredFrom.DEBUGGING,
)
# Create workflow node execution repository
workflow_node_execution_repository = SQLAlchemyWorkflowNodeExecutionRepository(
workflow_node_execution_repository = DifyCoreRepositoryFactory.create_workflow_node_execution_repository(
session_factory=session_factory,
user=user,
app_id=application_generate_entity.app_config.app_id,

View File

@ -16,9 +16,10 @@ from core.app.entities.queue_entities import (
QueueTextChunkEvent,
)
from core.moderation.base import ModerationError
from core.variables.variables import VariableUnion
from core.workflow.callbacks import WorkflowCallback, WorkflowLoggingCallback
from core.workflow.entities.variable_pool import VariablePool
from core.workflow.enums import SystemVariableKey
from core.workflow.system_variable import SystemVariable
from core.workflow.variable_loader import VariableLoader
from core.workflow.workflow_entry import WorkflowEntry
from extensions.ext_database import db
@ -64,7 +65,7 @@ class AdvancedChatAppRunner(WorkflowBasedAppRunner):
if not workflow:
raise ValueError("Workflow not initialized")
user_id = None
user_id: str | None = None
if self.application_generate_entity.invoke_from in {InvokeFrom.WEB_APP, InvokeFrom.SERVICE_API}:
end_user = db.session.query(EndUser).filter(EndUser.id == self.application_generate_entity.user_id).first()
if end_user:
@ -136,23 +137,25 @@ class AdvancedChatAppRunner(WorkflowBasedAppRunner):
session.commit()
# Create a variable pool.
system_inputs = {
SystemVariableKey.QUERY: query,
SystemVariableKey.FILES: files,
SystemVariableKey.CONVERSATION_ID: self.conversation.id,
SystemVariableKey.USER_ID: user_id,
SystemVariableKey.DIALOGUE_COUNT: self._dialogue_count,
SystemVariableKey.APP_ID: app_config.app_id,
SystemVariableKey.WORKFLOW_ID: app_config.workflow_id,
SystemVariableKey.WORKFLOW_EXECUTION_ID: self.application_generate_entity.workflow_run_id,
}
system_inputs = SystemVariable(
query=query,
files=files,
conversation_id=self.conversation.id,
user_id=user_id,
dialogue_count=self._dialogue_count,
app_id=app_config.app_id,
workflow_id=app_config.workflow_id,
workflow_execution_id=self.application_generate_entity.workflow_run_id,
)
# init variable pool
variable_pool = VariablePool(
system_variables=system_inputs,
user_inputs=inputs,
environment_variables=workflow.environment_variables,
conversation_variables=conversation_variables,
# Based on the definition of `VariableUnion`,
# `list[Variable]` can be safely used as `list[VariableUnion]` since they are compatible.
conversation_variables=cast(list[VariableUnion], conversation_variables),
)
# init graph

View File

@ -61,12 +61,12 @@ from core.base.tts import AppGeneratorTTSPublisher, AudioTrunk
from core.model_runtime.entities.llm_entities import LLMUsage
from core.ops.ops_trace_manager import TraceQueueManager
from core.workflow.entities.workflow_execution import WorkflowExecutionStatus, WorkflowType
from core.workflow.enums import SystemVariableKey
from core.workflow.graph_engine.entities.graph_runtime_state import GraphRuntimeState
from core.workflow.nodes import NodeType
from core.workflow.repositories.draft_variable_repository import DraftVariableSaverFactory
from core.workflow.repositories.workflow_execution_repository import WorkflowExecutionRepository
from core.workflow.repositories.workflow_node_execution_repository import WorkflowNodeExecutionRepository
from core.workflow.system_variable import SystemVariable
from core.workflow.workflow_cycle_manager import CycleManagerWorkflowInfo, WorkflowCycleManager
from events.message_event import message_was_created
from extensions.ext_database import db
@ -116,16 +116,16 @@ class AdvancedChatAppGenerateTaskPipeline:
self._workflow_cycle_manager = WorkflowCycleManager(
application_generate_entity=application_generate_entity,
workflow_system_variables={
SystemVariableKey.QUERY: message.query,
SystemVariableKey.FILES: application_generate_entity.files,
SystemVariableKey.CONVERSATION_ID: conversation.id,
SystemVariableKey.USER_ID: user_session_id,
SystemVariableKey.DIALOGUE_COUNT: dialogue_count,
SystemVariableKey.APP_ID: application_generate_entity.app_config.app_id,
SystemVariableKey.WORKFLOW_ID: workflow.id,
SystemVariableKey.WORKFLOW_EXECUTION_ID: application_generate_entity.workflow_run_id,
},
workflow_system_variables=SystemVariable(
query=message.query,
files=application_generate_entity.files,
conversation_id=conversation.id,
user_id=user_session_id,
dialogue_count=dialogue_count,
app_id=application_generate_entity.app_config.app_id,
workflow_id=workflow.id,
workflow_execution_id=application_generate_entity.workflow_run_id,
),
workflow_info=CycleManagerWorkflowInfo(
workflow_id=workflow.id,
workflow_type=WorkflowType(workflow.type),

View File

@ -23,8 +23,7 @@ from core.app.entities.app_invoke_entities import InvokeFrom, WorkflowAppGenerat
from core.app.entities.task_entities import WorkflowAppBlockingResponse, WorkflowAppStreamResponse
from core.model_runtime.errors.invoke import InvokeAuthorizationError
from core.ops.ops_trace_manager import TraceQueueManager
from core.repositories import SQLAlchemyWorkflowNodeExecutionRepository
from core.repositories.sqlalchemy_workflow_execution_repository import SQLAlchemyWorkflowExecutionRepository
from core.repositories import DifyCoreRepositoryFactory
from core.workflow.repositories.draft_variable_repository import DraftVariableSaverFactory
from core.workflow.repositories.workflow_execution_repository import WorkflowExecutionRepository
from core.workflow.repositories.workflow_node_execution_repository import WorkflowNodeExecutionRepository
@ -156,14 +155,14 @@ class WorkflowAppGenerator(BaseAppGenerator):
workflow_triggered_from = WorkflowRunTriggeredFrom.DEBUGGING
else:
workflow_triggered_from = WorkflowRunTriggeredFrom.APP_RUN
workflow_execution_repository = SQLAlchemyWorkflowExecutionRepository(
workflow_execution_repository = DifyCoreRepositoryFactory.create_workflow_execution_repository(
session_factory=session_factory,
user=user,
app_id=application_generate_entity.app_config.app_id,
triggered_from=workflow_triggered_from,
)
# Create workflow node execution repository
workflow_node_execution_repository = SQLAlchemyWorkflowNodeExecutionRepository(
workflow_node_execution_repository = DifyCoreRepositoryFactory.create_workflow_node_execution_repository(
session_factory=session_factory,
user=user,
app_id=application_generate_entity.app_config.app_id,
@ -306,16 +305,14 @@ class WorkflowAppGenerator(BaseAppGenerator):
# Create session factory
session_factory = sessionmaker(bind=db.engine, expire_on_commit=False)
# Create workflow execution(aka workflow run) repository
workflow_execution_repository = SQLAlchemyWorkflowExecutionRepository(
workflow_execution_repository = DifyCoreRepositoryFactory.create_workflow_execution_repository(
session_factory=session_factory,
user=user,
app_id=application_generate_entity.app_config.app_id,
triggered_from=WorkflowRunTriggeredFrom.DEBUGGING,
)
# Create workflow node execution repository
session_factory = sessionmaker(bind=db.engine, expire_on_commit=False)
workflow_node_execution_repository = SQLAlchemyWorkflowNodeExecutionRepository(
workflow_node_execution_repository = DifyCoreRepositoryFactory.create_workflow_node_execution_repository(
session_factory=session_factory,
user=user,
app_id=application_generate_entity.app_config.app_id,
@ -390,16 +387,14 @@ class WorkflowAppGenerator(BaseAppGenerator):
# Create session factory
session_factory = sessionmaker(bind=db.engine, expire_on_commit=False)
# Create workflow execution(aka workflow run) repository
workflow_execution_repository = SQLAlchemyWorkflowExecutionRepository(
workflow_execution_repository = DifyCoreRepositoryFactory.create_workflow_execution_repository(
session_factory=session_factory,
user=user,
app_id=application_generate_entity.app_config.app_id,
triggered_from=WorkflowRunTriggeredFrom.DEBUGGING,
)
# Create workflow node execution repository
session_factory = sessionmaker(bind=db.engine, expire_on_commit=False)
workflow_node_execution_repository = SQLAlchemyWorkflowNodeExecutionRepository(
workflow_node_execution_repository = DifyCoreRepositoryFactory.create_workflow_node_execution_repository(
session_factory=session_factory,
user=user,
app_id=application_generate_entity.app_config.app_id,

View File

@ -11,7 +11,7 @@ from core.app.entities.app_invoke_entities import (
)
from core.workflow.callbacks import WorkflowCallback, WorkflowLoggingCallback
from core.workflow.entities.variable_pool import VariablePool
from core.workflow.enums import SystemVariableKey
from core.workflow.system_variable import SystemVariable
from core.workflow.variable_loader import VariableLoader
from core.workflow.workflow_entry import WorkflowEntry
from extensions.ext_database import db
@ -95,13 +95,14 @@ class WorkflowAppRunner(WorkflowBasedAppRunner):
files = self.application_generate_entity.files
# Create a variable pool.
system_inputs = {
SystemVariableKey.FILES: files,
SystemVariableKey.USER_ID: user_id,
SystemVariableKey.APP_ID: app_config.app_id,
SystemVariableKey.WORKFLOW_ID: app_config.workflow_id,
SystemVariableKey.WORKFLOW_EXECUTION_ID: self.application_generate_entity.workflow_execution_id,
}
system_inputs = SystemVariable(
files=files,
user_id=user_id,
app_id=app_config.app_id,
workflow_id=app_config.workflow_id,
workflow_execution_id=self.application_generate_entity.workflow_execution_id,
)
variable_pool = VariablePool(
system_variables=system_inputs,

View File

@ -3,7 +3,6 @@ import time
from collections.abc import Generator
from typing import Optional, Union
from sqlalchemy import select
from sqlalchemy.orm import Session
from constants.tts_auto_play_timeout import TTS_AUTO_PLAY_TIMEOUT, TTS_AUTO_PLAY_YIELD_CPU_TIME
@ -55,10 +54,10 @@ from core.app.task_pipeline.based_generate_task_pipeline import BasedGenerateTas
from core.base.tts import AppGeneratorTTSPublisher, AudioTrunk
from core.ops.ops_trace_manager import TraceQueueManager
from core.workflow.entities.workflow_execution import WorkflowExecution, WorkflowExecutionStatus, WorkflowType
from core.workflow.enums import SystemVariableKey
from core.workflow.repositories.draft_variable_repository import DraftVariableSaverFactory
from core.workflow.repositories.workflow_execution_repository import WorkflowExecutionRepository
from core.workflow.repositories.workflow_node_execution_repository import WorkflowNodeExecutionRepository
from core.workflow.system_variable import SystemVariable
from core.workflow.workflow_cycle_manager import CycleManagerWorkflowInfo, WorkflowCycleManager
from extensions.ext_database import db
from models.account import Account
@ -68,7 +67,6 @@ from models.workflow import (
Workflow,
WorkflowAppLog,
WorkflowAppLogCreatedFrom,
WorkflowRun,
)
logger = logging.getLogger(__name__)
@ -109,13 +107,13 @@ class WorkflowAppGenerateTaskPipeline:
self._workflow_cycle_manager = WorkflowCycleManager(
application_generate_entity=application_generate_entity,
workflow_system_variables={
SystemVariableKey.FILES: application_generate_entity.files,
SystemVariableKey.USER_ID: user_session_id,
SystemVariableKey.APP_ID: application_generate_entity.app_config.app_id,
SystemVariableKey.WORKFLOW_ID: workflow.id,
SystemVariableKey.WORKFLOW_EXECUTION_ID: application_generate_entity.workflow_execution_id,
},
workflow_system_variables=SystemVariable(
files=application_generate_entity.files,
user_id=user_session_id,
app_id=application_generate_entity.app_config.app_id,
workflow_id=workflow.id,
workflow_execution_id=application_generate_entity.workflow_execution_id,
),
workflow_info=CycleManagerWorkflowInfo(
workflow_id=workflow.id,
workflow_type=WorkflowType(workflow.type),
@ -562,8 +560,6 @@ class WorkflowAppGenerateTaskPipeline:
tts_publisher.publish(None)
def _save_workflow_app_log(self, *, session: Session, workflow_execution: WorkflowExecution) -> None:
workflow_run = session.scalar(select(WorkflowRun).where(WorkflowRun.id == workflow_execution.id_))
assert workflow_run is not None
invoke_from = self._application_generate_entity.invoke_from
if invoke_from == InvokeFrom.SERVICE_API:
created_from = WorkflowAppLogCreatedFrom.SERVICE_API
@ -576,10 +572,10 @@ class WorkflowAppGenerateTaskPipeline:
return
workflow_app_log = WorkflowAppLog()
workflow_app_log.tenant_id = workflow_run.tenant_id
workflow_app_log.app_id = workflow_run.app_id
workflow_app_log.workflow_id = workflow_run.workflow_id
workflow_app_log.workflow_run_id = workflow_run.id
workflow_app_log.tenant_id = self._application_generate_entity.app_config.tenant_id
workflow_app_log.app_id = self._application_generate_entity.app_config.app_id
workflow_app_log.workflow_id = workflow_execution.workflow_id
workflow_app_log.workflow_run_id = workflow_execution.id_
workflow_app_log.created_from = created_from.value
workflow_app_log.created_by_role = self._created_by_role
workflow_app_log.created_by = self._user_id

View File

@ -62,6 +62,7 @@ from core.workflow.graph_engine.entities.event import (
from core.workflow.graph_engine.entities.graph import Graph
from core.workflow.nodes import NodeType
from core.workflow.nodes.node_mapping import NODE_TYPE_CLASSES_MAPPING
from core.workflow.system_variable import SystemVariable
from core.workflow.variable_loader import DUMMY_VARIABLE_LOADER, VariableLoader, load_into_variable_pool
from core.workflow.workflow_entry import WorkflowEntry
from extensions.ext_database import db
@ -166,7 +167,7 @@ class WorkflowBasedAppRunner(AppRunner):
# init variable pool
variable_pool = VariablePool(
system_variables={},
system_variables=SystemVariable.empty(),
user_inputs={},
environment_variables=workflow.environment_variables,
)
@ -263,7 +264,7 @@ class WorkflowBasedAppRunner(AppRunner):
# init variable pool
variable_pool = VariablePool(
system_variables={},
system_variables=SystemVariable.empty(),
user_inputs={},
environment_variables=workflow.environment_variables,
)

View File

@ -21,7 +21,9 @@ def get_signed_file_url(upload_file_id: str) -> str:
def get_signed_file_url_for_plugin(filename: str, mimetype: str, tenant_id: str, user_id: str) -> str:
url = f"{dify_config.FILES_URL}/files/upload/for-plugin"
# Plugin access should use internal URL for Docker network communication
base_url = dify_config.INTERNAL_FILES_URL or dify_config.FILES_URL
url = f"{base_url}/files/upload/for-plugin"
if user_id is None:
user_id = "DEFAULT-USER"

View File

@ -5,6 +5,8 @@ from base64 import b64encode
from collections.abc import Mapping
from typing import Any
from core.variables.utils import SegmentJSONEncoder
class TemplateTransformer(ABC):
_code_placeholder: str = "{{code}}"
@ -43,17 +45,13 @@ class TemplateTransformer(ABC):
result_str = cls.extract_result_str_from_response(response)
result = json.loads(result_str)
except json.JSONDecodeError as e:
raise ValueError(f"Failed to parse JSON response: {str(e)}. Response content: {result_str[:200]}...")
raise ValueError(f"Failed to parse JSON response: {str(e)}.")
except ValueError as e:
# Re-raise ValueError from extract_result_str_from_response
raise e
except Exception as e:
raise ValueError(f"Unexpected error during response transformation: {str(e)}")
# Check if the result contains an error
if isinstance(result, dict) and "error" in result:
raise ValueError(f"JavaScript execution error: {result['error']}")
if not isinstance(result, dict):
raise ValueError(f"Result must be a dict, got {type(result).__name__}")
if not all(isinstance(k, str) for k in result):
@ -95,7 +93,7 @@ class TemplateTransformer(ABC):
@classmethod
def serialize_inputs(cls, inputs: Mapping[str, Any]) -> str:
inputs_json_str = json.dumps(inputs, ensure_ascii=False).encode()
inputs_json_str = json.dumps(inputs, ensure_ascii=False, cls=SegmentJSONEncoder).encode()
input_base64_encoded = b64encode(inputs_json_str).decode("utf-8")
return input_base64_encoded

View File

@ -240,7 +240,7 @@ def refresh_authorization(
response = requests.post(token_url, data=params)
if not response.ok:
raise ValueError(f"Token refresh failed: HTTP {response.status_code}")
return OAuthTokens.parse_obj(response.json())
return OAuthTokens.model_validate(response.json())
def register_client(

View File

@ -112,13 +112,13 @@ class MCPServerStreamableHTTPRequestHandler:
def initialize(self):
request = cast(types.InitializeRequest, self.request.root)
client_info = request.params.clientInfo
clinet_name = f"{client_info.name}@{client_info.version}"
client_name = f"{client_info.name}@{client_info.version}"
if not self.end_user:
end_user = EndUser(
tenant_id=self.app.tenant_id,
app_id=self.app.id,
type="mcp",
name=clinet_name,
name=client_name,
session_id=generate_session_id(),
external_user_id=self.mcp_server.id,
)

View File

@ -1,7 +1,7 @@
import logging
import queue
from collections.abc import Callable
from concurrent.futures import ThreadPoolExecutor
from concurrent.futures import Future, ThreadPoolExecutor, TimeoutError
from contextlib import ExitStack
from datetime import timedelta
from types import TracebackType
@ -171,23 +171,41 @@ class BaseSession(
self._session_read_timeout_seconds = read_timeout_seconds
self._in_flight = {}
self._exit_stack = ExitStack()
# Initialize executor and future to None for proper cleanup checks
self._executor: ThreadPoolExecutor | None = None
self._receiver_future: Future | None = None
def __enter__(self) -> Self:
self._executor = ThreadPoolExecutor()
# The thread pool is dedicated to running `_receive_loop`. Setting `max_workers` to 1
# ensures no unnecessary threads are created.
self._executor = ThreadPoolExecutor(max_workers=1)
self._receiver_future = self._executor.submit(self._receive_loop)
return self
def check_receiver_status(self) -> None:
if self._receiver_future.done():
"""`check_receiver_status` ensures that any exceptions raised during the
execution of `_receive_loop` are retrieved and propagated."""
if self._receiver_future and self._receiver_future.done():
self._receiver_future.result()
def __exit__(
self, exc_type: type[BaseException] | None, exc_val: BaseException | None, exc_tb: TracebackType | None
) -> None:
self._exit_stack.close()
self._read_stream.put(None)
self._write_stream.put(None)
# Wait for the receiver loop to finish
if self._receiver_future:
try:
self._receiver_future.result(timeout=5.0) # Wait up to 5 seconds
except TimeoutError:
# If the receiver loop is still running after timeout, we'll force shutdown
pass
# Shutdown the executor
if self._executor:
self._executor.shutdown(wait=True)
def send_request(
self,
request: SendRequestT,

View File

@ -1,6 +1,8 @@
from collections.abc import Sequence
from typing import Optional
from sqlalchemy import select
from core.app.app_config.features.file_upload.manager import FileUploadConfigManager
from core.file import file_manager
from core.model_manager import ModelInstance
@ -17,11 +19,15 @@ from core.prompt.utils.extract_thread_messages import extract_thread_messages
from extensions.ext_database import db
from factories import file_factory
from models.model import AppMode, Conversation, Message, MessageFile
from models.workflow import WorkflowRun
from models.workflow import Workflow, WorkflowRun
class TokenBufferMemory:
def __init__(self, conversation: Conversation, model_instance: ModelInstance) -> None:
def __init__(
self,
conversation: Conversation,
model_instance: ModelInstance,
) -> None:
self.conversation = conversation
self.model_instance = model_instance
@ -36,20 +42,8 @@ class TokenBufferMemory:
app_record = self.conversation.app
# fetch limited messages, and return reversed
query = (
db.session.query(
Message.id,
Message.query,
Message.answer,
Message.created_at,
Message.workflow_run_id,
Message.parent_message_id,
Message.answer_tokens,
)
.filter(
Message.conversation_id == self.conversation.id,
)
.order_by(Message.created_at.desc())
stmt = (
select(Message).where(Message.conversation_id == self.conversation.id).order_by(Message.created_at.desc())
)
if message_limit and message_limit > 0:
@ -57,7 +51,9 @@ class TokenBufferMemory:
else:
message_limit = 500
messages = query.limit(message_limit).all()
stmt = stmt.limit(message_limit)
messages = db.session.scalars(stmt).all()
# instead of all messages from the conversation, we only need to extract messages
# that belong to the thread of last message
@ -74,18 +70,20 @@ class TokenBufferMemory:
files = db.session.query(MessageFile).filter(MessageFile.message_id == message.id).all()
if files:
file_extra_config = None
if self.conversation.mode not in {AppMode.ADVANCED_CHAT, AppMode.WORKFLOW}:
if self.conversation.mode in {AppMode.AGENT_CHAT, AppMode.COMPLETION, AppMode.CHAT}:
file_extra_config = FileUploadConfigManager.convert(self.conversation.model_config)
elif self.conversation.mode in {AppMode.ADVANCED_CHAT, AppMode.WORKFLOW}:
workflow_run = db.session.scalar(
select(WorkflowRun).where(WorkflowRun.id == message.workflow_run_id)
)
if not workflow_run:
raise ValueError(f"Workflow run not found: {message.workflow_run_id}")
workflow = db.session.scalar(select(Workflow).where(Workflow.id == workflow_run.workflow_id))
if not workflow:
raise ValueError(f"Workflow not found: {workflow_run.workflow_id}")
file_extra_config = FileUploadConfigManager.convert(workflow.features_dict, is_vision=False)
else:
if message.workflow_run_id:
workflow_run = (
db.session.query(WorkflowRun).filter(WorkflowRun.id == message.workflow_run_id).first()
)
if workflow_run and workflow_run.workflow:
file_extra_config = FileUploadConfigManager.convert(
workflow_run.workflow.features_dict, is_vision=False
)
raise AssertionError(f"Invalid app mode: {self.conversation.mode}")
detail = ImagePromptMessageContent.DETAIL.LOW
if file_extra_config and app_record:

View File

@ -284,7 +284,8 @@ class AliyunDataTrace(BaseTraceInstance):
else:
node_span = self.build_workflow_task_span(trace_id, workflow_span_id, trace_info, node_execution)
return node_span
except Exception:
except Exception as e:
logging.debug(f"Error occurred in build_workflow_node_span: {e}", exc_info=True)
return None
def get_workflow_node_status(self, node_execution: WorkflowNodeExecution) -> Status:
@ -306,7 +307,7 @@ class AliyunDataTrace(BaseTraceInstance):
start_time=convert_datetime_to_nanoseconds(node_execution.created_at),
end_time=convert_datetime_to_nanoseconds(node_execution.finished_at),
attributes={
GEN_AI_SESSION_ID: trace_info.metadata.get("conversation_id", ""),
GEN_AI_SESSION_ID: trace_info.metadata.get("conversation_id") or "",
GEN_AI_SPAN_KIND: GenAISpanKind.TASK.value,
GEN_AI_FRAMEWORK: "dify",
INPUT_VALUE: json.dumps(node_execution.inputs, ensure_ascii=False),
@ -381,7 +382,7 @@ class AliyunDataTrace(BaseTraceInstance):
start_time=convert_datetime_to_nanoseconds(node_execution.created_at),
end_time=convert_datetime_to_nanoseconds(node_execution.finished_at),
attributes={
GEN_AI_SESSION_ID: trace_info.metadata.get("conversation_id", ""),
GEN_AI_SESSION_ID: trace_info.metadata.get("conversation_id") or "",
GEN_AI_SPAN_KIND: GenAISpanKind.LLM.value,
GEN_AI_FRAMEWORK: "dify",
GEN_AI_MODEL_NAME: process_data.get("model_name", ""),
@ -415,7 +416,7 @@ class AliyunDataTrace(BaseTraceInstance):
start_time=convert_datetime_to_nanoseconds(trace_info.start_time),
end_time=convert_datetime_to_nanoseconds(trace_info.end_time),
attributes={
GEN_AI_SESSION_ID: trace_info.metadata.get("conversation_id", ""),
GEN_AI_SESSION_ID: trace_info.metadata.get("conversation_id") or "",
GEN_AI_USER_ID: str(user_id),
GEN_AI_SPAN_KIND: GenAISpanKind.CHAIN.value,
GEN_AI_FRAMEWORK: "dify",

View File

@ -28,7 +28,7 @@ from core.ops.langfuse_trace.entities.langfuse_trace_entity import (
UnitEnum,
)
from core.ops.utils import filter_none_values
from core.repositories import SQLAlchemyWorkflowNodeExecutionRepository
from core.repositories import DifyCoreRepositoryFactory
from core.workflow.nodes.enums import NodeType
from extensions.ext_database import db
from models import EndUser, WorkflowNodeExecutionTriggeredFrom
@ -123,10 +123,10 @@ class LangFuseDataTrace(BaseTraceInstance):
service_account = self.get_service_account_with_tenant(app_id)
workflow_node_execution_repository = SQLAlchemyWorkflowNodeExecutionRepository(
workflow_node_execution_repository = DifyCoreRepositoryFactory.create_workflow_node_execution_repository(
session_factory=session_factory,
user=service_account,
app_id=trace_info.metadata.get("app_id"),
app_id=app_id,
triggered_from=WorkflowNodeExecutionTriggeredFrom.WORKFLOW_RUN,
)

View File

@ -27,7 +27,7 @@ from core.ops.langsmith_trace.entities.langsmith_trace_entity import (
LangSmithRunUpdateModel,
)
from core.ops.utils import filter_none_values, generate_dotted_order
from core.repositories import SQLAlchemyWorkflowNodeExecutionRepository
from core.repositories import DifyCoreRepositoryFactory
from core.workflow.entities.workflow_node_execution import WorkflowNodeExecutionMetadataKey
from core.workflow.nodes.enums import NodeType
from extensions.ext_database import db
@ -145,10 +145,10 @@ class LangSmithDataTrace(BaseTraceInstance):
service_account = self.get_service_account_with_tenant(app_id)
workflow_node_execution_repository = SQLAlchemyWorkflowNodeExecutionRepository(
workflow_node_execution_repository = DifyCoreRepositoryFactory.create_workflow_node_execution_repository(
session_factory=session_factory,
user=service_account,
app_id=trace_info.metadata.get("app_id"),
app_id=app_id,
triggered_from=WorkflowNodeExecutionTriggeredFrom.WORKFLOW_RUN,
)

View File

@ -21,7 +21,7 @@ from core.ops.entities.trace_entity import (
TraceTaskName,
WorkflowTraceInfo,
)
from core.repositories import SQLAlchemyWorkflowNodeExecutionRepository
from core.repositories import DifyCoreRepositoryFactory
from core.workflow.entities.workflow_node_execution import WorkflowNodeExecutionMetadataKey
from core.workflow.nodes.enums import NodeType
from extensions.ext_database import db
@ -160,10 +160,10 @@ class OpikDataTrace(BaseTraceInstance):
service_account = self.get_service_account_with_tenant(app_id)
workflow_node_execution_repository = SQLAlchemyWorkflowNodeExecutionRepository(
workflow_node_execution_repository = DifyCoreRepositoryFactory.create_workflow_node_execution_repository(
session_factory=session_factory,
user=service_account,
app_id=trace_info.metadata.get("app_id"),
app_id=app_id,
triggered_from=WorkflowNodeExecutionTriggeredFrom.WORKFLOW_RUN,
)
@ -241,7 +241,7 @@ class OpikDataTrace(BaseTraceInstance):
"trace_id": opik_trace_id,
"id": prepare_opik_uuid(created_at, node_execution_id),
"parent_span_id": prepare_opik_uuid(trace_info.start_time, parent_span_id),
"name": node_type,
"name": node_name,
"type": run_type,
"start_time": created_at,
"end_time": finished_at,

View File

@ -22,7 +22,7 @@ from core.ops.entities.trace_entity import (
WorkflowTraceInfo,
)
from core.ops.weave_trace.entities.weave_trace_entity import WeaveTraceModel
from core.repositories import SQLAlchemyWorkflowNodeExecutionRepository
from core.repositories import DifyCoreRepositoryFactory
from core.workflow.entities.workflow_node_execution import WorkflowNodeExecutionMetadataKey
from core.workflow.nodes.enums import NodeType
from extensions.ext_database import db
@ -144,10 +144,10 @@ class WeaveDataTrace(BaseTraceInstance):
service_account = self.get_service_account_with_tenant(app_id)
workflow_node_execution_repository = SQLAlchemyWorkflowNodeExecutionRepository(
workflow_node_execution_repository = DifyCoreRepositoryFactory.create_workflow_node_execution_repository(
session_factory=session_factory,
user=service_account,
app_id=trace_info.metadata.get("app_id"),
app_id=app_id,
triggered_from=WorkflowNodeExecutionTriggeredFrom.WORKFLOW_RUN,
)

View File

@ -36,7 +36,7 @@ class PluginInstaller(BasePluginClient):
"GET",
f"plugin/{tenant_id}/management/list",
PluginListResponse,
params={"page": 1, "page_size": 256},
params={"page": 1, "page_size": 256, "response_type": "paged"},
)
return result.list
@ -45,7 +45,7 @@ class PluginInstaller(BasePluginClient):
"GET",
f"plugin/{tenant_id}/management/list",
PluginListResponse,
params={"page": page, "page_size": page_size},
params={"page": page, "page_size": page_size, "response_type": "paged"},
)
def upload_pkg(

View File

@ -158,7 +158,7 @@ class AdvancedPromptTransform(PromptTransform):
if prompt_item.edition_type == "basic" or not prompt_item.edition_type:
if self.with_variable_tmpl:
vp = VariablePool()
vp = VariablePool.empty()
for k, v in inputs.items():
if k.startswith("#"):
vp.add(k[1:-1].split("."), v)

View File

@ -1,10 +1,11 @@
from typing import Any
from collections.abc import Sequence
from constants import UUID_NIL
from models import Message
def extract_thread_messages(messages: list[Any]):
thread_messages = []
def extract_thread_messages(messages: Sequence[Message]):
thread_messages: list[Message] = []
next_message = None
for message in messages:

View File

@ -1,3 +1,5 @@
from sqlalchemy import select
from core.prompt.utils.extract_thread_messages import extract_thread_messages
from extensions.ext_database import db
from models.model import Message
@ -8,19 +10,9 @@ def get_thread_messages_length(conversation_id: str) -> int:
Get the number of thread messages based on the parent message id.
"""
# Fetch all messages related to the conversation
query = (
db.session.query(
Message.id,
Message.parent_message_id,
Message.answer,
)
.filter(
Message.conversation_id == conversation_id,
)
.order_by(Message.created_at.desc())
)
stmt = select(Message).where(Message.conversation_id == conversation_id).order_by(Message.created_at.desc())
messages = query.all()
messages = db.session.scalars(stmt).all()
# Extract thread messages
thread_messages = extract_thread_messages(messages)

View File

@ -3,7 +3,7 @@ from concurrent.futures import ThreadPoolExecutor
from typing import Optional
from flask import Flask, current_app
from sqlalchemy.orm import load_only
from sqlalchemy.orm import Session, load_only
from configs import dify_config
from core.rag.data_post_processor.data_post_processor import DataPostProcessor
@ -144,7 +144,8 @@ class RetrievalService:
@classmethod
def _get_dataset(cls, dataset_id: str) -> Optional[Dataset]:
return db.session.query(Dataset).filter(Dataset.id == dataset_id).first()
with Session(db.engine) as session:
return session.query(Dataset).filter(Dataset.id == dataset_id).first()
@classmethod
def keyword_search(

View File

@ -4,6 +4,7 @@ from typing import Any, Optional
import tablestore # type: ignore
from pydantic import BaseModel, model_validator
from tablestore import BatchGetRowRequest, TableInBatchGetRowItem
from configs import dify_config
from core.rag.datasource.vdb.field import Field
@ -50,6 +51,29 @@ class TableStoreVector(BaseVector):
self._index_name = f"{collection_name}_idx"
self._tags_field = f"{Field.METADATA_KEY.value}_tags"
def create_collection(self, embeddings: list[list[float]], **kwargs):
dimension = len(embeddings[0])
self._create_collection(dimension)
def get_by_ids(self, ids: list[str]) -> list[Document]:
docs = []
request = BatchGetRowRequest()
columns_to_get = [Field.METADATA_KEY.value, Field.CONTENT_KEY.value]
rows_to_get = [[("id", _id)] for _id in ids]
request.add(TableInBatchGetRowItem(self._table_name, rows_to_get, columns_to_get, None, 1))
result = self._tablestore_client.batch_get_row(request)
table_result = result.get_result_by_table(self._table_name)
for item in table_result:
if item.is_ok and item.row:
kv = {k: v for k, v, t in item.row.attribute_columns}
docs.append(
Document(
page_content=kv[Field.CONTENT_KEY.value], metadata=json.loads(kv[Field.METADATA_KEY.value])
)
)
return docs
def get_type(self) -> str:
return VectorType.TABLESTORE

View File

@ -9,6 +9,7 @@ from typing import Any, Optional, Union, cast
from flask import Flask, current_app
from sqlalchemy import Float, and_, or_, text
from sqlalchemy import cast as sqlalchemy_cast
from sqlalchemy.orm import Session
from core.app.app_config.entities import (
DatasetEntity,
@ -598,7 +599,8 @@ class DatasetRetrieval:
metadata_condition: Optional[MetadataCondition] = None,
):
with flask_app.app_context():
dataset = db.session.query(Dataset).filter(Dataset.id == dataset_id).first()
with Session(db.engine) as session:
dataset = session.query(Dataset).filter(Dataset.id == dataset_id).first()
if not dataset:
return []

View File

@ -5,8 +5,11 @@ This package contains concrete implementations of the repository interfaces
defined in the core.workflow.repository package.
"""
from core.repositories.factory import DifyCoreRepositoryFactory, RepositoryImportError
from core.repositories.sqlalchemy_workflow_node_execution_repository import SQLAlchemyWorkflowNodeExecutionRepository
__all__ = [
"DifyCoreRepositoryFactory",
"RepositoryImportError",
"SQLAlchemyWorkflowNodeExecutionRepository",
]

View File

@ -0,0 +1,224 @@
"""
Repository factory for dynamically creating repository instances based on configuration.
This module provides a Django-like settings system for repository implementations,
allowing users to configure different repository backends through string paths.
"""
import importlib
import inspect
import logging
from typing import Protocol, Union
from sqlalchemy.engine import Engine
from sqlalchemy.orm import sessionmaker
from configs import dify_config
from core.workflow.repositories.workflow_execution_repository import WorkflowExecutionRepository
from core.workflow.repositories.workflow_node_execution_repository import WorkflowNodeExecutionRepository
from models import Account, EndUser
from models.enums import WorkflowRunTriggeredFrom
from models.workflow import WorkflowNodeExecutionTriggeredFrom
logger = logging.getLogger(__name__)
class RepositoryImportError(Exception):
"""Raised when a repository implementation cannot be imported or instantiated."""
pass
class DifyCoreRepositoryFactory:
"""
Factory for creating repository instances based on configuration.
This factory supports Django-like settings where repository implementations
are specified as module paths (e.g., 'module.submodule.ClassName').
"""
@staticmethod
def _import_class(class_path: str) -> type:
"""
Import a class from a module path string.
Args:
class_path: Full module path to the class (e.g., 'module.submodule.ClassName')
Returns:
The imported class
Raises:
RepositoryImportError: If the class cannot be imported
"""
try:
module_path, class_name = class_path.rsplit(".", 1)
module = importlib.import_module(module_path)
repo_class = getattr(module, class_name)
assert isinstance(repo_class, type)
return repo_class
except (ValueError, ImportError, AttributeError) as e:
raise RepositoryImportError(f"Cannot import repository class '{class_path}': {e}") from e
@staticmethod
def _validate_repository_interface(repository_class: type, expected_interface: type[Protocol]) -> None: # type: ignore
"""
Validate that a class implements the expected repository interface.
Args:
repository_class: The class to validate
expected_interface: The expected interface/protocol
Raises:
RepositoryImportError: If the class doesn't implement the interface
"""
# Check if the class has all required methods from the protocol
required_methods = [
method
for method in dir(expected_interface)
if not method.startswith("_") and callable(getattr(expected_interface, method, None))
]
missing_methods = []
for method_name in required_methods:
if not hasattr(repository_class, method_name):
missing_methods.append(method_name)
if missing_methods:
raise RepositoryImportError(
f"Repository class '{repository_class.__name__}' does not implement required methods "
f"{missing_methods} from interface '{expected_interface.__name__}'"
)
@staticmethod
def _validate_constructor_signature(repository_class: type, required_params: list[str]) -> None:
"""
Validate that a repository class constructor accepts required parameters.
Args:
repository_class: The class to validate
required_params: List of required parameter names
Raises:
RepositoryImportError: If the constructor doesn't accept required parameters
"""
try:
# MyPy may flag the line below with the following error:
#
# > Accessing "__init__" on an instance is unsound, since
# > instance.__init__ could be from an incompatible subclass.
#
# Despite this, we need to ensure that the constructor of `repository_class`
# has a compatible signature.
signature = inspect.signature(repository_class.__init__) # type: ignore[misc]
param_names = list(signature.parameters.keys())
# Remove 'self' parameter
if "self" in param_names:
param_names.remove("self")
missing_params = [param for param in required_params if param not in param_names]
if missing_params:
raise RepositoryImportError(
f"Repository class '{repository_class.__name__}' constructor does not accept required parameters: "
f"{missing_params}. Expected parameters: {required_params}"
)
except Exception as e:
raise RepositoryImportError(
f"Failed to validate constructor signature for '{repository_class.__name__}': {e}"
) from e
@classmethod
def create_workflow_execution_repository(
cls,
session_factory: Union[sessionmaker, Engine],
user: Union[Account, EndUser],
app_id: str,
triggered_from: WorkflowRunTriggeredFrom,
) -> WorkflowExecutionRepository:
"""
Create a WorkflowExecutionRepository instance based on configuration.
Args:
session_factory: SQLAlchemy sessionmaker or engine
user: Account or EndUser object
app_id: Application ID
triggered_from: Source of the execution trigger
Returns:
Configured WorkflowExecutionRepository instance
Raises:
RepositoryImportError: If the configured repository cannot be created
"""
class_path = dify_config.CORE_WORKFLOW_EXECUTION_REPOSITORY
logger.debug(f"Creating WorkflowExecutionRepository from: {class_path}")
try:
repository_class = cls._import_class(class_path)
cls._validate_repository_interface(repository_class, WorkflowExecutionRepository)
cls._validate_constructor_signature(
repository_class, ["session_factory", "user", "app_id", "triggered_from"]
)
return repository_class( # type: ignore[no-any-return]
session_factory=session_factory,
user=user,
app_id=app_id,
triggered_from=triggered_from,
)
except RepositoryImportError:
# Re-raise our custom errors as-is
raise
except Exception as e:
logger.exception("Failed to create WorkflowExecutionRepository")
raise RepositoryImportError(f"Failed to create WorkflowExecutionRepository from '{class_path}': {e}") from e
@classmethod
def create_workflow_node_execution_repository(
cls,
session_factory: Union[sessionmaker, Engine],
user: Union[Account, EndUser],
app_id: str,
triggered_from: WorkflowNodeExecutionTriggeredFrom,
) -> WorkflowNodeExecutionRepository:
"""
Create a WorkflowNodeExecutionRepository instance based on configuration.
Args:
session_factory: SQLAlchemy sessionmaker or engine
user: Account or EndUser object
app_id: Application ID
triggered_from: Source of the execution trigger
Returns:
Configured WorkflowNodeExecutionRepository instance
Raises:
RepositoryImportError: If the configured repository cannot be created
"""
class_path = dify_config.CORE_WORKFLOW_NODE_EXECUTION_REPOSITORY
logger.debug(f"Creating WorkflowNodeExecutionRepository from: {class_path}")
try:
repository_class = cls._import_class(class_path)
cls._validate_repository_interface(repository_class, WorkflowNodeExecutionRepository)
cls._validate_constructor_signature(
repository_class, ["session_factory", "user", "app_id", "triggered_from"]
)
return repository_class( # type: ignore[no-any-return]
session_factory=session_factory,
user=user,
app_id=app_id,
triggered_from=triggered_from,
)
except RepositoryImportError:
# Re-raise our custom errors as-is
raise
except Exception as e:
logger.exception("Failed to create WorkflowNodeExecutionRepository")
raise RepositoryImportError(
f"Failed to create WorkflowNodeExecutionRepository from '{class_path}': {e}"
) from e

View File

@ -39,19 +39,22 @@ class ApiToolProviderController(ToolProviderController):
type=ProviderConfig.Type.SELECT,
options=[
ProviderConfig.Option(value="none", label=I18nObject(en_US="None", zh_Hans="")),
ProviderConfig.Option(value="api_key", label=I18nObject(en_US="api_key", zh_Hans="api_key")),
ProviderConfig.Option(value="api_key_header", label=I18nObject(en_US="Header", zh_Hans="请求头")),
ProviderConfig.Option(
value="api_key_query", label=I18nObject(en_US="Query Param", zh_Hans="查询参数")
),
],
default="none",
help=I18nObject(en_US="The auth type of the api provider", zh_Hans="api provider 的认证类型"),
)
]
if auth_type == ApiProviderAuthType.API_KEY:
if auth_type == ApiProviderAuthType.API_KEY_HEADER:
credentials_schema = [
*credentials_schema,
ProviderConfig(
name="api_key_header",
required=False,
default="api_key",
default="Authorization",
type=ProviderConfig.Type.TEXT_INPUT,
help=I18nObject(en_US="The header name of the api key", zh_Hans="携带 api key 的 header 名称"),
),
@ -74,6 +77,25 @@ class ApiToolProviderController(ToolProviderController):
],
),
]
elif auth_type == ApiProviderAuthType.API_KEY_QUERY:
credentials_schema = [
*credentials_schema,
ProviderConfig(
name="api_key_query_param",
required=False,
default="key",
type=ProviderConfig.Type.TEXT_INPUT,
help=I18nObject(
en_US="The query parameter name of the api key", zh_Hans="携带 api key 的查询参数名称"
),
),
ProviderConfig(
name="api_key_value",
required=True,
type=ProviderConfig.Type.SECRET_INPUT,
help=I18nObject(en_US="The api key", zh_Hans="api key 的值"),
),
]
elif auth_type == ApiProviderAuthType.NONE:
pass

View File

@ -78,8 +78,8 @@ class ApiTool(Tool):
if "auth_type" not in credentials:
raise ToolProviderCredentialValidationError("Missing auth_type")
if credentials["auth_type"] == "api_key":
api_key_header = "api_key"
if credentials["auth_type"] in ("api_key_header", "api_key"): # backward compatibility:
api_key_header = "Authorization"
if "api_key_header" in credentials:
api_key_header = credentials["api_key_header"]
@ -100,6 +100,11 @@ class ApiTool(Tool):
headers[api_key_header] = credentials["api_key_value"]
elif credentials["auth_type"] == "api_key_query":
# For query parameter authentication, we don't add anything to headers
# The query parameter will be added in do_http_request method
pass
needed_parameters = [parameter for parameter in (self.api_bundle.parameters or []) if parameter.required]
for parameter in needed_parameters:
if parameter.required and parameter.name not in parameters:
@ -154,6 +159,15 @@ class ApiTool(Tool):
cookies = {}
files = []
# Add API key to query parameters if auth_type is api_key_query
if self.runtime and self.runtime.credentials:
credentials = self.runtime.credentials
if credentials.get("auth_type") == "api_key_query":
api_key_query_param = credentials.get("api_key_query_param", "key")
api_key_value = credentials.get("api_key_value")
if api_key_value:
params[api_key_query_param] = api_key_value
# check parameters
for parameter in self.api_bundle.openapi.get("parameters", []):
value = self.get_parameter_value(parameter, parameters)
@ -213,7 +227,8 @@ class ApiTool(Tool):
elif "default" in property:
body[name] = property["default"]
else:
body[name] = None
# omit optional parameters that weren't provided, instead of setting them to None
pass
break
# replace path parameters

View File

@ -96,7 +96,8 @@ class ApiProviderAuthType(Enum):
"""
NONE = "none"
API_KEY = "api_key"
API_KEY_HEADER = "api_key_header"
API_KEY_QUERY = "api_key_query"
@classmethod
def value_of(cls, value: str) -> "ApiProviderAuthType":

View File

@ -9,9 +9,10 @@ from configs import dify_config
def sign_tool_file(tool_file_id: str, extension: str) -> str:
"""
sign file to get a temporary url
sign file to get a temporary url for plugin access
"""
base_url = dify_config.FILES_URL
# Use internal URL for plugin/tool file access in Docker environments
base_url = dify_config.INTERNAL_FILES_URL or dify_config.FILES_URL
file_preview_url = f"{base_url}/files/tools/{tool_file_id}{extension}"
timestamp = str(int(time.time()))

View File

@ -35,9 +35,10 @@ class ToolFileManager:
@staticmethod
def sign_file(tool_file_id: str, extension: str) -> str:
"""
sign file to get a temporary url
sign file to get a temporary url for plugin access
"""
base_url = dify_config.FILES_URL
# Use internal URL for plugin/tool file access in Docker environments
base_url = dify_config.INTERNAL_FILES_URL or dify_config.FILES_URL
file_preview_url = f"{base_url}/files/tools/{tool_file_id}{extension}"
timestamp = str(int(time.time()))

View File

@ -684,9 +684,16 @@ class ToolManager:
if provider is None:
raise ToolProviderNotFoundError(f"api provider {provider_id} not found")
auth_type = ApiProviderAuthType.NONE
provider_auth_type = provider.credentials.get("auth_type")
if provider_auth_type in ("api_key_header", "api_key"): # backward compatibility
auth_type = ApiProviderAuthType.API_KEY_HEADER
elif provider_auth_type == "api_key_query":
auth_type = ApiProviderAuthType.API_KEY_QUERY
controller = ApiToolProviderController.from_db(
provider,
ApiProviderAuthType.API_KEY if provider.credentials["auth_type"] == "api_key" else ApiProviderAuthType.NONE,
auth_type,
)
controller.load_bundled_tools(provider.tools)
@ -745,9 +752,16 @@ class ToolManager:
credentials = {}
# package tool provider controller
auth_type = ApiProviderAuthType.NONE
credentials_auth_type = credentials.get("auth_type")
if credentials_auth_type in ("api_key_header", "api_key"): # backward compatibility
auth_type = ApiProviderAuthType.API_KEY_HEADER
elif credentials_auth_type == "api_key_query":
auth_type = ApiProviderAuthType.API_KEY_QUERY
controller = ApiToolProviderController.from_db(
provider_obj,
ApiProviderAuthType.API_KEY if credentials["auth_type"] == "api_key" else ApiProviderAuthType.NONE,
auth_type,
)
# init tool configuration
tool_configuration = ProviderConfigEncrypter(

View File

@ -1,5 +1,4 @@
import re
import uuid
from json import dumps as json_dumps
from json import loads as json_loads
from json.decoder import JSONDecodeError
@ -154,7 +153,7 @@ class ApiBasedToolSchemaParser:
# remove special characters like / to ensure the operation id is valid ^[a-zA-Z0-9_-]{1,64}$
path = re.sub(r"[^a-zA-Z0-9_-]", "", path)
if not path:
path = str(uuid.uuid4())
path = "<root>"
interface["operation"]["operationId"] = f"{path}_{interface['method']}"

View File

@ -1,9 +1,9 @@
import json
import sys
from collections.abc import Mapping, Sequence
from typing import Any
from typing import Annotated, Any, TypeAlias
from pydantic import BaseModel, ConfigDict, field_validator
from pydantic import BaseModel, ConfigDict, Discriminator, Tag, field_validator
from core.file import File
@ -11,6 +11,11 @@ from .types import SegmentType
class Segment(BaseModel):
"""Segment is runtime type used during the execution of workflow.
Note: this class is abstract, you should use subclasses of this class instead.
"""
model_config = ConfigDict(frozen=True)
value_type: SegmentType
@ -73,7 +78,7 @@ class StringSegment(Segment):
class FloatSegment(Segment):
value_type: SegmentType = SegmentType.NUMBER
value_type: SegmentType = SegmentType.FLOAT
value: float
# NOTE(QuantumGhost): seems that the equality for FloatSegment with `NaN` value has some problems.
# The following tests cannot pass.
@ -92,7 +97,7 @@ class FloatSegment(Segment):
class IntegerSegment(Segment):
value_type: SegmentType = SegmentType.NUMBER
value_type: SegmentType = SegmentType.INTEGER
value: int
@ -181,3 +186,46 @@ class ArrayFileSegment(ArraySegment):
@property
def text(self) -> str:
return ""
def get_segment_discriminator(v: Any) -> SegmentType | None:
if isinstance(v, Segment):
return v.value_type
elif isinstance(v, dict):
value_type = v.get("value_type")
if value_type is None:
return None
try:
seg_type = SegmentType(value_type)
except ValueError:
return None
return seg_type
else:
# return None if the discriminator value isn't found
return None
# The `SegmentUnion`` type is used to enable serialization and deserialization with Pydantic.
# Use `Segment` for type hinting when serialization is not required.
#
# Note:
# - All variants in `SegmentUnion` must inherit from the `Segment` class.
# - The union must include all non-abstract subclasses of `Segment`, except:
# - `SegmentGroup`, which is not added to the variable pool.
# - `Variable` and its subclasses, which are handled by `VariableUnion`.
SegmentUnion: TypeAlias = Annotated[
(
Annotated[NoneSegment, Tag(SegmentType.NONE)]
| Annotated[StringSegment, Tag(SegmentType.STRING)]
| Annotated[FloatSegment, Tag(SegmentType.FLOAT)]
| Annotated[IntegerSegment, Tag(SegmentType.INTEGER)]
| Annotated[ObjectSegment, Tag(SegmentType.OBJECT)]
| Annotated[FileSegment, Tag(SegmentType.FILE)]
| Annotated[ArrayAnySegment, Tag(SegmentType.ARRAY_ANY)]
| Annotated[ArrayStringSegment, Tag(SegmentType.ARRAY_STRING)]
| Annotated[ArrayNumberSegment, Tag(SegmentType.ARRAY_NUMBER)]
| Annotated[ArrayObjectSegment, Tag(SegmentType.ARRAY_OBJECT)]
| Annotated[ArrayFileSegment, Tag(SegmentType.ARRAY_FILE)]
),
Discriminator(get_segment_discriminator),
]

View File

@ -1,8 +1,27 @@
from collections.abc import Mapping
from enum import StrEnum
from typing import Any, Optional
from core.file.models import File
class ArrayValidation(StrEnum):
"""Strategy for validating array elements"""
# Skip element validation (only check array container)
NONE = "none"
# Validate the first element (if array is non-empty)
FIRST = "first"
# Validate all elements in the array.
ALL = "all"
class SegmentType(StrEnum):
NUMBER = "number"
INTEGER = "integer"
FLOAT = "float"
STRING = "string"
OBJECT = "object"
SECRET = "secret"
@ -19,16 +38,141 @@ class SegmentType(StrEnum):
GROUP = "group"
def is_array_type(self):
def is_array_type(self) -> bool:
return self in _ARRAY_TYPES
@classmethod
def infer_segment_type(cls, value: Any) -> Optional["SegmentType"]:
"""
Attempt to infer the `SegmentType` based on the Python type of the `value` parameter.
Returns `None` if no appropriate `SegmentType` can be determined for the given `value`.
For example, this may occur if the input is a generic Python object of type `object`.
"""
if isinstance(value, list):
elem_types: set[SegmentType] = set()
for i in value:
segment_type = cls.infer_segment_type(i)
if segment_type is None:
return None
elem_types.add(segment_type)
if len(elem_types) != 1:
if elem_types.issubset(_NUMERICAL_TYPES):
return SegmentType.ARRAY_NUMBER
return SegmentType.ARRAY_ANY
elif all(i.is_array_type() for i in elem_types):
return SegmentType.ARRAY_ANY
match elem_types.pop():
case SegmentType.STRING:
return SegmentType.ARRAY_STRING
case SegmentType.NUMBER | SegmentType.INTEGER | SegmentType.FLOAT:
return SegmentType.ARRAY_NUMBER
case SegmentType.OBJECT:
return SegmentType.ARRAY_OBJECT
case SegmentType.FILE:
return SegmentType.ARRAY_FILE
case SegmentType.NONE:
return SegmentType.ARRAY_ANY
case _:
# This should be unreachable.
raise ValueError(f"not supported value {value}")
if value is None:
return SegmentType.NONE
elif isinstance(value, int) and not isinstance(value, bool):
return SegmentType.INTEGER
elif isinstance(value, float):
return SegmentType.FLOAT
elif isinstance(value, str):
return SegmentType.STRING
elif isinstance(value, dict):
return SegmentType.OBJECT
elif isinstance(value, File):
return SegmentType.FILE
elif isinstance(value, str):
return SegmentType.STRING
else:
return None
def _validate_array(self, value: Any, array_validation: ArrayValidation) -> bool:
if not isinstance(value, list):
return False
# Skip element validation if array is empty
if len(value) == 0:
return True
if self == SegmentType.ARRAY_ANY:
return True
element_type = _ARRAY_ELEMENT_TYPES_MAPPING[self]
if array_validation == ArrayValidation.NONE:
return True
elif array_validation == ArrayValidation.FIRST:
return element_type.is_valid(value[0])
else:
return all([element_type.is_valid(i, array_validation=ArrayValidation.NONE)] for i in value)
def is_valid(self, value: Any, array_validation: ArrayValidation = ArrayValidation.FIRST) -> bool:
"""
Check if a value matches the segment type.
Users of `SegmentType` should call this method, instead of using
`isinstance` manually.
Args:
value: The value to validate
array_validation: Validation strategy for array types (ignored for non-array types)
Returns:
True if the value matches the type under the given validation strategy
"""
if self.is_array_type():
return self._validate_array(value, array_validation)
elif self == SegmentType.NUMBER:
return isinstance(value, (int, float))
elif self == SegmentType.STRING:
return isinstance(value, str)
elif self == SegmentType.OBJECT:
return isinstance(value, dict)
elif self == SegmentType.SECRET:
return isinstance(value, str)
elif self == SegmentType.FILE:
return isinstance(value, File)
elif self == SegmentType.NONE:
return value is None
else:
raise AssertionError("this statement should be unreachable.")
def exposed_type(self) -> "SegmentType":
"""Returns the type exposed to the frontend.
The frontend treats `INTEGER` and `FLOAT` as `NUMBER`, so these are returned as `NUMBER` here.
"""
if self in (SegmentType.INTEGER, SegmentType.FLOAT):
return SegmentType.NUMBER
return self
_ARRAY_ELEMENT_TYPES_MAPPING: Mapping[SegmentType, SegmentType] = {
# ARRAY_ANY does not have correpond element type.
SegmentType.ARRAY_STRING: SegmentType.STRING,
SegmentType.ARRAY_NUMBER: SegmentType.NUMBER,
SegmentType.ARRAY_OBJECT: SegmentType.OBJECT,
SegmentType.ARRAY_FILE: SegmentType.FILE,
}
_ARRAY_TYPES = frozenset(
[
list(_ARRAY_ELEMENT_TYPES_MAPPING.keys())
+ [
SegmentType.ARRAY_ANY,
SegmentType.ARRAY_STRING,
SegmentType.ARRAY_NUMBER,
SegmentType.ARRAY_OBJECT,
SegmentType.ARRAY_FILE,
]
)
_NUMERICAL_TYPES = frozenset(
[
SegmentType.NUMBER,
SegmentType.INTEGER,
SegmentType.FLOAT,
]
)

View File

@ -1,8 +1,8 @@
from collections.abc import Sequence
from typing import cast
from typing import Annotated, TypeAlias, cast
from uuid import uuid4
from pydantic import Field
from pydantic import Discriminator, Field, Tag
from core.helper import encrypter
@ -20,6 +20,7 @@ from .segments import (
ObjectSegment,
Segment,
StringSegment,
get_segment_discriminator,
)
from .types import SegmentType
@ -27,6 +28,10 @@ from .types import SegmentType
class Variable(Segment):
"""
A variable is a segment that has a name.
It is mainly used to store segments and their selector in VariablePool.
Note: this class is abstract, you should use subclasses of this class instead.
"""
id: str = Field(
@ -93,3 +98,28 @@ class FileVariable(FileSegment, Variable):
class ArrayFileVariable(ArrayFileSegment, ArrayVariable):
pass
# The `VariableUnion`` type is used to enable serialization and deserialization with Pydantic.
# Use `Variable` for type hinting when serialization is not required.
#
# Note:
# - All variants in `VariableUnion` must inherit from the `Variable` class.
# - The union must include all non-abstract subclasses of `Segment`, except:
VariableUnion: TypeAlias = Annotated[
(
Annotated[NoneVariable, Tag(SegmentType.NONE)]
| Annotated[StringVariable, Tag(SegmentType.STRING)]
| Annotated[FloatVariable, Tag(SegmentType.FLOAT)]
| Annotated[IntegerVariable, Tag(SegmentType.INTEGER)]
| Annotated[ObjectVariable, Tag(SegmentType.OBJECT)]
| Annotated[FileVariable, Tag(SegmentType.FILE)]
| Annotated[ArrayAnyVariable, Tag(SegmentType.ARRAY_ANY)]
| Annotated[ArrayStringVariable, Tag(SegmentType.ARRAY_STRING)]
| Annotated[ArrayNumberVariable, Tag(SegmentType.ARRAY_NUMBER)]
| Annotated[ArrayObjectVariable, Tag(SegmentType.ARRAY_OBJECT)]
| Annotated[ArrayFileVariable, Tag(SegmentType.ARRAY_FILE)]
| Annotated[SecretVariable, Tag(SegmentType.SECRET)]
),
Discriminator(get_segment_discriminator),
]

View File

@ -1,7 +1,7 @@
import re
from collections import defaultdict
from collections.abc import Mapping, Sequence
from typing import Any, Union
from typing import Annotated, Any, Union, cast
from pydantic import BaseModel, Field
@ -9,8 +9,9 @@ from core.file import File, FileAttribute, file_manager
from core.variables import Segment, SegmentGroup, Variable
from core.variables.consts import MIN_SELECTORS_LENGTH
from core.variables.segments import FileSegment, NoneSegment
from core.variables.variables import VariableUnion
from core.workflow.constants import CONVERSATION_VARIABLE_NODE_ID, ENVIRONMENT_VARIABLE_NODE_ID, SYSTEM_VARIABLE_NODE_ID
from core.workflow.enums import SystemVariableKey
from core.workflow.system_variable import SystemVariable
from factories import variable_factory
VariableValue = Union[str, int, float, dict, list, File]
@ -23,31 +24,31 @@ class VariablePool(BaseModel):
# The first element of the selector is the node id, it's the first-level key in the dictionary.
# Other elements of the selector are the keys in the second-level dictionary. To get the key, we hash the
# elements of the selector except the first one.
variable_dictionary: dict[str, dict[int, Segment]] = Field(
variable_dictionary: defaultdict[str, Annotated[dict[int, VariableUnion], Field(default_factory=dict)]] = Field(
description="Variables mapping",
default=defaultdict(dict),
)
# TODO: This user inputs is not used for pool.
# The `user_inputs` is used only when constructing the inputs for the `StartNode`. It's not used elsewhere.
user_inputs: Mapping[str, Any] = Field(
description="User inputs",
default_factory=dict,
)
system_variables: Mapping[SystemVariableKey, Any] = Field(
system_variables: SystemVariable = Field(
description="System variables",
default_factory=dict,
)
environment_variables: Sequence[Variable] = Field(
environment_variables: Sequence[VariableUnion] = Field(
description="Environment variables.",
default_factory=list,
)
conversation_variables: Sequence[Variable] = Field(
conversation_variables: Sequence[VariableUnion] = Field(
description="Conversation variables.",
default_factory=list,
)
def model_post_init(self, context: Any, /) -> None:
for key, value in self.system_variables.items():
self.add((SYSTEM_VARIABLE_NODE_ID, key.value), value)
# Create a mapping from field names to SystemVariableKey enum values
self._add_system_variables(self.system_variables)
# Add environment variables to the variable pool
for var in self.environment_variables:
self.add((ENVIRONMENT_VARIABLE_NODE_ID, var.name), var)
@ -83,8 +84,22 @@ class VariablePool(BaseModel):
segment = variable_factory.build_segment(value)
variable = variable_factory.segment_to_variable(segment=segment, selector=selector)
hash_key = hash(tuple(selector[1:]))
self.variable_dictionary[selector[0]][hash_key] = variable
key, hash_key = self._selector_to_keys(selector)
# Based on the definition of `VariableUnion`,
# `list[Variable]` can be safely used as `list[VariableUnion]` since they are compatible.
self.variable_dictionary[key][hash_key] = cast(VariableUnion, variable)
@classmethod
def _selector_to_keys(cls, selector: Sequence[str]) -> tuple[str, int]:
return selector[0], hash(tuple(selector[1:]))
def _has(self, selector: Sequence[str]) -> bool:
key, hash_key = self._selector_to_keys(selector)
if key not in self.variable_dictionary:
return False
if hash_key not in self.variable_dictionary[key]:
return False
return True
def get(self, selector: Sequence[str], /) -> Segment | None:
"""
@ -102,8 +117,8 @@ class VariablePool(BaseModel):
if len(selector) < MIN_SELECTORS_LENGTH:
return None
hash_key = hash(tuple(selector[1:]))
value = self.variable_dictionary[selector[0]].get(hash_key)
key, hash_key = self._selector_to_keys(selector)
value: Segment | None = self.variable_dictionary[key].get(hash_key)
if value is None:
selector, attr = selector[:-1], selector[-1]
@ -136,8 +151,9 @@ class VariablePool(BaseModel):
if len(selector) == 1:
self.variable_dictionary[selector[0]] = {}
return
key, hash_key = self._selector_to_keys(selector)
hash_key = hash(tuple(selector[1:]))
self.variable_dictionary[selector[0]].pop(hash_key, None)
self.variable_dictionary[key].pop(hash_key, None)
def convert_template(self, template: str, /):
parts = VARIABLE_PATTERN.split(template)
@ -154,3 +170,20 @@ class VariablePool(BaseModel):
if isinstance(segment, FileSegment):
return segment
return None
def _add_system_variables(self, system_variable: SystemVariable):
sys_var_mapping = system_variable.to_dict()
for key, value in sys_var_mapping.items():
if value is None:
continue
selector = (SYSTEM_VARIABLE_NODE_ID, key)
# If the system variable already exists, do not add it again.
# This ensures that we can keep the id of the system variables intact.
if self._has(selector):
continue
self.add(selector, value) # type: ignore
@classmethod
def empty(cls) -> "VariablePool":
"""Create an empty variable pool."""
return cls(system_variables=SystemVariable.empty())

View File

@ -17,8 +17,12 @@ class GraphRuntimeState(BaseModel):
"""total tokens"""
llm_usage: LLMUsage = LLMUsage.empty_usage()
"""llm usage info"""
# The `outputs` field stores the final output values generated by executing workflows or chatflows.
#
# Note: Since the type of this field is `dict[str, Any]`, its values may not remain consistent
# after a serialization and deserialization round trip.
outputs: dict[str, Any] = {}
"""outputs"""
node_run_steps: int = 0
"""node run steps"""

View File

@ -521,18 +521,52 @@ class IterationNode(BaseNode[IterationNodeData]):
)
return
elif self.node_data.error_handle_mode == ErrorHandleMode.TERMINATED:
yield IterationRunFailedEvent(
iteration_id=self.id,
iteration_node_id=self.node_id,
iteration_node_type=self.node_type,
iteration_node_data=self.node_data,
start_at=start_at,
inputs=inputs,
outputs={"output": None},
steps=len(iterator_list_value),
metadata={"total_tokens": graph_engine.graph_runtime_state.total_tokens},
error=event.error,
yield NodeInIterationFailedEvent(
**metadata_event.model_dump(),
)
outputs[current_index] = None
# clean nodes resources
for node_id in iteration_graph.node_ids:
variable_pool.remove([node_id])
# iteration run failed
if self.node_data.is_parallel:
yield IterationRunFailedEvent(
iteration_id=self.id,
iteration_node_id=self.node_id,
iteration_node_type=self.node_type,
iteration_node_data=self.node_data,
parallel_mode_run_id=parallel_mode_run_id,
start_at=start_at,
inputs=inputs,
outputs={"output": outputs},
steps=len(iterator_list_value),
metadata={"total_tokens": graph_engine.graph_runtime_state.total_tokens},
error=event.error,
)
else:
yield IterationRunFailedEvent(
iteration_id=self.id,
iteration_node_id=self.node_id,
iteration_node_type=self.node_type,
iteration_node_data=self.node_data,
start_at=start_at,
inputs=inputs,
outputs={"output": outputs},
steps=len(iterator_list_value),
metadata={"total_tokens": graph_engine.graph_runtime_state.total_tokens},
error=event.error,
)
# stop the iterator
yield RunCompletedEvent(
run_result=NodeRunResult(
status=WorkflowNodeExecutionStatus.FAILED,
error=event.error,
)
)
return
yield metadata_event
current_output_segment = variable_pool.get(self.node_data.output_selector)

View File

@ -144,6 +144,8 @@ class KnowledgeRetrievalNode(LLMNode):
error=str(e),
error_type=type(e).__name__,
)
finally:
db.session.close()
def _fetch_dataset_retriever(self, node_data: KnowledgeRetrievalNodeData, query: str) -> list[dict[str, Any]]:
available_datasets = []
@ -171,6 +173,9 @@ class KnowledgeRetrievalNode(LLMNode):
.all()
)
# avoid blocking at retrieval
db.session.close()
for dataset in results:
# pass if dataset is not available
if not dataset:

View File

@ -1,11 +1,29 @@
from collections.abc import Mapping
from typing import Any, Literal, Optional
from typing import Annotated, Any, Literal, Optional
from pydantic import BaseModel, Field
from pydantic import AfterValidator, BaseModel, Field
from core.variables.types import SegmentType
from core.workflow.nodes.base import BaseLoopNodeData, BaseLoopState, BaseNodeData
from core.workflow.utils.condition.entities import Condition
_VALID_VAR_TYPE = frozenset(
[
SegmentType.STRING,
SegmentType.NUMBER,
SegmentType.OBJECT,
SegmentType.ARRAY_STRING,
SegmentType.ARRAY_NUMBER,
SegmentType.ARRAY_OBJECT,
]
)
def _is_valid_var_type(seg_type: SegmentType) -> SegmentType:
if seg_type not in _VALID_VAR_TYPE:
raise ValueError(...)
return seg_type
class LoopVariableData(BaseModel):
"""
@ -13,7 +31,7 @@ class LoopVariableData(BaseModel):
"""
label: str
var_type: Literal["string", "number", "object", "array[string]", "array[number]", "array[object]"]
var_type: Annotated[SegmentType, AfterValidator(_is_valid_var_type)]
value_type: Literal["variable", "constant"]
value: Optional[Any | list[str]] = None

View File

@ -7,14 +7,9 @@ from typing import TYPE_CHECKING, Any, Literal, cast
from configs import dify_config
from core.variables import (
ArrayNumberSegment,
ArrayObjectSegment,
ArrayStringSegment,
IntegerSegment,
ObjectSegment,
Segment,
SegmentType,
StringSegment,
)
from core.workflow.entities.node_entities import NodeRunResult
from core.workflow.entities.workflow_node_execution import WorkflowNodeExecutionMetadataKey, WorkflowNodeExecutionStatus
@ -39,6 +34,7 @@ from core.workflow.nodes.enums import NodeType
from core.workflow.nodes.event import NodeEvent, RunCompletedEvent
from core.workflow.nodes.loop.entities import LoopNodeData
from core.workflow.utils.condition.processor import ConditionProcessor
from factories.variable_factory import TypeMismatchError, build_segment_with_type
if TYPE_CHECKING:
from core.workflow.entities.variable_pool import VariablePool
@ -505,23 +501,21 @@ class LoopNode(BaseNode[LoopNodeData]):
return variable_mapping
@staticmethod
def _get_segment_for_constant(var_type: str, value: Any) -> Segment:
def _get_segment_for_constant(var_type: SegmentType, value: Any) -> Segment:
"""Get the appropriate segment type for a constant value."""
segment_mapping: dict[str, tuple[type[Segment], SegmentType]] = {
"string": (StringSegment, SegmentType.STRING),
"number": (IntegerSegment, SegmentType.NUMBER),
"object": (ObjectSegment, SegmentType.OBJECT),
"array[string]": (ArrayStringSegment, SegmentType.ARRAY_STRING),
"array[number]": (ArrayNumberSegment, SegmentType.ARRAY_NUMBER),
"array[object]": (ArrayObjectSegment, SegmentType.ARRAY_OBJECT),
}
if var_type in ["array[string]", "array[number]", "array[object]"]:
if value:
if value and isinstance(value, str):
value = json.loads(value)
else:
value = []
segment_info = segment_mapping.get(var_type)
if not segment_info:
raise ValueError(f"Invalid variable type: {var_type}")
segment_class, value_type = segment_info
return segment_class(value=value, value_type=value_type)
try:
return build_segment_with_type(var_type, value)
except TypeMismatchError as type_exc:
# Attempt to parse the value as a JSON-encoded string, if applicable.
if not isinstance(value, str):
raise
try:
value = json.loads(value)
except ValueError:
raise type_exc
return build_segment_with_type(var_type, value)

View File

@ -16,7 +16,7 @@ class StartNode(BaseNode[StartNodeData]):
def _run(self) -> NodeRunResult:
node_inputs = dict(self.graph_runtime_state.variable_pool.user_inputs)
system_inputs = self.graph_runtime_state.variable_pool.system_variables
system_inputs = self.graph_runtime_state.variable_pool.system_variables.to_dict()
# TODO: System variables should be directly accessible, no need for special handling
# Set system variables as node outputs.

View File

@ -130,6 +130,7 @@ class VariableAssignerNode(BaseNode[VariableAssignerData]):
def get_zero_value(t: SegmentType):
# TODO(QuantumGhost): this should be a method of `SegmentType`.
match t:
case SegmentType.ARRAY_OBJECT | SegmentType.ARRAY_STRING | SegmentType.ARRAY_NUMBER:
return variable_factory.build_segment([])
@ -137,6 +138,10 @@ def get_zero_value(t: SegmentType):
return variable_factory.build_segment({})
case SegmentType.STRING:
return variable_factory.build_segment("")
case SegmentType.INTEGER:
return variable_factory.build_segment(0)
case SegmentType.FLOAT:
return variable_factory.build_segment(0.0)
case SegmentType.NUMBER:
return variable_factory.build_segment(0)
case _:

View File

@ -1,5 +1,6 @@
from core.variables import SegmentType
# Note: This mapping is duplicated with `get_zero_value`. Consider refactoring to avoid redundancy.
EMPTY_VALUE_MAPPING = {
SegmentType.STRING: "",
SegmentType.NUMBER: 0,

View File

@ -10,10 +10,16 @@ def is_operation_supported(*, variable_type: SegmentType, operation: Operation):
case Operation.OVER_WRITE | Operation.CLEAR:
return True
case Operation.SET:
return variable_type in {SegmentType.OBJECT, SegmentType.STRING, SegmentType.NUMBER}
return variable_type in {
SegmentType.OBJECT,
SegmentType.STRING,
SegmentType.NUMBER,
SegmentType.INTEGER,
SegmentType.FLOAT,
}
case Operation.ADD | Operation.SUBTRACT | Operation.MULTIPLY | Operation.DIVIDE:
# Only number variable can be added, subtracted, multiplied or divided
return variable_type == SegmentType.NUMBER
return variable_type in {SegmentType.NUMBER, SegmentType.INTEGER, SegmentType.FLOAT}
case Operation.APPEND | Operation.EXTEND:
# Only array variable can be appended or extended
return variable_type in {
@ -46,7 +52,7 @@ def is_constant_input_supported(*, variable_type: SegmentType, operation: Operat
match variable_type:
case SegmentType.STRING | SegmentType.OBJECT:
return operation in {Operation.OVER_WRITE, Operation.SET}
case SegmentType.NUMBER:
case SegmentType.NUMBER | SegmentType.INTEGER | SegmentType.FLOAT:
return operation in {
Operation.OVER_WRITE,
Operation.SET,
@ -66,7 +72,7 @@ def is_input_value_valid(*, variable_type: SegmentType, operation: Operation, va
case SegmentType.STRING:
return isinstance(value, str)
case SegmentType.NUMBER:
case SegmentType.NUMBER | SegmentType.INTEGER | SegmentType.FLOAT:
if not isinstance(value, int | float):
return False
if operation == Operation.DIVIDE and value == 0:

View File

@ -0,0 +1,89 @@
from collections.abc import Sequence
from typing import Any
from pydantic import AliasChoices, BaseModel, ConfigDict, Field, model_validator
from core.file.models import File
from core.workflow.enums import SystemVariableKey
class SystemVariable(BaseModel):
"""A model for managing system variables.
Fields with a value of `None` are treated as absent and will not be included
in the variable pool.
"""
model_config = ConfigDict(
extra="forbid",
serialize_by_alias=True,
validate_by_alias=True,
)
user_id: str | None = None
# Ideally, `app_id` and `workflow_id` should be required and not `None`.
# However, there are scenarios in the codebase where these fields are not set.
# To maintain compatibility, they are marked as optional here.
app_id: str | None = None
workflow_id: str | None = None
files: Sequence[File] = Field(default_factory=list)
# NOTE: The `workflow_execution_id` field was previously named `workflow_run_id`.
# To maintain compatibility with existing workflows, it must be serialized
# as `workflow_run_id` in dictionaries or JSON objects, and also referenced
# as `workflow_run_id` in the variable pool.
workflow_execution_id: str | None = Field(
validation_alias=AliasChoices("workflow_execution_id", "workflow_run_id"),
serialization_alias="workflow_run_id",
default=None,
)
# Chatflow related fields.
query: str | None = None
conversation_id: str | None = None
dialogue_count: int | None = None
@model_validator(mode="before")
@classmethod
def validate_json_fields(cls, data):
if isinstance(data, dict):
# For JSON validation, only allow workflow_run_id
if "workflow_execution_id" in data and "workflow_run_id" not in data:
# This is likely from direct instantiation, allow it
return data
elif "workflow_execution_id" in data and "workflow_run_id" in data:
# Both present, remove workflow_execution_id
data = data.copy()
data.pop("workflow_execution_id")
return data
return data
@classmethod
def empty(cls) -> "SystemVariable":
return cls()
def to_dict(self) -> dict[SystemVariableKey, Any]:
# NOTE: This method is provided for compatibility with legacy code.
# New code should use the `SystemVariable` object directly instead of converting
# it to a dictionary, as this conversion results in the loss of type information
# for each key, making static analysis more difficult.
d: dict[SystemVariableKey, Any] = {
SystemVariableKey.FILES: self.files,
}
if self.user_id is not None:
d[SystemVariableKey.USER_ID] = self.user_id
if self.app_id is not None:
d[SystemVariableKey.APP_ID] = self.app_id
if self.workflow_id is not None:
d[SystemVariableKey.WORKFLOW_ID] = self.workflow_id
if self.workflow_execution_id is not None:
d[SystemVariableKey.WORKFLOW_EXECUTION_ID] = self.workflow_execution_id
if self.query is not None:
d[SystemVariableKey.QUERY] = self.query
if self.conversation_id is not None:
d[SystemVariableKey.CONVERSATION_ID] = self.conversation_id
if self.dialogue_count is not None:
d[SystemVariableKey.DIALOGUE_COUNT] = self.dialogue_count
return d

View File

@ -26,6 +26,7 @@ from core.workflow.entities.workflow_node_execution import (
from core.workflow.enums import SystemVariableKey
from core.workflow.repositories.workflow_execution_repository import WorkflowExecutionRepository
from core.workflow.repositories.workflow_node_execution_repository import WorkflowNodeExecutionRepository
from core.workflow.system_variable import SystemVariable
from core.workflow.workflow_entry import WorkflowEntry
from libs.datetime_utils import naive_utc_now
@ -43,7 +44,7 @@ class WorkflowCycleManager:
self,
*,
application_generate_entity: Union[AdvancedChatAppGenerateEntity, WorkflowAppGenerateEntity],
workflow_system_variables: dict[SystemVariableKey, Any],
workflow_system_variables: SystemVariable,
workflow_info: CycleManagerWorkflowInfo,
workflow_execution_repository: WorkflowExecutionRepository,
workflow_node_execution_repository: WorkflowNodeExecutionRepository,
@ -56,17 +57,22 @@ class WorkflowCycleManager:
def handle_workflow_run_start(self) -> WorkflowExecution:
inputs = {**self._application_generate_entity.inputs}
for key, value in (self._workflow_system_variables or {}).items():
if key.value == "conversation":
continue
inputs[f"sys.{key.value}"] = value
# Iterate over SystemVariable fields using Pydantic's model_fields
if self._workflow_system_variables:
for field_name, value in self._workflow_system_variables.to_dict().items():
if field_name == SystemVariableKey.CONVERSATION_ID:
continue
inputs[f"sys.{field_name}"] = value
# handle special values
inputs = dict(WorkflowEntry.handle_special_values(inputs) or {})
# init workflow run
# TODO: This workflow_run_id should always not be None, maybe we can use a more elegant way to handle this
execution_id = str(self._workflow_system_variables.get(SystemVariableKey.WORKFLOW_EXECUTION_ID) or uuid4())
execution_id = str(
self._workflow_system_variables.workflow_execution_id if self._workflow_system_variables else None
) or str(uuid4())
execution = WorkflowExecution.new(
id_=execution_id,
workflow_id=self._workflow_info.workflow_id,

View File

@ -21,6 +21,7 @@ from core.workflow.nodes import NodeType
from core.workflow.nodes.base import BaseNode
from core.workflow.nodes.event import NodeEvent
from core.workflow.nodes.node_mapping import NODE_TYPE_CLASSES_MAPPING
from core.workflow.system_variable import SystemVariable
from core.workflow.variable_loader import DUMMY_VARIABLE_LOADER, VariableLoader, load_into_variable_pool
from factories import file_factory
from models.enums import UserFrom
@ -254,7 +255,7 @@ class WorkflowEntry:
# init variable pool
variable_pool = VariablePool(
system_variables={},
system_variables=SystemVariable.empty(),
user_inputs={},
environment_variables=[],
)

View File

@ -91,9 +91,13 @@ def _build_variable_from_mapping(*, mapping: Mapping[str, Any], selector: Sequen
result = StringVariable.model_validate(mapping)
case SegmentType.SECRET:
result = SecretVariable.model_validate(mapping)
case SegmentType.NUMBER if isinstance(value, int):
case SegmentType.NUMBER | SegmentType.INTEGER if isinstance(value, int):
mapping = dict(mapping)
mapping["value_type"] = SegmentType.INTEGER
result = IntegerVariable.model_validate(mapping)
case SegmentType.NUMBER if isinstance(value, float):
case SegmentType.NUMBER | SegmentType.FLOAT if isinstance(value, float):
mapping = dict(mapping)
mapping["value_type"] = SegmentType.FLOAT
result = FloatVariable.model_validate(mapping)
case SegmentType.NUMBER if not isinstance(value, float | int):
raise VariableError(f"invalid number value {value}")
@ -119,6 +123,8 @@ def infer_segment_type_from_value(value: Any, /) -> SegmentType:
def build_segment(value: Any, /) -> Segment:
# NOTE: If you have runtime type information available, consider using the `build_segment_with_type`
# below
if value is None:
return NoneSegment()
if isinstance(value, str):
@ -134,12 +140,17 @@ def build_segment(value: Any, /) -> Segment:
if isinstance(value, list):
items = [build_segment(item) for item in value]
types = {item.value_type for item in items}
if len(types) != 1 or all(isinstance(item, ArraySegment) for item in items):
if all(isinstance(item, ArraySegment) for item in items):
return ArrayAnySegment(value=value)
elif len(types) != 1:
if types.issubset({SegmentType.NUMBER, SegmentType.INTEGER, SegmentType.FLOAT}):
return ArrayNumberSegment(value=value)
return ArrayAnySegment(value=value)
match types.pop():
case SegmentType.STRING:
return ArrayStringSegment(value=value)
case SegmentType.NUMBER:
case SegmentType.NUMBER | SegmentType.INTEGER | SegmentType.FLOAT:
return ArrayNumberSegment(value=value)
case SegmentType.OBJECT:
return ArrayObjectSegment(value=value)
@ -153,6 +164,22 @@ def build_segment(value: Any, /) -> Segment:
raise ValueError(f"not supported value {value}")
_segment_factory: Mapping[SegmentType, type[Segment]] = {
SegmentType.NONE: NoneSegment,
SegmentType.STRING: StringSegment,
SegmentType.INTEGER: IntegerSegment,
SegmentType.FLOAT: FloatSegment,
SegmentType.FILE: FileSegment,
SegmentType.OBJECT: ObjectSegment,
# Array types
SegmentType.ARRAY_ANY: ArrayAnySegment,
SegmentType.ARRAY_STRING: ArrayStringSegment,
SegmentType.ARRAY_NUMBER: ArrayNumberSegment,
SegmentType.ARRAY_OBJECT: ArrayObjectSegment,
SegmentType.ARRAY_FILE: ArrayFileSegment,
}
def build_segment_with_type(segment_type: SegmentType, value: Any) -> Segment:
"""
Build a segment with explicit type checking.
@ -190,7 +217,7 @@ def build_segment_with_type(segment_type: SegmentType, value: Any) -> Segment:
if segment_type == SegmentType.NONE:
return NoneSegment()
else:
raise TypeMismatchError(f"Expected {segment_type}, but got None")
raise TypeMismatchError(f"Type mismatch: expected {segment_type}, but got None")
# Handle empty list special case for array types
if isinstance(value, list) and len(value) == 0:
@ -205,21 +232,25 @@ def build_segment_with_type(segment_type: SegmentType, value: Any) -> Segment:
elif segment_type == SegmentType.ARRAY_FILE:
return ArrayFileSegment(value=value)
else:
raise TypeMismatchError(f"Expected {segment_type}, but got empty list")
# Build segment using existing logic to infer actual type
inferred_segment = build_segment(value)
inferred_type = inferred_segment.value_type
raise TypeMismatchError(f"Type mismatch: expected {segment_type}, but got empty list")
inferred_type = SegmentType.infer_segment_type(value)
# Type compatibility checking
if inferred_type is None:
raise TypeMismatchError(
f"Type mismatch: expected {segment_type}, but got python object, type={type(value)}, value={value}"
)
if inferred_type == segment_type:
return inferred_segment
# Type mismatch - raise error with descriptive message
raise TypeMismatchError(
f"Type mismatch: expected {segment_type}, but value '{value}' "
f"(type: {type(value).__name__}) corresponds to {inferred_type}"
)
segment_class = _segment_factory[segment_type]
return segment_class(value_type=segment_type, value=value)
elif segment_type == SegmentType.NUMBER and inferred_type in (
SegmentType.INTEGER,
SegmentType.FLOAT,
):
segment_class = _segment_factory[inferred_type]
return segment_class(value_type=inferred_type, value=value)
else:
raise TypeMismatchError(f"Type mismatch: expected {segment_type}, but got {inferred_type}, value={value}")
def segment_to_variable(
@ -247,6 +278,6 @@ def segment_to_variable(
name=name,
description=description,
value=segment.value,
selector=selector,
selector=list(selector),
),
)

View File

@ -0,0 +1,15 @@
from typing import TypedDict
from core.variables.segments import Segment
from core.variables.types import SegmentType
class _VarTypedDict(TypedDict, total=False):
value_type: SegmentType
def serialize_value_type(v: _VarTypedDict | Segment) -> str:
if isinstance(v, Segment):
return v.value_type.exposed_type().value
else:
return v["value_type"].exposed_type().value

View File

@ -188,6 +188,7 @@ app_detail_fields_with_site = {
"site": fields.Nested(site_fields),
"api_base_url": fields.String,
"use_icon_as_answer_icon": fields.Boolean,
"max_active_requests": fields.Integer,
"created_by": fields.String,
"created_at": TimestampField,
"updated_by": fields.String,

View File

@ -2,10 +2,12 @@ from flask_restful import fields
from libs.helper import TimestampField
from ._value_type_serializer import serialize_value_type
conversation_variable_fields = {
"id": fields.String,
"name": fields.String,
"value_type": fields.String(attribute="value_type.value"),
"value_type": fields.String(attribute=serialize_value_type),
"value": fields.String,
"description": fields.String,
"created_at": TimestampField,

View File

@ -5,6 +5,8 @@ from core.variables import SecretVariable, SegmentType, Variable
from fields.member_fields import simple_account_fields
from libs.helper import TimestampField
from ._value_type_serializer import serialize_value_type
ENVIRONMENT_VARIABLE_SUPPORTED_TYPES = (SegmentType.STRING, SegmentType.NUMBER, SegmentType.SECRET)
@ -24,11 +26,16 @@ class EnvironmentVariableField(fields.Raw):
"id": value.id,
"name": value.name,
"value": value.value,
"value_type": value.value_type.value,
"value_type": value.value_type.exposed_type().value,
"description": value.description,
}
if isinstance(value, dict):
value_type = value.get("value_type")
value_type_str = value.get("value_type")
if not isinstance(value_type_str, str):
raise TypeError(
f"unexpected type for value_type field, value={value_type_str}, type={type(value_type_str)}"
)
value_type = SegmentType(value_type_str).exposed_type()
if value_type not in ENVIRONMENT_VARIABLE_SUPPORTED_TYPES:
raise ValueError(f"Unsupported environment variable value type: {value_type}")
return value
@ -37,7 +44,7 @@ class EnvironmentVariableField(fields.Raw):
conversation_variable_fields = {
"id": fields.String,
"name": fields.String,
"value_type": fields.String(attribute="value_type.value"),
"value_type": fields.String(attribute=serialize_value_type),
"value": fields.Raw,
"description": fields.String,
}

View File

@ -14,9 +14,11 @@ class PassportService:
def verify(self, token):
try:
return jwt.decode(token, self.sk, algorithms=["HS256"])
except jwt.exceptions.ExpiredSignatureError:
raise Unauthorized("Token has expired.")
except jwt.exceptions.InvalidSignatureError:
raise Unauthorized("Invalid token signature.")
except jwt.exceptions.DecodeError:
raise Unauthorized("Invalid token.")
except jwt.exceptions.ExpiredSignatureError:
raise Unauthorized("Token has expired.")
except jwt.exceptions.PyJWTError: # Catch-all for other JWT errors
raise Unauthorized("Invalid token.")

164
api/libs/uuid_utils.py Normal file
View File

@ -0,0 +1,164 @@
import secrets
import struct
import time
import uuid
# Reference for UUIDv7 specification:
# RFC 9562, Section 5.7 - https://www.rfc-editor.org/rfc/rfc9562.html#section-5.7
# Define the format for packing the timestamp as an unsigned 64-bit integer (big-endian).
#
# For details on the `struct.pack` format, refer to:
# https://docs.python.org/3/library/struct.html#byte-order-size-and-alignment
_PACK_TIMESTAMP = ">Q"
# Define the format for packing the 12-bit random data A (as specified in RFC 9562 Section 5.7)
# into an unsigned 16-bit integer (big-endian).
_PACK_RAND_A = ">H"
def _create_uuidv7_bytes(timestamp_ms: int, random_bytes: bytes) -> bytes:
"""Create UUIDv7 byte structure with given timestamp and random bytes.
This is a private helper function that handles the common logic for creating
UUIDv7 byte structure according to RFC 9562 specification.
UUIDv7 Structure:
- 48 bits: timestamp (milliseconds since Unix epoch)
- 12 bits: random data A (with version bits)
- 62 bits: random data B (with variant bits)
The function performs the following operations:
1. Creates a 128-bit (16-byte) UUID structure
2. Packs the timestamp into the first 48 bits (6 bytes)
3. Sets the version bits to 7 (0111) in the correct position
4. Sets the variant bits to 10 (binary) in the correct position
5. Fills the remaining bits with the provided random bytes
Args:
timestamp_ms: The timestamp in milliseconds since Unix epoch (48 bits).
random_bytes: Random bytes to use for the random portions (must be 10 bytes).
First 2 bytes are used for random data A (12 bits after version).
Last 8 bytes are used for random data B (62 bits after variant).
Returns:
A 16-byte bytes object representing the complete UUIDv7 structure.
Note:
This function assumes the random_bytes parameter is exactly 10 bytes.
The caller is responsible for providing appropriate random data.
"""
# Create the 128-bit UUID structure
uuid_bytes = bytearray(16)
# Pack timestamp (48 bits) into first 6 bytes
uuid_bytes[0:6] = struct.pack(_PACK_TIMESTAMP, timestamp_ms)[2:8] # Take last 6 bytes of 8-byte big-endian
# Next 16 bits: random data A (12 bits) + version (4 bits)
# Take first 2 random bytes and set version to 7
rand_a = struct.unpack(_PACK_RAND_A, random_bytes[0:2])[0]
# Clear the highest 4 bits to make room for the version field
# by performing a bitwise AND with 0x0FFF (binary: 0b0000_1111_1111_1111).
rand_a = rand_a & 0x0FFF
# Set the version field to 7 (binary: 0111) by performing a bitwise OR with 0x7000 (binary: 0b0111_0000_0000_0000).
rand_a = rand_a | 0x7000
uuid_bytes[6:8] = struct.pack(_PACK_RAND_A, rand_a)
# Last 64 bits: random data B (62 bits) + variant (2 bits)
# Use remaining 8 random bytes and set variant to 10 (binary)
uuid_bytes[8:16] = random_bytes[2:10]
# Set variant bits (first 2 bits of byte 8 should be '10')
uuid_bytes[8] = (uuid_bytes[8] & 0x3F) | 0x80 # Set variant to 10xxxxxx
return bytes(uuid_bytes)
def uuidv7(timestamp_ms: int | None = None) -> uuid.UUID:
"""Generate a UUID version 7 according to RFC 9562 specification.
UUIDv7 features a time-ordered value field derived from the widely
implemented and well known Unix Epoch timestamp source, the number of
milliseconds since midnight 1 Jan 1970 UTC, leap seconds excluded.
Structure:
- 48 bits: timestamp (milliseconds since Unix epoch)
- 12 bits: random data A (with version bits)
- 62 bits: random data B (with variant bits)
Args:
timestamp_ms: The timestamp used when generating UUID, use the current time if unspecified.
Should be an integer representing milliseconds since Unix epoch.
Returns:
A UUID object representing a UUIDv7.
Example:
>>> import time
>>> # Generate UUIDv7 with current time
>>> uuid_current = uuidv7()
>>> # Generate UUIDv7 with specific timestamp
>>> uuid_specific = uuidv7(int(time.time() * 1000))
"""
if timestamp_ms is None:
timestamp_ms = int(time.time() * 1000)
# Generate 10 random bytes for the random portions
random_bytes = secrets.token_bytes(10)
# Create UUIDv7 bytes using the helper function
uuid_bytes = _create_uuidv7_bytes(timestamp_ms, random_bytes)
return uuid.UUID(bytes=uuid_bytes)
def uuidv7_timestamp(id_: uuid.UUID) -> int:
"""Extract the timestamp from a UUIDv7.
UUIDv7 contains a 48-bit timestamp field representing milliseconds since
the Unix epoch (1970-01-01 00:00:00 UTC). This function extracts and
returns that timestamp as an integer representing milliseconds since the epoch.
Args:
id_: A UUID object that should be a UUIDv7 (version 7).
Returns:
The timestamp as an integer representing milliseconds since Unix epoch.
Raises:
ValueError: If the provided UUID is not version 7.
Example:
>>> uuid_v7 = uuidv7()
>>> timestamp = uuidv7_timestamp(uuid_v7)
>>> print(f"UUID was created at: {timestamp} ms")
"""
# Verify this is a UUIDv7
if id_.version != 7:
raise ValueError(f"Expected UUIDv7 (version 7), got version {id_.version}")
# Extract the UUID bytes
uuid_bytes = id_.bytes
# Extract the first 48 bits (6 bytes) as the timestamp in milliseconds
# Pad with 2 zero bytes at the beginning to make it 8 bytes for unpacking as Q (unsigned long long)
timestamp_bytes = b"\x00\x00" + uuid_bytes[0:6]
ts_in_ms = struct.unpack(_PACK_TIMESTAMP, timestamp_bytes)[0]
# Return timestamp directly in milliseconds as integer
assert isinstance(ts_in_ms, int)
return ts_in_ms
def uuidv7_boundary(timestamp_ms: int) -> uuid.UUID:
"""Generate a non-random uuidv7 with the given timestamp (first 48 bits) and
all random bits to 0. As the smallest possible uuidv7 for that timestamp,
it may be used as a boundary for partitions.
"""
# Use zero bytes for all random portions
zero_random_bytes = b"\x00" * 10
# Create UUIDv7 bytes using the helper function
uuid_bytes = _create_uuidv7_bytes(timestamp_ms, zero_random_bytes)
return uuid.UUID(bytes=uuid_bytes)

View File

@ -0,0 +1,86 @@
"""add uuidv7 function in SQL
Revision ID: 1c9ba48be8e4
Revises: 58eb7bdb93fe
Create Date: 2025-07-02 23:32:38.484499
"""
"""
The functions in this files comes from https://github.com/dverite/postgres-uuidv7-sql/, with minor modifications.
LICENSE:
# Copyright and License
Copyright (c) 2024, Daniel Vérité
Permission to use, copy, modify, and distribute this software and its documentation for any purpose, without fee, and without a written agreement is hereby granted, provided that the above copyright notice and this paragraph and the following two paragraphs appear in all copies.
In no event shall Daniel Vérité be liable to any party for direct, indirect, special, incidental, or consequential damages, including lost profits, arising out of the use of this software and its documentation, even if Daniel Vérité has been advised of the possibility of such damage.
Daniel Vérité specifically disclaims any warranties, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. The software provided hereunder is on an "AS IS" basis, and Daniel Vérité has no obligations to provide maintenance, support, updates, enhancements, or modifications.
"""
from alembic import op
import models as models
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision = '1c9ba48be8e4'
down_revision = '58eb7bdb93fe'
branch_labels: None = None
depends_on: None = None
def upgrade():
# This implementation differs slightly from the original uuidv7 function in
# https://github.com/dverite/postgres-uuidv7-sql/.
# The ability to specify source timestamp has been removed because its type signature is incompatible with
# PostgreSQL 18's `uuidv7` function. This capability is rarely needed in practice, as IDs can be
# generated and controlled within the application layer.
op.execute(sa.text(r"""
/* Main function to generate a uuidv7 value with millisecond precision */
CREATE FUNCTION uuidv7() RETURNS uuid
AS
$$
-- Replace the first 48 bits of a uuidv4 with the current
-- number of milliseconds since 1970-01-01 UTC
-- and set the "ver" field to 7 by setting additional bits
SELECT encode(
set_bit(
set_bit(
overlay(uuid_send(gen_random_uuid()) placing
substring(int8send((extract(epoch from clock_timestamp()) * 1000)::bigint) from
3)
from 1 for 6),
52, 1),
53, 1), 'hex')::uuid;
$$ LANGUAGE SQL VOLATILE PARALLEL SAFE;
COMMENT ON FUNCTION uuidv7 IS
'Generate a uuid-v7 value with a 48-bit timestamp (millisecond precision) and 74 bits of randomness';
"""))
op.execute(sa.text(r"""
CREATE FUNCTION uuidv7_boundary(timestamptz) RETURNS uuid
AS
$$
/* uuid fields: version=0b0111, variant=0b10 */
SELECT encode(
overlay('\x00000000000070008000000000000000'::bytea
placing substring(int8send(floor(extract(epoch from $1) * 1000)::bigint) from 3)
from 1 for 6),
'hex')::uuid;
$$ LANGUAGE SQL STABLE STRICT PARALLEL SAFE;
COMMENT ON FUNCTION uuidv7_boundary(timestamptz) IS
'Generate a non-random uuidv7 with the given timestamp (first 48 bits) and all random bits to 0. As the smallest possible uuidv7 for that timestamp, it may be used as a boundary for partitions.';
"""
))
def downgrade():
op.execute(sa.text("DROP FUNCTION uuidv7"))
op.execute(sa.text("DROP FUNCTION uuidv7_boundary"))

View File

@ -50,7 +50,6 @@ class AppMode(StrEnum):
CHAT = "chat"
ADVANCED_CHAT = "advanced-chat"
AGENT_CHAT = "agent-chat"
CHANNEL = "channel"
@classmethod
def value_of(cls, value: str) -> "AppMode":
@ -934,7 +933,7 @@ class Message(Base):
created_at: Mapped[datetime] = mapped_column(db.DateTime, server_default=func.current_timestamp())
updated_at = db.Column(db.DateTime, nullable=False, server_default=func.current_timestamp())
agent_based = db.Column(db.Boolean, nullable=False, server_default=db.text("false"))
workflow_run_id = db.Column(StringUUID)
workflow_run_id: Mapped[str] = db.Column(StringUUID)
@property
def inputs(self):

View File

@ -12,6 +12,7 @@ from sqlalchemy import orm
from core.file.constants import maybe_file_object
from core.file.models import File
from core.variables import utils as variable_utils
from core.variables.variables import FloatVariable, IntegerVariable, StringVariable
from core.workflow.constants import CONVERSATION_VARIABLE_NODE_ID, SYSTEM_VARIABLE_NODE_ID
from core.workflow.nodes.enums import NodeType
from factories.variable_factory import TypeMismatchError, build_segment_with_type
@ -347,7 +348,7 @@ class Workflow(Base):
)
@property
def environment_variables(self) -> Sequence[Variable]:
def environment_variables(self) -> Sequence[StringVariable | IntegerVariable | FloatVariable | SecretVariable]:
# TODO: find some way to init `self._environment_variables` when instance created.
if self._environment_variables is None:
self._environment_variables = "{}"
@ -367,11 +368,15 @@ class Workflow(Base):
def decrypt_func(var):
if isinstance(var, SecretVariable):
return var.model_copy(update={"value": encrypter.decrypt_token(tenant_id=tenant_id, token=var.value)})
else:
elif isinstance(var, (StringVariable, IntegerVariable, FloatVariable)):
return var
else:
raise AssertionError("this statement should be unreachable.")
results = list(map(decrypt_func, results))
return results
decrypted_results: list[SecretVariable | StringVariable | IntegerVariable | FloatVariable] = list(
map(decrypt_func, results)
)
return decrypted_results
@environment_variables.setter
def environment_variables(self, value: Sequence[Variable]):

View File

@ -108,7 +108,7 @@ dev = [
"faker~=32.1.0",
"lxml-stubs~=0.5.1",
"mypy~=1.16.0",
"ruff~=0.11.5",
"ruff~=0.12.3",
"pytest~=8.3.2",
"pytest-benchmark~=4.0.0",
"pytest-cov~=4.1.0",

View File

View File

@ -0,0 +1,197 @@
"""
Service-layer repository protocol for WorkflowNodeExecutionModel operations.
This module provides a protocol interface for service-layer operations on WorkflowNodeExecutionModel
that abstracts database queries currently done directly in service classes. This repository is
specifically designed for service-layer needs and is separate from the core domain repository.
The service repository handles operations that require access to database-specific fields like
tenant_id, app_id, triggered_from, etc., which are not part of the core domain model.
"""
from collections.abc import Sequence
from datetime import datetime
from typing import Optional, Protocol
from core.workflow.repositories.workflow_node_execution_repository import WorkflowNodeExecutionRepository
from models.workflow import WorkflowNodeExecutionModel
class DifyAPIWorkflowNodeExecutionRepository(WorkflowNodeExecutionRepository, Protocol):
"""
Protocol for service-layer operations on WorkflowNodeExecutionModel.
This repository provides database access patterns specifically needed by service classes,
handling queries that involve database-specific fields and multi-tenancy concerns.
Key responsibilities:
- Manages database operations for workflow node executions
- Handles multi-tenant data isolation
- Provides batch processing capabilities
- Supports execution lifecycle management
Implementation notes:
- Returns database models directly (WorkflowNodeExecutionModel)
- Handles tenant/app filtering automatically
- Provides service-specific query patterns
- Focuses on database operations without domain logic
- Supports cleanup and maintenance operations
"""
def get_node_last_execution(
self,
tenant_id: str,
app_id: str,
workflow_id: str,
node_id: str,
) -> Optional[WorkflowNodeExecutionModel]:
"""
Get the most recent execution for a specific node.
This method finds the latest execution of a specific node within a workflow,
ordered by creation time. Used primarily for debugging and inspection purposes.
Args:
tenant_id: The tenant identifier
app_id: The application identifier
workflow_id: The workflow identifier
node_id: The node identifier
Returns:
The most recent WorkflowNodeExecutionModel for the node, or None if not found
"""
...
def get_executions_by_workflow_run(
self,
tenant_id: str,
app_id: str,
workflow_run_id: str,
) -> Sequence[WorkflowNodeExecutionModel]:
"""
Get all node executions for a specific workflow run.
This method retrieves all node executions that belong to a specific workflow run,
ordered by index in descending order for proper trace visualization.
Args:
tenant_id: The tenant identifier
app_id: The application identifier
workflow_run_id: The workflow run identifier
Returns:
A sequence of WorkflowNodeExecutionModel instances ordered by index (desc)
"""
...
def get_execution_by_id(
self,
execution_id: str,
tenant_id: Optional[str] = None,
) -> Optional[WorkflowNodeExecutionModel]:
"""
Get a workflow node execution by its ID.
This method retrieves a specific execution by its unique identifier.
Tenant filtering is optional for cases where the execution ID is globally unique.
When `tenant_id` is None, it's the caller's responsibility to ensure proper data isolation between tenants.
If the `execution_id` comes from untrusted sources (e.g., retrieved from an API request), the caller should
set `tenant_id` to prevent horizontal privilege escalation.
Args:
execution_id: The execution identifier
tenant_id: Optional tenant identifier for additional filtering
Returns:
The WorkflowNodeExecutionModel if found, or None if not found
"""
...
def delete_expired_executions(
self,
tenant_id: str,
before_date: datetime,
batch_size: int = 1000,
) -> int:
"""
Delete workflow node executions that are older than the specified date.
This method is used for cleanup operations to remove expired executions
in batches to avoid overwhelming the database.
Args:
tenant_id: The tenant identifier
before_date: Delete executions created before this date
batch_size: Maximum number of executions to delete in one batch
Returns:
The number of executions deleted
"""
...
def delete_executions_by_app(
self,
tenant_id: str,
app_id: str,
batch_size: int = 1000,
) -> int:
"""
Delete all workflow node executions for a specific app.
This method is used when removing an app and all its related data.
Executions are deleted in batches to avoid overwhelming the database.
Args:
tenant_id: The tenant identifier
app_id: The application identifier
batch_size: Maximum number of executions to delete in one batch
Returns:
The total number of executions deleted
"""
...
def get_expired_executions_batch(
self,
tenant_id: str,
before_date: datetime,
batch_size: int = 1000,
) -> Sequence[WorkflowNodeExecutionModel]:
"""
Get a batch of expired workflow node executions for backup purposes.
This method retrieves expired executions without deleting them,
allowing the caller to backup the data before deletion.
Args:
tenant_id: The tenant identifier
before_date: Get executions created before this date
batch_size: Maximum number of executions to retrieve
Returns:
A sequence of WorkflowNodeExecutionModel instances
"""
...
def delete_executions_by_ids(
self,
execution_ids: Sequence[str],
) -> int:
"""
Delete workflow node executions by their IDs.
This method deletes specific executions by their IDs,
typically used after backing up the data.
This method does not perform tenant isolation checks. The caller is responsible for ensuring proper
data isolation between tenants. When execution IDs come from untrusted sources (e.g., API requests),
additional tenant validation should be implemented to prevent unauthorized access.
Args:
execution_ids: List of execution IDs to delete
Returns:
The number of executions deleted
"""
...

View File

@ -0,0 +1,181 @@
"""
API WorkflowRun Repository Protocol
This module defines the protocol for service-layer WorkflowRun operations.
The repository provides an abstraction layer for WorkflowRun database operations
used by service classes, separating service-layer concerns from core domain logic.
Key Features:
- Paginated workflow run queries with filtering
- Bulk deletion operations with OSS backup support
- Multi-tenant data isolation
- Expired record cleanup with data retention
- Service-layer specific query patterns
Usage:
This protocol should be used by service classes that need to perform
WorkflowRun database operations. It provides a clean interface that
hides implementation details and supports dependency injection.
Example:
```python
from repositories.dify_api_repository_factory import DifyAPIRepositoryFactory
session_maker = sessionmaker(bind=db.engine, expire_on_commit=False)
repo = DifyAPIRepositoryFactory.create_api_workflow_run_repository(session_maker)
# Get paginated workflow runs
runs = repo.get_paginated_workflow_runs(
tenant_id="tenant-123",
app_id="app-456",
triggered_from="debugging",
limit=20
)
```
"""
from collections.abc import Sequence
from datetime import datetime
from typing import Optional, Protocol
from core.workflow.repositories.workflow_execution_repository import WorkflowExecutionRepository
from libs.infinite_scroll_pagination import InfiniteScrollPagination
from models.workflow import WorkflowRun
class APIWorkflowRunRepository(WorkflowExecutionRepository, Protocol):
"""
Protocol for service-layer WorkflowRun repository operations.
This protocol defines the interface for WorkflowRun database operations
that are specific to service-layer needs, including pagination, filtering,
and bulk operations with data backup support.
"""
def get_paginated_workflow_runs(
self,
tenant_id: str,
app_id: str,
triggered_from: str,
limit: int = 20,
last_id: Optional[str] = None,
) -> InfiniteScrollPagination:
"""
Get paginated workflow runs with filtering.
Retrieves workflow runs for a specific app and trigger source with
cursor-based pagination support. Used primarily for debugging and
workflow run listing in the UI.
Args:
tenant_id: Tenant identifier for multi-tenant isolation
app_id: Application identifier
triggered_from: Filter by trigger source (e.g., "debugging", "app-run")
limit: Maximum number of records to return (default: 20)
last_id: Cursor for pagination - ID of the last record from previous page
Returns:
InfiniteScrollPagination object containing:
- data: List of WorkflowRun objects
- limit: Applied limit
- has_more: Boolean indicating if more records exist
Raises:
ValueError: If last_id is provided but the corresponding record doesn't exist
"""
...
def get_workflow_run_by_id(
self,
tenant_id: str,
app_id: str,
run_id: str,
) -> Optional[WorkflowRun]:
"""
Get a specific workflow run by ID.
Retrieves a single workflow run with tenant and app isolation.
Used for workflow run detail views and execution tracking.
Args:
tenant_id: Tenant identifier for multi-tenant isolation
app_id: Application identifier
run_id: Workflow run identifier
Returns:
WorkflowRun object if found, None otherwise
"""
...
def get_expired_runs_batch(
self,
tenant_id: str,
before_date: datetime,
batch_size: int = 1000,
) -> Sequence[WorkflowRun]:
"""
Get a batch of expired workflow runs for cleanup.
Retrieves workflow runs created before the specified date for
cleanup operations. Used by scheduled tasks to remove old data
while maintaining data retention policies.
Args:
tenant_id: Tenant identifier for multi-tenant isolation
before_date: Only return runs created before this date
batch_size: Maximum number of records to return
Returns:
Sequence of WorkflowRun objects to be processed for cleanup
"""
...
def delete_runs_by_ids(
self,
run_ids: Sequence[str],
) -> int:
"""
Delete workflow runs by their IDs.
Performs bulk deletion of workflow runs by ID. This method should
be used after backing up the data to OSS storage for retention.
Args:
run_ids: Sequence of workflow run IDs to delete
Returns:
Number of records actually deleted
Note:
This method performs hard deletion. Ensure data is backed up
to OSS storage before calling this method for compliance with
data retention policies.
"""
...
def delete_runs_by_app(
self,
tenant_id: str,
app_id: str,
batch_size: int = 1000,
) -> int:
"""
Delete all workflow runs for a specific app.
Performs bulk deletion of all workflow runs associated with an app.
Used during app cleanup operations. Processes records in batches
to avoid memory issues and long-running transactions.
Args:
tenant_id: Tenant identifier for multi-tenant isolation
app_id: Application identifier
batch_size: Number of records to process in each batch
Returns:
Total number of records deleted across all batches
Note:
This method performs hard deletion without backup. Use with caution
and ensure proper data retention policies are followed.
"""
...

103
api/repositories/factory.py Normal file
View File

@ -0,0 +1,103 @@
"""
DifyAPI Repository Factory for creating repository instances.
This factory is specifically designed for DifyAPI repositories that handle
service-layer operations with dependency injection patterns.
"""
import logging
from sqlalchemy.orm import sessionmaker
from configs import dify_config
from core.repositories import DifyCoreRepositoryFactory, RepositoryImportError
from repositories.api_workflow_node_execution_repository import DifyAPIWorkflowNodeExecutionRepository
from repositories.api_workflow_run_repository import APIWorkflowRunRepository
logger = logging.getLogger(__name__)
class DifyAPIRepositoryFactory(DifyCoreRepositoryFactory):
"""
Factory for creating DifyAPI repository instances based on configuration.
This factory handles the creation of repositories that are specifically designed
for service-layer operations and use dependency injection with sessionmaker
for better testability and separation of concerns.
"""
@classmethod
def create_api_workflow_node_execution_repository(
cls, session_maker: sessionmaker
) -> DifyAPIWorkflowNodeExecutionRepository:
"""
Create a DifyAPIWorkflowNodeExecutionRepository instance based on configuration.
This repository is designed for service-layer operations and uses dependency injection
with a sessionmaker for better testability and separation of concerns. It provides
database access patterns specifically needed by service classes, handling queries
that involve database-specific fields and multi-tenancy concerns.
Args:
session_maker: SQLAlchemy sessionmaker to inject for database session management.
Returns:
Configured DifyAPIWorkflowNodeExecutionRepository instance
Raises:
RepositoryImportError: If the configured repository cannot be imported or instantiated
"""
class_path = dify_config.API_WORKFLOW_NODE_EXECUTION_REPOSITORY
logger.debug(f"Creating DifyAPIWorkflowNodeExecutionRepository from: {class_path}")
try:
repository_class = cls._import_class(class_path)
cls._validate_repository_interface(repository_class, DifyAPIWorkflowNodeExecutionRepository)
# Service repository requires session_maker parameter
cls._validate_constructor_signature(repository_class, ["session_maker"])
return repository_class(session_maker=session_maker) # type: ignore[no-any-return]
except RepositoryImportError:
# Re-raise our custom errors as-is
raise
except Exception as e:
logger.exception("Failed to create DifyAPIWorkflowNodeExecutionRepository")
raise RepositoryImportError(
f"Failed to create DifyAPIWorkflowNodeExecutionRepository from '{class_path}': {e}"
) from e
@classmethod
def create_api_workflow_run_repository(cls, session_maker: sessionmaker) -> APIWorkflowRunRepository:
"""
Create an APIWorkflowRunRepository instance based on configuration.
This repository is designed for service-layer WorkflowRun operations and uses dependency
injection with a sessionmaker for better testability and separation of concerns. It provides
database access patterns specifically needed by service classes for workflow run management,
including pagination, filtering, and bulk operations.
Args:
session_maker: SQLAlchemy sessionmaker to inject for database session management.
Returns:
Configured APIWorkflowRunRepository instance
Raises:
RepositoryImportError: If the configured repository cannot be imported or instantiated
"""
class_path = dify_config.API_WORKFLOW_RUN_REPOSITORY
logger.debug(f"Creating APIWorkflowRunRepository from: {class_path}")
try:
repository_class = cls._import_class(class_path)
cls._validate_repository_interface(repository_class, APIWorkflowRunRepository)
# Service repository requires session_maker parameter
cls._validate_constructor_signature(repository_class, ["session_maker"])
return repository_class(session_maker=session_maker) # type: ignore[no-any-return]
except RepositoryImportError:
# Re-raise our custom errors as-is
raise
except Exception as e:
logger.exception("Failed to create APIWorkflowRunRepository")
raise RepositoryImportError(f"Failed to create APIWorkflowRunRepository from '{class_path}': {e}") from e

View File

@ -0,0 +1,290 @@
"""
SQLAlchemy implementation of WorkflowNodeExecutionServiceRepository.
This module provides a concrete implementation of the service repository protocol
using SQLAlchemy 2.0 style queries for WorkflowNodeExecutionModel operations.
"""
from collections.abc import Sequence
from datetime import datetime
from typing import Optional
from sqlalchemy import delete, desc, select
from sqlalchemy.orm import Session, sessionmaker
from models.workflow import WorkflowNodeExecutionModel
from repositories.api_workflow_node_execution_repository import DifyAPIWorkflowNodeExecutionRepository
class DifyAPISQLAlchemyWorkflowNodeExecutionRepository(DifyAPIWorkflowNodeExecutionRepository):
"""
SQLAlchemy implementation of DifyAPIWorkflowNodeExecutionRepository.
This repository provides service-layer database operations for WorkflowNodeExecutionModel
using SQLAlchemy 2.0 style queries. It implements the DifyAPIWorkflowNodeExecutionRepository
protocol with the following features:
- Multi-tenancy data isolation through tenant_id filtering
- Direct database model operations without domain conversion
- Batch processing for efficient large-scale operations
- Optimized query patterns for common access patterns
- Dependency injection for better testability and maintainability
- Session management and transaction handling with proper cleanup
- Maintenance operations for data lifecycle management
- Thread-safe database operations using session-per-request pattern
"""
def __init__(self, session_maker: sessionmaker[Session]):
"""
Initialize the repository with a sessionmaker.
Args:
session_maker: SQLAlchemy sessionmaker for creating database sessions
"""
self._session_maker = session_maker
def get_node_last_execution(
self,
tenant_id: str,
app_id: str,
workflow_id: str,
node_id: str,
) -> Optional[WorkflowNodeExecutionModel]:
"""
Get the most recent execution for a specific node.
This method replicates the query pattern from WorkflowService.get_node_last_run()
using SQLAlchemy 2.0 style syntax.
Args:
tenant_id: The tenant identifier
app_id: The application identifier
workflow_id: The workflow identifier
node_id: The node identifier
Returns:
The most recent WorkflowNodeExecutionModel for the node, or None if not found
"""
stmt = (
select(WorkflowNodeExecutionModel)
.where(
WorkflowNodeExecutionModel.tenant_id == tenant_id,
WorkflowNodeExecutionModel.app_id == app_id,
WorkflowNodeExecutionModel.workflow_id == workflow_id,
WorkflowNodeExecutionModel.node_id == node_id,
)
.order_by(desc(WorkflowNodeExecutionModel.created_at))
.limit(1)
)
with self._session_maker() as session:
return session.scalar(stmt)
def get_executions_by_workflow_run(
self,
tenant_id: str,
app_id: str,
workflow_run_id: str,
) -> Sequence[WorkflowNodeExecutionModel]:
"""
Get all node executions for a specific workflow run.
This method replicates the query pattern from WorkflowRunService.get_workflow_run_node_executions()
using SQLAlchemy 2.0 style syntax.
Args:
tenant_id: The tenant identifier
app_id: The application identifier
workflow_run_id: The workflow run identifier
Returns:
A sequence of WorkflowNodeExecutionModel instances ordered by index (desc)
"""
stmt = (
select(WorkflowNodeExecutionModel)
.where(
WorkflowNodeExecutionModel.tenant_id == tenant_id,
WorkflowNodeExecutionModel.app_id == app_id,
WorkflowNodeExecutionModel.workflow_run_id == workflow_run_id,
)
.order_by(desc(WorkflowNodeExecutionModel.index))
)
with self._session_maker() as session:
return session.execute(stmt).scalars().all()
def get_execution_by_id(
self,
execution_id: str,
tenant_id: Optional[str] = None,
) -> Optional[WorkflowNodeExecutionModel]:
"""
Get a workflow node execution by its ID.
This method replicates the query pattern from WorkflowDraftVariableService
and WorkflowService.single_step_run_workflow_node() using SQLAlchemy 2.0 style syntax.
When `tenant_id` is None, it's the caller's responsibility to ensure proper data isolation between tenants.
If the `execution_id` comes from untrusted sources (e.g., retrieved from an API request), the caller should
set `tenant_id` to prevent horizontal privilege escalation.
Args:
execution_id: The execution identifier
tenant_id: Optional tenant identifier for additional filtering
Returns:
The WorkflowNodeExecutionModel if found, or None if not found
"""
stmt = select(WorkflowNodeExecutionModel).where(WorkflowNodeExecutionModel.id == execution_id)
# Add tenant filtering if provided
if tenant_id is not None:
stmt = stmt.where(WorkflowNodeExecutionModel.tenant_id == tenant_id)
with self._session_maker() as session:
return session.scalar(stmt)
def delete_expired_executions(
self,
tenant_id: str,
before_date: datetime,
batch_size: int = 1000,
) -> int:
"""
Delete workflow node executions that are older than the specified date.
Args:
tenant_id: The tenant identifier
before_date: Delete executions created before this date
batch_size: Maximum number of executions to delete in one batch
Returns:
The number of executions deleted
"""
total_deleted = 0
while True:
with self._session_maker() as session:
# Find executions to delete in batches
stmt = (
select(WorkflowNodeExecutionModel.id)
.where(
WorkflowNodeExecutionModel.tenant_id == tenant_id,
WorkflowNodeExecutionModel.created_at < before_date,
)
.limit(batch_size)
)
execution_ids = session.execute(stmt).scalars().all()
if not execution_ids:
break
# Delete the batch
delete_stmt = delete(WorkflowNodeExecutionModel).where(WorkflowNodeExecutionModel.id.in_(execution_ids))
result = session.execute(delete_stmt)
session.commit()
total_deleted += result.rowcount
# If we deleted fewer than the batch size, we're done
if len(execution_ids) < batch_size:
break
return total_deleted
def delete_executions_by_app(
self,
tenant_id: str,
app_id: str,
batch_size: int = 1000,
) -> int:
"""
Delete all workflow node executions for a specific app.
Args:
tenant_id: The tenant identifier
app_id: The application identifier
batch_size: Maximum number of executions to delete in one batch
Returns:
The total number of executions deleted
"""
total_deleted = 0
while True:
with self._session_maker() as session:
# Find executions to delete in batches
stmt = (
select(WorkflowNodeExecutionModel.id)
.where(
WorkflowNodeExecutionModel.tenant_id == tenant_id,
WorkflowNodeExecutionModel.app_id == app_id,
)
.limit(batch_size)
)
execution_ids = session.execute(stmt).scalars().all()
if not execution_ids:
break
# Delete the batch
delete_stmt = delete(WorkflowNodeExecutionModel).where(WorkflowNodeExecutionModel.id.in_(execution_ids))
result = session.execute(delete_stmt)
session.commit()
total_deleted += result.rowcount
# If we deleted fewer than the batch size, we're done
if len(execution_ids) < batch_size:
break
return total_deleted
def get_expired_executions_batch(
self,
tenant_id: str,
before_date: datetime,
batch_size: int = 1000,
) -> Sequence[WorkflowNodeExecutionModel]:
"""
Get a batch of expired workflow node executions for backup purposes.
Args:
tenant_id: The tenant identifier
before_date: Get executions created before this date
batch_size: Maximum number of executions to retrieve
Returns:
A sequence of WorkflowNodeExecutionModel instances
"""
stmt = (
select(WorkflowNodeExecutionModel)
.where(
WorkflowNodeExecutionModel.tenant_id == tenant_id,
WorkflowNodeExecutionModel.created_at < before_date,
)
.limit(batch_size)
)
with self._session_maker() as session:
return session.execute(stmt).scalars().all()
def delete_executions_by_ids(
self,
execution_ids: Sequence[str],
) -> int:
"""
Delete workflow node executions by their IDs.
Args:
execution_ids: List of execution IDs to delete
Returns:
The number of executions deleted
"""
if not execution_ids:
return 0
with self._session_maker() as session:
stmt = delete(WorkflowNodeExecutionModel).where(WorkflowNodeExecutionModel.id.in_(execution_ids))
result = session.execute(stmt)
session.commit()
return result.rowcount

View File

@ -0,0 +1,203 @@
"""
SQLAlchemy API WorkflowRun Repository Implementation
This module provides the SQLAlchemy-based implementation of the APIWorkflowRunRepository
protocol. It handles service-layer WorkflowRun database operations using SQLAlchemy 2.0
style queries with proper session management and multi-tenant data isolation.
Key Features:
- SQLAlchemy 2.0 style queries for modern database operations
- Cursor-based pagination for efficient large dataset handling
- Bulk operations with batch processing for performance
- Multi-tenant data isolation and security
- Proper session management with dependency injection
Implementation Notes:
- Uses sessionmaker for consistent session management
- Implements cursor-based pagination using created_at timestamps
- Provides efficient bulk deletion with batch processing
- Maintains data consistency with proper transaction handling
"""
import logging
from collections.abc import Sequence
from datetime import datetime
from typing import Optional, cast
from sqlalchemy import delete, select
from sqlalchemy.orm import Session, sessionmaker
from libs.infinite_scroll_pagination import InfiniteScrollPagination
from models.workflow import WorkflowRun
from repositories.api_workflow_run_repository import APIWorkflowRunRepository
logger = logging.getLogger(__name__)
class DifyAPISQLAlchemyWorkflowRunRepository(APIWorkflowRunRepository):
"""
SQLAlchemy implementation of APIWorkflowRunRepository.
Provides service-layer WorkflowRun database operations using SQLAlchemy 2.0
style queries. Supports dependency injection through sessionmaker and
maintains proper multi-tenant data isolation.
Args:
session_maker: SQLAlchemy sessionmaker instance for database connections
"""
def __init__(self, session_maker: sessionmaker[Session]) -> None:
"""
Initialize the repository with a sessionmaker.
Args:
session_maker: SQLAlchemy sessionmaker for database connections
"""
self._session_maker = session_maker
def get_paginated_workflow_runs(
self,
tenant_id: str,
app_id: str,
triggered_from: str,
limit: int = 20,
last_id: Optional[str] = None,
) -> InfiniteScrollPagination:
"""
Get paginated workflow runs with filtering.
Implements cursor-based pagination using created_at timestamps for
efficient handling of large datasets. Filters by tenant, app, and
trigger source for proper data isolation.
"""
with self._session_maker() as session:
# Build base query with filters
base_stmt = select(WorkflowRun).where(
WorkflowRun.tenant_id == tenant_id,
WorkflowRun.app_id == app_id,
WorkflowRun.triggered_from == triggered_from,
)
if last_id:
# Get the last workflow run for cursor-based pagination
last_run_stmt = base_stmt.where(WorkflowRun.id == last_id)
last_workflow_run = session.scalar(last_run_stmt)
if not last_workflow_run:
raise ValueError("Last workflow run not exists")
# Get records created before the last run's timestamp
base_stmt = base_stmt.where(
WorkflowRun.created_at < last_workflow_run.created_at,
WorkflowRun.id != last_workflow_run.id,
)
# First page - get most recent records
workflow_runs = session.scalars(base_stmt.order_by(WorkflowRun.created_at.desc()).limit(limit + 1)).all()
# Check if there are more records for pagination
has_more = len(workflow_runs) > limit
if has_more:
workflow_runs = workflow_runs[:-1]
return InfiniteScrollPagination(data=workflow_runs, limit=limit, has_more=has_more)
def get_workflow_run_by_id(
self,
tenant_id: str,
app_id: str,
run_id: str,
) -> Optional[WorkflowRun]:
"""
Get a specific workflow run by ID with tenant and app isolation.
"""
with self._session_maker() as session:
stmt = select(WorkflowRun).where(
WorkflowRun.tenant_id == tenant_id,
WorkflowRun.app_id == app_id,
WorkflowRun.id == run_id,
)
return cast(Optional[WorkflowRun], session.scalar(stmt))
def get_expired_runs_batch(
self,
tenant_id: str,
before_date: datetime,
batch_size: int = 1000,
) -> Sequence[WorkflowRun]:
"""
Get a batch of expired workflow runs for cleanup operations.
"""
with self._session_maker() as session:
stmt = (
select(WorkflowRun)
.where(
WorkflowRun.tenant_id == tenant_id,
WorkflowRun.created_at < before_date,
)
.limit(batch_size)
)
return cast(Sequence[WorkflowRun], session.scalars(stmt).all())
def delete_runs_by_ids(
self,
run_ids: Sequence[str],
) -> int:
"""
Delete workflow runs by their IDs using bulk deletion.
"""
if not run_ids:
return 0
with self._session_maker() as session:
stmt = delete(WorkflowRun).where(WorkflowRun.id.in_(run_ids))
result = session.execute(stmt)
session.commit()
deleted_count = cast(int, result.rowcount)
logger.info(f"Deleted {deleted_count} workflow runs by IDs")
return deleted_count
def delete_runs_by_app(
self,
tenant_id: str,
app_id: str,
batch_size: int = 1000,
) -> int:
"""
Delete all workflow runs for a specific app in batches.
"""
total_deleted = 0
while True:
with self._session_maker() as session:
# Get a batch of run IDs to delete
stmt = (
select(WorkflowRun.id)
.where(
WorkflowRun.tenant_id == tenant_id,
WorkflowRun.app_id == app_id,
)
.limit(batch_size)
)
run_ids = session.scalars(stmt).all()
if not run_ids:
break
# Delete the batch
delete_stmt = delete(WorkflowRun).where(WorkflowRun.id.in_(run_ids))
result = session.execute(delete_stmt)
session.commit()
batch_deleted = result.rowcount
total_deleted += batch_deleted
logger.info(f"Deleted batch of {batch_deleted} workflow runs for app {app_id}")
# If we deleted fewer records than the batch size, we're done
if batch_deleted < batch_size:
break
logger.info(f"Total deleted {total_deleted} workflow runs for app {app_id}")
return total_deleted

View File

@ -47,8 +47,6 @@ class AppService:
filters.append(App.mode == AppMode.ADVANCED_CHAT.value)
elif args["mode"] == "agent-chat":
filters.append(App.mode == AppMode.AGENT_CHAT.value)
elif args["mode"] == "channel":
filters.append(App.mode == AppMode.CHANNEL.value)
if args.get("is_created_by_me", False):
filters.append(App.created_by == user_id)
@ -235,6 +233,7 @@ class AppService:
app.icon = args.get("icon")
app.icon_background = args.get("icon_background")
app.use_icon_as_answer_icon = args.get("use_icon_as_answer_icon", False)
app.max_active_requests = args.get("max_active_requests")
app.updated_by = current_user.id
app.updated_at = datetime.now(UTC).replace(tzinfo=None)
db.session.commit()

View File

@ -6,7 +6,7 @@ from concurrent.futures import ThreadPoolExecutor
import click
from flask import Flask, current_app
from sqlalchemy.orm import Session
from sqlalchemy.orm import Session, sessionmaker
from configs import dify_config
from core.model_runtime.utils.encoders import jsonable_encoder
@ -14,7 +14,7 @@ from extensions.ext_database import db
from extensions.ext_storage import storage
from models.account import Tenant
from models.model import App, Conversation, Message
from models.workflow import WorkflowNodeExecutionModel, WorkflowRun
from repositories.factory import DifyAPIRepositoryFactory
from services.billing_service import BillingService
logger = logging.getLogger(__name__)
@ -105,84 +105,99 @@ class ClearFreePlanTenantExpiredLogs:
)
)
while True:
with Session(db.engine).no_autoflush as session:
workflow_node_executions = (
session.query(WorkflowNodeExecutionModel)
.filter(
WorkflowNodeExecutionModel.tenant_id == tenant_id,
WorkflowNodeExecutionModel.created_at
< datetime.datetime.now() - datetime.timedelta(days=days),
)
.limit(batch)
.all()
)
if len(workflow_node_executions) == 0:
break
# save workflow node executions
storage.save(
f"free_plan_tenant_expired_logs/"
f"{tenant_id}/workflow_node_executions/{datetime.datetime.now().strftime('%Y-%m-%d')}"
f"-{time.time()}.json",
json.dumps(
jsonable_encoder(workflow_node_executions),
).encode("utf-8"),
)
workflow_node_execution_ids = [
workflow_node_execution.id for workflow_node_execution in workflow_node_executions
]
# delete workflow node executions
session.query(WorkflowNodeExecutionModel).filter(
WorkflowNodeExecutionModel.id.in_(workflow_node_execution_ids),
).delete(synchronize_session=False)
session.commit()
click.echo(
click.style(
f"[{datetime.datetime.now()}] Processed {len(workflow_node_execution_ids)}"
f" workflow node executions for tenant {tenant_id}"
)
)
# Process expired workflow node executions with backup
session_maker = sessionmaker(bind=db.engine, expire_on_commit=False)
node_execution_repo = DifyAPIRepositoryFactory.create_api_workflow_node_execution_repository(session_maker)
before_date = datetime.datetime.now() - datetime.timedelta(days=days)
total_deleted = 0
while True:
with Session(db.engine).no_autoflush as session:
workflow_runs = (
session.query(WorkflowRun)
.filter(
WorkflowRun.tenant_id == tenant_id,
WorkflowRun.created_at < datetime.datetime.now() - datetime.timedelta(days=days),
)
.limit(batch)
.all()
# Get a batch of expired executions for backup
workflow_node_executions = node_execution_repo.get_expired_executions_batch(
tenant_id=tenant_id,
before_date=before_date,
batch_size=batch,
)
if len(workflow_node_executions) == 0:
break
# Save workflow node executions to storage
storage.save(
f"free_plan_tenant_expired_logs/"
f"{tenant_id}/workflow_node_executions/{datetime.datetime.now().strftime('%Y-%m-%d')}"
f"-{time.time()}.json",
json.dumps(
jsonable_encoder(workflow_node_executions),
).encode("utf-8"),
)
# Extract IDs for deletion
workflow_node_execution_ids = [
workflow_node_execution.id for workflow_node_execution in workflow_node_executions
]
# Delete the backed up executions
deleted_count = node_execution_repo.delete_executions_by_ids(workflow_node_execution_ids)
total_deleted += deleted_count
click.echo(
click.style(
f"[{datetime.datetime.now()}] Processed {len(workflow_node_execution_ids)}"
f" workflow node executions for tenant {tenant_id}"
)
)
if len(workflow_runs) == 0:
break
# If we got fewer than the batch size, we're done
if len(workflow_node_executions) < batch:
break
# save workflow runs
# Process expired workflow runs with backup
session_maker = sessionmaker(bind=db.engine, expire_on_commit=False)
workflow_run_repo = DifyAPIRepositoryFactory.create_api_workflow_run_repository(session_maker)
before_date = datetime.datetime.now() - datetime.timedelta(days=days)
total_deleted = 0
storage.save(
f"free_plan_tenant_expired_logs/"
f"{tenant_id}/workflow_runs/{datetime.datetime.now().strftime('%Y-%m-%d')}"
f"-{time.time()}.json",
json.dumps(
jsonable_encoder(
[workflow_run.to_dict() for workflow_run in workflow_runs],
),
).encode("utf-8"),
while True:
# Get a batch of expired workflow runs for backup
workflow_runs = workflow_run_repo.get_expired_runs_batch(
tenant_id=tenant_id,
before_date=before_date,
batch_size=batch,
)
if len(workflow_runs) == 0:
break
# Save workflow runs to storage
storage.save(
f"free_plan_tenant_expired_logs/"
f"{tenant_id}/workflow_runs/{datetime.datetime.now().strftime('%Y-%m-%d')}"
f"-{time.time()}.json",
json.dumps(
jsonable_encoder(
[workflow_run.to_dict() for workflow_run in workflow_runs],
),
).encode("utf-8"),
)
# Extract IDs for deletion
workflow_run_ids = [workflow_run.id for workflow_run in workflow_runs]
# Delete the backed up workflow runs
deleted_count = workflow_run_repo.delete_runs_by_ids(workflow_run_ids)
total_deleted += deleted_count
click.echo(
click.style(
f"[{datetime.datetime.now()}] Processed {len(workflow_run_ids)}"
f" workflow runs for tenant {tenant_id}"
)
)
workflow_run_ids = [workflow_run.id for workflow_run in workflow_runs]
# delete workflow runs
session.query(WorkflowRun).filter(
WorkflowRun.id.in_(workflow_run_ids),
).delete(synchronize_session=False)
session.commit()
# If we got fewer than the batch size, we're done
if len(workflow_runs) < batch:
break
@classmethod
def process(cls, days: int, batch: int, tenant_ids: list[str]):

View File

@ -29,7 +29,7 @@ class EnterpriseService:
raise ValueError("No data found.")
try:
# parse the UTC timestamp from the response
return datetime.fromisoformat(data.replace("Z", "+00:00"))
return datetime.fromisoformat(data)
except ValueError as e:
raise ValueError(f"Invalid date format: {data}") from e
@ -40,7 +40,7 @@ class EnterpriseService:
raise ValueError("No data found.")
try:
# parse the UTC timestamp from the response
return datetime.fromisoformat(data.replace("Z", "+00:00"))
return datetime.fromisoformat(data)
except ValueError as e:
raise ValueError(f"Invalid date format: {data}") from e

Some files were not shown because too many files have changed in this diff Show More