fix: iteration node use the main thread pool

fix: dialogue_count incorrect in chatflow when there's... (#11175 )
2026-01-23 05:25:34 +08:00 · 2024-12-02 21:13:47 +08:00 · 2024-12-02 21:13:39 +08:00 · 2024-12-02 16:09:26 +08:00 · 2024-12-02 16:00:40 +08:00 · 2024-12-02 15:29:25 +08:00
247 changed files with 6399 additions and 1253 deletions
--- a/.github/workflows/db-migration-test.yml
+++ b/.github/workflows/db-migration-test.yml
@ -48,6 +48,8 @@ jobs:
          cp .env.example .env

      - name: Run DB Migration
+        env:
+          DEBUG: true
        run: |
          cd api
          poetry run python -m flask upgrade-db
--- a/README.md
+++ b/README.md
@ -147,6 +147,13 @@ Deploy Dify to Cloud Platform with a single click using [terraform](https://www.
 ##### Google Cloud
 - [Google Cloud Terraform by @sotazum](https://github.com/DeNA/dify-google-cloud-terraform)

+#### Using AWS CDK for Deployment
+
+Deploy Dify to AWS with [CDK](https://aws.amazon.com/cdk/)
+
+##### AWS 
+- [AWS CDK by @KevinZhao](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
+
 ## Contributing

 For those who'd like to contribute code, see our [Contribution Guide](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md). 
--- a/README_AR.md
+++ b/README_AR.md
@ -190,6 +190,13 @@ docker compose up -d
 ##### Google Cloud
 - [Google Cloud Terraform بواسطة @sotazum](https://github.com/DeNA/dify-google-cloud-terraform)

+#### استخدام AWS CDK للنشر
+
+انشر Dify على AWS باستخدام [CDK](https://aws.amazon.com/cdk/)
+
+##### AWS 
+- [AWS CDK بواسطة @KevinZhao](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
+
 ## المساهمة

 لأولئك الذين يرغبون في المساهمة، انظر إلى [دليل المساهمة](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md) لدينا. 
@ -222,3 +229,10 @@ docker compose up -d
 ## الرخصة

 هذا المستودع متاح تحت [رخصة البرنامج الحر Dify](LICENSE)، والتي تعتبر بشكل أساسي Apache 2.0 مع بعض القيود الإضافية.
+## الكشف عن الأمان
+
+لحماية خصوصيتك، يرجى تجنب نشر مشكلات الأمان على GitHub. بدلاً من ذلك، أرسل أسئلتك إلى security@dify.ai وسنقدم لك إجابة أكثر تفصيلاً.
+
+## الرخصة
+
+هذا المستودع متاح تحت [رخصة البرنامج الحر Dify](LICENSE)، والتي تعتبر بشكل أساسي Apache 2.0 مع بعض القيود الإضافية.
--- a/README_CN.md
+++ b/README_CN.md
@ -213,6 +213,13 @@ docker compose up -d
 ##### Google Cloud
 - [Google Cloud Terraform by @sotazum](https://github.com/DeNA/dify-google-cloud-terraform)

+#### 使用 AWS CDK 部署
+
+使用 [CDK](https://aws.amazon.com/cdk/) 将 Dify 部署到 AWS
+
+##### AWS 
+- [AWS CDK by @KevinZhao](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
+
 ## Star History

 [![Star History Chart](https://api.star-history.com/svg?repos=langgenius/dify&type=Date)](https://star-history.com/#langgenius/dify&Date)
--- a/README_ES.md
+++ b/README_ES.md
@ -215,6 +215,13 @@ Despliega Dify en una plataforma en la nube con un solo clic utilizando [terrafo
 ##### Google Cloud
 - [Google Cloud Terraform por @sotazum](https://github.com/DeNA/dify-google-cloud-terraform)

+#### Usando AWS CDK para el Despliegue
+
+Despliegue Dify en AWS usando [CDK](https://aws.amazon.com/cdk/)
+
+##### AWS 
+- [AWS CDK por @KevinZhao](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
+
 ## Contribuir

 Para aquellos que deseen contribuir con código, consulten nuestra [Guía de contribución](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md). 
@ -248,3 +255,10 @@ Para proteger tu privacidad, evita publicar problemas de seguridad en GitHub. En
 ## Licencia

 Este repositorio está disponible bajo la [Licencia de Código Abierto de Dify](LICENSE), que es esencialmente Apache 2.0 con algunas restricciones adicionales.
+## Divulgación de Seguridad
+
+Para proteger tu privacidad, evita publicar problemas de seguridad en GitHub. En su lugar, envía tus preguntas a security@dify.ai y te proporcionaremos una respuesta más detallada.
+
+## Licencia
+
+Este repositorio está disponible bajo la [Licencia de Código Abierto de Dify](LICENSE), que es esencialmente Apache 2.0 con algunas restricciones adicionales.
--- a/README_FR.md
+++ b/README_FR.md
@ -213,6 +213,13 @@ Déployez Dify sur une plateforme cloud en un clic en utilisant [terraform](http
 ##### Google Cloud
 - [Google Cloud Terraform par @sotazum](https://github.com/DeNA/dify-google-cloud-terraform)

+#### Utilisation d'AWS CDK pour le déploiement
+
+Déployez Dify sur AWS en utilisant [CDK](https://aws.amazon.com/cdk/)
+
+##### AWS 
+- [AWS CDK par @KevinZhao](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
+
 ## Contribuer

 Pour ceux qui souhaitent contribuer du code, consultez notre [Guide de contribution](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md). 
@ -246,3 +253,10 @@ Pour protéger votre vie privée, veuillez éviter de publier des problèmes de
 ## Licence

 Ce référentiel est disponible sous la [Licence open source Dify](LICENSE), qui est essentiellement l'Apache 2.0 avec quelques restrictions supplémentaires.
+## Divulgation de sécurité
+
+Pour protéger votre vie privée, veuillez éviter de publier des problèmes de sécurité sur GitHub. Au lieu de cela, envoyez vos questions à security@dify.ai et nous vous fournirons une réponse plus détaillée.
+
+## Licence
+
+Ce référentiel est disponible sous la [Licence open source Dify](LICENSE), qui est essentiellement l'Apache 2.0 avec quelques restrictions supplémentaires.
--- a/README_JA.md
+++ b/README_JA.md
@ -212,6 +212,13 @@ docker compose up -d
 ##### Google Cloud
 - [@sotazumによるGoogle Cloud Terraform](https://github.com/DeNA/dify-google-cloud-terraform)

+#### AWS CDK を使用したデプロイ
+
+[CDK](https://aws.amazon.com/cdk/) を使用して、DifyをAWSにデプロイします
+
+##### AWS 
+- [@KevinZhaoによるAWS CDK](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
+
 ## 貢献

 コードに貢献したい方は、[Contribution Guide](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md)を参照してください。
--- a/README_KL.md
+++ b/README_KL.md
@ -213,6 +213,13 @@ wa'logh nIqHom neH ghun deployment toy'wI' [terraform](https://www.terraform.io/
 ##### Google Cloud
 - [Google Cloud Terraform qachlot @sotazum](https://github.com/DeNA/dify-google-cloud-terraform)

+#### AWS CDK atorlugh pilersitsineq
+
+wa'logh nIqHom neH ghun deployment toy'wI' [CDK](https://aws.amazon.com/cdk/) lo'laH.
+
+##### AWS 
+- [AWS CDK qachlot @KevinZhao](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
+
 ## Contributing

 For those who'd like to contribute code, see our [Contribution Guide](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md). 
--- a/README_KR.md
+++ b/README_KR.md
@ -205,6 +205,13 @@ Dify를 Kubernetes에 배포하고 프리미엄 스케일링 설정을 구성했
 ##### Google Cloud
 - [sotazum의 Google Cloud Terraform](https://github.com/DeNA/dify-google-cloud-terraform)

+#### AWS CDK를 사용한 배포
+
+[CDK](https://aws.amazon.com/cdk/)를 사용하여 AWS에 Dify 배포
+
+##### AWS 
+- [KevinZhao의 AWS CDK](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
+
 ## 기여

 코드에 기여하고 싶은 분들은 [기여 가이드](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md)를 참조하세요.
--- a/README_PT.md
+++ b/README_PT.md
@ -211,6 +211,13 @@ Implante o Dify na Plataforma Cloud com um único clique usando [terraform](http
 ##### Google Cloud
 - [Google Cloud Terraform por @sotazum](https://github.com/DeNA/dify-google-cloud-terraform)

+#### Usando AWS CDK para Implantação
+
+Implante o Dify na AWS usando [CDK](https://aws.amazon.com/cdk/)
+
+##### AWS 
+- [AWS CDK por @KevinZhao](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
+
 ## Contribuindo

 Para aqueles que desejam contribuir com código, veja nosso [Guia de Contribuição](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md). 
--- a/README_SI.md
+++ b/README_SI.md
@ -145,6 +145,13 @@ namestite Dify v Cloud Platform z enim klikom z uporabo [terraform](https://www.
 ##### Google Cloud
 - [Google Cloud Terraform by @sotazum](https://github.com/DeNA/dify-google-cloud-terraform)

+#### Uporaba AWS CDK za uvajanje
+
+Uvedite Dify v AWS z uporabo [CDK](https://aws.amazon.com/cdk/)
+
+##### AWS 
+- [AWS CDK by @KevinZhao](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
+
 ## Prispevam

 Za tiste, ki bi radi prispevali kodo, si oglejte naš vodnik za prispevke . Hkrati vas prosimo, da podprete Dify tako, da ga delite na družbenih medijih ter na dogodkih in konferencah. 
--- a/README_TR.md
+++ b/README_TR.md
@ -211,6 +211,13 @@ Dify'ı bulut platformuna tek tıklamayla dağıtın [terraform](https://www.ter
 ##### Google Cloud
 - [Google Cloud Terraform tarafından @sotazum](https://github.com/DeNA/dify-google-cloud-terraform)

+#### AWS CDK ile Dağıtım
+
+[CDK](https://aws.amazon.com/cdk/) kullanarak Dify'ı AWS'ye dağıtın
+
+##### AWS 
+- [AWS CDK tarafından @KevinZhao](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
+
 ## Katkıda Bulunma

 Kod katkısında bulunmak isteyenler için [Katkı Kılavuzumuza](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md) bakabilirsiniz.
--- a/README_VI.md
+++ b/README_VI.md
@ -207,6 +207,13 @@ Triển khai Dify lên nền tảng đám mây với một cú nhấp chuột b
 ##### Google Cloud
 - [Google Cloud Terraform bởi @sotazum](https://github.com/DeNA/dify-google-cloud-terraform)

+#### Sử dụng AWS CDK để Triển khai
+
+Triển khai Dify trên AWS bằng [CDK](https://aws.amazon.com/cdk/)
+
+##### AWS 
+- [AWS CDK bởi @KevinZhao](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
+
 ## Đóng góp

 Đối với những người muốn đóng góp mã, xem [Hướng dẫn Đóng góp](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md) của chúng tôi. 
--- a/api/.env.example
+++ b/api/.env.example
@ -329,6 +329,7 @@ NOTION_INTERNAL_SECRET=you-internal-secret
 ETL_TYPE=dify
 UNSTRUCTURED_API_URL=
 UNSTRUCTURED_API_KEY=
+SCARF_NO_ANALYTICS=true

 #ssrf
 SSRF_PROXY_HTTP_URL=
@ -382,7 +383,7 @@ LOG_DATEFORMAT=%Y-%m-%d %H:%M:%S
 LOG_TZ=UTC

 # Indexing configuration
-INDEXING_MAX_SEGMENTATION_TOKENS_LENGTH=1000
+INDEXING_MAX_SEGMENTATION_TOKENS_LENGTH=4000

 # Workflow runtime configuration
 WORKFLOW_MAX_EXECUTION_STEPS=500
@ -410,4 +411,6 @@ POSITION_PROVIDER_EXCLUDES=
 # Reset password token expiry minutes
 RESET_PASSWORD_TOKEN_EXPIRY_MINUTES=5

-CREATE_TIDB_SERVICE_JOB_ENABLED=false
+CREATE_TIDB_SERVICE_JOB_ENABLED=false
+
+RETRIEVAL_TOP_N=0
--- a/api/.ruff.toml
+++ b/api/.ruff.toml
@ -0,0 +1,93 @@
+exclude = [
+    "migrations/*",
+]
+line-length = 120
+
+[format]
+quote-style = "double"
+
+[lint]
+preview = true
+select = [
+    "B", # flake8-bugbear rules
+    "C4", # flake8-comprehensions
+    "E", # pycodestyle E rules
+    "F", # pyflakes rules
+    "FURB", # refurb rules
+    "I", # isort rules
+    "N", # pep8-naming
+    "PT", # flake8-pytest-style rules
+    "PLC0208", # iteration-over-set
+    "PLC2801", # unnecessary-dunder-call
+    "PLC0414", # useless-import-alias
+    "PLR0402", # manual-from-import
+    "PLR1711", # useless-return
+    "PLR1714", # repeated-equality-comparison
+    "RUF013", # implicit-optional
+    "RUF019", # unnecessary-key-check
+    "RUF100", # unused-noqa
+    "RUF101", # redirected-noqa
+    "RUF200", # invalid-pyproject-toml
+    "S506", # unsafe-yaml-load
+    "SIM", # flake8-simplify rules
+    "TRY400", # error-instead-of-exception
+    "TRY401", # verbose-log-message
+    "UP", # pyupgrade rules
+    "W191", # tab-indentation
+    "W605", # invalid-escape-sequence
+]
+
+ignore = [
+    "E402", # module-import-not-at-top-of-file
+    "E711", # none-comparison
+    "E712", # true-false-comparison
+    "E721", # type-comparison
+    "E722", # bare-except
+    "E731", # lambda-assignment
+    "F821", # undefined-name
+    "F841", # unused-variable
+    "FURB113", # repeated-append
+    "FURB152", # math-constant
+    "UP007", # non-pep604-annotation
+    "UP032", # f-string
+    "B005", # strip-with-multi-characters
+    "B006", # mutable-argument-default
+    "B007", # unused-loop-control-variable
+    "B026", # star-arg-unpacking-after-keyword-arg
+    "B904", # raise-without-from-inside-except
+    "B905", # zip-without-explicit-strict
+    "N806", # non-lowercase-variable-in-function
+    "N815", # mixed-case-variable-in-class-scope
+    "PT011", # pytest-raises-too-broad
+    "SIM102", # collapsible-if
+    "SIM103", # needless-bool
+    "SIM105", # suppressible-exception
+    "SIM107", # return-in-try-except-finally
+    "SIM108", # if-else-block-instead-of-if-exp
+    "SIM113", # eumerate-for-loop
+    "SIM117", # multiple-with-statements
+    "SIM210", # if-expr-with-true-false
+    "SIM300", # yoda-conditions,
+]
+
+[lint.per-file-ignores]
+"__init__.py" = [
+    "F401", # unused-import
+    "F811", # redefined-while-unused
+]
+"configs/*" = [
+    "N802", # invalid-function-name
+]
+"libs/gmpy2_pkcs10aep_cipher.py" = [
+    "N803", # invalid-argument-name
+]
+"tests/*" = [
+    "F811", # redefined-while-unused
+    "F401", # unused-import
+]
+
+[lint.pyflakes]
+extend-generics = [
+    "_pytest.monkeypatch",
+    "tests.integration_tests",
+]
--- a/api/Dockerfile
+++ b/api/Dockerfile
@ -55,7 +55,7 @@ RUN apt-get update \
    && echo "deb http://deb.debian.org/debian testing main" > /etc/apt/sources.list \
    && apt-get update \
    # For Security
-    && apt-get install -y --no-install-recommends expat=2.6.4-1 libldap-2.5-0=2.5.18+dfsg-3+b1 perl=5.40.0-7 libsqlite3-0=3.46.1-1 zlib1g=1:1.3.dfsg+really1.3.1-1+b1 \
+    && apt-get install -y --no-install-recommends expat=2.6.4-1 libldap-2.5-0=2.5.18+dfsg-3+b1 perl=5.40.0-8 libsqlite3-0=3.46.1-1 zlib1g=1:1.3.dfsg+really1.3.1-1+b1 \
    # install a chinese font to support the use of tools like matplotlib
    && apt-get install -y fonts-noto-cjk \
    && apt-get autoremove -y \
--- a/api/app.py
+++ b/api/app.py
@ -1,113 +1,13 @@
-import os
-import sys
-
-python_version = sys.version_info
-if not ((3, 11) <= python_version < (3, 13)):
-    print(f"Python 3.11 or 3.12 is required, current version is {python_version.major}.{python_version.minor}")
-    raise SystemExit(1)
-
-from configs import dify_config
-
-if not dify_config.DEBUG:
-    from gevent import monkey
-
-    monkey.patch_all()
-
-    import grpc.experimental.gevent
-
-    grpc.experimental.gevent.init_gevent()
-
-import json
-import threading
-import time
-import warnings
-
-from flask import Response
-
 from app_factory import create_app
+from libs import threadings_utils, version_utils

-# DO NOT REMOVE BELOW
-from events import event_handlers  # noqa: F401
-from extensions.ext_database import db
-
-# TODO: Find a way to avoid importing models here
-from models import account, dataset, model, source, task, tool, tools, web  # noqa: F401
-
-# DO NOT REMOVE ABOVE
-
-
-warnings.simplefilter("ignore", ResourceWarning)
-
-os.environ["TZ"] = "UTC"
-# windows platform not support tzset
-if hasattr(time, "tzset"):
-    time.tzset()
-
+# preparation before creating app
+version_utils.check_supported_python_version()
+threadings_utils.apply_gevent_threading_patch()

 # create app
 app = create_app()
 celery = app.extensions["celery"]

-if dify_config.TESTING:
-    print("App is running in TESTING mode")
-
-
-@app.after_request
-def after_request(response):
-    """Add Version headers to the response."""
-    response.headers.add("X-Version", dify_config.CURRENT_VERSION)
-    response.headers.add("X-Env", dify_config.DEPLOY_ENV)
-    return response
-
-
-@app.route("/health")
-def health():
-    return Response(
-        json.dumps({"pid": os.getpid(), "status": "ok", "version": dify_config.CURRENT_VERSION}),
-        status=200,
-        content_type="application/json",
-    )
-
-
-@app.route("/threads")
-def threads():
-    num_threads = threading.active_count()
-    threads = threading.enumerate()
-
-    thread_list = []
-    for thread in threads:
-        thread_name = thread.name
-        thread_id = thread.ident
-        is_alive = thread.is_alive()
-
-        thread_list.append(
-            {
-                "name": thread_name,
-                "id": thread_id,
-                "is_alive": is_alive,
-            }
-        )
-
-    return {
-        "pid": os.getpid(),
-        "thread_num": num_threads,
-        "threads": thread_list,
-    }
-
-
-@app.route("/db-pool-stat")
-def pool_stat():
-    engine = db.engine
-    return {
-        "pid": os.getpid(),
-        "pool_size": engine.pool.size(),
-        "checked_in_connections": engine.pool.checkedin(),
-        "checked_out_connections": engine.pool.checkedout(),
-        "overflow_connections": engine.pool.overflow(),
-        "connection_timeout": engine.pool.timeout(),
-        "recycle_time": db.engine.pool._recycle,
-    }
-
-
 if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5001)
--- a/api/app_factory.py
+++ b/api/app_factory.py
@ -1,54 +1,15 @@
+import logging
 import os
+import time

 from configs import dify_config
-
-if not dify_config.DEBUG:
-    from gevent import monkey
-
-    monkey.patch_all()
-
-    import grpc.experimental.gevent
-
-    grpc.experimental.gevent.init_gevent()
-
-import json
-
-from flask import Flask, Response, request
-from flask_cors import CORS
-from werkzeug.exceptions import Unauthorized
-
-import contexts
-from commands import register_commands
-from configs import dify_config
-from extensions import (
-    ext_celery,
-    ext_code_based_extension,
-    ext_compress,
-    ext_database,
-    ext_hosting_provider,
-    ext_logging,
-    ext_login,
-    ext_mail,
-    ext_migrate,
-    ext_proxy_fix,
-    ext_redis,
-    ext_sentry,
-    ext_storage,
-)
-from extensions.ext_database import db
-from extensions.ext_login import login_manager
-from libs.passport import PassportService
-from services.account_service import AccountService
-
-
-class DifyApp(Flask):
-    pass
+from dify_app import DifyApp


 # ----------------------------
 # Application Factory Function
 # ----------------------------
-def create_flask_app_with_configs() -> Flask:
+def create_flask_app_with_configs() -> DifyApp:
    """
    create a raw flask app
    with configs loaded from .env file
@ -68,111 +29,72 @@ def create_flask_app_with_configs() -> Flask:
    return dify_app


-def create_app() -> Flask:
+def create_app() -> DifyApp:
+    start_time = time.perf_counter()
    app = create_flask_app_with_configs()
-    app.secret_key = dify_config.SECRET_KEY
    initialize_extensions(app)
-    register_blueprints(app)
-    register_commands(app)
-
+    end_time = time.perf_counter()
+    if dify_config.DEBUG:
+        logging.info(f"Finished create_app ({round((end_time - start_time) * 1000, 2)} ms)")
    return app


-def initialize_extensions(app):
-    # Since the application instance is now created, pass it to each Flask
-    # extension instance to bind it to the Flask application instance (app)
-    ext_logging.init_app(app)
-    ext_compress.init_app(app)
-    ext_code_based_extension.init()
-    ext_database.init_app(app)
-    ext_migrate.init(app, db)
-    ext_redis.init_app(app)
-    ext_storage.init_app(app)
-    ext_celery.init_app(app)
-    ext_login.init_app(app)
-    ext_mail.init_app(app)
-    ext_hosting_provider.init_app(app)
-    ext_sentry.init_app(app)
-    ext_proxy_fix.init_app(app)
-
-
-# Flask-Login configuration
-@login_manager.request_loader
-def load_user_from_request(request_from_flask_login):
-    """Load user based on the request."""
-    if request.blueprint not in {"console", "inner_api"}:
-        return None
-    # Check if the user_id contains a dot, indicating the old format
-    auth_header = request.headers.get("Authorization", "")
-    if not auth_header:
-        auth_token = request.args.get("_token")
-        if not auth_token:
-            raise Unauthorized("Invalid Authorization token.")
-    else:
-        if " " not in auth_header:
-            raise Unauthorized("Invalid Authorization header format. Expected 'Bearer <api-key>' format.")
-        auth_scheme, auth_token = auth_header.split(None, 1)
-        auth_scheme = auth_scheme.lower()
-        if auth_scheme != "bearer":
-            raise Unauthorized("Invalid Authorization header format. Expected 'Bearer <api-key>' format.")
-
-    decoded = PassportService().verify(auth_token)
-    user_id = decoded.get("user_id")
-
-    logged_in_account = AccountService.load_logged_in_account(account_id=user_id)
-    if logged_in_account:
-        contexts.tenant_id.set(logged_in_account.current_tenant_id)
-    return logged_in_account
-
-
-@login_manager.unauthorized_handler
-def unauthorized_handler():
-    """Handle unauthorized requests."""
-    return Response(
-        json.dumps({"code": "unauthorized", "message": "Unauthorized."}),
-        status=401,
-        content_type="application/json",
+def initialize_extensions(app: DifyApp):
+    from extensions import (
+        ext_app_metrics,
+        ext_blueprints,
+        ext_celery,
+        ext_code_based_extension,
+        ext_commands,
+        ext_compress,
+        ext_database,
+        ext_hosting_provider,
+        ext_import_modules,
+        ext_logging,
+        ext_login,
+        ext_mail,
+        ext_migrate,
+        ext_proxy_fix,
+        ext_redis,
+        ext_sentry,
+        ext_set_secretkey,
+        ext_storage,
+        ext_timezone,
+        ext_warnings,
    )

+    extensions = [
+        ext_timezone,
+        ext_logging,
+        ext_warnings,
+        ext_import_modules,
+        ext_set_secretkey,
+        ext_compress,
+        ext_code_based_extension,
+        ext_database,
+        ext_app_metrics,
+        ext_migrate,
+        ext_redis,
+        ext_storage,
+        ext_celery,
+        ext_login,
+        ext_mail,
+        ext_hosting_provider,
+        ext_sentry,
+        ext_proxy_fix,
+        ext_blueprints,
+        ext_commands,
+    ]
+    for ext in extensions:
+        short_name = ext.__name__.split(".")[-1]
+        is_enabled = ext.is_enabled() if hasattr(ext, "is_enabled") else True
+        if not is_enabled:
+            if dify_config.DEBUG:
+                logging.info(f"Skipped {short_name}")
+            continue

-# register blueprint routers
-def register_blueprints(app):
-    from controllers.console import bp as console_app_bp
-    from controllers.files import bp as files_bp
-    from controllers.inner_api import bp as inner_api_bp
-    from controllers.service_api import bp as service_api_bp
-    from controllers.web import bp as web_bp
-
-    CORS(
-        service_api_bp,
-        allow_headers=["Content-Type", "Authorization", "X-App-Code"],
-        methods=["GET", "PUT", "POST", "DELETE", "OPTIONS", "PATCH"],
-    )
-    app.register_blueprint(service_api_bp)
-
-    CORS(
-        web_bp,
-        resources={r"/*": {"origins": dify_config.WEB_API_CORS_ALLOW_ORIGINS}},
-        supports_credentials=True,
-        allow_headers=["Content-Type", "Authorization", "X-App-Code"],
-        methods=["GET", "PUT", "POST", "DELETE", "OPTIONS", "PATCH"],
-        expose_headers=["X-Version", "X-Env"],
-    )
-
-    app.register_blueprint(web_bp)
-
-    CORS(
-        console_app_bp,
-        resources={r"/*": {"origins": dify_config.CONSOLE_CORS_ALLOW_ORIGINS}},
-        supports_credentials=True,
-        allow_headers=["Content-Type", "Authorization"],
-        methods=["GET", "PUT", "POST", "DELETE", "OPTIONS", "PATCH"],
-        expose_headers=["X-Version", "X-Env"],
-    )
-
-    app.register_blueprint(console_app_bp)
-
-    CORS(files_bp, allow_headers=["Content-Type"], methods=["GET", "PUT", "POST", "DELETE", "OPTIONS", "PATCH"])
-    app.register_blueprint(files_bp)
-
-    app.register_blueprint(inner_api_bp)
+        start_time = time.perf_counter()
+        ext.init_app(app)
+        end_time = time.perf_counter()
+        if dify_config.DEBUG:
+            logging.info(f"Loaded {short_name} ({round((end_time - start_time) * 1000, 2)} ms)")
--- a/api/commands.py
+++ b/api/commands.py
@ -640,15 +640,3 @@ where sites.id is null limit 1000"""
                break

    click.echo(click.style("Fix for missing app-related sites completed successfully!", fg="green"))
-
-
-def register_commands(app):
-    app.cli.add_command(reset_password)
-    app.cli.add_command(reset_email)
-    app.cli.add_command(reset_encrypt_key_pair)
-    app.cli.add_command(vdb_migrate)
-    app.cli.add_command(convert_to_agent_apps)
-    app.cli.add_command(add_qdrant_doc_id_index)
-    app.cli.add_command(create_tenant)
-    app.cli.add_command(upgrade_db)
-    app.cli.add_command(fix_app_site_missing)
--- a/api/configs/deploy/init.py
+++ b/api/configs/deploy/init.py
@ -17,11 +17,6 @@ class DeploymentConfig(BaseSettings):
        default=False,
    )

-    TESTING: bool = Field(
-        description="Enable testing mode for running automated tests",
-        default=False,
-    )
-
    EDITION: str = Field(
        description="Deployment edition of the application (e.g., 'SELF_HOSTED', 'CLOUD')",
        default="SELF_HOSTED",
--- a/api/configs/feature/init.py
+++ b/api/configs/feature/init.py
@ -585,6 +585,11 @@ class RagEtlConfig(BaseSettings):
        default=None,
    )

+    SCARF_NO_ANALYTICS: Optional[str] = Field(
+        description="This is about whether to disable Scarf analytics in Unstructured library.",
+        default="false",
+    )
+

 class DataSetConfig(BaseSettings):
    """
@ -621,6 +626,8 @@ class DataSetConfig(BaseSettings):
        default=30,
    )

+    RETRIEVAL_TOP_N: int = Field(description="number of retrieval top_n", default=0)
+

 class WorkspaceConfig(BaseSettings):
    """
@ -640,7 +647,7 @@ class IndexingConfig(BaseSettings):

    INDEXING_MAX_SEGMENTATION_TOKENS_LENGTH: PositiveInt = Field(
        description="Maximum token length for text segmentation during indexing",
-        default=1000,
+        default=4000,
    )


--- a/api/configs/packaging/init.py
+++ b/api/configs/packaging/init.py
@ -9,7 +9,7 @@ class PackagingInfo(BaseSettings):

    CURRENT_VERSION: str = Field(
        description="Dify version",
-        default="0.12.0",
+        default="0.12.1",
    )

    COMMIT_SHA: str = Field(
--- a/api/constants/languages.py
+++ b/api/constants/languages.py
@ -18,6 +18,7 @@ language_timezone_mapping = {
    "tr-TR": "Europe/Istanbul",
    "fa-IR": "Asia/Tehran",
    "sl-SI": "Europe/Ljubljana",
+    "th-TH": "Asia/Bangkok",
 }

 languages = list(language_timezone_mapping.keys())
--- a/api/controllers/console/app/app.py
+++ b/api/controllers/console/app/app.py
@ -190,7 +190,7 @@ class AppCopyApi(Resource):
            )
            session.commit()

-            stmt = select(App).where(App.id == result.app.id)
+            stmt = select(App).where(App.id == result.app_id)
            app = session.scalar(stmt)

        return app, 201
--- a/api/controllers/console/auth/data_source_oauth.py
+++ b/api/controllers/console/auth/data_source_oauth.py
@ -34,7 +34,6 @@ class OAuthDataSource(Resource):
        OAUTH_DATASOURCE_PROVIDERS = get_oauth_providers()
        with current_app.app_context():
            oauth_provider = OAUTH_DATASOURCE_PROVIDERS.get(provider)
-            print(vars(oauth_provider))
        if not oauth_provider:
            return {"error": "Invalid provider"}, 400
        if dify_config.NOTION_INTEGRATION_TYPE == "internal":
--- a/api/controllers/console/auth/oauth.py
+++ b/api/controllers/console/auth/oauth.py
@ -52,7 +52,6 @@ class OAuthLogin(Resource):
        OAUTH_PROVIDERS = get_oauth_providers()
        with current_app.app_context():
            oauth_provider = OAUTH_PROVIDERS.get(provider)
-            print(vars(oauth_provider))
        if not oauth_provider:
            return {"error": "Invalid provider"}, 400

--- a/api/controllers/console/datasets/datasets_document.py
+++ b/api/controllers/console/datasets/datasets_document.py
@ -106,6 +106,7 @@ class GetProcessRuleApi(Resource):
        # get default rules
        mode = DocumentService.DEFAULT_RULES["mode"]
        rules = DocumentService.DEFAULT_RULES["rules"]
+        limits = DocumentService.DEFAULT_RULES["limits"]
        if document_id:
            # get the latest process rule
            document = Document.query.get_or_404(document_id)
@ -132,7 +133,7 @@ class GetProcessRuleApi(Resource):
                mode = dataset_process_rule.mode
                rules = dataset_process_rule.rules_dict

-        return {"mode": mode, "rules": rules}
+        return {"mode": mode, "rules": rules, "limits": limits}


 class DatasetDocumentListApi(Resource):
--- a/api/controllers/service_api/app/app.py
+++ b/api/controllers/service_api/app/app.py
@ -48,7 +48,8 @@ class AppInfoApi(Resource):
    @validate_app_token
    def get(self, app_model: App):
        """Get app information"""
-        return {"name": app_model.name, "description": app_model.description}
+        tags = [tag.name for tag in app_model.tags]
+        return {"name": app_model.name, "description": app_model.description, "tags": tags}


 api.add_resource(AppParameterApi, "/parameters")
--- a/api/core/app/app_config/easy_ui_based_app/model_config/manager.py
+++ b/api/core/app/app_config/easy_ui_based_app/model_config/manager.py
@ -1,3 +1,6 @@
+from collections.abc import Mapping
+from typing import Any
+
 from core.app.app_config.entities import ModelConfigEntity
 from core.model_runtime.entities.model_entities import ModelPropertyKey, ModelType
 from core.model_runtime.model_providers import model_provider_factory
@ -36,7 +39,7 @@ class ModelConfigManager:
        )

    @classmethod
-    def validate_and_set_defaults(cls, tenant_id: str, config: dict) -> tuple[dict, list[str]]:
+    def validate_and_set_defaults(cls, tenant_id: str, config: Mapping[str, Any]) -> tuple[dict, list[str]]:
        """
        Validate and set defaults for model config

--- a/api/core/app/apps/advanced_chat/app_generator.py
+++ b/api/core/app/apps/advanced_chat/app_generator.py
@ -2,8 +2,8 @@ import contextvars
 import logging
 import threading
 import uuid
-from collections.abc import Generator
-from typing import Any, Literal, Optional, Union, overload
+from collections.abc import Generator, Mapping
+from typing import Any, Optional, Union

 from flask import Flask, current_app
 from pydantic import ValidationError
@ -23,6 +23,7 @@ from core.app.entities.app_invoke_entities import AdvancedChatAppGenerateEntity,
 from core.app.entities.task_entities import ChatbotAppBlockingResponse, ChatbotAppStreamResponse
 from core.model_runtime.errors.invoke import InvokeAuthorizationError, InvokeError
 from core.ops.ops_trace_manager import TraceQueueManager
+from core.prompt.utils.get_thread_messages_length import get_thread_messages_length
 from extensions.ext_database import db
 from factories import file_factory
 from models.account import Account
@ -33,37 +34,17 @@ logger = logging.getLogger(__name__)


 class AdvancedChatAppGenerator(MessageBasedAppGenerator):
-    @overload
-    def generate(
-        self,
-        app_model: App,
-        workflow: Workflow,
-        user: Union[Account, EndUser],
-        args: dict,
-        invoke_from: InvokeFrom,
-        stream: Literal[True] = True,
-    ) -> Generator[str, None, None]: ...
-
-    @overload
-    def generate(
-        self,
-        app_model: App,
-        workflow: Workflow,
-        user: Union[Account, EndUser],
-        args: dict,
-        invoke_from: InvokeFrom,
-        stream: Literal[False] = False,
-    ) -> dict: ...
+    _dialogue_count: int

    def generate(
        self,
        app_model: App,
        workflow: Workflow,
        user: Union[Account, EndUser],
-        args: dict,
+        args: Mapping[str, Any],
        invoke_from: InvokeFrom,
-        stream: bool = True,
-    ) -> dict[str, Any] | Generator[str, Any, None]:
+        streaming: bool = True,
+    ) -> Mapping[str, Any] | Generator[str, None, None]:
        """
        Generate App response.

@ -127,12 +108,14 @@ class AdvancedChatAppGenerator(MessageBasedAppGenerator):
            conversation_id=conversation.id if conversation else None,
            inputs=conversation.inputs
            if conversation
-            else self._prepare_user_inputs(user_inputs=inputs, variables=app_config.variables, tenant_id=app_model.id),
+            else self._prepare_user_inputs(
+                user_inputs=inputs, variables=app_config.variables, tenant_id=app_model.tenant_id
+            ),
            query=query,
            files=file_objs,
            parent_message_id=args.get("parent_message_id") if invoke_from != InvokeFrom.SERVICE_API else UUID_NIL,
            user_id=user.id,
-            stream=stream,
+            stream=streaming,
            invoke_from=invoke_from,
            extras=extras,
            trace_manager=trace_manager,
@ -146,12 +129,12 @@ class AdvancedChatAppGenerator(MessageBasedAppGenerator):
            invoke_from=invoke_from,
            application_generate_entity=application_generate_entity,
            conversation=conversation,
-            stream=stream,
+            stream=streaming,
        )

    def single_iteration_generate(
-        self, app_model: App, workflow: Workflow, node_id: str, user: Account, args: dict, stream: bool = True
-    ) -> dict[str, Any] | Generator[str, Any, None]:
+        self, app_model: App, workflow: Workflow, node_id: str, user: Account, args: dict, streaming: bool = True
+    ) -> Mapping[str, Any] | Generator[str, None, None]:
        """
        Generate App response.

@ -180,7 +163,7 @@ class AdvancedChatAppGenerator(MessageBasedAppGenerator):
            query="",
            files=[],
            user_id=user.id,
-            stream=stream,
+            stream=streaming,
            invoke_from=InvokeFrom.DEBUGGER,
            extras={"auto_generate_conversation_name": False},
            single_iteration_run=AdvancedChatAppGenerateEntity.SingleIterationRunEntity(
@ -195,7 +178,7 @@ class AdvancedChatAppGenerator(MessageBasedAppGenerator):
            invoke_from=InvokeFrom.DEBUGGER,
            application_generate_entity=application_generate_entity,
            conversation=None,
-            stream=stream,
+            stream=streaming,
        )

    def _generate(
@ -207,7 +190,7 @@ class AdvancedChatAppGenerator(MessageBasedAppGenerator):
        application_generate_entity: AdvancedChatAppGenerateEntity,
        conversation: Optional[Conversation] = None,
        stream: bool = True,
-    ) -> dict[str, Any] | Generator[str, Any, None]:
+    ) -> Mapping[str, Any] | Generator[str, None, None]:
        """
        Generate App response.

@ -231,6 +214,9 @@ class AdvancedChatAppGenerator(MessageBasedAppGenerator):
            db.session.commit()
            db.session.refresh(conversation)

+        # get conversation dialogue count
+        self._dialogue_count = get_thread_messages_length(conversation.id)
+
        # init queue manager
        queue_manager = MessageBasedAppQueueManager(
            task_id=application_generate_entity.task_id,
@ -301,6 +287,7 @@ class AdvancedChatAppGenerator(MessageBasedAppGenerator):
                    queue_manager=queue_manager,
                    conversation=conversation,
                    message=message,
+                    dialogue_count=self._dialogue_count,
                )

                runner.run()
@ -354,6 +341,7 @@ class AdvancedChatAppGenerator(MessageBasedAppGenerator):
            message=message,
            user=user,
            stream=stream,
+            dialogue_count=self._dialogue_count,
        )

        try:
--- a/api/core/app/apps/advanced_chat/app_runner.py
+++ b/api/core/app/apps/advanced_chat/app_runner.py
@ -39,12 +39,14 @@ class AdvancedChatAppRunner(WorkflowBasedAppRunner):
        queue_manager: AppQueueManager,
        conversation: Conversation,
        message: Message,
+        dialogue_count: int,
    ) -> None:
        super().__init__(queue_manager)

        self.application_generate_entity = application_generate_entity
        self.conversation = conversation
        self.message = message
+        self._dialogue_count = dialogue_count

    def run(self) -> None:
        app_config = self.application_generate_entity.app_config
@ -122,19 +124,13 @@ class AdvancedChatAppRunner(WorkflowBasedAppRunner):

                session.commit()

-            # Increment dialogue count.
-            self.conversation.dialogue_count += 1
-
-            conversation_dialogue_count = self.conversation.dialogue_count
-            db.session.commit()
-
            # Create a variable pool.
            system_inputs = {
                SystemVariableKey.QUERY: query,
                SystemVariableKey.FILES: files,
                SystemVariableKey.CONVERSATION_ID: self.conversation.id,
                SystemVariableKey.USER_ID: user_id,
-                SystemVariableKey.DIALOGUE_COUNT: conversation_dialogue_count,
+                SystemVariableKey.DIALOGUE_COUNT: self._dialogue_count,
                SystemVariableKey.APP_ID: app_config.app_id,
                SystemVariableKey.WORKFLOW_ID: app_config.workflow_id,
                SystemVariableKey.WORKFLOW_RUN_ID: self.application_generate_entity.workflow_run_id,
--- a/api/core/app/apps/advanced_chat/generate_task_pipeline.py
+++ b/api/core/app/apps/advanced_chat/generate_task_pipeline.py
@ -88,6 +88,7 @@ class AdvancedChatAppGenerateTaskPipeline(BasedGenerateTaskPipeline, WorkflowCyc
        message: Message,
        user: Union[Account, EndUser],
        stream: bool,
+        dialogue_count: int,
    ) -> None:
        """
        Initialize AdvancedChatAppGenerateTaskPipeline.
@ -98,6 +99,7 @@ class AdvancedChatAppGenerateTaskPipeline(BasedGenerateTaskPipeline, WorkflowCyc
        :param message: message
        :param user: user
        :param stream: stream
+        :param dialogue_count: dialogue count
        """
        super().__init__(application_generate_entity, queue_manager, user, stream)

@ -114,7 +116,7 @@ class AdvancedChatAppGenerateTaskPipeline(BasedGenerateTaskPipeline, WorkflowCyc
            SystemVariableKey.FILES: application_generate_entity.files,
            SystemVariableKey.CONVERSATION_ID: conversation.id,
            SystemVariableKey.USER_ID: user_id,
-            SystemVariableKey.DIALOGUE_COUNT: conversation.dialogue_count,
+            SystemVariableKey.DIALOGUE_COUNT: dialogue_count,
            SystemVariableKey.APP_ID: application_generate_entity.app_config.app_id,
            SystemVariableKey.WORKFLOW_ID: workflow.id,
            SystemVariableKey.WORKFLOW_RUN_ID: application_generate_entity.workflow_run_id,
@ -125,6 +127,7 @@ class AdvancedChatAppGenerateTaskPipeline(BasedGenerateTaskPipeline, WorkflowCyc

        self._conversation_name_generate_thread = None
        self._recorded_files: list[Mapping[str, Any]] = []
+        self.total_tokens: int = 0

    def process(self):
        """
@ -358,6 +361,8 @@ class AdvancedChatAppGenerateTaskPipeline(BasedGenerateTaskPipeline, WorkflowCyc
                if not workflow_run:
                    raise Exception("Workflow run not initialized.")

+                # FIXME for issue #11221 quick fix maybe have a better solution
+                self.total_tokens += event.metadata.get("total_tokens", 0) if event.metadata else 0
                yield self._workflow_iteration_completed_to_stream_response(
                    task_id=self._application_generate_entity.task_id, workflow_run=workflow_run, event=event
                )
@ -371,7 +376,7 @@ class AdvancedChatAppGenerateTaskPipeline(BasedGenerateTaskPipeline, WorkflowCyc
                workflow_run = self._handle_workflow_run_success(
                    workflow_run=workflow_run,
                    start_at=graph_runtime_state.start_at,
-                    total_tokens=graph_runtime_state.total_tokens,
+                    total_tokens=graph_runtime_state.total_tokens or self.total_tokens,
                    total_steps=graph_runtime_state.node_run_steps,
                    outputs=event.outputs,
                    conversation_id=self._conversation.id,
--- a/api/core/app/apps/agent_chat/app_config_manager.py
+++ b/api/core/app/apps/agent_chat/app_config_manager.py
@ -1,5 +1,6 @@
 import uuid
-from typing import Optional
+from collections.abc import Mapping
+from typing import Any, Optional

 from core.agent.entities import AgentEntity
 from core.app.app_config.base_app_config_manager import BaseAppConfigManager
@ -85,7 +86,7 @@ class AgentChatAppConfigManager(BaseAppConfigManager):
        return app_config

    @classmethod
-    def config_validate(cls, tenant_id: str, config: dict) -> dict:
+    def config_validate(cls, tenant_id: str, config: Mapping[str, Any]) -> dict:
        """
        Validate for agent chat app model config

--- a/api/core/app/apps/agent_chat/app_generator.py
+++ b/api/core/app/apps/agent_chat/app_generator.py
@ -1,8 +1,8 @@
 import logging
 import threading
 import uuid
-from collections.abc import Generator
-from typing import Any, Literal, Union, overload
+from collections.abc import Generator, Mapping
+from typing import Any, Union

 from flask import Flask, current_app
 from pydantic import ValidationError
@ -28,34 +28,15 @@ logger = logging.getLogger(__name__)


 class AgentChatAppGenerator(MessageBasedAppGenerator):
-    @overload
    def generate(
        self,
+        *,
        app_model: App,
        user: Union[Account, EndUser],
-        args: dict,
+        args: Mapping[str, Any],
        invoke_from: InvokeFrom,
-        stream: Literal[True] = True,
-    ) -> Generator[dict, None, None]: ...
-
-    @overload
-    def generate(
-        self,
-        app_model: App,
-        user: Union[Account, EndUser],
-        args: dict,
-        invoke_from: InvokeFrom,
-        stream: Literal[False] = False,
-    ) -> dict: ...
-
-    def generate(
-        self,
-        app_model: App,
-        user: Union[Account, EndUser],
-        args: Any,
-        invoke_from: InvokeFrom,
-        stream: bool = True,
-    ) -> Union[dict, Generator[dict, None, None]]:
+        streaming: bool = True,
+    ) -> Mapping[str, Any] | Generator[str, None, None]:
        """
        Generate App response.

@ -65,7 +46,7 @@ class AgentChatAppGenerator(MessageBasedAppGenerator):
        :param invoke_from: invoke from source
        :param stream: is stream
        """
-        if not stream:
+        if not streaming:
            raise ValueError("Agent Chat App does not support blocking mode")

        if not args.get("query"):
@ -96,7 +77,8 @@ class AgentChatAppGenerator(MessageBasedAppGenerator):

            # validate config
            override_model_config_dict = AgentChatAppConfigManager.config_validate(
-                tenant_id=app_model.tenant_id, config=args.get("model_config")
+                tenant_id=app_model.tenant_id,
+                config=args["model_config"],
            )

            # always enable retriever resource in debugger mode
@ -134,12 +116,14 @@ class AgentChatAppGenerator(MessageBasedAppGenerator):
            conversation_id=conversation.id if conversation else None,
            inputs=conversation.inputs
            if conversation
-            else self._prepare_user_inputs(user_inputs=inputs, variables=app_config.variables, tenant_id=app_model.id),
+            else self._prepare_user_inputs(
+                user_inputs=inputs, variables=app_config.variables, tenant_id=app_model.tenant_id
+            ),
            query=query,
            files=file_objs,
            parent_message_id=args.get("parent_message_id") if invoke_from != InvokeFrom.SERVICE_API else UUID_NIL,
            user_id=user.id,
-            stream=stream,
+            stream=streaming,
            invoke_from=invoke_from,
            extras=extras,
            call_depth=0,
@ -180,7 +164,7 @@ class AgentChatAppGenerator(MessageBasedAppGenerator):
            conversation=conversation,
            message=message,
            user=user,
-            stream=stream,
+            stream=streaming,
        )

        return AgentChatAppGenerateResponseConverter.convert(response=response, invoke_from=invoke_from)
--- a/api/core/app/apps/base_app_generate_response_converter.py
+++ b/api/core/app/apps/base_app_generate_response_converter.py
@ -1,6 +1,6 @@
 import logging
 from abc import ABC, abstractmethod
-from collections.abc import Generator
+from collections.abc import Generator, Mapping
 from typing import Any, Union

 from core.app.entities.app_invoke_entities import InvokeFrom
@ -14,8 +14,10 @@ class AppGenerateResponseConverter(ABC):

    @classmethod
    def convert(
-        cls, response: Union[AppBlockingResponse, Generator[AppStreamResponse, Any, None]], invoke_from: InvokeFrom
-    ) -> dict[str, Any] | Generator[str, Any, None]:
+        cls,
+        response: Union[AppBlockingResponse, Generator[AppStreamResponse, Any, None]],
+        invoke_from: InvokeFrom,
+    ) -> Mapping[str, Any] | Generator[str, None, None]:
        if invoke_from in {InvokeFrom.DEBUGGER, InvokeFrom.SERVICE_API}:
            if isinstance(response, AppBlockingResponse):
                return cls.convert_blocking_full_response(response)
--- a/api/core/app/apps/chat/app_generator.py
+++ b/api/core/app/apps/chat/app_generator.py
@ -55,7 +55,7 @@ class ChatAppGenerator(MessageBasedAppGenerator):
        user: Union[Account, EndUser],
        args: Any,
        invoke_from: InvokeFrom,
-        stream: bool = True,
+        streaming: bool = True,
    ) -> Union[dict, Generator[str, None, None]]:
        """
        Generate App response.
@ -132,7 +132,9 @@ class ChatAppGenerator(MessageBasedAppGenerator):
            conversation_id=conversation.id if conversation else None,
            inputs=conversation.inputs
            if conversation
-            else self._prepare_user_inputs(user_inputs=inputs, variables=app_config.variables, tenant_id=app_model.id),
+            else self._prepare_user_inputs(
+                user_inputs=inputs, variables=app_config.variables, tenant_id=app_model.tenant_id
+            ),
            query=query,
            files=file_objs,
            parent_message_id=args.get("parent_message_id") if invoke_from != InvokeFrom.SERVICE_API else UUID_NIL,
@ -140,7 +142,7 @@ class ChatAppGenerator(MessageBasedAppGenerator):
            invoke_from=invoke_from,
            extras=extras,
            trace_manager=trace_manager,
-            stream=stream,
+            stream=streaming,
        )

        # init generate records
@ -177,7 +179,7 @@ class ChatAppGenerator(MessageBasedAppGenerator):
            conversation=conversation,
            message=message,
            user=user,
-            stream=stream,
+            stream=streaming,
        )

        return ChatAppGenerateResponseConverter.convert(response=response, invoke_from=invoke_from)
--- a/api/core/app/apps/completion/app_generator.py
+++ b/api/core/app/apps/completion/app_generator.py
@ -50,7 +50,7 @@ class CompletionAppGenerator(MessageBasedAppGenerator):
    ) -> dict: ...

    def generate(
-        self, app_model: App, user: Union[Account, EndUser], args: Any, invoke_from: InvokeFrom, stream: bool = True
+        self, app_model: App, user: Union[Account, EndUser], args: Any, invoke_from: InvokeFrom, streaming: bool = True
    ) -> Union[dict, Generator[str, None, None]]:
        """
        Generate App response.
@ -114,12 +114,12 @@ class CompletionAppGenerator(MessageBasedAppGenerator):
            model_conf=ModelConfigConverter.convert(app_config),
            file_upload_config=file_extra_config,
            inputs=self._prepare_user_inputs(
-                user_inputs=inputs, variables=app_config.variables, tenant_id=app_model.id
+                user_inputs=inputs, variables=app_config.variables, tenant_id=app_model.tenant_id
            ),
            query=query,
            files=file_objs,
            user_id=user.id,
-            stream=stream,
+            stream=streaming,
            invoke_from=invoke_from,
            extras=extras,
            trace_manager=trace_manager,
@ -158,7 +158,7 @@ class CompletionAppGenerator(MessageBasedAppGenerator):
            conversation=conversation,
            message=message,
            user=user,
-            stream=stream,
+            stream=streaming,
        )

        return CompletionAppGenerateResponseConverter.convert(response=response, invoke_from=invoke_from)
--- a/api/core/app/apps/workflow/app_generator.py
+++ b/api/core/app/apps/workflow/app_generator.py
@ -3,7 +3,7 @@ import logging
 import threading
 import uuid
 from collections.abc import Generator, Mapping, Sequence
-from typing import Any, Literal, Optional, Union, overload
+from typing import Any, Optional, Union

 from flask import Flask, current_app
 from pydantic import ValidationError
@ -30,43 +30,18 @@ logger = logging.getLogger(__name__)


 class WorkflowAppGenerator(BaseAppGenerator):
-    @overload
    def generate(
        self,
+        *,
        app_model: App,
        workflow: Workflow,
-        user: Union[Account, EndUser],
-        args: dict,
-        invoke_from: InvokeFrom,
-        stream: Literal[True] = True,
-        call_depth: int = 0,
-        workflow_thread_pool_id: Optional[str] = None,
-    ) -> Generator[str, None, None]: ...
-
-    @overload
-    def generate(
-        self,
-        app_model: App,
-        workflow: Workflow,
-        user: Union[Account, EndUser],
-        args: dict,
-        invoke_from: InvokeFrom,
-        stream: Literal[False] = False,
-        call_depth: int = 0,
-        workflow_thread_pool_id: Optional[str] = None,
-    ) -> dict: ...
-
-    def generate(
-        self,
-        app_model: App,
-        workflow: Workflow,
-        user: Union[Account, EndUser],
+        user: Account | EndUser,
        args: Mapping[str, Any],
        invoke_from: InvokeFrom,
-        stream: bool = True,
+        streaming: bool = True,
        call_depth: int = 0,
        workflow_thread_pool_id: Optional[str] = None,
-    ):
+    ) -> Mapping[str, Any] | Generator[str, None, None]:
        files: Sequence[Mapping[str, Any]] = args.get("files") or []

        # parse files
@ -101,7 +76,7 @@ class WorkflowAppGenerator(BaseAppGenerator):
            ),
            files=system_files,
            user_id=user.id,
-            stream=stream,
+            stream=streaming,
            invoke_from=invoke_from,
            call_depth=call_depth,
            trace_manager=trace_manager,
@ -115,7 +90,7 @@ class WorkflowAppGenerator(BaseAppGenerator):
            user=user,
            application_generate_entity=application_generate_entity,
            invoke_from=invoke_from,
-            stream=stream,
+            streaming=streaming,
            workflow_thread_pool_id=workflow_thread_pool_id,
        )

@ -127,20 +102,9 @@ class WorkflowAppGenerator(BaseAppGenerator):
        user: Union[Account, EndUser],
        application_generate_entity: WorkflowAppGenerateEntity,
        invoke_from: InvokeFrom,
-        stream: bool = True,
+        streaming: bool = True,
        workflow_thread_pool_id: Optional[str] = None,
-    ) -> dict[str, Any] | Generator[str, None, None]:
-        """
-        Generate App response.
-
-        :param app_model: App
-        :param workflow: Workflow
-        :param user: account or end user
-        :param application_generate_entity: application generate entity
-        :param invoke_from: invoke from source
-        :param stream: is stream
-        :param workflow_thread_pool_id: workflow thread pool id
-        """
+    ) -> Mapping[str, Any] | Generator[str, None, None]:
        # init queue manager
        queue_manager = WorkflowAppQueueManager(
            task_id=application_generate_entity.task_id,
@ -169,14 +133,20 @@ class WorkflowAppGenerator(BaseAppGenerator):
            workflow=workflow,
            queue_manager=queue_manager,
            user=user,
-            stream=stream,
+            stream=streaming,
        )

        return WorkflowAppGenerateResponseConverter.convert(response=response, invoke_from=invoke_from)

    def single_iteration_generate(
-        self, app_model: App, workflow: Workflow, node_id: str, user: Account, args: dict, stream: bool = True
-    ) -> dict[str, Any] | Generator[str, Any, None]:
+        self,
+        app_model: App,
+        workflow: Workflow,
+        node_id: str,
+        user: Account,
+        args: Mapping[str, Any],
+        streaming: bool = True,
+    ) -> Mapping[str, Any] | Generator[str, None, None]:
        """
        Generate App response.

@ -203,7 +173,7 @@ class WorkflowAppGenerator(BaseAppGenerator):
            inputs={},
            files=[],
            user_id=user.id,
-            stream=stream,
+            stream=streaming,
            invoke_from=InvokeFrom.DEBUGGER,
            extras={"auto_generate_conversation_name": False},
            single_iteration_run=WorkflowAppGenerateEntity.SingleIterationRunEntity(
@ -218,7 +188,7 @@ class WorkflowAppGenerator(BaseAppGenerator):
            user=user,
            invoke_from=InvokeFrom.DEBUGGER,
            application_generate_entity=application_generate_entity,
-            stream=stream,
+            streaming=streaming,
        )

    def _generate_worker(
--- a/api/core/app/apps/workflow/generate_task_pipeline.py
+++ b/api/core/app/apps/workflow/generate_task_pipeline.py
@ -106,6 +106,7 @@ class WorkflowAppGenerateTaskPipeline(BasedGenerateTaskPipeline, WorkflowCycleMa

        self._task_state = WorkflowTaskState()
        self._wip_workflow_node_executions = {}
+        self.total_tokens: int = 0

    def process(self) -> Union[WorkflowAppBlockingResponse, Generator[WorkflowAppStreamResponse, None, None]]:
        """
@ -319,6 +320,8 @@ class WorkflowAppGenerateTaskPipeline(BasedGenerateTaskPipeline, WorkflowCycleMa
                if not workflow_run:
                    raise Exception("Workflow run not initialized.")

+                # FIXME for issue #11221 quick fix maybe have a better solution
+                self.total_tokens += event.metadata.get("total_tokens", 0) if event.metadata else 0
                yield self._workflow_iteration_completed_to_stream_response(
                    task_id=self._application_generate_entity.task_id, workflow_run=workflow_run, event=event
                )
@ -332,7 +335,7 @@ class WorkflowAppGenerateTaskPipeline(BasedGenerateTaskPipeline, WorkflowCycleMa
                workflow_run = self._handle_workflow_run_success(
                    workflow_run=workflow_run,
                    start_at=graph_runtime_state.start_at,
-                    total_tokens=graph_runtime_state.total_tokens,
+                    total_tokens=graph_runtime_state.total_tokens or self.total_tokens,
                    total_steps=graph_runtime_state.node_run_steps,
                    outputs=event.outputs,
                    conversation_id=None,
--- a/api/core/app/features/rate_limiting/rate_limit.py
+++ b/api/core/app/features/rate_limiting/rate_limit.py
@ -1,9 +1,9 @@
 import logging
 import time
 import uuid
-from collections.abc import Generator
+from collections.abc import Generator, Mapping
 from datetime import timedelta
-from typing import Optional, Union
+from typing import Any, Optional, Union

 from core.errors.error import AppInvokeQuotaExceededError
 from extensions.ext_redis import redis_client
@ -88,20 +88,17 @@ class RateLimit:
    def gen_request_key() -> str:
        return str(uuid.uuid4())

-    def generate(self, generator: Union[Generator, callable, dict], request_id: str):
-        if isinstance(generator, dict):
+    def generate(self, generator: Union[Generator[str, None, None], Mapping[str, Any]], request_id: str):
+        if isinstance(generator, Mapping):
            return generator
        else:
-            return RateLimitGenerator(self, generator, request_id)
+            return RateLimitGenerator(rate_limit=self, generator=generator, request_id=request_id)


 class RateLimitGenerator:
-    def __init__(self, rate_limit: RateLimit, generator: Union[Generator, callable], request_id: str):
+    def __init__(self, rate_limit: RateLimit, generator: Generator[str, None, None], request_id: str):
        self.rate_limit = rate_limit
-        if callable(generator):
-            self.generator = generator()
-        else:
-            self.generator = generator
+        self.generator = generator
        self.request_id = request_id
        self.closed = False

--- a/api/core/app/task_pipeline/workflow_cycle_manage.py
+++ b/api/core/app/task_pipeline/workflow_cycle_manage.py
@ -340,7 +340,7 @@ class WorkflowCycleManage:
                WorkflowNodeExecution.status: WorkflowNodeExecutionStatus.FAILED.value,
                WorkflowNodeExecution.error: event.error,
                WorkflowNodeExecution.inputs: json.dumps(inputs) if inputs else None,
-                WorkflowNodeExecution.process_data: json.dumps(event.process_data) if event.process_data else None,
+                WorkflowNodeExecution.process_data: json.dumps(process_data) if process_data else None,
                WorkflowNodeExecution.outputs: json.dumps(outputs) if outputs else None,
                WorkflowNodeExecution.finished_at: finished_at,
                WorkflowNodeExecution.elapsed_time: elapsed_time,
--- a/api/core/helper/ssrf_proxy.py
+++ b/api/core/helper/ssrf_proxy.py
@ -53,8 +53,6 @@ def make_request(method, url, max_retries=SSRF_DEFAULT_MAX_RETRIES, **kwargs):
                    response = client.request(method=method, url=url, **kwargs)

            if response.status_code not in STATUS_FORCELIST:
-                if stream:
-                    return response.iter_bytes()
                return response
            else:
                logging.warning(f"Received status code {response.status_code} for URL {url} which is in the force list")
--- a/api/core/llm_generator/output_parser/suggested_questions_after_answer.py
+++ b/api/core/llm_generator/output_parser/suggested_questions_after_answer.py
@ -15,6 +15,5 @@ class SuggestedQuestionsAfterAnswerOutputParser:
            json_obj = json.loads(action_match.group(0).strip())
        else:
            json_obj = []
-            print(f"Could not parse LLM output: {text}")

        return json_obj
--- a/api/core/model_runtime/model_providers/anthropic/llm/llm.py
+++ b/api/core/model_runtime/model_providers/anthropic/llm/llm.py
@ -453,7 +453,7 @@ class AnthropicLargeLanguageModel(LargeLanguageModel):

        return credentials_kwargs

-    def _convert_prompt_messages(self, prompt_messages: list[PromptMessage]) -> tuple[str, list[dict]]:
+    def _convert_prompt_messages(self, prompt_messages: Sequence[PromptMessage]) -> tuple[str, list[dict]]:
        """
        Convert prompt messages to dict list and system
        """
@ -461,7 +461,15 @@ class AnthropicLargeLanguageModel(LargeLanguageModel):
        first_loop = True
        for message in prompt_messages:
            if isinstance(message, SystemPromptMessage):
-                message.content = message.content.strip()
+                if isinstance(message.content, str):
+                    message.content = message.content.strip()
+                elif isinstance(message.content, list):
+                    # System prompt only support text
+                    message.content = "".join(
+                        c.data.strip() for c in message.content if isinstance(c, TextPromptMessageContent)
+                    )
+                else:
+                    raise ValueError(f"Unknown system prompt message content type {type(message.content)}")
                if first_loop:
                    system = message.content
                    first_loop = False
@ -475,6 +483,10 @@ class AnthropicLargeLanguageModel(LargeLanguageModel):
                if isinstance(message, UserPromptMessage):
                    message = cast(UserPromptMessage, message)
                    if isinstance(message.content, str):
+                        # handle empty user prompt see #10013 #10520
+                        # responses, ignore user prompts containing only whitespace, the Claude API can't handle it.
+                        if not message.content.strip():
+                            continue
                        message_dict = {"role": "user", "content": message.content}
                        prompt_message_dicts.append(message_dict)
                    else:
--- a/api/core/model_runtime/model_providers/azure_openai/_constant.py
+++ b/api/core/model_runtime/model_providers/azure_openai/_constant.py
@ -779,7 +779,7 @@ LLM_BASE_MODELS = [
                    name="frequency_penalty",
                    **PARAMETER_RULE_TEMPLATE[DefaultParameterName.FREQUENCY_PENALTY],
                ),
-                _get_max_tokens(default=512, min_val=1, max_val=4096),
+                _get_max_tokens(default=512, min_val=1, max_val=16384),
                ParameterRule(
                    name="seed",
                    label=I18nObject(zh_Hans="种子", en_US="Seed"),
--- a/api/core/model_runtime/model_providers/azure_openai/llm/llm.py
+++ b/api/core/model_runtime/model_providers/azure_openai/llm/llm.py
@ -598,6 +598,9 @@ class AzureOpenAILargeLanguageModel(_CommonAzureOpenAI, LargeLanguageModel):
            # message = cast(AssistantPromptMessage, message)
            message_dict = {"role": "assistant", "content": message.content}
            if message.tool_calls:
+                # fix azure when enable json schema cant process content = "" in assistant fix with None
+                if not message.content:
+                    message_dict["content"] = None
                message_dict["tool_calls"] = [helper.dump_model(tool_call) for tool_call in message.tool_calls]
        elif isinstance(message, SystemPromptMessage):
            message = cast(SystemPromptMessage, message)
--- a/api/core/model_runtime/model_providers/azure_openai/tts/tts.py
+++ b/api/core/model_runtime/model_providers/azure_openai/tts/tts.py
@ -14,7 +14,7 @@ from core.model_runtime.model_providers.azure_openai._constant import TTS_BASE_M

 class AzureOpenAIText2SpeechModel(_CommonAzureOpenAI, TTSModel):
    """
-    Model class for OpenAI Speech to text model.
+    Model class for OpenAI text2speech model.
    """

    def _invoke(
--- a/api/core/model_runtime/model_providers/gitee_ai/llm/llm.py
+++ b/api/core/model_runtime/model_providers/gitee_ai/llm/llm.py
@ -32,12 +32,12 @@ class GiteeAILargeLanguageModel(OAIAPICompatLargeLanguageModel):
        return super()._invoke(model, credentials, prompt_messages, model_parameters, tools, stop, stream, user)

    def validate_credentials(self, model: str, credentials: dict) -> None:
-        self._add_custom_parameters(credentials, model, None)
+        self._add_custom_parameters(credentials, None)
        super().validate_credentials(model, credentials)

-    def _add_custom_parameters(self, credentials: dict, model: str, model_parameters: dict) -> None:
+    def _add_custom_parameters(self, credentials: dict, model: Optional[str]) -> None:
        if model is None:
-            model = "bge-large-zh-v1.5"
+            model = "Qwen2-72B-Instruct"

        model_identity = GiteeAILargeLanguageModel.MODEL_TO_IDENTITY.get(model, model)
        credentials["endpoint_url"] = f"https://ai.gitee.com/api/serverless/{model_identity}/"
@ -47,5 +47,7 @@ class GiteeAILargeLanguageModel(OAIAPICompatLargeLanguageModel):
            credentials["mode"] = LLMMode.CHAT.value

        schema = self.get_model_schema(model, credentials)
+        assert schema is not None, f"Model schema not found for model {model}"
+        assert schema.features is not None, f"Model features not found for model {model}"
        if ModelFeature.TOOL_CALL in schema.features or ModelFeature.MULTI_TOOL_CALL in schema.features:
            credentials["function_calling_type"] = "tool_call"
--- a/api/core/model_runtime/model_providers/gitee_ai/rerank/rerank.py
+++ b/api/core/model_runtime/model_providers/gitee_ai/rerank/rerank.py
@ -122,7 +122,7 @@ class GiteeAIRerankModel(RerankModel):
            label=I18nObject(en_US=model),
            model_type=ModelType.RERANK,
            fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
-            model_properties={ModelPropertyKey.CONTEXT_SIZE: int(credentials.get("context_size"))},
+            model_properties={ModelPropertyKey.CONTEXT_SIZE: int(credentials.get("context_size", 512))},
        )

        return entity
--- a/api/core/model_runtime/model_providers/gitee_ai/tts/tts.py
+++ b/api/core/model_runtime/model_providers/gitee_ai/tts/tts.py
@ -10,7 +10,7 @@ from core.model_runtime.model_providers.gitee_ai._common import _CommonGiteeAI

 class GiteeAIText2SpeechModel(_CommonGiteeAI, TTSModel):
    """
-    Model class for OpenAI Speech to text model.
+    Model class for OpenAI text2speech model.
    """

    def _invoke(
--- a/api/core/model_runtime/model_providers/google/llm/llm.py
+++ b/api/core/model_runtime/model_providers/google/llm/llm.py
@ -254,8 +254,12 @@ class GoogleLargeLanguageModel(LargeLanguageModel):
        assistant_prompt_message = AssistantPromptMessage(content=response.text)

        # calculate num tokens
-        prompt_tokens = self.get_num_tokens(model, credentials, prompt_messages)
-        completion_tokens = self.get_num_tokens(model, credentials, [assistant_prompt_message])
+        if response.usage_metadata:
+            prompt_tokens = response.usage_metadata.prompt_token_count
+            completion_tokens = response.usage_metadata.candidates_token_count
+        else:
+            prompt_tokens = self.get_num_tokens(model, credentials, prompt_messages)
+            completion_tokens = self.get_num_tokens(model, credentials, [assistant_prompt_message])

        # transform usage
        usage = self._calc_response_usage(model, credentials, prompt_tokens, completion_tokens)
--- a/api/core/model_runtime/model_providers/gpustack/rerank/rerank.py
+++ b/api/core/model_runtime/model_providers/gpustack/rerank/rerank.py
@ -140,7 +140,7 @@ class GPUStackRerankModel(RerankModel):
            label=I18nObject(en_US=model),
            model_type=ModelType.RERANK,
            fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
-            model_properties={ModelPropertyKey.CONTEXT_SIZE: int(credentials.get("context_size"))},
+            model_properties={ModelPropertyKey.CONTEXT_SIZE: int(credentials.get("context_size", 512))},
        )

        return entity
--- a/api/core/model_runtime/model_providers/jina/rerank/rerank.py
+++ b/api/core/model_runtime/model_providers/jina/rerank/rerank.py
@ -128,7 +128,7 @@ class JinaRerankModel(RerankModel):
            label=I18nObject(en_US=model),
            model_type=ModelType.RERANK,
            fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
-            model_properties={ModelPropertyKey.CONTEXT_SIZE: int(credentials.get("context_size"))},
+            model_properties={ModelPropertyKey.CONTEXT_SIZE: int(credentials.get("context_size", 8000))},
        )

        return entity
--- a/api/core/model_runtime/model_providers/jina/text_embedding/text_embedding.py
+++ b/api/core/model_runtime/model_providers/jina/text_embedding/text_embedding.py
@ -193,7 +193,7 @@ class JinaTextEmbeddingModel(TextEmbeddingModel):
            label=I18nObject(en_US=model),
            model_type=ModelType.TEXT_EMBEDDING,
            fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
-            model_properties={ModelPropertyKey.CONTEXT_SIZE: int(credentials.get("context_size"))},
+            model_properties={ModelPropertyKey.CONTEXT_SIZE: int(credentials.get("context_size", 8000))},
        )

        return entity
--- a/api/core/model_runtime/model_providers/ollama/text_embedding/text_embedding.py
+++ b/api/core/model_runtime/model_providers/ollama/text_embedding/text_embedding.py
@ -139,7 +139,7 @@ class OllamaEmbeddingModel(TextEmbeddingModel):
            model_type=ModelType.TEXT_EMBEDDING,
            fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
            model_properties={
-                ModelPropertyKey.CONTEXT_SIZE: int(credentials.get("context_size")),
+                ModelPropertyKey.CONTEXT_SIZE: int(credentials.get("context_size", 512)),
                ModelPropertyKey.MAX_CHUNKS: 1,
            },
            parameter_rules=[],
--- a/api/core/model_runtime/model_providers/openai/llm/llm.py
+++ b/api/core/model_runtime/model_providers/openai/llm/llm.py
@ -943,6 +943,9 @@ class OpenAILargeLanguageModel(_CommonOpenAI, LargeLanguageModel):
                }
        elif isinstance(message, SystemPromptMessage):
            message = cast(SystemPromptMessage, message)
+            if isinstance(message.content, list):
+                text_contents = filter(lambda c: isinstance(c, TextPromptMessageContent), message.content)
+                message.content = "".join(c.data for c in text_contents)
            message_dict = {"role": "system", "content": message.content}
        elif isinstance(message, ToolPromptMessage):
            message = cast(ToolPromptMessage, message)
--- a/api/core/model_runtime/model_providers/openai/tts/tts.py
+++ b/api/core/model_runtime/model_providers/openai/tts/tts.py
@ -11,7 +11,7 @@ from core.model_runtime.model_providers.openai._common import _CommonOpenAI

 class OpenAIText2SpeechModel(_CommonOpenAI, TTSModel):
    """
-    Model class for OpenAI Speech to text model.
+    Model class for OpenAI text2speech model.
    """

    def _invoke(
--- a/api/core/model_runtime/model_providers/openai_api_compatible/openai_api_compatible.yaml
+++ b/api/core/model_runtime/model_providers/openai_api_compatible/openai_api_compatible.yaml
@ -9,6 +9,7 @@ supported_model_types:
  - text-embedding
  - speech2text
  - rerank
+  - tts
 configurate_methods:
  - customizable-model
 model_credential_schema:
@ -67,7 +68,7 @@ model_credential_schema:
        - variable: __model_type
          value: llm
      type: text-input
-      default: '4096'
+      default: "4096"
      placeholder:
        zh_Hans: 在此输入您的模型上下文长度
        en_US: Enter your Model context size
@ -80,7 +81,7 @@ model_credential_schema:
        - variable: __model_type
          value: text-embedding
      type: text-input
-      default: '4096'
+      default: "4096"
      placeholder:
        zh_Hans: 在此输入您的模型上下文长度
        en_US: Enter your Model context size
@ -93,7 +94,7 @@ model_credential_schema:
        - variable: __model_type
          value: rerank
      type: text-input
-      default: '4096'
+      default: "4096"
      placeholder:
        zh_Hans: 在此输入您的模型上下文长度
        en_US: Enter your Model context size
@ -104,7 +105,7 @@ model_credential_schema:
      show_on:
        - variable: __model_type
          value: llm
-      default: '4096'
+      default: "4096"
      type: text-input
    - variable: function_calling_type
      show_on:
@ -174,3 +175,19 @@ model_credential_schema:
          value: llm
      default: '\n\n'
      type: text-input
+    - variable: voices
+      show_on:
+        - variable: __model_type
+          value: tts
+      label:
+        en_US: Available Voices (comma-separated)
+        zh_Hans: 可用声音（用英文逗号分隔）
+      type: text-input
+      required: false
+      default: "alloy"
+      placeholder:
+        en_US: "alloy,echo,fable,onyx,nova,shimmer"
+        zh_Hans: "alloy,echo,fable,onyx,nova,shimmer"
+      help:
+        en_US: "List voice names separated by commas. First voice will be used as default."
+        zh_Hans: "用英文逗号分隔的声音列表。第一个声音将作为默认值。"
--- a/api/core/model_runtime/model_providers/openai_api_compatible/text_embedding/text_embedding.py
+++ b/api/core/model_runtime/model_providers/openai_api_compatible/text_embedding/text_embedding.py
@ -139,13 +139,17 @@ class OAICompatEmbeddingModel(_CommonOaiApiCompat, TextEmbeddingModel):
            if api_key:
                headers["Authorization"] = f"Bearer {api_key}"

-            endpoint_url = credentials.get("endpoint_url")
+            endpoint_url = credentials.get("endpoint_url", "")
            if not endpoint_url.endswith("/"):
                endpoint_url += "/"

            endpoint_url = urljoin(endpoint_url, "embeddings")

            payload = {"input": "ping", "model": model}
+            # For nvidia models, the "input_type":"query" need in the payload
+            # more to check issue #11193 or NvidiaTextEmbeddingModel
+            if model.startswith("nvidia/"):
+                payload["input_type"] = "query"

            response = requests.post(url=endpoint_url, headers=headers, data=json.dumps(payload), timeout=(10, 300))

@ -176,7 +180,7 @@ class OAICompatEmbeddingModel(_CommonOaiApiCompat, TextEmbeddingModel):
            model_type=ModelType.TEXT_EMBEDDING,
            fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
            model_properties={
-                ModelPropertyKey.CONTEXT_SIZE: int(credentials.get("context_size")),
+                ModelPropertyKey.CONTEXT_SIZE: int(credentials.get("context_size", 512)),
                ModelPropertyKey.MAX_CHUNKS: 1,
            },
            parameter_rules=[],
--- a/api/core/model_runtime/model_providers/openai_api_compatible/tts/init.py
+++ b/api/core/model_runtime/model_providers/openai_api_compatible/tts/init.py
--- a/api/core/model_runtime/model_providers/openai_api_compatible/tts/tts.py
+++ b/api/core/model_runtime/model_providers/openai_api_compatible/tts/tts.py
@ -0,0 +1,145 @@
+from collections.abc import Iterable
+from typing import Optional
+from urllib.parse import urljoin
+
+import requests
+
+from core.model_runtime.entities.common_entities import I18nObject
+from core.model_runtime.entities.model_entities import AIModelEntity, FetchFrom, ModelPropertyKey, ModelType
+from core.model_runtime.errors.invoke import InvokeBadRequestError
+from core.model_runtime.errors.validate import CredentialsValidateFailedError
+from core.model_runtime.model_providers.__base.tts_model import TTSModel
+from core.model_runtime.model_providers.openai_api_compatible._common import _CommonOaiApiCompat
+
+
+class OAICompatText2SpeechModel(_CommonOaiApiCompat, TTSModel):
+    """
+    Model class for OpenAI-compatible text2speech model.
+    """
+
+    def _invoke(
+        self,
+        model: str,
+        tenant_id: str,
+        credentials: dict,
+        content_text: str,
+        voice: str,
+        user: Optional[str] = None,
+    ) -> Iterable[bytes]:
+        """
+        Invoke TTS model
+
+        :param model: model name
+        :param tenant_id: user tenant id
+        :param credentials: model credentials
+        :param content_text: text content to be translated
+        :param voice: model voice/speaker
+        :param user: unique user id
+        :return: audio data as bytes iterator
+        """
+        # Set up headers with authentication if provided
+        headers = {}
+        if api_key := credentials.get("api_key"):
+            headers["Authorization"] = f"Bearer {api_key}"
+
+        # Construct endpoint URL
+        endpoint_url = credentials.get("endpoint_url")
+        if not endpoint_url.endswith("/"):
+            endpoint_url += "/"
+        endpoint_url = urljoin(endpoint_url, "audio/speech")
+
+        # Get audio format from model properties
+        audio_format = self._get_model_audio_type(model, credentials)
+
+        # Split text into chunks if needed based on word limit
+        word_limit = self._get_model_word_limit(model, credentials)
+        sentences = self._split_text_into_sentences(content_text, word_limit)
+
+        for sentence in sentences:
+            # Prepare request payload
+            payload = {"model": model, "input": sentence, "voice": voice, "response_format": audio_format}
+
+            # Make POST request
+            response = requests.post(endpoint_url, headers=headers, json=payload, stream=True)
+
+            if response.status_code != 200:
+                raise InvokeBadRequestError(response.text)
+
+            # Stream the audio data
+            for chunk in response.iter_content(chunk_size=4096):
+                if chunk:
+                    yield chunk
+
+    def validate_credentials(self, model: str, credentials: dict) -> None:
+        """
+        Validate model credentials
+
+        :param model: model name
+        :param credentials: model credentials
+        :return:
+        """
+        try:
+            # Get default voice for validation
+            voice = self._get_model_default_voice(model, credentials)
+
+            # Test with a simple text
+            next(
+                self._invoke(
+                    model=model, tenant_id="validate", credentials=credentials, content_text="Test.", voice=voice
+                )
+            )
+        except Exception as ex:
+            raise CredentialsValidateFailedError(str(ex))
+
+    def get_customizable_model_schema(self, model: str, credentials: dict) -> Optional[AIModelEntity]:
+        """
+        Get customizable model schema
+        """
+        # Parse voices from comma-separated string
+        voice_names = credentials.get("voices", "alloy").strip().split(",")
+        voices = []
+
+        for voice in voice_names:
+            voice = voice.strip()
+            if not voice:
+                continue
+
+            # Use en-US for all voices
+            voices.append(
+                {
+                    "name": voice,
+                    "mode": voice,
+                    "language": "en-US",
+                }
+            )
+
+        # If no voices provided or all voices were empty strings, use 'alloy' as default
+        if not voices:
+            voices = [{"name": "Alloy", "mode": "alloy", "language": "en-US"}]
+
+        return AIModelEntity(
+            model=model,
+            label=I18nObject(en_US=model),
+            fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
+            model_type=ModelType.TTS,
+            model_properties={
+                ModelPropertyKey.AUDIO_TYPE: credentials.get("audio_type", "mp3"),
+                ModelPropertyKey.WORD_LIMIT: int(credentials.get("word_limit", 4096)),
+                ModelPropertyKey.DEFAULT_VOICE: voices[0]["mode"],
+                ModelPropertyKey.VOICES: voices,
+            },
+        )
+
+    def get_tts_model_voices(self, model: str, credentials: dict, language: Optional[str] = None) -> list:
+        """
+        Override base get_tts_model_voices to handle customizable voices
+        """
+        model_schema = self.get_customizable_model_schema(model, credentials)
+
+        if not model_schema or ModelPropertyKey.VOICES not in model_schema.model_properties:
+            raise ValueError("this model does not support voice")
+
+        voices = model_schema.model_properties[ModelPropertyKey.VOICES]
+
+        # Always return all voices regardless of language
+        return [{"name": d["name"], "value": d["mode"]} for d in voices]
--- a/api/core/model_runtime/model_providers/perfxcloud/text_embedding/text_embedding.py
+++ b/api/core/model_runtime/model_providers/perfxcloud/text_embedding/text_embedding.py
@ -182,7 +182,7 @@ class OAICompatEmbeddingModel(_CommonOaiApiCompat, TextEmbeddingModel):
            model_type=ModelType.TEXT_EMBEDDING,
            fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
            model_properties={
-                ModelPropertyKey.CONTEXT_SIZE: int(credentials.get("context_size")),
+                ModelPropertyKey.CONTEXT_SIZE: int(credentials.get("context_size", 512)),
                ModelPropertyKey.MAX_CHUNKS: 1,
            },
            parameter_rules=[],
--- a/api/core/model_runtime/model_providers/vertex_ai/text_embedding/text_embedding.py
+++ b/api/core/model_runtime/model_providers/vertex_ai/text_embedding/text_embedding.py
@ -173,7 +173,7 @@ class VertexAiTextEmbeddingModel(_CommonVertexAi, TextEmbeddingModel):
            model_type=ModelType.TEXT_EMBEDDING,
            fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
            model_properties={
-                ModelPropertyKey.CONTEXT_SIZE: int(credentials.get("context_size")),
+                ModelPropertyKey.CONTEXT_SIZE: int(credentials.get("context_size", 512)),
                ModelPropertyKey.MAX_CHUNKS: 1,
            },
            parameter_rules=[],
--- a/api/core/model_runtime/model_providers/voyage/text_embedding/text_embedding.py
+++ b/api/core/model_runtime/model_providers/voyage/text_embedding/text_embedding.py
@ -166,7 +166,7 @@ class VoyageTextEmbeddingModel(TextEmbeddingModel):
            label=I18nObject(en_US=model),
            model_type=ModelType.TEXT_EMBEDDING,
            fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
-            model_properties={ModelPropertyKey.CONTEXT_SIZE: int(credentials.get("context_size"))},
+            model_properties={ModelPropertyKey.CONTEXT_SIZE: int(credentials.get("context_size", 512))},
        )

        return entity
--- a/api/core/model_runtime/model_providers/x/llm/grok-beta.yaml
+++ b/api/core/model_runtime/model_providers/x/llm/grok-beta.yaml
@ -1,9 +1,12 @@
 model: grok-beta
 label:
-  en_US: Grok beta
+  en_US: Grok Beta
 model_type: llm
 features:
+  - agent-thought
+  - tool-call
  - multi-tool-call
+  - stream-tool-call
 model_properties:
  mode: chat
  context_size: 131072
--- a/api/core/model_runtime/model_providers/x/llm/grok-vision-beta.yaml
+++ b/api/core/model_runtime/model_providers/x/llm/grok-vision-beta.yaml
@ -0,0 +1,64 @@
+model: grok-vision-beta
+label:
+  en_US: Grok Vision Beta
+model_type: llm
+features:
+  - agent-thought
+  - vision
+model_properties:
+  mode: chat
+  context_size: 8192
+parameter_rules:
+  - name: temperature
+    label:
+      en_US: "Temperature"
+      zh_Hans: "采样温度"
+    type: float
+    default: 0.7
+    min: 0.0
+    max: 2.0
+    precision: 1
+    required: true
+    help:
+      en_US: "The randomness of the sampling temperature control output. The temperature value is within the range of [0.0, 1.0]. The higher the value, the more random and creative the output; the lower the value, the more stable it is. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
+      zh_Hans: "采样温度控制输出的随机性。温度值在 [0.0, 1.0] 范围内，值越高，输出越随机和创造性；值越低，输出越稳定。建议根据需求调整 top_p 或 temperature 参数，避免同时调整两者。"
+
+  - name: top_p
+    label:
+      en_US: "Top P"
+      zh_Hans: "Top P"
+    type: float
+    default: 0.7
+    min: 0.0
+    max: 1.0
+    precision: 1
+    required: true
+    help:
+      en_US: "The value range of the sampling method is [0.0, 1.0]. The top_p value determines that the model selects tokens from the top p% of candidate words with the highest probability; when top_p is 0, this parameter is invalid. It is recommended to adjust either top_p or temperature parameters according to your needs to avoid adjusting both at the same time."
+      zh_Hans: "采样方法的取值范围为 [0.0,1.0]。top_p 值确定模型从概率最高的前p%的候选词中选取 tokens；当 top_p 为 0 时，此参数无效。建议根据需求调整 top_p 或 temperature 参数，避免同时调整两者。"
+
+  - name: frequency_penalty
+    use_template: frequency_penalty
+    label:
+      en_US: "Frequency Penalty"
+      zh_Hans: "频率惩罚"
+    type: float
+    default: 0
+    min: 0
+    max: 2.0
+    precision: 1
+    required: false
+    help:
+      en_US: "Number between 0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim."
+      zh_Hans: "介于0和2.0之间的数字。正值会根据新标记在文本中迄今为止的现有频率来惩罚它们，从而降低模型一字不差地重复同一句话的可能性。"
+
+  - name: user
+    use_template: text
+    label:
+      en_US: "User"
+      zh_Hans: "用户"
+    type: string
+    required: false
+    help:
+      en_US: "Used to track and differentiate conversation requests from different users."
+      zh_Hans: "用于追踪和区分不同用户的对话请求。"
--- a/api/core/model_runtime/model_providers/x/llm/llm.py
+++ b/api/core/model_runtime/model_providers/x/llm/llm.py
@ -35,3 +35,5 @@ class XAILargeLanguageModel(OAIAPICompatLargeLanguageModel):
        credentials["endpoint_url"] = str(URL(credentials["endpoint_url"])) or "https://api.x.ai/v1"
        credentials["mode"] = LLMMode.CHAT.value
        credentials["function_calling_type"] = "tool_call"
+        credentials["stream_function_calling"] = "support"
+        credentials["vision_support"] = "support"
--- a/api/core/model_runtime/model_providers/zhipuai/text_embedding/text_embedding.py
+++ b/api/core/model_runtime/model_providers/zhipuai/text_embedding/text_embedding.py
@ -105,17 +105,6 @@ class ZhipuAITextEmbeddingModel(_CommonZhipuaiAI, TextEmbeddingModel):

        return [list(map(float, e)) for e in embeddings], embedding_used_tokens

-    def embed_query(self, text: str) -> list[float]:
-        """Call out to ZhipuAI's embedding endpoint.
-
-        Args:
-            text: The text to embed.
-
-        Returns:
-            Embeddings for the text.
-        """
-        return self.embed_documents([text])[0]
-
    def _calc_response_usage(self, model: str, credentials: dict, tokens: int) -> EmbeddingUsage:
        """
        Calculate response usage
--- a/api/core/ops/ops_trace_manager.py
+++ b/api/core/ops/ops_trace_manager.py
@ -445,7 +445,7 @@ class TraceTask:
            "ls_provider": message_data.model_provider,
            "ls_model_name": message_data.model_id,
            "status": message_data.status,
-            "from_end_user_id": message_data.from_account_id,
+            "from_end_user_id": message_data.from_end_user_id,
            "from_account_id": message_data.from_account_id,
            "agent_based": message_data.agent_based,
            "workflow_run_id": message_data.workflow_run_id,
@ -521,7 +521,7 @@ class TraceTask:
            "ls_provider": message_data.model_provider,
            "ls_model_name": message_data.model_id,
            "status": message_data.status,
-            "from_end_user_id": message_data.from_account_id,
+            "from_end_user_id": message_data.from_end_user_id,
            "from_account_id": message_data.from_account_id,
            "agent_based": message_data.agent_based,
            "workflow_run_id": message_data.workflow_run_id,
@ -570,7 +570,7 @@ class TraceTask:
            "ls_provider": message_data.model_provider,
            "ls_model_name": message_data.model_id,
            "status": message_data.status,
-            "from_end_user_id": message_data.from_account_id,
+            "from_end_user_id": message_data.from_end_user_id,
            "from_account_id": message_data.from_account_id,
            "agent_based": message_data.agent_based,
            "workflow_run_id": message_data.workflow_run_id,
--- a/api/core/prompt/utils/get_thread_messages_length.py
+++ b/api/core/prompt/utils/get_thread_messages_length.py
@ -0,0 +1,32 @@
+from core.prompt.utils.extract_thread_messages import extract_thread_messages
+from extensions.ext_database import db
+from models.model import Message
+
+
+def get_thread_messages_length(conversation_id: str) -> int:
+    """
+    Get the number of thread messages based on the parent message id.
+    """
+    # Fetch all messages related to the conversation
+    query = (
+        db.session.query(
+            Message.id,
+            Message.parent_message_id,
+            Message.answer,
+        )
+        .filter(
+            Message.conversation_id == conversation_id,
+        )
+        .order_by(Message.created_at.desc())
+    )
+
+    messages = query.all()
+
+    # Extract thread messages
+    thread_messages = extract_thread_messages(messages)
+
+    # Exclude the newly created message with an empty answer
+    if thread_messages and not thread_messages[0].answer:
+        thread_messages.pop(0)
+
+    return len(thread_messages)
--- a/api/core/rag/datasource/retrieval_service.py
+++ b/api/core/rag/datasource/retrieval_service.py
@ -3,6 +3,7 @@ from typing import Optional

 from flask import Flask, current_app

+from configs import DifyConfig
 from core.rag.data_post_processor.data_post_processor import DataPostProcessor
 from core.rag.datasource.keyword.keyword_factory import Keyword
 from core.rag.datasource.vdb.vector_factory import Vector
@ -110,8 +111,12 @@ class RetrievalService:
                str(dataset.tenant_id), reranking_mode, reranking_model, weights, False
            )
            all_documents = data_post_processor.invoke(
-                query=query, documents=all_documents, score_threshold=score_threshold, top_n=top_k
+                query=query,
+                documents=all_documents,
+                score_threshold=score_threshold,
+                top_n=DifyConfig.RETRIEVAL_TOP_N or top_k,
            )
+
        return all_documents

    @classmethod
@ -178,7 +183,10 @@ class RetrievalService:
                        )
                        all_documents.extend(
                            data_post_processor.invoke(
-                                query=query, documents=documents, score_threshold=score_threshold, top_n=len(documents)
+                                query=query,
+                                documents=documents,
+                                score_threshold=score_threshold,
+                                top_n=DifyConfig.RETRIEVAL_TOP_N or len(documents),
                            )
                        )
                    else:
@ -220,7 +228,10 @@ class RetrievalService:
                        )
                        all_documents.extend(
                            data_post_processor.invoke(
-                                query=query, documents=documents, score_threshold=score_threshold, top_n=len(documents)
+                                query=query,
+                                documents=documents,
+                                score_threshold=score_threshold,
+                                top_n=DifyConfig.RETRIEVAL_TOP_N or len(documents),
                            )
                        )
                    else:
--- a/api/core/rag/datasource/vdb/oracle/oraclevector.py
+++ b/api/core/rag/datasource/vdb/oracle/oraclevector.py
@ -230,7 +230,6 @@ class OracleVector(BaseVector):
                except LookupError:
                    nltk.download("punkt")
                    nltk.download("stopwords")
-                    print("run download")
                e_str = re.sub(r"[^\w ]", "", query)
                all_tokens = nltk.word_tokenize(e_str)
                stop_words = stopwords.words("english")
--- a/api/core/rag/datasource/vdb/upstash/upstash_vector.py
+++ b/api/core/rag/datasource/vdb/upstash/upstash_vector.py
@ -64,7 +64,7 @@ class UpstashVector(BaseVector):
        item_ids = []
        for doc_id in ids:
            ids = self.get_ids_by_metadata_field("doc_id", doc_id)
-            if id:
+            if ids:
                item_ids += ids
        self._delete_by_ids(ids=item_ids)

@ -95,9 +95,10 @@ class UpstashVector(BaseVector):
            metadata = record.metadata
            text = record.data
            score = record.score
-            metadata["score"] = score
-            if score > score_threshold:
-                docs.append(Document(page_content=text, metadata=metadata))
+            if metadata is not None and text is not None:
+                metadata["score"] = score
+                if score > score_threshold:
+                    docs.append(Document(page_content=text, metadata=metadata))
        return docs

    def search_by_full_text(self, query: str, **kwargs: Any) -> list[Document]:
@ -123,7 +124,7 @@ class UpstashVectorFactory(AbstractVectorFactory):
        return UpstashVector(
            collection_name=collection_name,
            config=UpstashVectorConfig(
-                url=dify_config.UPSTASH_VECTOR_URL,
-                token=dify_config.UPSTASH_VECTOR_TOKEN,
+                url=dify_config.UPSTASH_VECTOR_URL or "",
+                token=dify_config.UPSTASH_VECTOR_TOKEN or "",
            ),
        )
--- a/api/core/rag/embedding/cached_embedding.py
+++ b/api/core/rag/embedding/cached_embedding.py
@ -102,7 +102,8 @@ class CacheEmbedding(Embeddings):
        embedding = redis_client.get(embedding_cache_key)
        if embedding:
            redis_client.expire(embedding_cache_key, 600)
-            return list(np.frombuffer(base64.b64decode(embedding), dtype="float"))
+            decoded_embedding = np.frombuffer(base64.b64decode(embedding), dtype="float")
+            return [float(x) for x in decoded_embedding]
        try:
            embedding_result = self._model_instance.invoke_text_embedding(
                texts=[text], user=self._user, input_type=EmbeddingInputType.QUERY
--- a/api/core/rag/extractor/word_extractor.py
+++ b/api/core/rag/extractor/word_extractor.py
@ -86,7 +86,7 @@ class WordExtractor(BaseExtractor):
                image_count += 1
                if rel.is_external:
                    url = rel.reltype
-                    response = ssrf_proxy.get(url, stream=True)
+                    response = ssrf_proxy.get(url)
                    if response.status_code == 200:
                        image_ext = mimetypes.guess_extension(response.headers["Content-Type"])
                        file_uuid = str(uuid.uuid4())
--- a/api/core/tools/provider/builtin/aws/tools/lambda_translate_utils.py
+++ b/api/core/tools/provider/builtin/aws/tools/lambda_translate_utils.py
@ -12,7 +12,7 @@ class LambdaTranslateUtilsTool(BuiltinTool):

    def _invoke_lambda(self, text_content, src_lang, dest_lang, model_id, dictionary_name, request_type, lambda_name):
        msg = {
-            "src_content": text_content,
+            "src_contents": [text_content],
            "src_lang": src_lang,
            "dest_lang": dest_lang,
            "dictionary_id": dictionary_name,
--- a/api/core/tools/provider/builtin/aws/tools/lambda_translate_utils.yaml
+++ b/api/core/tools/provider/builtin/aws/tools/lambda_translate_utils.yaml
@ -8,9 +8,9 @@ identity:
  icon: icon.svg
 description:
  human:
-    en_US: A util tools for LLM translation, extra deployment is needed on AWS. Please refer Github Repo - https://github.com/ybalbert001/dynamodb-rag
-    zh_Hans: 大语言模型翻译工具(专词映射获取)，需要在AWS上进行额外部署，可参考Github Repo - https://github.com/ybalbert001/dynamodb-rag
-    pt_BR: A util tools for LLM translation, specific Lambda Function deployment is needed on AWS. Please refer Github Repo - https://github.com/ybalbert001/dynamodb-rag
+    en_US: A util tools for LLM translation, extra deployment is needed on AWS. Please refer Github Repo - https://github.com/aws-samples/rag-based-translation-with-dynamodb-and-bedrock
+    zh_Hans: 大语言模型翻译工具(专词映射获取)，需要在AWS上进行额外部署，可参考Github Repo - https://github.com/aws-samples/rag-based-translation-with-dynamodb-and-bedrock
+    pt_BR: A util tools for LLM translation, specific Lambda Function deployment is needed on AWS. Please refer Github Repo - https://github.com/aws-samples/rag-based-translation-with-dynamodb-and-bedrock
  llm: A util tools for translation.
 parameters:
  - name: text_content
--- a/api/core/tools/provider/builtin/aws/tools/sagemaker_chinese_toxicity_detector.py
+++ b/api/core/tools/provider/builtin/aws/tools/sagemaker_chinese_toxicity_detector.py
@ -0,0 +1,67 @@
+import json
+from typing import Any, Union
+
+import boto3
+
+from core.tools.entities.tool_entities import ToolInvokeMessage
+from core.tools.tool.builtin_tool import BuiltinTool
+
+# 定义标签映射
+LABEL_MAPPING = {"LABEL_0": "SAFE", "LABEL_1": "NO_SAFE"}
+
+
+class ContentModerationTool(BuiltinTool):
+    sagemaker_client: Any = None
+    sagemaker_endpoint: str = None
+
+    def _invoke_sagemaker(self, payload: dict, endpoint: str):
+        response = self.sagemaker_client.invoke_endpoint(
+            EndpointName=endpoint,
+            Body=json.dumps(payload),
+            ContentType="application/json",
+        )
+        # Parse response
+        response_body = response["Body"].read().decode("utf8")
+
+        json_obj = json.loads(response_body)
+
+        # Handle nested JSON if present
+        if isinstance(json_obj, dict) and "body" in json_obj:
+            body_content = json.loads(json_obj["body"])
+            raw_label = body_content.get("label")
+        else:
+            raw_label = json_obj.get("label")
+
+        # 映射标签并返回
+        result = LABEL_MAPPING.get(raw_label, "NO_SAFE")  # 如果映射中没有找到，默认返回NO_SAFE
+        return result
+
+    def _invoke(
+        self,
+        user_id: str,
+        tool_parameters: dict[str, Any],
+    ) -> Union[ToolInvokeMessage, list[ToolInvokeMessage]]:
+        """
+        invoke tools
+        """
+        try:
+            if not self.sagemaker_client:
+                aws_region = tool_parameters.get("aws_region")
+                if aws_region:
+                    self.sagemaker_client = boto3.client("sagemaker-runtime", region_name=aws_region)
+                else:
+                    self.sagemaker_client = boto3.client("sagemaker-runtime")
+
+            if not self.sagemaker_endpoint:
+                self.sagemaker_endpoint = tool_parameters.get("sagemaker_endpoint")
+
+            content_text = tool_parameters.get("content_text")
+
+            payload = {"text": content_text}
+
+            result = self._invoke_sagemaker(payload, self.sagemaker_endpoint)
+
+            return self.create_text_message(text=result)
+
+        except Exception as e:
+            return self.create_text_message(f"Exception {str(e)}")
--- a/api/core/tools/provider/builtin/aws/tools/sagemaker_chinese_toxicity_detector.yaml
+++ b/api/core/tools/provider/builtin/aws/tools/sagemaker_chinese_toxicity_detector.yaml
@ -0,0 +1,46 @@
+identity:
+  name: chinese_toxicity_detector
+  author: AWS
+  label:
+    en_US: Chinese Toxicity Detector
+    zh_Hans: 中文有害内容检测
+  icon: icon.svg
+description:
+  human:
+    en_US: A tool to detect Chinese toxicity
+    zh_Hans: 检测中文有害内容的工具
+  llm: A tool that checks if Chinese content is safe for work
+parameters:
+  - name: sagemaker_endpoint
+    type: string
+    required: true
+    label:
+      en_US: sagemaker endpoint for moderation
+      zh_Hans: 内容审核的SageMaker端点
+    human_description:
+      en_US: sagemaker endpoint for content moderation
+      zh_Hans: 内容审核的SageMaker端点
+    llm_description: sagemaker endpoint for content moderation
+    form: form
+  - name: content_text
+    type: string
+    required: true
+    label:
+      en_US: content text
+      zh_Hans: 待审核文本
+    human_description:
+      en_US: text content to be moderated
+      zh_Hans: 需要审核的文本内容
+    llm_description: text content to be moderated
+    form: llm
+  - name: aws_region
+    type: string
+    required: false
+    label:
+      en_US: region of sagemaker endpoint
+      zh_Hans: SageMaker 端点所在的region
+    human_description:
+      en_US: region of sagemaker endpoint
+      zh_Hans: SageMaker 端点所在的region
+    llm_description: region of sagemaker endpoint
+    form: form
--- a/api/core/tools/provider/builtin/aws/tools/transcribe_asr.py
+++ b/api/core/tools/provider/builtin/aws/tools/transcribe_asr.py
@ -0,0 +1,418 @@
+import json
+import logging
+import os
+import re
+import time
+import uuid
+from typing import Any, Union
+from urllib.parse import urlparse
+
+import boto3
+import requests
+from botocore.exceptions import ClientError
+from requests.exceptions import RequestException
+
+from core.tools.entities.tool_entities import ToolInvokeMessage
+from core.tools.tool.builtin_tool import BuiltinTool
+
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+
+
+LanguageCodeOptions = [
+    "af-ZA",
+    "ar-AE",
+    "ar-SA",
+    "da-DK",
+    "de-CH",
+    "de-DE",
+    "en-AB",
+    "en-AU",
+    "en-GB",
+    "en-IE",
+    "en-IN",
+    "en-US",
+    "en-WL",
+    "es-ES",
+    "es-US",
+    "fa-IR",
+    "fr-CA",
+    "fr-FR",
+    "he-IL",
+    "hi-IN",
+    "id-ID",
+    "it-IT",
+    "ja-JP",
+    "ko-KR",
+    "ms-MY",
+    "nl-NL",
+    "pt-BR",
+    "pt-PT",
+    "ru-RU",
+    "ta-IN",
+    "te-IN",
+    "tr-TR",
+    "zh-CN",
+    "zh-TW",
+    "th-TH",
+    "en-ZA",
+    "en-NZ",
+    "vi-VN",
+    "sv-SE",
+    "ab-GE",
+    "ast-ES",
+    "az-AZ",
+    "ba-RU",
+    "be-BY",
+    "bg-BG",
+    "bn-IN",
+    "bs-BA",
+    "ca-ES",
+    "ckb-IQ",
+    "ckb-IR",
+    "cs-CZ",
+    "cy-WL",
+    "el-GR",
+    "et-ET",
+    "eu-ES",
+    "fi-FI",
+    "gl-ES",
+    "gu-IN",
+    "ha-NG",
+    "hr-HR",
+    "hu-HU",
+    "hy-AM",
+    "is-IS",
+    "ka-GE",
+    "kab-DZ",
+    "kk-KZ",
+    "kn-IN",
+    "ky-KG",
+    "lg-IN",
+    "lt-LT",
+    "lv-LV",
+    "mhr-RU",
+    "mi-NZ",
+    "mk-MK",
+    "ml-IN",
+    "mn-MN",
+    "mr-IN",
+    "mt-MT",
+    "no-NO",
+    "or-IN",
+    "pa-IN",
+    "pl-PL",
+    "ps-AF",
+    "ro-RO",
+    "rw-RW",
+    "si-LK",
+    "sk-SK",
+    "sl-SI",
+    "so-SO",
+    "sr-RS",
+    "su-ID",
+    "sw-BI",
+    "sw-KE",
+    "sw-RW",
+    "sw-TZ",
+    "sw-UG",
+    "tl-PH",
+    "tt-RU",
+    "ug-CN",
+    "uk-UA",
+    "uz-UZ",
+    "wo-SN",
+    "zu-ZA",
+]
+
+MediaFormat = ["mp3", "mp4", "wav", "flac", "ogg", "amr", "webm", "m4a"]
+
+
+def is_url(text):
+    if not text:
+        return False
+    text = text.strip()
+    # Regular expression pattern for URL validation
+    pattern = re.compile(
+        r"^"  # Start of the string
+        r"(?:http|https)://"  # Protocol (http or https)
+        r"(?:(?:[A-Z0-9](?:[A-Z0-9-]{0,61}[A-Z0-9])?\.)+(?:[A-Z]{2,6}\.?|[A-Z0-9-]{2,}\.?)|"  # Domain
+        r"localhost|"  # localhost
+        r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})"  # IP address
+        r"(?::\d+)?"  # Optional port
+        r"(?:/?|[/?]\S+)"  # Path
+        r"$",  # End of the string
+        re.IGNORECASE,
+    )
+    return bool(pattern.match(text))
+
+
+def upload_file_from_url_to_s3(s3_client, url, bucket_name, s3_key=None, max_retries=3):
+    """
+    Upload a file from a URL to an S3 bucket with retries and better error handling.
+
+    Parameters:
+    - s3_client
+    - url (str): The URL of the file to upload
+    - bucket_name (str): The name of the S3 bucket
+    - s3_key (str): The desired key (path) in S3. If None, will use the filename from URL
+    - max_retries (int): Maximum number of retry attempts
+
+    Returns:
+    - tuple: (bool, str) - (Success status, Message)
+    """
+
+    # Validate inputs
+    if not url or not bucket_name:
+        return False, "URL and bucket name are required"
+
+    retry_count = 0
+    while retry_count < max_retries:
+        try:
+            # Download the file from URL
+            response = requests.get(url, stream=True, timeout=30)
+            response.raise_for_status()
+
+            # If s3_key is not provided, try to get filename from URL
+            if not s3_key:
+                parsed_url = urlparse(url)
+                filename = os.path.basename(parsed_url.path.split("/file-preview")[0])
+                s3_key = "transcribe-files/" + filename
+
+            # Upload the file to S3
+            s3_client.upload_fileobj(
+                response.raw,
+                bucket_name,
+                s3_key,
+                ExtraArgs={
+                    "ContentType": response.headers.get("content-type"),
+                    "ACL": "private",  # Ensure the uploaded file is private
+                },
+            )
+
+            return f"s3://{bucket_name}/{s3_key}", f"Successfully uploaded file to s3://{bucket_name}/{s3_key}"
+
+        except RequestException as e:
+            retry_count += 1
+            if retry_count == max_retries:
+                return None, f"Failed to download file from URL after {max_retries} attempts: {str(e)}"
+            continue
+
+        except ClientError as e:
+            return None, f"AWS S3 error: {str(e)}"
+
+        except Exception as e:
+            return None, f"Unexpected error: {str(e)}"
+
+    return None, "Maximum retries exceeded"
+
+
+class TranscribeTool(BuiltinTool):
+    s3_client: Any = None
+    transcribe_client: Any = None
+
+    """
+    Note that you must include one of LanguageCode, IdentifyLanguage,
+    or IdentifyMultipleLanguages in your request. 
+    If you include more than one of these parameters, your transcription job fails.
+    """
+
+    def _transcribe_audio(self, audio_file_uri, file_type, **extra_args):
+        uuid_str = str(uuid.uuid4())
+        job_name = f"{int(time.time())}-{uuid_str}"
+        try:
+            # Start transcription job
+            response = self.transcribe_client.start_transcription_job(
+                TranscriptionJobName=job_name, Media={"MediaFileUri": audio_file_uri}, **extra_args
+            )
+
+            # Wait for the job to complete
+            while True:
+                status = self.transcribe_client.get_transcription_job(TranscriptionJobName=job_name)
+                if status["TranscriptionJob"]["TranscriptionJobStatus"] in ["COMPLETED", "FAILED"]:
+                    break
+                time.sleep(5)
+
+            if status["TranscriptionJob"]["TranscriptionJobStatus"] == "COMPLETED":
+                return status["TranscriptionJob"]["Transcript"]["TranscriptFileUri"], None
+            else:
+                return None, f"Error: TranscriptionJobStatus:{status['TranscriptionJob']['TranscriptionJobStatus']} "
+
+        except Exception as e:
+            return None, f"Error: {str(e)}"
+
+    def _download_and_read_transcript(self, transcript_file_uri: str, max_retries: int = 3) -> tuple[str, str]:
+        """
+        Download and read the transcript file from the given URI.
+
+        Parameters:
+        - transcript_file_uri (str): The URI of the transcript file
+        - max_retries (int): Maximum number of retry attempts
+
+        Returns:
+        - tuple: (text, error) - (Transcribed text if successful, error message if failed)
+        """
+        retry_count = 0
+        while retry_count < max_retries:
+            try:
+                # Download the transcript file
+                response = requests.get(transcript_file_uri, timeout=30)
+                response.raise_for_status()
+
+                # Parse the JSON content
+                transcript_data = response.json()
+
+                # Check if speaker labels are present and enabled
+                has_speaker_labels = (
+                    "results" in transcript_data
+                    and "speaker_labels" in transcript_data["results"]
+                    and "segments" in transcript_data["results"]["speaker_labels"]
+                )
+
+                if has_speaker_labels:
+                    # Get speaker segments
+                    segments = transcript_data["results"]["speaker_labels"]["segments"]
+                    items = transcript_data["results"]["items"]
+
+                    # Create a mapping of start_time -> speaker_label
+                    time_to_speaker = {}
+                    for segment in segments:
+                        speaker_label = segment["speaker_label"]
+                        for item in segment["items"]:
+                            time_to_speaker[item["start_time"]] = speaker_label
+
+                    # Build transcript with speaker labels
+                    current_speaker = None
+                    transcript_parts = []
+
+                    for item in items:
+                        # Skip non-pronunciation items (like punctuation)
+                        if item["type"] == "punctuation":
+                            transcript_parts.append(item["alternatives"][0]["content"])
+                            continue
+
+                        start_time = item["start_time"]
+                        speaker = time_to_speaker.get(start_time)
+
+                        if speaker != current_speaker:
+                            current_speaker = speaker
+                            transcript_parts.append(f"\n[{speaker}]: ")
+
+                        transcript_parts.append(item["alternatives"][0]["content"])
+
+                    return " ".join(transcript_parts).strip(), None
+                else:
+                    # Extract the transcription text
+                    # The transcript text is typically in the 'results' -> 'transcripts' array
+                    if "results" in transcript_data and "transcripts" in transcript_data["results"]:
+                        transcripts = transcript_data["results"]["transcripts"]
+                        if transcripts:
+                            # Combine all transcript segments
+                            full_text = " ".join(t.get("transcript", "") for t in transcripts)
+                            return full_text, None
+
+                return None, "No transcripts found in the response"
+
+            except requests.exceptions.RequestException as e:
+                retry_count += 1
+                if retry_count == max_retries:
+                    return None, f"Failed to download transcript file after {max_retries} attempts: {str(e)}"
+                continue
+
+            except json.JSONDecodeError as e:
+                return None, f"Failed to parse transcript JSON: {str(e)}"
+
+            except Exception as e:
+                return None, f"Unexpected error while processing transcript: {str(e)}"
+
+        return None, "Maximum retries exceeded"
+
+    def _invoke(
+        self,
+        user_id: str,
+        tool_parameters: dict[str, Any],
+    ) -> Union[ToolInvokeMessage, list[ToolInvokeMessage]]:
+        """
+        invoke tools
+        """
+        try:
+            if not self.transcribe_client:
+                aws_region = tool_parameters.get("aws_region")
+                if aws_region:
+                    self.transcribe_client = boto3.client("transcribe", region_name=aws_region)
+                    self.s3_client = boto3.client("s3", region_name=aws_region)
+                else:
+                    self.transcribe_client = boto3.client("transcribe")
+                    self.s3_client = boto3.client("s3")
+
+            file_url = tool_parameters.get("file_url")
+            file_type = tool_parameters.get("file_type")
+            language_code = tool_parameters.get("language_code")
+            identify_language = tool_parameters.get("identify_language", True)
+            identify_multiple_languages = tool_parameters.get("identify_multiple_languages", False)
+            language_options_str = tool_parameters.get("language_options")
+            s3_bucket_name = tool_parameters.get("s3_bucket_name")
+            ShowSpeakerLabels = tool_parameters.get("ShowSpeakerLabels", True)
+            MaxSpeakerLabels = tool_parameters.get("MaxSpeakerLabels", 2)
+
+            # Check the input params
+            if not s3_bucket_name:
+                return self.create_text_message(text="s3_bucket_name is required")
+            language_options = None
+            if language_options_str:
+                language_options = language_options_str.split("|")
+                for lang in language_options:
+                    if lang not in LanguageCodeOptions:
+                        return self.create_text_message(
+                            text=f"{lang} is not supported, should be one of {LanguageCodeOptions}"
+                        )
+            if language_code and language_code not in LanguageCodeOptions:
+                err_msg = f"language_code:{language_code} is not supported, should be one of {LanguageCodeOptions}"
+                return self.create_text_message(text=err_msg)
+
+            err_msg = f"identify_language:{identify_language}, \
+                identify_multiple_languages:{identify_multiple_languages}, \
+                Note that you must include one of LanguageCode, IdentifyLanguage, \
+                or IdentifyMultipleLanguages in your request. \
+                If you include more than one of these parameters, \
+                your transcription job fails."
+            if not language_code:
+                if identify_language and identify_multiple_languages:
+                    return self.create_text_message(text=err_msg)
+            else:
+                if identify_language or identify_multiple_languages:
+                    return self.create_text_message(text=err_msg)
+
+            extra_args = {
+                "IdentifyLanguage": identify_language,
+                "IdentifyMultipleLanguages": identify_multiple_languages,
+            }
+            if language_code:
+                extra_args["LanguageCode"] = language_code
+            if language_options:
+                extra_args["LanguageOptions"] = language_options
+            if ShowSpeakerLabels:
+                extra_args["Settings"] = {"ShowSpeakerLabels": ShowSpeakerLabels, "MaxSpeakerLabels": MaxSpeakerLabels}
+
+            # upload to s3 bucket
+            s3_path_result, error = upload_file_from_url_to_s3(self.s3_client, url=file_url, bucket_name=s3_bucket_name)
+            if not s3_path_result:
+                return self.create_text_message(text=error)
+
+            transcript_file_uri, error = self._transcribe_audio(
+                audio_file_uri=s3_path_result,
+                file_type=file_type,
+                **extra_args,
+            )
+            if not transcript_file_uri:
+                return self.create_text_message(text=error)
+
+            # Download and read the transcript
+            transcript_text, error = self._download_and_read_transcript(transcript_file_uri)
+            if not transcript_text:
+                return self.create_text_message(text=error)
+
+            return self.create_text_message(text=transcript_text)
+
+        except Exception as e:
+            return self.create_text_message(f"Exception {str(e)}")
--- a/api/core/tools/provider/builtin/aws/tools/transcribe_asr.yaml
+++ b/api/core/tools/provider/builtin/aws/tools/transcribe_asr.yaml
@ -0,0 +1,133 @@
+identity:
+  name: transcribe_asr
+  author: AWS
+  label:
+    en_US: TranscribeASR
+    zh_Hans: Transcribe语音识别转录
+    pt_BR: TranscribeASR
+  icon: icon.svg
+description:
+  human:
+    en_US: A tool for ASR (Automatic Speech Recognition) - https://github.com/aws-samples/dify-aws-tool
+    zh_Hans: AWS 语音识别转录服务, 请参考 https://aws.amazon.com/cn/pm/transcribe/#Learn_More_About_Amazon_Transcribe
+    pt_BR: A tool for ASR (Automatic Speech Recognition).
+  llm: A tool for ASR (Automatic Speech Recognition).
+parameters:
+  - name: file_url
+    type: string
+    required: true
+    label:
+      en_US: video or audio file url for transcribe
+      zh_Hans: 语音或者视频文件url
+      pt_BR: video or audio file url for transcribe
+    human_description:
+      en_US: video or audio file url for transcribe
+      zh_Hans: 语音或者视频文件url
+      pt_BR: video or audio file url for transcribe
+    llm_description: video or audio file url for transcribe
+    form: llm
+  - name: language_code
+    type: string
+    required: false
+    label:
+      en_US: Language Code
+      zh_Hans: 语言编码
+      pt_BR: Language Code
+    human_description:
+      en_US: The language code used to create your transcription job.  refer to :https://docs.aws.amazon.com/transcribe/latest/dg/supported-languages.html
+      zh_Hans: 语言编码,例如zh-CN, en-US 可参考 https://docs.aws.amazon.com/transcribe/latest/dg/supported-languages.html
+      pt_BR: The language code used to create your transcription job.  refer to :https://docs.aws.amazon.com/transcribe/latest/dg/supported-languages.html
+    llm_description: The language code used to create your transcription job.
+    form: llm
+  - name: identify_language
+    type: boolean
+    default: true
+    required: false
+    label:
+      en_US: Automactically Identify Language
+      zh_Hans: 自动识别语言
+      pt_BR: Automactically Identify Language
+    human_description:
+      en_US: Automactically Identify Language
+      zh_Hans: 自动识别语言
+      pt_BR: Automactically Identify Language
+    llm_description: Enable Automactically Identify Language
+    form: form
+  - name: identify_multiple_languages
+    type: boolean
+    required: false
+    label:
+      en_US: Automactically Identify Multiple Languages
+      zh_Hans: 自动识别多种语言
+      pt_BR: Automactically Identify Multiple Languages
+    human_description:
+      en_US: Automactically Identify Multiple Languages
+      zh_Hans: 自动识别多种语言
+      pt_BR: Automactically Identify Multiple Languages
+    llm_description: Enable Automactically Identify Multiple Languages
+    form: form
+  - name: language_options
+    type: string
+    required: false
+    label:
+      en_US: Language Options
+      zh_Hans: 语言种类选项
+      pt_BR: Language Options
+    human_description:
+      en_US: Seperated by |, e.g:zh-CN|en-US, You can specify two or more language codes that represent the languages you think may be present in your media
+      zh_Hans: 您可以指定两个或更多的语言代码来表示您认为可能出现在媒体中的语言。用｜分隔,如 zh-CN|en-US
+      pt_BR: Seperated by |, e.g:zh-CN|en-US, You can specify two or more language codes that represent the languages you think may be present in your media
+    llm_description: Seperated by |, e.g:zh-CN|en-US, You can specify two or more language codes that represent the languages you think may be present in your media
+    form: llm
+  - name: s3_bucket_name
+    type: string
+    required: true
+    label:
+      en_US: s3 bucket name
+      zh_Hans: s3 存储桶名称
+      pt_BR: s3 bucket name
+    human_description:
+      en_US: s3 bucket name to store transcribe files  (don't add prefix s3://)
+      zh_Hans: s3 存储桶名称,用于存储转录文件  (不需要前缀 s3://)
+      pt_BR: s3 bucket name to store transcribe files  (don't add prefix s3://)
+    llm_description: s3 bucket name to store transcribe files
+    form: form
+  - name: ShowSpeakerLabels
+    type: boolean
+    required: true
+    default: true
+    label:
+      en_US: ShowSpeakerLabels
+      zh_Hans: 显示说话人标签
+      pt_BR: ShowSpeakerLabels
+    human_description:
+      en_US: Enables speaker partitioning (diarization) in your transcription output
+      zh_Hans: 在转录输出中启用说话人分区（说话人分离）
+      pt_BR: Enables speaker partitioning (diarization) in your transcription output
+    llm_description: Enables speaker partitioning (diarization) in your transcription output
+    form: form
+  - name: MaxSpeakerLabels
+    type: number
+    required: true
+    default: 2
+    label:
+      en_US: MaxSpeakerLabels
+      zh_Hans: 说话人标签数量
+      pt_BR: MaxSpeakerLabels
+    human_description:
+      en_US: Specify the maximum number of speakers you want to partition in your media
+      zh_Hans: 指定您希望在媒体中划分的最多演讲者数量。
+      pt_BR: Specify the maximum number of speakers you want to partition in your media
+    llm_description: Specify the maximum number of speakers you want to partition in your media
+    form: form
+  - name: aws_region
+    type: string
+    required: false
+    label:
+      en_US: AWS Region
+      zh_Hans: AWS 区域
+    human_description:
+      en_US: Please enter the AWS region for the transcribe service, for example 'us-east-1'.
+      zh_Hans: 请输入Transcribe的 AWS 区域，例如 'us-east-1'。
+    llm_description: Please enter the AWS region for the transcribe service, for example 'us-east-1'.
+    form: form
--- a/api/core/tools/provider/builtin/chart/chart.py
+++ b/api/core/tools/provider/builtin/chart/chart.py
@ -1,3 +1,4 @@
+import matplotlib
 import matplotlib.pyplot as plt
 from matplotlib.font_manager import FontProperties, fontManager

@ -5,7 +6,7 @@ from core.tools.provider.builtin_tool_provider import BuiltinToolProviderControl


 def set_chinese_font():
-    font_list = [
+    to_find_fonts = [
        "PingFang SC",
        "SimHei",
        "Microsoft YaHei",
@ -15,16 +16,16 @@ def set_chinese_font():
        "Noto Sans CJK SC",
        "Noto Sans CJK JP",
    ]
-
-    for font in font_list:
-        if font in fontManager.ttflist:
-            chinese_font = FontProperties(font)
-            if chinese_font.get_name() == font:
-                return chinese_font
+    installed_fonts = frozenset(fontInfo.name for fontInfo in fontManager.ttflist)
+    for font in to_find_fonts:
+        if font in installed_fonts:
+            return FontProperties(font)

    return FontProperties()


+# use non-interactive backend to prevent `RuntimeError: main thread is not in main loop`
+matplotlib.use("Agg")
 # use a business theme
 plt.style.use("seaborn-v0_8-darkgrid")
 plt.rcParams["axes.unicode_minus"] = False
--- a/api/core/tools/provider/builtin/comfyui/comfyui.py
+++ b/api/core/tools/provider/builtin/comfyui/comfyui.py
@ -15,7 +15,7 @@ class ComfyUIProvider(BuiltinToolProviderController):

        try:
            ws.connect(ws_address)
-        except Exception as e:
+        except Exception:
            raise ToolProviderCredentialValidationError(f"can not connect to {ws_address}")
        finally:
            ws.close()
--- a/api/core/tools/provider/builtin/duckduckgo/tools/ddgo_img.py
+++ b/api/core/tools/provider/builtin/duckduckgo/tools/ddgo_img.py
@ -18,6 +18,12 @@ class DuckDuckGoImageSearchTool(BuiltinTool):
            "size": tool_parameters.get("size"),
            "max_results": tool_parameters.get("max_results"),
        }
+
+        # Add query_prefix handling
+        query_prefix = tool_parameters.get("query_prefix", "").strip()
+        final_query = f"{query_prefix} {query_dict['keywords']}".strip()
+        query_dict["keywords"] = final_query
+
        response = DDGS().images(**query_dict)
        markdown_result = "\n\n"
        json_result = []
--- a/api/core/tools/provider/builtin/duckduckgo/tools/ddgo_img.yaml
+++ b/api/core/tools/provider/builtin/duckduckgo/tools/ddgo_img.yaml
@ -86,3 +86,14 @@ parameters:
      en_US: The size of the image to be searched.
      zh_Hans: 要搜索的图片的大小
    form: form
+  - name: query_prefix
+    label:
+      en_US: Query Prefix
+      zh_Hans: 查询前缀
+    type: string
+    required: false
+    default: ""
+    form: form
+    human_description:
+      en_US: Specific Search e.g. "site:unsplash.com"
+      zh_Hans: 定向搜索 e.g. "site:unsplash.com"
--- a/api/core/tools/provider/builtin/duckduckgo/tools/ddgo_news.py
+++ b/api/core/tools/provider/builtin/duckduckgo/tools/ddgo_news.py
@ -7,7 +7,7 @@ from core.tools.entities.tool_entities import ToolInvokeMessage
 from core.tools.tool.builtin_tool import BuiltinTool

 SUMMARY_PROMPT = """
-User's query: 
+User's query:
 {query}

 Here are the news results:
@ -30,6 +30,12 @@ class DuckDuckGoNewsSearchTool(BuiltinTool):
            "safesearch": "moderate",
            "region": "wt-wt",
        }
+
+        # Add query_prefix handling
+        query_prefix = tool_parameters.get("query_prefix", "").strip()
+        final_query = f"{query_prefix} {query_dict['keywords']}".strip()
+        query_dict["keywords"] = final_query
+
        try:
            response = list(DDGS().news(**query_dict))
            if not response:
--- a/api/core/tools/provider/builtin/duckduckgo/tools/ddgo_news.yaml
+++ b/api/core/tools/provider/builtin/duckduckgo/tools/ddgo_news.yaml
@ -69,3 +69,14 @@ parameters:
      en_US: Whether to pass the news results to llm for summarization.
      zh_Hans: 是否需要将新闻结果传给大模型总结
    form: form
+  - name: query_prefix
+    label:
+      en_US: Query Prefix
+      zh_Hans: 查询前缀
+    type: string
+    required: false
+    default: ""
+    form: form
+    human_description:
+      en_US: Specific Search e.g. "site:msn.com"
+      zh_Hans: 定向搜索 e.g. "site:msn.com"
--- a/api/core/tools/provider/builtin/duckduckgo/tools/ddgo_search.py
+++ b/api/core/tools/provider/builtin/duckduckgo/tools/ddgo_search.py
@ -7,7 +7,7 @@ from core.tools.entities.tool_entities import ToolInvokeMessage
 from core.tools.tool.builtin_tool import BuiltinTool

 SUMMARY_PROMPT = """
-User's query: 
+User's query:
 {query}

 Here is the search engine result:
@ -26,7 +26,12 @@ class DuckDuckGoSearchTool(BuiltinTool):
        query = tool_parameters.get("query")
        max_results = tool_parameters.get("max_results", 5)
        require_summary = tool_parameters.get("require_summary", False)
-        response = DDGS().text(query, max_results=max_results)
+
+        # Add query_prefix handling
+        query_prefix = tool_parameters.get("query_prefix", "").strip()
+        final_query = f"{query_prefix} {query}".strip()
+
+        response = DDGS().text(final_query, max_results=max_results)
        if require_summary:
            results = "\n".join([res.get("body") for res in response])
            results = self.summary_results(user_id=user_id, content=results, query=query)
--- a/api/core/tools/provider/builtin/duckduckgo/tools/ddgo_search.yaml
+++ b/api/core/tools/provider/builtin/duckduckgo/tools/ddgo_search.yaml
@ -39,3 +39,14 @@ parameters:
      en_US: Whether to pass the search results to llm for summarization.
      zh_Hans: 是否需要将搜索结果传给大模型总结
    form: form
+  - name: query_prefix
+    label:
+      en_US: Query Prefix
+      zh_Hans: 查询前缀
+    type: string
+    required: false
+    default: ""
+    form: form
+    human_description:
+      en_US: Specific Search e.g. "site:wikipedia.org"
+      zh_Hans: 定向搜索 e.g. "site:wikipedia.org"
--- a/api/core/tools/provider/builtin/duckduckgo/tools/ddgo_video.py
+++ b/api/core/tools/provider/builtin/duckduckgo/tools/ddgo_video.py
@ -24,7 +24,7 @@ max-width: 100%; border-radius: 8px;">

    def _invoke(self, user_id: str, tool_parameters: dict[str, Any]) -> list[ToolInvokeMessage]:
        query_dict = {
-            "keywords": tool_parameters.get("query"),
+            "keywords": tool_parameters.get("query"),  # LLM's query
            "region": tool_parameters.get("region", "wt-wt"),
            "safesearch": tool_parameters.get("safesearch", "moderate"),
            "timelimit": tool_parameters.get("timelimit"),
@ -40,6 +40,12 @@ max-width: 100%; border-radius: 8px;">
        # Get proxy URL from parameters
        proxy_url = tool_parameters.get("proxy_url", "").strip()

+        query_prefix = tool_parameters.get("query_prefix", "").strip()
+        final_query = f"{query_prefix} {query_dict['keywords']}".strip()
+
+        # Update the keywords in query_dict with the final_query
+        query_dict["keywords"] = final_query
+
        response = DDGS().videos(**query_dict)

        # Create HTML result with embedded iframes
@ -51,9 +57,13 @@ max-width: 100%; border-radius: 8px;">
            embed_html = res.get("embed_html", "")
            description = res.get("description", "")
            content_url = res.get("content", "")
+            transcript_url = None

            # Handle TED.com videos
-            if not embed_html and "ted.com/talks" in content_url:
+            if "ted.com/talks" in content_url:
+                # Create transcript URL
+                transcript_url = f"{content_url}/transcript"
+                # Create embed URL
                embed_url = content_url.replace("www.ted.com", "embed.ted.com")
                if proxy_url:
                    embed_url = f"{proxy_url}{embed_url}"
@ -68,8 +78,14 @@ max-width: 100%; border-radius: 8px;">

            markdown_result += f"{title}\n\n"
            markdown_result += f"{embed_html}\n\n"
+            if description:
+                markdown_result += f"{description}\n\n"
            markdown_result += "---\n\n"

-            json_result.append(self.create_json_message(res))
+            # Add transcript_url to the JSON result if available
+            result_dict = res.copy()
+            if transcript_url:
+                result_dict["transcript_url"] = transcript_url
+            json_result.append(self.create_json_message(result_dict))

        return [self.create_text_message(markdown_result)] + json_result
--- a/api/core/tools/provider/builtin/duckduckgo/tools/ddgo_video.yaml
+++ b/api/core/tools/provider/builtin/duckduckgo/tools/ddgo_video.yaml
@ -95,3 +95,14 @@ parameters:
      en_US: Proxy URL
      zh_Hans: 视频代理地址
    form: form
+  - name: query_prefix
+    label:
+      en_US: Query Prefix
+      zh_Hans: 查询前缀
+    type: string
+    required: false
+    default: ""
+    form: form
+    human_description:
+      en_US: Specific Search e.g. "site:www.ted.com"
+      zh_Hans: 定向搜索 e.g. "site:www.ted.com"
--- a/api/core/tools/provider/builtin/searchapi/tools/google.py
+++ b/api/core/tools/provider/builtin/searchapi/tools/google.py
@ -45,7 +45,7 @@ class SearchAPI:
    def _process_response(res: dict, type: str) -> str:
        """Process response from SearchAPI."""
        if "error" in res:
-            raise ValueError(f"Got error from SearchApi: {res['error']}")
+            return res["error"]

        toret = ""
        if type == "text":
--- a/api/core/tools/provider/builtin/searchapi/tools/google_jobs.py
+++ b/api/core/tools/provider/builtin/searchapi/tools/google_jobs.py
@ -45,7 +45,7 @@ class SearchAPI:
    def _process_response(res: dict, type: str) -> str:
        """Process response from SearchAPI."""
        if "error" in res:
-            raise ValueError(f"Got error from SearchApi: {res['error']}")
+            return res["error"]

        toret = ""
        if type == "text":
--- a/api/core/tools/provider/builtin/searchapi/tools/google_news.py
+++ b/api/core/tools/provider/builtin/searchapi/tools/google_news.py
@ -45,7 +45,7 @@ class SearchAPI:
    def _process_response(res: dict, type: str) -> str:
        """Process response from SearchAPI."""
        if "error" in res:
-            raise ValueError(f"Got error from SearchApi: {res['error']}")
+            return res["error"]

        toret = ""
        if type == "text":
--- a/api/core/tools/provider/builtin/searchapi/tools/youtube_transcripts.py
+++ b/api/core/tools/provider/builtin/searchapi/tools/youtube_transcripts.py
@ -45,7 +45,7 @@ class SearchAPI:
    def _process_response(res: dict) -> str:
        """Process response from SearchAPI."""
        if "error" in res:
-            raise ValueError(f"Got error from SearchApi: {res['error']}")
+            return res["error"]

        toret = ""
        if "transcripts" in res and "text" in res["transcripts"][0]:
--- a/api/core/tools/tool/tool.py
+++ b/api/core/tools/tool/tool.py
@ -324,7 +324,12 @@ class Tool(BaseModel, ABC):
        :param blob: the blob
        :return: the blob message
        """
-        return ToolInvokeMessage(type=ToolInvokeMessage.MessageType.BLOB, message=blob, meta=meta, save_as=save_as)
+        return ToolInvokeMessage(
+            type=ToolInvokeMessage.MessageType.BLOB,
+            message=blob,
+            meta=meta or {},
+            save_as=save_as,
+        )

    def create_json_message(self, object: dict) -> ToolInvokeMessage:
        """
--- a/api/core/tools/tool/workflow_tool.py
+++ b/api/core/tools/tool/workflow_tool.py
@ -58,11 +58,11 @@ class WorkflowTool(Tool):
            user=self._get_user(user_id),
            args={"inputs": tool_parameters, "files": files},
            invoke_from=self.runtime.invoke_from,
-            stream=False,
+            streaming=False,
            call_depth=self.workflow_call_depth + 1,
            workflow_thread_pool_id=self.thread_pool_id,
        )
-
+        assert isinstance(result, dict)
        data = result.get("data", {})

        if data.get("error"):
--- a/api/core/workflow/graph_engine/graph_engine.py
+++ b/api/core/workflow/graph_engine/graph_engine.py
@ -64,7 +64,6 @@ class GraphEngineThreadPool(ThreadPoolExecutor):
        self.submit_count -= 1

    def check_is_full(self) -> None:
-        print(f"submit_count: {self.submit_count}, max_submit_count: {self.max_submit_count}")
        if self.submit_count > self.max_submit_count:
            raise ValueError(f"Max submit count {self.max_submit_count} of workflow thread pool reached.")

--- a/api/core/workflow/nodes/document_extractor/node.py
+++ b/api/core/workflow/nodes/document_extractor/node.py
@ -4,8 +4,8 @@ import json

 import docx
 import pandas as pd
-import pypdfium2
-import yaml
+import pypdfium2  # type: ignore
+import yaml  # type: ignore
 from unstructured.partition.api import partition_via_api
 from unstructured.partition.email import partition_email
 from unstructured.partition.epub import partition_epub
@ -113,7 +113,7 @@ def _extract_text_by_mime_type(*, file_content: bytes, mime_type: str) -> str:
 def _extract_text_by_file_extension(*, file_content: bytes, file_extension: str) -> str:
    """Extract text from a file based on its file extension."""
    match file_extension:
-        case ".txt" | ".markdown" | ".md" | ".html" | ".htm" | ".xml":
+        case ".txt" | ".markdown" | ".md" | ".html" | ".htm" | ".xml" | ".vtt":
            return _extract_text_from_plain_text(file_content)
        case ".json":
            return _extract_text_from_json(file_content)
@ -237,15 +237,17 @@ def _extract_text_from_csv(file_content: bytes) -> str:

 def _extract_text_from_excel(file_content: bytes) -> str:
    """Extract text from an Excel file using pandas."""
-
    try:
-        df = pd.read_excel(io.BytesIO(file_content))
-
-        # Drop rows where all elements are NaN
-        df.dropna(how="all", inplace=True)
-
-        # Convert DataFrame to Markdown table
-        markdown_table = df.to_markdown(index=False)
+        excel_file = pd.ExcelFile(io.BytesIO(file_content))
+        markdown_table = ""
+        for sheet_name in excel_file.sheet_names:
+            try:
+                df = excel_file.parse(sheet_name=sheet_name)
+                df.dropna(how="all", inplace=True)
+                # Create Markdown table two times to separate tables with a newline
+                markdown_table += df.to_markdown(index=False) + "\n\n"
+            except Exception as e:
+                continue
        return markdown_table
    except Exception as e:
        raise TextExtractionError(f"Failed to extract text from Excel file: {str(e)}") from e
--- a/api/core/workflow/nodes/http_request/node.py
+++ b/api/core/workflow/nodes/http_request/node.py
@ -107,6 +107,7 @@ class HttpRequestNode(BaseNode[HttpRequestNodeData]):
        node_data: HttpRequestNodeData,
    ) -> Mapping[str, Sequence[str]]:
        selectors: list[VariableSelector] = []
+        selectors += variable_template_parser.extract_selectors_from_template(node_data.url)
        selectors += variable_template_parser.extract_selectors_from_template(node_data.headers)
        selectors += variable_template_parser.extract_selectors_from_template(node_data.params)
        if node_data.body:
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Novice Lee	5f7771bc47	fix: iteration node use the main thread pool	2024-12-02 21:13:47 +08:00
Novice Lee	286741e139	fix: iteration node use the main thread pool	2024-12-02 21:13:39 +08:00
Hash Brown	c4fad66f2a	fix: `dialogue_count` incorrect in chatflow when there's... (#11175 )	2024-12-02 16:09:26 +08:00
yihong	02572e8cca	fix: claude can not handle empty string (#11238 ) Signed-off-by: yihong0618 <zouzou0208@gmail.com>	2024-12-02 16:00:40 +08:00
Hiroshi Fujita	1d8385f7ac	Sync INDEXING_MAX_SEGMENTATION_TOKENS_LENGTH between API and Web (#11230 )	2024-12-02 15:29:25 +08:00
-LAN-	f8c966c39c	fix(workflow_tool): Rename `stream` to `streaming` (#11258 ) Signed-off-by: -LAN- <laipz8200@outlook.com>	2024-12-02 15:00:26 +08:00
-LAN-	3c8efe7c0a	fix(workflow_cycle_manage): Handle special values in the process_data. (#11253 ) Signed-off-by: -LAN- <laipz8200@outlook.com>	2024-12-02 13:53:43 +08:00
Garfield Dai	dbc10e0feb	fix: license str parser. (#11248 )	2024-12-02 11:38:18 +08:00
yihong	239bf97b47	fix: nvidia special embedding model payload close #11193 (#11239 ) Signed-off-by: yihong0618 <zouzou0208@gmail.com>	2024-12-02 10:25:15 +08:00
Hiroshi Fujita	858db2f239	feat(api): include tags in app information response (#11242 )	2024-12-02 10:25:01 +08:00
-LAN-	c34bdb74e6	Fix/type-error (#11240 ) Signed-off-by: -LAN- <laipz8200@outlook.com>	2024-12-02 10:24:21 +08:00
-LAN-	9601102885	fix(word_extractor): Fix type error and remove stream in ssrf_proxy (#11241 ) Signed-off-by: -LAN- <laipz8200@outlook.com>	2024-12-02 10:24:03 +08:00
kazuya-awano	56c2d1cc55	feat: add pagination support for Notion search (#11194 )	2024-12-01 21:49:34 +08:00
Bowen Liang	a67b0d4771	chore(lint): extract ruff configs into .ruff.toml file keeping pyproject.toml clean (#11222 )	2024-12-01 12:51:28 +08:00
-LAN-	ef204817ae	chore(api/Dockerfile): Bump perl to 0.40.0-8 (#11234 ) Signed-off-by: -LAN- <laipz8200@outlook.com>	2024-12-01 09:39:02 +08:00
Hiroshi Fujita	9bc5bc2548	feat: Increase the number of Opening Questions in the Conversation Opener (#11233 )	2024-12-01 09:38:45 +08:00
yihong	fd4be36991	fix: total tokens is wrong which is zero in inter way, close #11221 (#11224 ) Signed-off-by: yihong0618 <zouzou0208@gmail.com>	2024-11-30 23:18:24 +08:00
Bowen Liang	9b46b02717	refactor: assembling the app features in modular way (#9129 ) Signed-off-by: -LAN- <laipz8200@outlook.com> Co-authored-by: -LAN- <laipz8200@outlook.com>	2024-11-30 23:05:22 +08:00
非法操作	3bc4dc58d7	fix: search model not work as expected (#11225 )	2024-11-30 17:31:15 +08:00
Shota Totsuka	594666eb61	fix: use Gemini response metadata for token counting (#11226 )	2024-11-30 17:30:55 +08:00
朱晓兵	e80f41a701	fix: support setting variables in url (#10676 )	2024-11-30 11:15:17 +08:00
Cling_o3	f9c2aa7689	feat: add retireval_top_n to config in env (#11132 )	2024-11-30 11:14:45 +08:00
fengjiajie	9dd4bf5574	fix: Correct inputs field type in API documentation (#11198 )	2024-11-30 11:13:32 +08:00
yihong	5a9b785773	fix: excel in node only read one sheet, close #9661 (#11215 ) Signed-off-by: yihong0618 <zouzou0208@gmail.com>	2024-11-30 11:11:08 +08:00
catusax	d96a28487a	fix: 'validation error for ToolInvokeMessage' when blob_message meta is None (#11212 )	2024-11-29 17:35:13 +08:00
-LAN-	0554898b5d	fix(file_factory): Remove transfer_method validation (#11207 ) Signed-off-by: -LAN- <laipz8200@outlook.com>	2024-11-29 17:26:31 +08:00
liujiamingtiny	6f9ce6a199	fix: fix azure open-4o-08-06 when enable json schema cant process content = "" (#11204 ) Co-authored-by: jiaming.liu <jiaming.liu@zkh.com>	2024-11-29 17:26:07 +08:00
Yi Xiao	e3119112a6	chore: add Thai GUI (#11201 )	2024-11-29 14:20:48 +08:00
非法操作	d3af0e9090	fix: handleLoadFileFromLink's transfer method incorrect (#11197 )	2024-11-29 09:37:50 +08:00
Bowen Liang	2feb44e2c5	chore(dep): bump flask from 3.0.1 to 3.1.0 and flask-compress to 1.17 (#11195 )	2024-11-29 09:28:53 +08:00
ybalbert001	cc0b92bc75	Update aws tools (#11174 ) Co-authored-by: Yuanbo Li <ybalbert@amazon.com>	2024-11-29 09:28:28 +08:00
非法操作	e576d32fb6	chore: improve conversation list and rename docs (#11187 )	2024-11-29 09:22:08 +08:00
kazuya-awano	2d6865d421	Ensure consistent float type for cached embedding return values (#10185 )	2024-11-29 09:18:41 +08:00
Ethan	0f1133729f	feat: introduce a new environment variable that suppose to disable Scarf analytics (#11179 )	2024-11-28 15:21:04 +08:00
yihong	d7160ee563	fix: typo in upstashVector if id is always true, also fix some type hint (#11183 ) Signed-off-by: yihong0618 <zouzou0208@gmail.com>	2024-11-28 14:05:25 +08:00
github-actions[bot]	18add94a31	chore: translate i18n files (#11182 ) Co-authored-by: JzoNgKVO <27049666+JzoNgKVO@users.noreply.github.com>	2024-11-28 13:21:04 +08:00
KVOJJJin	18d3ffc194	Feat: new pagination (#11170 )	2024-11-28 12:26:02 +08:00
NFish	0a30a5b077	Feat: remove github star and community links if it is enterprise version (#11180 )	2024-11-28 11:02:25 +08:00
jiangbo721	9049dd7725	fix: code linting (#11143 ) Co-authored-by: 刘江波 <jiangbo721@163.com>	2024-11-27 23:44:51 +08:00
Jinzhou Zhang	6f418da388	Fixes #11065 : tenant_id not found when login via ADMIN_KEY (#11066 )	2024-11-27 19:50:56 +08:00
Jyong	41c6bf5fe4	update the scheduler of update_tidb_serverless_status_task to 1/10min (#11135 )	2024-11-27 17:41:00 +08:00
Kevin Zhao	33d6d26bbf	Adding AWS CDK deploy link in README in multi-language (#11166 )	2024-11-27 17:40:40 +08:00
-LAN-	787285d58f	fix(file_factory): convert tool file correctly. (#11167 ) Signed-off-by: -LAN- <laipz8200@outlook.com>	2024-11-27 17:28:01 +08:00
yihong	40fc6f529e	fix: gitee ai wrong default model, and better para (#11168 ) Signed-off-by: yihong0618 <zouzou0208@gmail.com>	2024-11-27 17:27:11 +08:00
Novice	baef18cedd	fix: Incorrect iteration log display in workflow with multiple parallel mode iteartaion nodes (#11158 ) Co-authored-by: Novice Lee <novicelee@NovicedeMacBook-Pro.local>	2024-11-27 13:42:28 +08:00
Hiroshi Fujita	a918cea2fe	feat: add VTT file support to Document Extractor (#11148 )	2024-11-27 11:42:42 +08:00
-LAN-	9789905a1f	chore(*): Removes debugging print statements (#11145 ) Signed-off-by: -LAN- <laipz8200@outlook.com>	2024-11-26 22:03:19 +08:00
Charlie.Wei	f458580dee	fix parameter extractor function call Expected str (#11142 )	2024-11-26 21:46:56 +08:00
-LAN-	223a30401c	fix: LLM invoke error should not be raised (#11141 ) Signed-off-by: -LAN- <laipz8200@outlook.com>	2024-11-26 20:56:48 +08:00
yihong	2927493cf3	fix: better way to handle github dsl url close #11113 (#11125 ) Signed-off-by: yihong0618 <zouzou0208@gmail.com>	2024-11-26 19:39:55 +08:00
Joel	79db920fa7	fix: enable after disabled memory not pass user query (#11136 )	2024-11-26 17:55:11 +08:00
NFish	b3d65cc7df	Feat: Divider component now supports gradient background (#11130 )	2024-11-26 17:44:56 +08:00
-LAN-	208d6d6d94	chore: bump to 0.12.1 (#11122 )	2024-11-26 15:46:17 +08:00
Tao Wang	aa135a3780	Add TTS to OpenAI_API_Compatible (#11071 )	2024-11-26 15:14:02 +08:00
-LAN-	044e7b63c2	fix(llm_node): Ignore file if not supported. (#11114 )	2024-11-26 14:14:14 +08:00
-LAN-	5b7b328193	feat: Allow to contains files in the system prompt even model not support. (#11111 )	2024-11-26 13:45:49 +08:00
-LAN-	8d5a1be227	fix: Cannot use files in the user inputs. (#11112 )	2024-11-26 13:43:38 +08:00
非法操作	90d5765fb6	fix: app copy raise error (#11108 )	2024-11-26 13:42:13 +08:00
-LAN-	1db14793fa	fix(anthropic_llm): Ignore non-text parts in the system prompt. (#11107 )	2024-11-26 13:31:40 +08:00
-LAN-	cbb4e95928	fix(llm_node): Ignore user query when memory is disabled. (#11106 )	2024-11-26 13:07:32 +08:00
-LAN-	20c091a5e7	fix: user query be ignored if query_prompt_template is an empty string (#11103 )	2024-11-26 12:47:59 +08:00
NFish	e9c098d024	Fix regenerate themes (#11101 )	2024-11-26 11:33:04 +08:00
horochx	9f75970347	fix: ops_trace_manager `from_end_user_id` (#11077 )	2024-11-26 10:29:00 +08:00
非法操作	f1366e8e19	fix #11091 raise redirect issue (#11092 )	2024-11-26 10:25:42 +08:00
Hash Brown	0f85e3557b	fix: site icon not showing (#11094 )	2024-11-26 10:23:03 +08:00
SebastjanPrachovskij	17ee731546	SearchApi - Return error message instead of raising a ValueError (#11083 )	2024-11-26 09:34:51 +08:00
Tao Wang	af2461cccc	Add query_prefix + Return TED Transcript URL for Downstream Scraping Tasks (#11090 )	2024-11-26 09:32:37 +08:00
非法操作	60c1549771	fix: import Explore Apps raise error (#11091 )	2024-11-26 09:32:08 +08:00
fengjiajie	ab6dcf7032	fix: update the max tokens configuration for Azure GPT-4o (2024-08-06) to 16384 (#11074 )	2024-11-25 21:13:02 +08:00
yihong	8aae235a71	fix: int None will cause error for context size (#11055 ) Signed-off-by: yihong0618 <zouzou0208@gmail.com>	2024-11-25 21:04:16 +08:00
-LAN-	c032574491	fix: timezone not imported in conversation service. (#11076 )	2024-11-25 20:53:55 +08:00
Tao Wang	1065917872	Add grok-vision-beta to xAI + Update grok-beta Features (#11004 )	2024-11-25 20:53:03 +08:00
非法操作	56e361ac44	fix: chart tool chinese font display and raise error (#11058 )	2024-11-25 19:50:33 +08:00
yihong	2e00829b1e	fix: drop useless and wrong code for zhipu embedding (#11069 ) Signed-off-by: yihong0618 <zouzou0208@gmail.com>	2024-11-25 19:50:23 +08:00