feat: bump version 0.4.5 (#1994 )

fix: azure openai model parameters wrong when using hosting credentials (#1993 )
fix: baichuan max chunks (#1990 )
2026-01-26 06:45:45 +08:00 · 2024-01-11 10:55:56 +08:00 · 2024-01-11 10:49:35 +08:00 · 2024-01-10 23:13:35 +08:00 · 2024-01-10 21:14:10 +08:00 · 2024-01-10 20:48:16 +08:00
162 changed files with 3889 additions and 3120 deletions
--- a/.github/ISSUE_TEMPLATE/bug_report.yml
+++ b/.github/ISSUE_TEMPLATE/bug_report.yml
@ -1,56 +1,56 @@
 name: "🕷️ Bug report"
 description: Report errors or unexpected behavior
 labels:
- bug
+  - bug
 body:
- type: checkboxes
-  attributes:
-    label: Self Checks
-    description: "To make sure we get to you in time, please check the following :)"
-    options:
-      - label: I have searched for existing issues [search for existing issues](https://github.com/langgenius/dify/issues), including closed ones.
-        required: true
-      - label: I confirm that I am using English to file this report (我已阅读并同意 [Language Policy](https://github.com/langgenius/dify/issues/1542)).
-        required: true
+  - type: checkboxes
+    attributes:
+      label: Self Checks
+      description: "To make sure we get to you in time, please check the following :)"
+      options:
+        - label: I have searched for existing issues [search for existing issues](https://github.com/langgenius/dify/issues), including closed ones.
+          required: true
+        - label: I confirm that I am using English to file this report (我已阅读并同意 [Language Policy](https://github.com/langgenius/dify/issues/1542)).
+          required: true

- type: input
-  attributes:
-    label: Dify version
-    placeholder: 0.3.21
-    description: See about section in Dify console
-  validations:
-    required: true
+  - type: input
+    attributes:
+      label: Dify version
+      placeholder: 0.3.21
+      description: See about section in Dify console
+    validations:
+      required: true

- type: dropdown
-  attributes:
-    label: Cloud or Self Hosted
-    description: How / Where was Dify installed from?
-    multiple: true
-    options:
-      - Cloud
-      - Self Hosted (Docker)
-      - Self Hosted (Source)
-  validations:
-    required: true
+  - type: dropdown
+    attributes:
+      label: Cloud or Self Hosted
+      description: How / Where was Dify installed from?
+      multiple: true
+      options:
+        - Cloud
+        - Self Hosted (Docker)
+        - Self Hosted (Source)
+    validations:
+      required: true

- type: textarea
-  attributes:
-    label: Steps to reproduce
-    description: We highly suggest including screenshots and a bug report log.
-    placeholder: Having detailed steps helps us reproduce the bug. 
-  validations:
-    required: true
+  - type: textarea
+    attributes:
+      label: Steps to reproduce
+      description: We highly suggest including screenshots and a bug report log.
+      placeholder: Having detailed steps helps us reproduce the bug.
+    validations:
+      required: true

- type: textarea
-  attributes:
-    label: ✔️ Expected Behavior
-    placeholder: What were you expecting?
-  validations:
-    required: false
+  - type: textarea
+    attributes:
+      label: ✔️ Expected Behavior
+      placeholder: What were you expecting?
+    validations:
+      required: false

- type: textarea
-  attributes:
-    label: ❌ Actual Behavior
-    placeholder: What happened instead?
-  validations:
-    required: false
+  - type: textarea
+    attributes:
+      label: ❌ Actual Behavior
+      placeholder: What happened instead?
+    validations:
+      required: false
--- a/.github/ISSUE_TEMPLATE/config.yml
+++ b/.github/ISSUE_TEMPLATE/config.yml
@ -5,4 +5,4 @@ contact_links:
    about: Documentation for users of Dify
  - name: "\U0001F4DA Dify dev documentation"
    url: https://docs.dify.ai/getting-started/install-self-hosted
-    about: Documentation for people interested in developing and contributing for Dify
+    about: Documentation for people interested in developing and contributing for Dify
--- a/.github/ISSUE_TEMPLATE/document_issue.yml
+++ b/.github/ISSUE_TEMPLATE/document_issue.yml
@ -1,19 +1,20 @@
 name: "📚 Documentation Issue"
 description: Report issues in our documentation
-labels: 
- ducumentation
+labels:
+  - ducumentation
 body:
- type: checkboxes
-  attributes:
-    label: Self Checks
-    description: "To make sure we get to you in time, please check the following :)"
-      - label: I have searched for existing issues [search for existing issues](https://github.com/langgenius/dify/issues), including closed ones.
-        required: true
-      - label: I confirm that I am using English to file this report (我已阅读并同意 [Language Policy](https://github.com/langgenius/dify/issues/1542)).
-        required: true
- type: textarea
-  attributes: 
-    label: Provide a description of requested docs changes
-    placeholder: Briefly describe which document needs to be corrected and why.
-  validations:
-    required: true
+  - type: checkboxes
+    attributes:
+      label: Self Checks
+      description: "To make sure we get to you in time, please check the following :)"
+      options:
+        - label: I have searched for existing issues [search for existing issues](https://github.com/langgenius/dify/issues), including closed ones.
+          required: true
+        - label: I confirm that I am using English to file this report (我已阅读并同意 [Language Policy](https://github.com/langgenius/dify/issues/1542)).
+          required: true
+  - type: textarea
+    attributes:
+      label: Provide a description of requested docs changes
+      placeholder: Briefly describe which document needs to be corrected and why.
+    validations:
+      required: true
--- a/.github/ISSUE_TEMPLATE/feature_request.yml
+++ b/.github/ISSUE_TEMPLATE/feature_request.yml
@ -1,35 +1,35 @@
 name: "⭐ Feature or enhancement request"
 description: Propose something new.
 labels:
- enhancement
+  - enhancement
 body:
- type: checkboxes
-  attributes:
-    label: Self Checks
-    description: "To make sure we get to you in time, please check the following :)"
-    options:
-      - label: I have searched for existing issues [search for existing issues](https://github.com/langgenius/dify/issues), including closed ones.
-        required: true
-      - label: I confirm that I am using English to file this report (我已阅读并同意 [Language Policy](https://github.com/langgenius/dify/issues/1542)).
-        required: true
- type: textarea
-  attributes: 
-    label: Description of the new feature / enhancement
-    placeholder: What is the expected behavior of the proposed feature?
-  validations:
-    required: true
- type: textarea
-  attributes:
-    label: Scenario when this would be used?
-    placeholder: What is the scenario this would be used? Why is this important to your workflow as a dify user?
-  validations:
-    required: true
- type: textarea
-  attributes:
-    label: Supporting information
-    placeholder: "Having additional evidence, data, tweets, blog posts, research, ... anything is extremely helpful. This information provides context to the scenario that may otherwise be lost."
-  validations:
-    required: false
- type: markdown
-  attributes:
-    value: Please limit one request per issue.
+  - type: checkboxes
+    attributes:
+      label: Self Checks
+      description: "To make sure we get to you in time, please check the following :)"
+      options:
+        - label: I have searched for existing issues [search for existing issues](https://github.com/langgenius/dify/issues), including closed ones.
+          required: true
+        - label: I confirm that I am using English to file this report (我已阅读并同意 [Language Policy](https://github.com/langgenius/dify/issues/1542)).
+          required: true
+  - type: textarea
+    attributes:
+      label: Description of the new feature / enhancement
+      placeholder: What is the expected behavior of the proposed feature?
+    validations:
+      required: true
+  - type: textarea
+    attributes:
+      label: Scenario when this would be used?
+      placeholder: What is the scenario this would be used? Why is this important to your workflow as a dify user?
+    validations:
+      required: true
+  - type: textarea
+    attributes:
+      label: Supporting information
+      placeholder: "Having additional evidence, data, tweets, blog posts, research, ... anything is extremely helpful. This information provides context to the scenario that may otherwise be lost."
+    validations:
+      required: false
+  - type: markdown
+    attributes:
+      value: Please limit one request per issue.
--- a/.github/ISSUE_TEMPLATE/help_wanted.yml
+++ b/.github/ISSUE_TEMPLATE/help_wanted.yml
@ -1,20 +1,20 @@
 name: "🤝 Help Wanted"
-description: "Request help from the community" [please use English :）]
+description: "Request help from the community [please use English :）]"
 labels:
- help-wanted
+  - help-wanted
 body:
- type: checkboxes
-  attributes:
-    label: Self Checks
-    description: "To make sure we get to you in time, please check the following :)"
-    options:
-      - label: I have searched for existing issues [search for existing issues](https://github.com/langgenius/dify/issues), including closed ones.
-        required: true
-      - label: I confirm that I am using English to file this report (我已阅读并同意 [Language Policy](https://github.com/langgenius/dify/issues/1542)).
-        required: true
- type: textarea
-  attributes:
-    label: Provide a description of the help you need
-    placeholder: Briefly describe what you need help with.
-  validations:
-    required: true
+  - type: checkboxes
+    attributes:
+      label: Self Checks
+      description: "To make sure we get to you in time, please check the following :)"
+      options:
+        - label: I have searched for existing issues [search for existing issues](https://github.com/langgenius/dify/issues), including closed ones.
+          required: true
+        - label: I confirm that I am using English to file this report (我已阅读并同意 [Language Policy](https://github.com/langgenius/dify/issues/1542)).
+          required: true
+  - type: textarea
+    attributes:
+      label: Provide a description of the help you need
+      placeholder: Briefly describe what you need help with.
+    validations:
+      required: true
--- a/.github/ISSUE_TEMPLATE/translation_issue.yml
+++ b/.github/ISSUE_TEMPLATE/translation_issue.yml
@ -1,52 +1,52 @@
 name: "🌐 Localization/Translation issue"
 description: Report incorrect translations. [please use English :）]
 labels:
- translation
+  - translation
 body:
- type: checkboxes
-  attributes:
-    label: Self Checks
-    description: "To make sure we get to you in time, please check the following :)"
-    options:
-      - label: I have searched for existing issues [search for existing issues](https://github.com/langgenius/dify/issues), including closed ones.
-        required: true
-      - label: I confirm that I am using English to file this report (我已阅读并同意 [Language Policy](https://github.com/langgenius/dify/issues/1542)).
-        required: true
- type: input
-  attributes:
-    label: Dify version
-    placeholder: 0.3.21
-    description: Hover over system tray icon or look at Settings
-  validations:
-    required: true
- type: input
-  attributes:
-    label: Utility with translation issue
-    placeholder: Some area
-    description: Please input here the utility with the translation issue
-  validations:
-    required: true
- type: input
-  attributes:
-    label: 🌐 Language affected
-    placeholder: "German"
-  validations:
-    required: true
- type: textarea
-  attributes: 
-    label: ❌ Actual phrase(s)
-    placeholder: What is there? Please include a screenshot as that is extremely helpful.
-  validations:
-    required: true
- type: textarea
-  attributes: 
-    label: ✔️ Expected phrase(s)
-    placeholder: What was expected?
-  validations:
-    required: true
- type: textarea
-  attributes:
-    label: ℹ Why is the current translation wrong
-    placeholder: Why do you feel this is incorrect?
-  validations:
-    required: true
+  - type: checkboxes
+    attributes:
+      label: Self Checks
+      description: "To make sure we get to you in time, please check the following :)"
+      options:
+        - label: I have searched for existing issues [search for existing issues](https://github.com/langgenius/dify/issues), including closed ones.
+          required: true
+        - label: I confirm that I am using English to file this report (我已阅读并同意 [Language Policy](https://github.com/langgenius/dify/issues/1542)).
+          required: true
+  - type: input
+    attributes:
+      label: Dify version
+      placeholder: 0.3.21
+      description: Hover over system tray icon or look at Settings
+    validations:
+      required: true
+  - type: input
+    attributes:
+      label: Utility with translation issue
+      placeholder: Some area
+      description: Please input here the utility with the translation issue
+    validations:
+      required: true
+  - type: input
+    attributes:
+      label: 🌐 Language affected
+      placeholder: "German"
+    validations:
+      required: true
+  - type: textarea
+    attributes:
+      label: ❌ Actual phrase(s)
+      placeholder: What is there? Please include a screenshot as that is extremely helpful.
+    validations:
+      required: true
+  - type: textarea
+    attributes:
+      label: ✔️ Expected phrase(s)
+      placeholder: What was expected?
+    validations:
+      required: true
+  - type: textarea
+    attributes:
+      label: ℹ Why is the current translation wrong
+      placeholder: Why do you feel this is incorrect?
+    validations:
+      required: true
--- a/.github/linters/.hadolint.yaml
+++ b/.github/linters/.hadolint.yaml
@ -0,0 +1 @@
+failure-threshold: "error"
--- a/.github/linters/.yaml-lint.yml
+++ b/.github/linters/.yaml-lint.yml
@ -0,0 +1,11 @@
+---
+
+extends: default
+
+rules:
+  brackets:
+    max-spaces-inside: 1
+  comments-indentation: disable
+  document-start: disable
+  line-length: disable
+  truthy: disable
--- a/.github/workflows/api-model-runtime-tests.yml
+++ b/.github/workflows/api-model-runtime-tests.yml
@ -32,18 +32,18 @@ jobs:
      MOCK_SWITCH: true

    steps:
-    - name: Checkout code
-      uses: actions/checkout@v4
+      - name: Checkout code
+        uses: actions/checkout@v4

-    - name: Set up Python
-      uses: actions/setup-python@v5
-      with:
-        python-version: '3.10'
-        cache: 'pip'
-        cache-dependency-path: ./api/requirements.txt
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: '3.10'
+          cache: 'pip'
+          cache-dependency-path: ./api/requirements.txt

-    - name: Install dependencies
-      run: pip install -r ./api/requirements.txt
+      - name: Install dependencies
+        run: pip install -r ./api/requirements.txt

-    - name: Run pytest
-      run: pytest api/tests/integration_tests/model_runtime/anthropic api/tests/integration_tests/model_runtime/azure_openai api/tests/integration_tests/model_runtime/openai api/tests/integration_tests/model_runtime/chatglm api/tests/integration_tests/model_runtime/google api/tests/integration_tests/model_runtime/xinference api/tests/integration_tests/model_runtime/huggingface_hub/test_llm.py
+      - name: Run pytest
+        run: pytest api/tests/integration_tests/model_runtime/anthropic api/tests/integration_tests/model_runtime/azure_openai api/tests/integration_tests/model_runtime/openai api/tests/integration_tests/model_runtime/chatglm api/tests/integration_tests/model_runtime/google api/tests/integration_tests/model_runtime/xinference api/tests/integration_tests/model_runtime/huggingface_hub/test_llm.py
--- a/.github/workflows/build-api-image.yml
+++ b/.github/workflows/build-api-image.yml
@ -6,55 +6,55 @@ on:
      - 'main'
      - 'deploy/dev'
  release:
-    types: [published]
+    types: [ published ]

 jobs:
  build-and-push:
    runs-on: ubuntu-latest
    if: github.event.pull_request.draft == false
    steps:
-    - name: Set up QEMU
-      uses: docker/setup-qemu-action@v3
+      - name: Set up QEMU
+        uses: docker/setup-qemu-action@v3

-    - name: Set up Docker Buildx
-      uses: docker/setup-buildx-action@v3
+      - name: Set up Docker Buildx
+        uses: docker/setup-buildx-action@v3

-    - name: Login to Docker Hub
-      uses: docker/login-action@v2
-      with:
-        username: ${{ secrets.DOCKERHUB_USER }}
-        password: ${{ secrets.DOCKERHUB_TOKEN }}
+      - name: Login to Docker Hub
+        uses: docker/login-action@v2
+        with:
+          username: ${{ secrets.DOCKERHUB_USER }}
+          password: ${{ secrets.DOCKERHUB_TOKEN }}

-    - name: Extract metadata (tags, labels) for Docker
-      id: meta
-      uses: docker/metadata-action@v5
-      with:
-        images: langgenius/dify-api
-        tags: |
-          type=raw,value=latest,enable=${{ startsWith(github.ref, 'refs/tags/') }}
-          type=ref,event=branch
-          type=sha,enable=true,priority=100,prefix=,suffix=,format=long
-          type=raw,value=${{ github.ref_name }},enable=${{ startsWith(github.ref, 'refs/tags/') }}
+      - name: Extract metadata (tags, labels) for Docker
+        id: meta
+        uses: docker/metadata-action@v5
+        with:
+          images: langgenius/dify-api
+          tags: |
+            type=raw,value=latest,enable=${{ startsWith(github.ref, 'refs/tags/') }}
+            type=ref,event=branch
+            type=sha,enable=true,priority=100,prefix=,suffix=,format=long
+            type=raw,value=${{ github.ref_name }},enable=${{ startsWith(github.ref, 'refs/tags/') }}

-    - name: Build and push
-      uses: docker/build-push-action@v5
-      with:
-        context: "{{defaultContext}}:api"
-        platforms: ${{ startsWith(github.ref, 'refs/tags/') && 'linux/amd64,linux/arm64' || 'linux/amd64' }}
-        build-args: |
-          COMMIT_SHA=${{ fromJSON(steps.meta.outputs.json).labels['org.opencontainers.image.revision'] }}
-        push: true
-        tags: ${{ steps.meta.outputs.tags }}
-        labels: ${{ steps.meta.outputs.labels }}
-        cache-from: type=gha
-        cache-to: type=gha,mode=max
+      - name: Build and push
+        uses: docker/build-push-action@v5
+        with:
+          context: "{{defaultContext}}:api"
+          platforms: ${{ startsWith(github.ref, 'refs/tags/') && 'linux/amd64,linux/arm64' || 'linux/amd64' }}
+          build-args: |
+            COMMIT_SHA=${{ fromJSON(steps.meta.outputs.json).labels['org.opencontainers.image.revision'] }}
+          push: true
+          tags: ${{ steps.meta.outputs.tags }}
+          labels: ${{ steps.meta.outputs.labels }}
+          cache-from: type=gha
+          cache-to: type=gha,mode=max

-    - name: Deploy to server
-      if: github.ref == 'refs/heads/deploy/dev'
-      uses: appleboy/ssh-action@v0.1.8
-      with:
-        host: ${{ secrets.SSH_HOST }}
-        username: ${{ secrets.SSH_USER }}
-        key: ${{ secrets.SSH_PRIVATE_KEY }}
-        script: |
-          ${{ secrets.SSH_SCRIPT }}
+      - name: Deploy to server
+        if: github.ref == 'refs/heads/deploy/dev'
+        uses: appleboy/ssh-action@v0.1.8
+        with:
+          host: ${{ secrets.SSH_HOST }}
+          username: ${{ secrets.SSH_USER }}
+          key: ${{ secrets.SSH_PRIVATE_KEY }}
+          script: |
+            ${{ secrets.SSH_SCRIPT }}
--- a/.github/workflows/build-web-image.yml
+++ b/.github/workflows/build-web-image.yml
@ -6,55 +6,55 @@ on:
      - 'main'
      - 'deploy/dev'
  release:
-    types: [published]
+    types: [ published ]

 jobs:
  build-and-push:
    runs-on: ubuntu-latest
    if: github.event.pull_request.draft == false
    steps:
-    - name: Set up QEMU
-      uses: docker/setup-qemu-action@v3
+      - name: Set up QEMU
+        uses: docker/setup-qemu-action@v3

-    - name: Set up Docker Buildx
-      uses: docker/setup-buildx-action@v3
+      - name: Set up Docker Buildx
+        uses: docker/setup-buildx-action@v3

-    - name: Login to Docker Hub
-      uses: docker/login-action@v2
-      with:
-        username: ${{ secrets.DOCKERHUB_USER }}
-        password: ${{ secrets.DOCKERHUB_TOKEN }}
+      - name: Login to Docker Hub
+        uses: docker/login-action@v2
+        with:
+          username: ${{ secrets.DOCKERHUB_USER }}
+          password: ${{ secrets.DOCKERHUB_TOKEN }}

-    - name: Extract metadata (tags, labels) for Docker
-      id: meta
-      uses: docker/metadata-action@v5
-      with:
-        images: langgenius/dify-web
-        tags: |
-          type=raw,value=latest,enable=${{ startsWith(github.ref, 'refs/tags/') }}
-          type=ref,event=branch
-          type=sha,enable=true,priority=100,prefix=,suffix=,format=long
-          type=raw,value=${{ github.ref_name }},enable=${{ startsWith(github.ref, 'refs/tags/') }}
+      - name: Extract metadata (tags, labels) for Docker
+        id: meta
+        uses: docker/metadata-action@v5
+        with:
+          images: langgenius/dify-web
+          tags: |
+            type=raw,value=latest,enable=${{ startsWith(github.ref, 'refs/tags/') }}
+            type=ref,event=branch
+            type=sha,enable=true,priority=100,prefix=,suffix=,format=long
+            type=raw,value=${{ github.ref_name }},enable=${{ startsWith(github.ref, 'refs/tags/') }}

-    - name: Build and push
-      uses: docker/build-push-action@v5
-      with:
-        context: "{{defaultContext}}:web"
-        platforms: ${{ startsWith(github.ref, 'refs/tags/') && 'linux/amd64,linux/arm64' || 'linux/amd64' }}
-        build-args: |
-          COMMIT_SHA=${{ fromJSON(steps.meta.outputs.json).labels['org.opencontainers.image.revision'] }}
-        push: true
-        tags: ${{ steps.meta.outputs.tags }}
-        labels: ${{ steps.meta.outputs.labels }}
-        cache-from: type=gha
-        cache-to: type=gha,mode=max
+      - name: Build and push
+        uses: docker/build-push-action@v5
+        with:
+          context: "{{defaultContext}}:web"
+          platforms: ${{ startsWith(github.ref, 'refs/tags/') && 'linux/amd64,linux/arm64' || 'linux/amd64' }}
+          build-args: |
+            COMMIT_SHA=${{ fromJSON(steps.meta.outputs.json).labels['org.opencontainers.image.revision'] }}
+          push: true
+          tags: ${{ steps.meta.outputs.tags }}
+          labels: ${{ steps.meta.outputs.labels }}
+          cache-from: type=gha
+          cache-to: type=gha,mode=max

-    - name: Deploy to server
-      if: github.ref == 'refs/heads/deploy/dev'
-      uses: appleboy/ssh-action@v0.1.8
-      with:
-        host: ${{ secrets.SSH_HOST }}
-        username: ${{ secrets.SSH_USER }}
-        key: ${{ secrets.SSH_PRIVATE_KEY }}
-        script: |
-          ${{ secrets.SSH_SCRIPT }}
+      - name: Deploy to server
+        if: github.ref == 'refs/heads/deploy/dev'
+        uses: appleboy/ssh-action@v0.1.8
+        with:
+          host: ${{ secrets.SSH_HOST }}
+          username: ${{ secrets.SSH_USER }}
+          key: ${{ secrets.SSH_PRIVATE_KEY }}
+          script: |
+            ${{ secrets.SSH_SCRIPT }}
--- a/.github/workflows/stale.yml
+++ b/.github/workflows/stale.yml
@ -7,7 +7,7 @@ name: Mark stale issues and pull requests

 on:
  schedule:
-  - cron: '0 3 * * *'
+    - cron: '0 3 * * *'

 jobs:
  stale:
@ -18,13 +18,13 @@ jobs:
      pull-requests: write

    steps:
-    - uses: actions/stale@v5
-      with:
-        days-before-issue-stale: 15
-        days-before-issue-close: 3
-        repo-token: ${{ secrets.GITHUB_TOKEN }}
-        stale-issue-message: "Close due to it's no longer active, if you have any questions, you can reopen it."
-        stale-pr-message: "Close due to it's no longer active, if you have any questions, you can reopen it."
-        stale-issue-label: 'no-issue-activity'
-        stale-pr-label: 'no-pr-activity'
-        any-of-labels: 'duplicate,question,invalid,wontfix,no-issue-activity,no-pr-activity,enhancement,cant-reproduce,help-wanted'
+      - uses: actions/stale@v5
+        with:
+          days-before-issue-stale: 15
+          days-before-issue-close: 3
+          repo-token: ${{ secrets.GITHUB_TOKEN }}
+          stale-issue-message: "Close due to it's no longer active, if you have any questions, you can reopen it."
+          stale-pr-message: "Close due to it's no longer active, if you have any questions, you can reopen it."
+          stale-issue-label: 'no-issue-activity'
+          stale-pr-label: 'no-pr-activity'
+          any-of-labels: 'duplicate,question,invalid,wontfix,no-issue-activity,no-pr-activity,enhancement,cant-reproduce,help-wanted'
--- a/.github/workflows/style.yml
+++ b/.github/workflows/style.yml
@ -8,27 +8,47 @@ on:
    branches:
      - deploy/dev

+concurrency:
+  group: dep-${{ github.head_ref || github.run_id }}
+  cancel-in-progress: true
+
 jobs:
  test:
+    name: ESLint and SuperLinter
    runs-on: ubuntu-latest

    steps:
-    - name: Checkout code
-      uses: actions/checkout@v4
+      - name: Checkout code
+        uses: actions/checkout@v4

-    - name: Setup NodeJS
-      uses: actions/setup-node@v4
-      with:
-        node-version: 18
-        cache: yarn
-        cache-dependency-path: ./web/package.json
+      - name: Setup NodeJS
+        uses: actions/setup-node@v4
+        with:
+          node-version: 18
+          cache: yarn
+          cache-dependency-path: ./web/package.json

-    - name: Web dependencies
-      run: |
-        cd ./web
-        yarn install --frozen-lockfile
+      - name: Web dependencies
+        run: |
+          cd ./web
+          yarn install --frozen-lockfile

-    - name: Web style check
-      run: |
-        cd ./web
-        yarn run lint
+      - name: Web style check
+        run: |
+          cd ./web
+          yarn run lint
+
+      - name: Super-linter
+        uses: super-linter/super-linter/slim@v5
+        env:
+          BASH_SEVERITY: warning
+          DEFAULT_BRANCH: main
+          ERROR_ON_MISSING_EXEC_BIT: true
+          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+          IGNORE_GENERATED_FILES: true
+          IGNORE_GITIGNORED_FILES: true
+          VALIDATE_BASH: true
+          VALIDATE_BASH_EXEC: true
+          VALIDATE_GITHUB_ACTIONS: true
+          VALIDATE_DOCKERFILE_HADOLINT: true
+          VALIDATE_YAML: true
--- a/api/.env.example
+++ b/api/.env.example
@ -86,6 +86,7 @@ MULTIMODAL_SEND_IMAGE_FORMAT=base64
 MAIL_TYPE=
 MAIL_DEFAULT_SEND_FROM=no-reply <no-reply@dify.ai>
 RESEND_API_KEY=
+RESEND_API_URL=https://api.resend.com

 # Sentry configuration
 SENTRY_DSN=
--- a/api/config.py
+++ b/api/config.py
@ -88,7 +88,7 @@ class Config:
        # ------------------------
        # General Configurations.
        # ------------------------
-        self.CURRENT_VERSION = "0.4.3"
+        self.CURRENT_VERSION = "0.4.5"
        self.COMMIT_SHA = get_env('COMMIT_SHA')
        self.EDITION = "SELF_HOSTED"
        self.DEPLOY_ENV = get_env('DEPLOY_ENV')
@ -219,6 +219,7 @@ class Config:
        self.MAIL_TYPE = get_env('MAIL_TYPE')
        self.MAIL_DEFAULT_SEND_FROM = get_env('MAIL_DEFAULT_SEND_FROM')
        self.RESEND_API_KEY = get_env('RESEND_API_KEY')
+        self.RESEND_API_URL = get_env('RESEND_API_URL')
        
        # ------------------------
        # Workpace Configurations.
--- a/api/controllers/console/datasets/datasets_segments.py
+++ b/api/controllers/console/datasets/datasets_segments.py
@ -156,6 +156,9 @@ class DatasetDocumentSegmentApi(Resource):
        if not segment:
            raise NotFound('Segment not found.')

+        if segment.status != 'completed':
+            raise NotFound('Segment is not completed, enable or disable function is not allowed')
+
        document_indexing_cache_key = 'document_{}_indexing'.format(segment.document_id)
        cache_result = redis_client.get(document_indexing_cache_key)
        if cache_result is not None:
--- a/api/controllers/service_api/app/completion.py
+++ b/api/controllers/service_api/app/completion.py
@ -31,7 +31,7 @@ class CompletionApi(AppApiResource):
        parser.add_argument('query', type=str, location='json', default='')
        parser.add_argument('files', type=list, required=False, location='json')
        parser.add_argument('response_mode', type=str, choices=['blocking', 'streaming'], location='json')
-        parser.add_argument('user', type=str, location='json')
+        parser.add_argument('user', required=True, nullable=False, type=str, location='json')
        parser.add_argument('retriever_from', type=str, required=False, default='dev', location='json')

        args = parser.parse_args()
@ -96,7 +96,7 @@ class ChatApi(AppApiResource):
        parser.add_argument('files', type=list, required=False, location='json')
        parser.add_argument('response_mode', type=str, choices=['blocking', 'streaming'], location='json')
        parser.add_argument('conversation_id', type=uuid_value, location='json')
-        parser.add_argument('user', type=str, location='json')
+        parser.add_argument('user', type=str, required=True, nullable=False, location='json')
        parser.add_argument('retriever_from', type=str, required=False, default='dev', location='json')
        parser.add_argument('auto_generate_name', type=bool, required=False, default=True, location='json')

--- a/api/core/app_runner/app_runner.py
+++ b/api/core/app_runner/app_runner.py
@ -1,7 +1,7 @@
 import time
 from typing import cast, Optional, List, Tuple, Generator, Union

-from core.application_queue_manager import ApplicationQueueManager
+from core.application_queue_manager import ApplicationQueueManager, PublishFrom
 from core.entities.application_entities import ModelConfigEntity, PromptTemplateEntity, AppOrchestrationConfigEntity
 from core.file.file_obj import FileObj
 from core.memory.token_buffer_memory import TokenBufferMemory
@ -183,7 +183,7 @@ class AppRunner:
                        index=index,
                        message=AssistantPromptMessage(content=token)
                    )
-                ))
+                ), PublishFrom.APPLICATION_MANAGER)
                index += 1
                time.sleep(0.01)

@ -193,7 +193,8 @@ class AppRunner:
                prompt_messages=prompt_messages,
                message=AssistantPromptMessage(content=text),
                usage=usage if usage else LLMUsage.empty_usage()
-            )
+            ),
+            pub_from=PublishFrom.APPLICATION_MANAGER
        )

    def _handle_invoke_result(self, invoke_result: Union[LLMResult, Generator],
@ -226,7 +227,8 @@ class AppRunner:
        :return:
        """
        queue_manager.publish_message_end(
-            llm_result=invoke_result
+            llm_result=invoke_result,
+            pub_from=PublishFrom.APPLICATION_MANAGER
        )

    def _handle_invoke_result_stream(self, invoke_result: Generator,
@ -242,7 +244,7 @@ class AppRunner:
        text = ''
        usage = None
        for result in invoke_result:
-            queue_manager.publish_chunk_message(result)
+            queue_manager.publish_chunk_message(result, PublishFrom.APPLICATION_MANAGER)

            text += result.delta.message.content

@ -263,5 +265,6 @@ class AppRunner:
        )

        queue_manager.publish_message_end(
-            llm_result=llm_result
+            llm_result=llm_result,
+            pub_from=PublishFrom.APPLICATION_MANAGER
        )
--- a/api/core/app_runner/basic_app_runner.py
+++ b/api/core/app_runner/basic_app_runner.py
@ -5,7 +5,7 @@ from core.app_runner.app_runner import AppRunner
 from core.callback_handler.index_tool_callback_handler import DatasetIndexToolCallbackHandler
 from core.entities.application_entities import ApplicationGenerateEntity, ModelConfigEntity, \
    AppOrchestrationConfigEntity, InvokeFrom, ExternalDataVariableEntity, DatasetEntity
-from core.application_queue_manager import ApplicationQueueManager
+from core.application_queue_manager import ApplicationQueueManager, PublishFrom
 from core.features.annotation_reply import AnnotationReplyFeature
 from core.features.dataset_retrieval import DatasetRetrievalFeature
 from core.features.external_data_fetch import ExternalDataFetchFeature
@ -121,7 +121,8 @@ class BasicApplicationRunner(AppRunner):

            if annotation_reply:
                queue_manager.publish_annotation_reply(
-                    message_annotation_id=annotation_reply.id
+                    message_annotation_id=annotation_reply.id,
+                    pub_from=PublishFrom.APPLICATION_MANAGER
                )
                self.direct_output(
                    queue_manager=queue_manager,
@ -132,16 +133,16 @@ class BasicApplicationRunner(AppRunner):
                )
                return

-            # fill in variable inputs from external data tools if exists
-            external_data_tools = app_orchestration_config.external_data_variables
-            if external_data_tools:
-                inputs = self.fill_in_inputs_from_external_data_tools(
-                    tenant_id=app_record.tenant_id,
-                    app_id=app_record.id,
-                    external_data_tools=external_data_tools,
-                    inputs=inputs,
-                    query=query
-                )
+        # fill in variable inputs from external data tools if exists
+        external_data_tools = app_orchestration_config.external_data_variables
+        if external_data_tools:
+            inputs = self.fill_in_inputs_from_external_data_tools(
+                tenant_id=app_record.tenant_id,
+                app_id=app_record.id,
+                external_data_tools=external_data_tools,
+                inputs=inputs,
+                query=query
+            )

        # get context from datasets
        context = None
--- a/api/core/app_runner/generate_task_pipeline.py
+++ b/api/core/app_runner/generate_task_pipeline.py
@ -7,7 +7,7 @@ from pydantic import BaseModel

 from core.app_runner.moderation_handler import OutputModerationHandler, ModerationRule
 from core.entities.application_entities import ApplicationGenerateEntity
-from core.application_queue_manager import ApplicationQueueManager
+from core.application_queue_manager import ApplicationQueueManager, PublishFrom
 from core.entities.queue_entities import QueueErrorEvent, QueueStopEvent, QueueMessageEndEvent, \
    QueueRetrieverResourcesEvent, QueueAgentThoughtEvent, QueuePingEvent, QueueMessageEvent, QueueMessageReplaceEvent, \
    AnnotationReplyEvent
@ -312,8 +312,11 @@ class GenerateTaskPipeline:
                                index=0,
                                message=AssistantPromptMessage(content=self._task_state.llm_result.message.content)
                            )
-                        ))
-                        self._queue_manager.publish(QueueStopEvent(stopped_by=QueueStopEvent.StopBy.OUTPUT_MODERATION))
+                        ), PublishFrom.TASK_PIPELINE)
+                        self._queue_manager.publish(
+                            QueueStopEvent(stopped_by=QueueStopEvent.StopBy.OUTPUT_MODERATION),
+                            PublishFrom.TASK_PIPELINE
+                        )
                        continue
                    else:
                        self._output_moderation_handler.append_new_token(delta_text)
--- a/api/core/app_runner/moderation_handler.py
+++ b/api/core/app_runner/moderation_handler.py
@ -6,6 +6,7 @@ from typing import Any, Optional, Dict
 from flask import current_app, Flask
 from pydantic import BaseModel

+from core.application_queue_manager import PublishFrom
 from core.moderation.base import ModerationAction, ModerationOutputsResult
 from core.moderation.factory import ModerationFactory

@ -66,7 +67,7 @@ class OutputModerationHandler(BaseModel):
            final_output = result.text

        if public_event:
-            self.on_message_replace_func(final_output)
+            self.on_message_replace_func(final_output, PublishFrom.TASK_PIPELINE)

        return final_output

--- a/api/core/application_manager.py
+++ b/api/core/application_manager.py
@ -23,7 +23,7 @@ from core.model_runtime.errors.invoke import InvokeAuthorizationError, InvokeErr
 from core.model_runtime.model_providers.__base.large_language_model import LargeLanguageModel
 from core.prompt.prompt_template import PromptTemplateParser
 from core.provider_manager import ProviderManager
-from core.application_queue_manager import ApplicationQueueManager, ConversationTaskStoppedException
+from core.application_queue_manager import ApplicationQueueManager, ConversationTaskStoppedException, PublishFrom
 from extensions.ext_database import db
 from models.account import Account
 from models.model import EndUser, Conversation, Message, MessageFile, App
@ -169,15 +169,18 @@ class ApplicationManager:
            except ConversationTaskStoppedException:
                pass
            except InvokeAuthorizationError:
-                queue_manager.publish_error(InvokeAuthorizationError('Incorrect API key provided'))
+                queue_manager.publish_error(
+                    InvokeAuthorizationError('Incorrect API key provided'),
+                    PublishFrom.APPLICATION_MANAGER
+                )
            except ValidationError as e:
                logger.exception("Validation Error when generating")
-                queue_manager.publish_error(e)
+                queue_manager.publish_error(e, PublishFrom.APPLICATION_MANAGER)
            except (ValueError, InvokeError) as e:
-                queue_manager.publish_error(e)
+                queue_manager.publish_error(e, PublishFrom.APPLICATION_MANAGER)
            except Exception as e:
                logger.exception("Unknown Error when generating")
-                queue_manager.publish_error(e)
+                queue_manager.publish_error(e, PublishFrom.APPLICATION_MANAGER)
            finally:
                db.session.remove()

--- a/api/core/application_queue_manager.py
+++ b/api/core/application_queue_manager.py
@ -1,5 +1,6 @@
 import queue
 import time
+from enum import Enum
 from typing import Generator, Any

 from sqlalchemy.orm import DeclarativeMeta
@ -13,6 +14,11 @@ from extensions.ext_redis import redis_client
 from models.model import MessageAgentThought


+class PublishFrom(Enum):
+    APPLICATION_MANAGER = 1
+    TASK_PIPELINE = 2
+
+
 class ApplicationQueueManager:
    def __init__(self, task_id: str,
                 user_id: str,
@ -61,11 +67,14 @@ class ApplicationQueueManager:
                if elapsed_time >= listen_timeout or self._is_stopped():
                    # publish two messages to make sure the client can receive the stop signal
                    # and stop listening after the stop signal processed
-                    self.publish(QueueStopEvent(stopped_by=QueueStopEvent.StopBy.USER_MANUAL))
+                    self.publish(
+                        QueueStopEvent(stopped_by=QueueStopEvent.StopBy.USER_MANUAL),
+                        PublishFrom.TASK_PIPELINE
+                    )
                    self.stop_listen()

                if elapsed_time // 10 > last_ping_time:
-                    self.publish(QueuePingEvent())
+                    self.publish(QueuePingEvent(), PublishFrom.TASK_PIPELINE)
                    last_ping_time = elapsed_time // 10

    def stop_listen(self) -> None:
@ -75,76 +84,83 @@ class ApplicationQueueManager:
        """
        self._q.put(None)

-    def publish_chunk_message(self, chunk: LLMResultChunk) -> None:
+    def publish_chunk_message(self, chunk: LLMResultChunk, pub_from: PublishFrom) -> None:
        """
        Publish chunk message to channel

        :param chunk: chunk
+        :param pub_from: publish from
        :return:
        """
        self.publish(QueueMessageEvent(
            chunk=chunk
-        ))
+        ), pub_from)

-    def publish_message_replace(self, text: str) -> None:
+    def publish_message_replace(self, text: str, pub_from: PublishFrom) -> None:
        """
        Publish message replace
        :param text: text
+        :param pub_from: publish from
        :return:
        """
        self.publish(QueueMessageReplaceEvent(
            text=text
-        ))
+        ), pub_from)

-    def publish_retriever_resources(self, retriever_resources: list[dict]) -> None:
+    def publish_retriever_resources(self, retriever_resources: list[dict], pub_from: PublishFrom) -> None:
        """
        Publish retriever resources
        :return:
        """
-        self.publish(QueueRetrieverResourcesEvent(retriever_resources=retriever_resources))
+        self.publish(QueueRetrieverResourcesEvent(retriever_resources=retriever_resources), pub_from)

-    def publish_annotation_reply(self, message_annotation_id: str) -> None:
+    def publish_annotation_reply(self, message_annotation_id: str, pub_from: PublishFrom) -> None:
        """
        Publish annotation reply
        :param message_annotation_id: message annotation id
+        :param pub_from: publish from
        :return:
        """
-        self.publish(AnnotationReplyEvent(message_annotation_id=message_annotation_id))
+        self.publish(AnnotationReplyEvent(message_annotation_id=message_annotation_id), pub_from)

-    def publish_message_end(self, llm_result: LLMResult) -> None:
+    def publish_message_end(self, llm_result: LLMResult, pub_from: PublishFrom) -> None:
        """
        Publish message end
        :param llm_result: llm result
+        :param pub_from: publish from
        :return:
        """
-        self.publish(QueueMessageEndEvent(llm_result=llm_result))
+        self.publish(QueueMessageEndEvent(llm_result=llm_result), pub_from)
        self.stop_listen()

-    def publish_agent_thought(self, message_agent_thought: MessageAgentThought) -> None:
+    def publish_agent_thought(self, message_agent_thought: MessageAgentThought, pub_from: PublishFrom) -> None:
        """
        Publish agent thought
        :param message_agent_thought: message agent thought
+        :param pub_from: publish from
        :return:
        """
        self.publish(QueueAgentThoughtEvent(
            agent_thought_id=message_agent_thought.id
-        ))
+        ), pub_from)

-    def publish_error(self, e) -> None:
+    def publish_error(self, e, pub_from: PublishFrom) -> None:
        """
        Publish error
        :param e: error
+        :param pub_from: publish from
        :return:
        """
        self.publish(QueueErrorEvent(
            error=e
-        ))
+        ), pub_from)
        self.stop_listen()

-    def publish(self, event: AppQueueEvent) -> None:
+    def publish(self, event: AppQueueEvent, pub_from: PublishFrom) -> None:
        """
        Publish event to queue
        :param event:
+        :param pub_from:
        :return:
        """
        self._check_for_sqlalchemy_models(event.dict())
@ -162,6 +178,9 @@ class ApplicationQueueManager:
        if isinstance(event, QueueStopEvent):
            self.stop_listen()

+        if pub_from == PublishFrom.APPLICATION_MANAGER and self._is_stopped():
+            raise ConversationTaskStoppedException()
+
    @classmethod
    def set_stop_flag(cls, task_id: str, invoke_from: InvokeFrom, user_id: str) -> None:
        """
@ -187,7 +206,6 @@ class ApplicationQueueManager:
        stopped_cache_key = ApplicationQueueManager._generate_stopped_cache_key(self._task_id)
        result = redis_client.get(stopped_cache_key)
        if result is not None:
-            redis_client.delete(stopped_cache_key)
            return True

        return False
--- a/api/core/callback_handler/agent_loop_gather_callback_handler.py
+++ b/api/core/callback_handler/agent_loop_gather_callback_handler.py
@ -8,7 +8,7 @@ from langchain.agents import openai_functions_agent, openai_functions_multi_agen
 from langchain.callbacks.base import BaseCallbackHandler
 from langchain.schema import AgentAction, AgentFinish, LLMResult, ChatGeneration, BaseMessage

-from core.application_queue_manager import ApplicationQueueManager
+from core.application_queue_manager import ApplicationQueueManager, PublishFrom
 from core.callback_handler.entity.agent_loop import AgentLoop
 from core.entities.application_entities import ModelConfigEntity
 from core.model_runtime.entities.llm_entities import LLMResult as RuntimeLLMResult
@ -232,7 +232,7 @@ class AgentLoopGatherCallbackHandler(BaseCallbackHandler):
        db.session.add(message_agent_thought)
        db.session.commit()

-        self.queue_manager.publish_agent_thought(message_agent_thought)
+        self.queue_manager.publish_agent_thought(message_agent_thought, PublishFrom.APPLICATION_MANAGER)

        return message_agent_thought

--- a/api/core/callback_handler/index_tool_callback_handler.py
+++ b/api/core/callback_handler/index_tool_callback_handler.py
@ -2,7 +2,7 @@ from typing import List, Union

 from langchain.schema import Document

-from core.application_queue_manager import ApplicationQueueManager
+from core.application_queue_manager import ApplicationQueueManager, PublishFrom
 from core.entities.application_entities import InvokeFrom
 from extensions.ext_database import db
 from models.dataset import DocumentSegment, DatasetQuery
@ -80,4 +80,4 @@ class DatasetIndexToolCallbackHandler:
                db.session.add(dataset_retriever_resource)
                db.session.commit()

-        self._queue_manager.publish_retriever_resources(resource)
+        self._queue_manager.publish_retriever_resources(resource, PublishFrom.APPLICATION_MANAGER)
--- a/api/core/entities/provider_configuration.py
+++ b/api/core/entities/provider_configuration.py
@ -1,7 +1,7 @@
 import datetime
 import json
 import logging
-import time
+
 from json import JSONDecodeError
 from typing import Optional, List, Dict, Tuple, Iterator

@ -11,8 +11,9 @@ from core.entities.model_entities import ModelWithProviderEntity, ModelStatus, S
 from core.entities.provider_entities import SystemConfiguration, CustomConfiguration, SystemConfigurationStatus
 from core.helper import encrypter
 from core.helper.model_provider_cache import ProviderCredentialsCache, ProviderCredentialsCacheType
-from core.model_runtime.entities.model_entities import ModelType
-from core.model_runtime.entities.provider_entities import ProviderEntity, CredentialFormSchema, FormType
+from core.model_runtime.entities.model_entities import ModelType, FetchFrom
+from core.model_runtime.entities.provider_entities import ProviderEntity, CredentialFormSchema, FormType, \
+    ConfigurateMethod
 from core.model_runtime.model_providers import model_provider_factory
 from core.model_runtime.model_providers.__base.ai_model import AIModel
 from core.model_runtime.model_providers.__base.model_provider import ModelProvider
@ -22,6 +23,8 @@ from models.provider import ProviderType, Provider, ProviderModel, TenantPreferr

 logger = logging.getLogger(__name__)

+original_provider_configurate_methods = {}
+

 class ProviderConfiguration(BaseModel):
    """
@ -34,6 +37,20 @@ class ProviderConfiguration(BaseModel):
    system_configuration: SystemConfiguration
    custom_configuration: CustomConfiguration

+    def __init__(self, **data):
+        super().__init__(**data)
+
+        if self.provider.provider not in original_provider_configurate_methods:
+            original_provider_configurate_methods[self.provider.provider] = []
+            for configurate_method in self.provider.configurate_methods:
+                original_provider_configurate_methods[self.provider.provider].append(configurate_method)
+
+        if original_provider_configurate_methods[self.provider.provider] == [ConfigurateMethod.CUSTOMIZABLE_MODEL]:
+            if (any([len(quota_configuration.restrict_models) > 0
+                     for quota_configuration in self.system_configuration.quota_configurations])
+                    and ConfigurateMethod.PREDEFINED_MODEL not in self.provider.configurate_methods):
+                self.provider.configurate_methods.append(ConfigurateMethod.PREDEFINED_MODEL)
+
    def get_current_credentials(self, model_type: ModelType, model: str) -> Optional[dict]:
        """
        Get current credentials.
@ -43,7 +60,22 @@ class ProviderConfiguration(BaseModel):
        :return:
        """
        if self.using_provider_type == ProviderType.SYSTEM:
-            return self.system_configuration.credentials
+            restrict_models = []
+            for quota_configuration in self.system_configuration.quota_configurations:
+                if self.system_configuration.current_quota_type != quota_configuration.quota_type:
+                    continue
+
+                restrict_models = quota_configuration.restrict_models
+
+            copy_credentials = self.system_configuration.credentials.copy()
+            if restrict_models:
+                for restrict_model in restrict_models:
+                    if (restrict_model.model_type == model_type
+                            and restrict_model.model == model
+                            and restrict_model.base_model_name):
+                        copy_credentials['base_model_name'] = restrict_model.base_model_name
+
+            return copy_credentials
        else:
            if self.custom_configuration.models:
                for model_configuration in self.custom_configuration.models:
@ -123,7 +155,8 @@ class ProviderConfiguration(BaseModel):

        if provider_record:
            try:
-                original_credentials = json.loads(provider_record.encrypted_config) if provider_record.encrypted_config else {}
+                original_credentials = json.loads(
+                    provider_record.encrypted_config) if provider_record.encrypted_config else {}
            except JSONDecodeError:
                original_credentials = {}

@ -265,7 +298,8 @@ class ProviderConfiguration(BaseModel):

        if provider_model_record:
            try:
-                original_credentials = json.loads(provider_model_record.encrypted_config) if provider_model_record.encrypted_config else {}
+                original_credentials = json.loads(
+                    provider_model_record.encrypted_config) if provider_model_record.encrypted_config else {}
            except JSONDecodeError:
                original_credentials = {}

@ -520,7 +554,13 @@ class ProviderConfiguration(BaseModel):
            provider_models.extend(
                [
                    ModelWithProviderEntity(
-                        **m.dict(),
+                        model=m.model,
+                        label=m.label,
+                        model_type=m.model_type,
+                        features=m.features,
+                        fetch_from=m.fetch_from,
+                        model_properties=m.model_properties,
+                        deprecated=m.deprecated,
                        provider=SimpleModelProviderEntity(self.provider),
                        status=ModelStatus.ACTIVE
                    )
@ -528,21 +568,70 @@ class ProviderConfiguration(BaseModel):
                ]
            )

+        if self.provider.provider not in original_provider_configurate_methods:
+            original_provider_configurate_methods[self.provider.provider] = []
+            for configurate_method in provider_instance.get_provider_schema().configurate_methods:
+                original_provider_configurate_methods[self.provider.provider].append(configurate_method)
+
+        should_use_custom_model = False
+        if original_provider_configurate_methods[self.provider.provider] == [ConfigurateMethod.CUSTOMIZABLE_MODEL]:
+            should_use_custom_model = True
+
        for quota_configuration in self.system_configuration.quota_configurations:
            if self.system_configuration.current_quota_type != quota_configuration.quota_type:
                continue

-            restrict_llms = quota_configuration.restrict_llms
-            if not restrict_llms:
+            restrict_models = quota_configuration.restrict_models
+            if len(restrict_models) == 0:
                break

+            if should_use_custom_model:
+                if original_provider_configurate_methods[self.provider.provider] == [ConfigurateMethod.CUSTOMIZABLE_MODEL]:
+                    # only customizable model
+                    for restrict_model in restrict_models:
+                        copy_credentials = self.system_configuration.credentials.copy()
+                        if restrict_model.base_model_name:
+                            copy_credentials['base_model_name'] = restrict_model.base_model_name
+
+                        try:
+                            custom_model_schema = (
+                                provider_instance.get_model_instance(restrict_model.model_type)
+                                .get_customizable_model_schema_from_credentials(
+                                    restrict_model.model,
+                                    copy_credentials
+                                )
+                            )
+                        except Exception as ex:
+                            logger.warning(f'get custom model schema failed, {ex}')
+                            continue
+
+                        if not custom_model_schema:
+                            continue
+
+                        if custom_model_schema.model_type not in model_types:
+                            continue
+
+                        provider_models.append(
+                            ModelWithProviderEntity(
+                                model=custom_model_schema.model,
+                                label=custom_model_schema.label,
+                                model_type=custom_model_schema.model_type,
+                                features=custom_model_schema.features,
+                                fetch_from=FetchFrom.PREDEFINED_MODEL,
+                                model_properties=custom_model_schema.model_properties,
+                                deprecated=custom_model_schema.deprecated,
+                                provider=SimpleModelProviderEntity(self.provider),
+                                status=ModelStatus.ACTIVE
+                            )
+                        )
+
            # if llm name not in restricted llm list, remove it
+            restrict_model_names = [rm.model for rm in restrict_models]
            for m in provider_models:
-                if m.model_type == ModelType.LLM and m.model not in restrict_llms:
+                if m.model_type == ModelType.LLM and m.model not in restrict_model_names:
                    m.status = ModelStatus.NO_PERMISSION
                elif not quota_configuration.is_valid:
                    m.status = ModelStatus.QUOTA_EXCEEDED
-
        return provider_models

    def _get_custom_provider_models(self,
@ -569,7 +658,13 @@ class ProviderConfiguration(BaseModel):
            for m in models:
                provider_models.append(
                    ModelWithProviderEntity(
-                        **m.dict(),
+                        model=m.model,
+                        label=m.label,
+                        model_type=m.model_type,
+                        features=m.features,
+                        fetch_from=m.fetch_from,
+                        model_properties=m.model_properties,
+                        deprecated=m.deprecated,
                        provider=SimpleModelProviderEntity(self.provider),
                        status=ModelStatus.ACTIVE if credentials else ModelStatus.NO_CONFIGURE
                    )
@ -597,7 +692,13 @@ class ProviderConfiguration(BaseModel):

            provider_models.append(
                ModelWithProviderEntity(
-                    **custom_model_schema.dict(),
+                    model=custom_model_schema.model,
+                    label=custom_model_schema.label,
+                    model_type=custom_model_schema.model_type,
+                    features=custom_model_schema.features,
+                    fetch_from=custom_model_schema.fetch_from,
+                    model_properties=custom_model_schema.model_properties,
+                    deprecated=custom_model_schema.deprecated,
                    provider=SimpleModelProviderEntity(self.provider),
                    status=ModelStatus.ACTIVE
                )
--- a/api/core/entities/provider_entities.py
+++ b/api/core/entities/provider_entities.py
@ -21,6 +21,12 @@ class SystemConfigurationStatus(Enum):
    UNSUPPORTED = 'unsupported'


+class RestrictModel(BaseModel):
+    model: str
+    base_model_name: Optional[str] = None
+    model_type: ModelType
+
+
 class QuotaConfiguration(BaseModel):
    """
    Model class for provider quota configuration.
@ -30,7 +36,7 @@ class QuotaConfiguration(BaseModel):
    quota_limit: int
    quota_used: int
    is_valid: bool
-    restrict_llms: list[str] = []
+    restrict_models: list[RestrictModel] = []


 class SystemConfiguration(BaseModel):
--- a/api/core/external_data_tool/api/api.py
+++ b/api/core/external_data_tool/api/api.py
@ -58,7 +58,7 @@ class ApiExternalDataTool(ExternalDataTool):
        if not api_based_extension:
            raise ValueError("[External data tool] API query failed, variable: {}, "
                             "error: api_based_extension_id is invalid"
-                             .format(self.config.get('variable')))
+                             .format(self.variable))

        # decrypt api_key
        api_key = encrypter.decrypt_token(
@ -74,7 +74,7 @@ class ApiExternalDataTool(ExternalDataTool):
            )
        except Exception as e:
            raise ValueError("[External data tool] API query failed, variable: {}, error: {}".format(
-                self.config.get('variable'),
+                self.variable,
                e
            ))

@ -87,6 +87,10 @@ class ApiExternalDataTool(ExternalDataTool):

        if 'result' not in response_json:
            raise ValueError("[External data tool] API query failed, variable: {}, error: result not found in response"
-                             .format(self.config.get('variable')))
+                             .format(self.variable))
+
+        if not isinstance(response_json['result'], str):
+            raise ValueError("[External data tool] API query failed, variable: {}, error: result is not string"
+                             .format(self.variable))

        return response_json['result']
--- a/api/core/helper/model_provider_cache.py
+++ b/api/core/helper/model_provider_cache.py
@ -40,7 +40,7 @@ class ProviderCredentialsCache:
        :param credentials: provider credentials
        :return:
        """
-        redis_client.setex(self.cache_key, 3600, json.dumps(credentials))
+        redis_client.setex(self.cache_key, 86400, json.dumps(credentials))

    def delete(self) -> None:
        """
--- a/api/core/hosting_configuration.py
+++ b/api/core/hosting_configuration.py
@ -4,13 +4,14 @@ from typing import Optional
 from flask import Flask
 from pydantic import BaseModel

-from core.entities.provider_entities import QuotaUnit
+from core.entities.provider_entities import QuotaUnit, RestrictModel
+from core.model_runtime.entities.model_entities import ModelType
 from models.provider import ProviderQuotaType


 class HostingQuota(BaseModel):
    quota_type: ProviderQuotaType
-    restrict_llms: list[str] = []
+    restrict_models: list[RestrictModel] = []


 class TrialHostingQuota(HostingQuota):
@ -47,10 +48,11 @@ class HostingConfiguration:
    provider_map: dict[str, HostingProvider] = {}
    moderation_config: HostedModerationConfig = None

-    def init_app(self, app: Flask):
+    def init_app(self, app: Flask) -> None:
        if app.config.get('EDITION') != 'CLOUD':
            return

+        self.provider_map["azure_openai"] = self.init_azure_openai()
        self.provider_map["openai"] = self.init_openai()
        self.provider_map["anthropic"] = self.init_anthropic()
        self.provider_map["minimax"] = self.init_minimax()
@ -59,6 +61,47 @@ class HostingConfiguration:

        self.moderation_config = self.init_moderation_config()

+    def init_azure_openai(self) -> HostingProvider:
+        quota_unit = QuotaUnit.TIMES
+        if os.environ.get("HOSTED_AZURE_OPENAI_ENABLED") and os.environ.get("HOSTED_AZURE_OPENAI_ENABLED").lower() == 'true':
+            credentials = {
+                "openai_api_key": os.environ.get("HOSTED_AZURE_OPENAI_API_KEY"),
+                "openai_api_base": os.environ.get("HOSTED_AZURE_OPENAI_API_BASE"),
+                "base_model_name": "gpt-35-turbo"
+            }
+
+            quotas = []
+            hosted_quota_limit = int(os.environ.get("HOSTED_AZURE_OPENAI_QUOTA_LIMIT", "1000"))
+            if hosted_quota_limit != -1 or hosted_quota_limit > 0:
+                trial_quota = TrialHostingQuota(
+                    quota_limit=hosted_quota_limit,
+                    restrict_models=[
+                        RestrictModel(model="gpt-4", base_model_name="gpt-4", model_type=ModelType.LLM),
+                        RestrictModel(model="gpt-4-32k", base_model_name="gpt-4-32k", model_type=ModelType.LLM),
+                        RestrictModel(model="gpt-4-1106-preview", base_model_name="gpt-4-1106-preview", model_type=ModelType.LLM),
+                        RestrictModel(model="gpt-4-vision-preview", base_model_name="gpt-4-vision-preview", model_type=ModelType.LLM),
+                        RestrictModel(model="gpt-35-turbo", base_model_name="gpt-35-turbo", model_type=ModelType.LLM),
+                        RestrictModel(model="gpt-35-turbo-1106", base_model_name="gpt-35-turbo-1106", model_type=ModelType.LLM),
+                        RestrictModel(model="gpt-35-turbo-instruct", base_model_name="gpt-35-turbo-instruct", model_type=ModelType.LLM),
+                        RestrictModel(model="gpt-35-turbo-16k", base_model_name="gpt-35-turbo-16k", model_type=ModelType.LLM),
+                        RestrictModel(model="text-davinci-003", base_model_name="text-davinci-003", model_type=ModelType.LLM),
+                        RestrictModel(model="text-embedding-ada-002", base_model_name="text-embedding-ada-002", model_type=ModelType.TEXT_EMBEDDING),
+                    ]
+                )
+                quotas.append(trial_quota)
+
+            return HostingProvider(
+                enabled=True,
+                credentials=credentials,
+                quota_unit=quota_unit,
+                quotas=quotas
+            )
+
+        return HostingProvider(
+            enabled=False,
+            quota_unit=quota_unit,
+        )
+
    def init_openai(self) -> HostingProvider:
        quota_unit = QuotaUnit.TIMES
        if os.environ.get("HOSTED_OPENAI_ENABLED") and os.environ.get("HOSTED_OPENAI_ENABLED").lower() == 'true':
@ -77,12 +120,12 @@ class HostingConfiguration:
            if hosted_quota_limit != -1 or hosted_quota_limit > 0:
                trial_quota = TrialHostingQuota(
                    quota_limit=hosted_quota_limit,
-                    restrict_llms=[
-                        "gpt-3.5-turbo",
-                        "gpt-3.5-turbo-1106",
-                        "gpt-3.5-turbo-instruct",
-                        "gpt-3.5-turbo-16k",
-                        "text-davinci-003"
+                    restrict_models=[
+                        RestrictModel(model="gpt-3.5-turbo", model_type=ModelType.LLM),
+                        RestrictModel(model="gpt-3.5-turbo-1106", model_type=ModelType.LLM),
+                        RestrictModel(model="gpt-3.5-turbo-instruct", model_type=ModelType.LLM),
+                        RestrictModel(model="gpt-3.5-turbo-16k", model_type=ModelType.LLM),
+                        RestrictModel(model="text-davinci-003", model_type=ModelType.LLM),
                    ]
                )
                quotas.append(trial_quota)
--- a/api/core/index/keyword_table_index/keyword_table_index.py
+++ b/api/core/index/keyword_table_index/keyword_table_index.py
@ -136,6 +136,7 @@ class KeywordTableIndex(BaseIndex):
                    page_content=segment.content,
                    metadata={
                        "doc_id": chunk_index,
+                        "doc_hash": segment.index_node_hash,
                        "document_id": segment.document_id,
                        "dataset_id": segment.dataset_id,
                    }
--- a/api/core/indexing_runner.py
+++ b/api/core/indexing_runner.py
@ -221,12 +221,18 @@ class IndexingRunner:
            if not dataset:
                raise ValueError('Dataset not found.')
            if dataset.indexing_technique == 'high_quality' or indexing_technique == 'high_quality':
-                embedding_model_instance = self.model_manager.get_model_instance(
-                    tenant_id=tenant_id,
-                    provider=dataset.embedding_model_provider,
-                    model_type=ModelType.TEXT_EMBEDDING,
-                    model=dataset.embedding_model
-                )
+                if dataset.embedding_model_provider:
+                    embedding_model_instance = self.model_manager.get_model_instance(
+                        tenant_id=tenant_id,
+                        provider=dataset.embedding_model_provider,
+                        model_type=ModelType.TEXT_EMBEDDING,
+                        model=dataset.embedding_model
+                    )
+                else:
+                    embedding_model_instance = self.model_manager.get_default_model_instance(
+                        tenant_id=tenant_id,
+                        model_type=ModelType.TEXT_EMBEDDING,
+                    )
        else:
            if indexing_technique == 'high_quality':
                embedding_model_instance = self.model_manager.get_default_model_instance(
@ -328,12 +334,18 @@ class IndexingRunner:
            if not dataset:
                raise ValueError('Dataset not found.')
            if dataset.indexing_technique == 'high_quality' or indexing_technique == 'high_quality':
-                embedding_model_instance = self.model_manager.get_model_instance(
-                    tenant_id=tenant_id,
-                    provider=dataset.embedding_model_provider,
-                    model_type=ModelType.TEXT_EMBEDDING,
-                    model=dataset.embedding_model
-                )
+                if dataset.embedding_model_provider:
+                    embedding_model_instance = self.model_manager.get_model_instance(
+                        tenant_id=tenant_id,
+                        provider=dataset.embedding_model_provider,
+                        model_type=ModelType.TEXT_EMBEDDING,
+                        model=dataset.embedding_model
+                    )
+                else:
+                    embedding_model_instance = self.model_manager.get_default_model_instance(
+                        tenant_id=tenant_id,
+                        model_type=ModelType.TEXT_EMBEDDING,
+                    )
        else:
            if indexing_technique == 'high_quality':
                embedding_model_instance = self.model_manager.get_default_model_instance(
--- a/api/core/model_manager.py
+++ b/api/core/model_manager.py
@ -144,7 +144,7 @@ class ModelInstance:
            user=user
        )

-    def invoke_speech2text(self, file: IO[bytes], user: Optional[str] = None) \
+    def invoke_speech2text(self, file: IO[bytes], user: Optional[str] = None, **params) \
            -> str:
        """
        Invoke large language model
@ -161,7 +161,8 @@ class ModelInstance:
            model=self.model,
            credentials=self.credentials,
            file=file,
-            user=user
+            user=user,
+            **params
        )


@ -178,6 +179,8 @@ class ModelManager:
        :param model: model name
        :return:
        """
+        if not provider:
+            return self.get_default_model_instance(tenant_id, model_type)
        provider_model_bundle = self._provider_manager.get_provider_model_bundle(
            tenant_id=tenant_id,
            provider=provider,
--- a/api/core/model_runtime/docs/zh_Hans/customizable_model_scale_out.md
+++ b/api/core/model_runtime/docs/zh_Hans/customizable_model_scale_out.md
@ -30,7 +30,7 @@
 ```yaml
 provider: xinference #确定供应商标识
 label: # 供应商展示名称，可设置 en_US 英文、zh_Hans 中文两种语言，zh_Hans 不设置将默认使用 en_US。
-  en_US: Xorbots Inference
+  en_US: Xorbits Inference
 icon_small: # 小图标，可以参考其他供应商的图标，存储在对应供应商实现目录下的 _assets 目录，中英文策略同 label
  en_US: icon_s_en.svg
 icon_large: # 大图标
@ -260,7 +260,7 @@ provider_credential_schema:
            fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
            model_type=model_type,
            model_properties={ 
-                'mode':  ModelType.LLM,
+                ModelPropertyKey.MODE:  ModelType.LLM,
            },
            parameter_rules=rules
        )
--- a/api/core/model_runtime/entities/model_entities.py
+++ b/api/core/model_runtime/entities/model_entities.py
@ -32,7 +32,7 @@ class ModelType(Enum):
            return cls.TEXT_EMBEDDING
        elif origin_model_type == 'reranking' or origin_model_type == cls.RERANK.value:
            return cls.RERANK
-        elif origin_model_type == cls.SPEECH2TEXT.value:
+        elif origin_model_type == 'speech2text' or origin_model_type == cls.SPEECH2TEXT.value:
            return cls.SPEECH2TEXT
        elif origin_model_type == cls.MODERATION.value:
            return cls.MODERATION
--- a/api/core/model_runtime/errors/invoke.py
+++ b/api/core/model_runtime/errors/invoke.py
@ -8,6 +8,9 @@ class InvokeError(Exception):
    def __init__(self, description: Optional[str] = None) -> None:
        self.description = description

+    def __str__(self):
+        return self.description or self.__class__.__name__
+

 class InvokeConnectionError(InvokeError):
    """Raised when the Invoke returns connection error."""
--- a/api/core/model_runtime/model_providers/__base/ai_model.py
+++ b/api/core/model_runtime/model_providers/__base/ai_model.py
@ -148,7 +148,9 @@ class AIModel(ABC):
        position_map = {}
        if os.path.exists(position_file_path):
            with open(position_file_path, 'r', encoding='utf-8') as f:
-                position_map = yaml.safe_load(f)
+                positions = yaml.safe_load(f)
+                # convert list to dict with key as model provider name, value as index
+                position_map = {position: index for index, position in enumerate(positions)}

        # traverse all model_schema_yaml_paths
        for model_schema_yaml_path in model_schema_yaml_paths:
--- a/api/core/model_runtime/model_providers/__base/large_language_model.py
+++ b/api/core/model_runtime/model_providers/__base/large_language_model.py
@ -165,7 +165,7 @@ class LargeLanguageModel(AIModel):
                model=real_model,
                prompt_messages=prompt_messages,
                message=prompt_message,
-                usage=usage,
+                usage=usage if usage else LLMUsage.empty_usage(),
                system_fingerprint=system_fingerprint
            ),
            credentials=credentials,
--- a/api/core/model_runtime/model_providers/__base/model_provider.py
+++ b/api/core/model_runtime/model_providers/__base/model_provider.py
@ -112,7 +112,7 @@ class ModelProvider(ABC):
        model_class = None
        for name, obj in vars(mod).items():
            if (isinstance(obj, type) and issubclass(obj, AIModel) and not obj.__abstractmethods__
-                    and obj != AIModel):
+                    and obj != AIModel and obj.__module__ == mod.__name__):
                model_class = obj
                break

--- a/api/core/model_runtime/model_providers/_position.yaml
+++ b/api/core/model_runtime/model_providers/_position.yaml
@ -1,19 +1,20 @@
-openai: 0
-anthropic: 1
-azure_openai: 2
-google: 3
-replicate: 4
-huggingface_hub: 5
-cohere: 6
-zhipuai: 7
-baichuan: 8
-spark: 9
-minimax: 10
-tongyi: 11
-wenxin: 12
-jina: 13
-chatglm: 14
-xinference: 15
-openllm: 16
-localai: 17
-openai_api_compatible: 18
+- openai
+- anthropic
+- azure_openai
+- google
+- replicate
+- huggingface_hub
+- cohere
+- togetherai
+- zhipuai
+- baichuan
+- spark
+- minimax
+- tongyi
+- wenxin
+- jina
+- chatglm
+- xinference
+- openllm
+- localai
+- openai_api_compatible
--- a/api/core/model_runtime/model_providers/anthropic/anthropic.yaml
+++ b/api/core/model_runtime/model_providers/anthropic/anthropic.yaml
@ -16,24 +16,24 @@ help:
  url:
    en_US: https://console.anthropic.com/account/keys
 supported_model_types:
- llm
+  - llm
 configurate_methods:
- predefined-model
+  - predefined-model
 provider_credential_schema:
  credential_form_schemas:
-  - variable: anthropic_api_key
-    label:
-      en_US: API Key
-    type: secret-input
-    required: true
-    placeholder:
-      zh_Hans: 在此输入您的 API Key
-      en_US: Enter your API Key
-  - variable: anthropic_api_url
-    label:
-      en_US: API URL
-    type: text-input
-    required: false
-    placeholder:
-      zh_Hans: 在此输入您的 API URL
-      en_US: Enter your API URL
+    - variable: anthropic_api_key
+      label:
+        en_US: API Key
+      type: secret-input
+      required: true
+      placeholder:
+        zh_Hans: 在此输入您的 API Key
+        en_US: Enter your API Key
+    - variable: anthropic_api_url
+      label:
+        en_US: API URL
+      type: text-input
+      required: false
+      placeholder:
+        zh_Hans: 在此输入您的 API URL
+        en_US: Enter your API URL
--- a/api/core/model_runtime/model_providers/anthropic/llm/claude-2.1.yaml
+++ b/api/core/model_runtime/model_providers/anthropic/llm/claude-2.1.yaml
@ -3,32 +3,32 @@ label:
  en_US: claude-2.1
 model_type: llm
 features:
- agent-thought
+  - agent-thought
 model_properties:
  mode: chat
  context_size: 200000
 parameter_rules:
- name: temperature
-  use_template: temperature
- name: top_p
-  use_template: top_p
- name: top_k
-  label:
-    zh_Hans: 取样数量
-    en_US: Top k
-  type: int
-  help:
-    zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
-    en_US: Only sample from the top K options for each subsequent token.
-  required: false
- name: max_tokens_to_sample
-  use_template: max_tokens
-  required: true
-  default: 4096
-  min: 1
-  max: 4096
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: max_tokens_to_sample
+    use_template: max_tokens
+    required: true
+    default: 4096
+    min: 1
+    max: 4096
 pricing:
  input: '8.00'
  output: '24.00'
  unit: '0.000001'
-  currency: USD
+  currency: USD
--- a/api/core/model_runtime/model_providers/anthropic/llm/claude-2.yaml
+++ b/api/core/model_runtime/model_providers/anthropic/llm/claude-2.yaml
@ -3,32 +3,32 @@ label:
  en_US: claude-2
 model_type: llm
 features:
- agent-thought
+  - agent-thought
 model_properties:
  mode: chat
  context_size: 100000
 parameter_rules:
- name: temperature
-  use_template: temperature
- name: top_p
-  use_template: top_p
- name: top_k
-  label:
-    zh_Hans: 取样数量
-    en_US: Top k
-  type: int
-  help:
-    zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
-    en_US: Only sample from the top K options for each subsequent token.
-  required: false
- name: max_tokens_to_sample
-  use_template: max_tokens
-  required: true
-  default: 4096
-  min: 1
-  max: 4096
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: max_tokens_to_sample
+    use_template: max_tokens
+    required: true
+    default: 4096
+    min: 1
+    max: 4096
 pricing:
  input: '8.00'
  output: '24.00'
  unit: '0.000001'
-  currency: USD
+  currency: USD
--- a/api/core/model_runtime/model_providers/anthropic/llm/claude-instant-1.yaml
+++ b/api/core/model_runtime/model_providers/anthropic/llm/claude-instant-1.yaml
@ -2,32 +2,32 @@ model: claude-instant-1
 label:
  en_US: claude-instant-1
 model_type: llm
-features: []
+features: [ ]
 model_properties:
  mode: chat
  context_size: 100000
 parameter_rules:
- name: temperature
-  use_template: temperature
- name: top_p
-  use_template: top_p
- name: top_k
-  label:
-    zh_Hans: 取样数量
-    en_US: Top k
-  type: int
-  help:
-    zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
-    en_US: Only sample from the top K options for each subsequent token.
-  required: false
- name: max_tokens_to_sample
-  use_template: max_tokens
-  required: true
-  default: 4096
-  min: 1
-  max: 4096
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: max_tokens_to_sample
+    use_template: max_tokens
+    required: true
+    default: 4096
+    min: 1
+    max: 4096
 pricing:
  input: '1.63'
  output: '5.51'
  unit: '0.000001'
-  currency: USD
+  currency: USD
--- a/api/core/model_runtime/model_providers/azure_openai/_constant.py
+++ b/api/core/model_runtime/model_providers/azure_openai/_constant.py
@ -2,7 +2,7 @@ from pydantic import BaseModel

 from core.model_runtime.entities.llm_entities import LLMMode
 from core.model_runtime.entities.model_entities import ModelFeature, ModelType, FetchFrom, ParameterRule, \
-    DefaultParameterName, PriceConfig
+    DefaultParameterName, PriceConfig, ModelPropertyKey
 from core.model_runtime.entities.model_entities import AIModelEntity, I18nObject
 from core.model_runtime.entities.defaults import PARAMETER_RULE_TEMPLATE

@ -40,8 +40,8 @@ LLM_BASE_MODELS = [
            ],
            fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
            model_properties={
-                'mode': LLMMode.CHAT.value,
-                'context_size': 4096,
+                ModelPropertyKey.MODE: LLMMode.CHAT.value,
+                ModelPropertyKey.CONTEXT_SIZE: 4096,
            },
            parameter_rules=[
                ParameterRule(
@ -84,8 +84,8 @@ LLM_BASE_MODELS = [
            ],
            fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
            model_properties={
-                'mode': LLMMode.CHAT.value,
-                'context_size': 16385,
+                ModelPropertyKey.MODE: LLMMode.CHAT.value,
+                ModelPropertyKey.CONTEXT_SIZE: 16385,
            },
            parameter_rules=[
                ParameterRule(
@ -128,8 +128,8 @@ LLM_BASE_MODELS = [
            ],
            fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
            model_properties={
-                'mode': LLMMode.CHAT.value,
-                'context_size': 8192,
+                ModelPropertyKey.MODE: LLMMode.CHAT.value,
+                ModelPropertyKey.CONTEXT_SIZE: 8192,
            },
            parameter_rules=[
                ParameterRule(
@ -202,8 +202,8 @@ LLM_BASE_MODELS = [
            ],
            fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
            model_properties={
-                'mode': LLMMode.CHAT.value,
-                'context_size': 32768,
+                ModelPropertyKey.MODE: LLMMode.CHAT.value,
+                ModelPropertyKey.CONTEXT_SIZE: 32768,
            },
            parameter_rules=[
                ParameterRule(
@ -276,8 +276,8 @@ LLM_BASE_MODELS = [
            ],
            fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
            model_properties={
-                'mode': LLMMode.CHAT.value,
-                'context_size': 128000,
+                ModelPropertyKey.MODE: LLMMode.CHAT.value,
+                ModelPropertyKey.CONTEXT_SIZE: 128000,
            },
            parameter_rules=[
                ParameterRule(
@ -296,7 +296,7 @@ LLM_BASE_MODELS = [
                    name='frequency_penalty',
                    **PARAMETER_RULE_TEMPLATE[DefaultParameterName.FREQUENCY_PENALTY],
                ),
-                _get_max_tokens(default=512, min_val=1, max_val=128000),
+                _get_max_tokens(default=512, min_val=1, max_val=4096),
                ParameterRule(
                    name='seed',
                    label=I18nObject(
@ -349,8 +349,8 @@ LLM_BASE_MODELS = [
            ],
            fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
            model_properties={
-                'mode': LLMMode.CHAT.value,
-                'context_size': 128000,
+                ModelPropertyKey.MODE: LLMMode.CHAT.value,
+                ModelPropertyKey.CONTEXT_SIZE: 128000,
            },
            parameter_rules=[
                ParameterRule(
@ -369,7 +369,7 @@ LLM_BASE_MODELS = [
                    name='frequency_penalty',
                    **PARAMETER_RULE_TEMPLATE[DefaultParameterName.FREQUENCY_PENALTY],
                ),
-                _get_max_tokens(default=512, min_val=1, max_val=128000),
+                _get_max_tokens(default=512, min_val=1, max_val=4096),
                ParameterRule(
                    name='seed',
                    label=I18nObject(
@ -419,8 +419,8 @@ LLM_BASE_MODELS = [
            model_type=ModelType.LLM,
            fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
            model_properties={
-                'mode': LLMMode.COMPLETION.value,
-                'context_size': 4096,
+                ModelPropertyKey.MODE: LLMMode.COMPLETION.value,
+                ModelPropertyKey.CONTEXT_SIZE: 4096,
            },
            parameter_rules=[
                ParameterRule(
@ -459,8 +459,8 @@ LLM_BASE_MODELS = [
            model_type=ModelType.LLM,
            fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
            model_properties={
-                'mode': LLMMode.COMPLETION.value,
-                'context_size': 4096,
+                ModelPropertyKey.MODE: LLMMode.COMPLETION.value,
+                ModelPropertyKey.CONTEXT_SIZE: 4096,
            },
            parameter_rules=[
                ParameterRule(
@ -502,8 +502,8 @@ EMBEDDING_BASE_MODELS = [
            fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
            model_type=ModelType.TEXT_EMBEDDING,
            model_properties={
-                'context_size': 8097,
-                'max_chunks': 32,
+                ModelPropertyKey.CONTEXT_SIZE: 8097,
+                ModelPropertyKey.MAX_CHUNKS: 32,
            },
            pricing=PriceConfig(
                input=0.0001,
--- a/api/core/model_runtime/model_providers/azure_openai/azure_openai.yaml
+++ b/api/core/model_runtime/model_providers/azure_openai/azure_openai.yaml
@ -13,10 +13,10 @@ help:
  url:
    en_US: https://azure.microsoft.com/en-us/products/ai-services/openai-service
 supported_model_types:
- llm
- text-embedding
+  - llm
+  - text-embedding
 configurate_methods:
- customizable-model
+  - customizable-model
 model_credential_schema:
  model:
    label:
@ -26,79 +26,79 @@ model_credential_schema:
      en_US: Enter your Deployment Name here, matching the Azure deployment name.
      zh_Hans: 在此输入您的部署名称，与 Azure 部署名称匹配。
  credential_form_schemas:
-  - variable: openai_api_base
-    label:
-      en_US: API Endpoint URL
-      zh_Hans: API 域名
-    type: text-input
-    required: true
-    placeholder:
-      zh_Hans: '在此输入您的 API 域名，如：https://example.com/xxx'
-      en_US: 'Enter your API Endpoint, eg: https://example.com/xxx'
-  - variable: openai_api_key
-    label:
-      en_US: API Key
-      zh_Hans: API Key
-    type: secret-input
-    required: true
-    placeholder:
-      zh_Hans: 在此输入您的 API Key
-      en_US: Enter your API key here
-  - variable: base_model_name
-    label:
-      en_US: Base Model
-      zh_Hans: 基础模型
-    type: select
-    required: true
-    options:
-    - label:
-        en_US: gpt-35-turbo
-      value: gpt-35-turbo
-      show_on:
-      - variable: __model_type
-        value: llm
-    - label:
-        en_US: gpt-35-turbo-16k
-      value: gpt-35-turbo-16k
-      show_on:
-      - variable: __model_type
-        value: llm
-    - label:
-        en_US: gpt-4
-      value: gpt-4
-      show_on:
-      - variable: __model_type
-        value: llm
-    - label:
-        en_US: gpt-4-32k
-      value: gpt-4-32k
-      show_on:
-      - variable: __model_type
-        value: llm
-    - label:
-        en_US: gpt-4-1106-preview
-      value: gpt-4-1106-preview
-      show_on:
-      - variable: __model_type
-        value: llm
-    - label:
-        en_US: gpt-4-vision-preview
-      value: gpt-4-vision-preview
-      show_on:
-      - variable: __model_type
-        value: llm
-    - label:
-        en_US: gpt-35-turbo-instruct
-      value: gpt-35-turbo-instruct
-      show_on:
-      - variable: __model_type
-        value: llm
-    - label:
-        en_US: text-embedding-ada-002
-      value: text-embedding-ada-002
-      show_on:
-        - variable: __model_type
-          value: text-embedding
-    placeholder:
-      zh_Hans: 在此输入您的模型版本
-      en_US: Enter your model version
+    - variable: openai_api_base
+      label:
+        en_US: API Endpoint URL
+        zh_Hans: API 域名
+      type: text-input
+      required: true
+      placeholder:
+        zh_Hans: '在此输入您的 API 域名，如：https://example.com/xxx'
+        en_US: 'Enter your API Endpoint, eg: https://example.com/xxx'
+    - variable: openai_api_key
+      label:
+        en_US: API Key
+        zh_Hans: API Key
+      type: secret-input
+      required: true
+      placeholder:
+        zh_Hans: 在此输入您的 API Key
+        en_US: Enter your API key here
+    - variable: base_model_name
+      label:
+        en_US: Base Model
+        zh_Hans: 基础模型
+      type: select
+      required: true
+      options:
+        - label:
+            en_US: gpt-35-turbo
+          value: gpt-35-turbo
+          show_on:
+            - variable: __model_type
+              value: llm
+        - label:
+            en_US: gpt-35-turbo-16k
+          value: gpt-35-turbo-16k
+          show_on:
+            - variable: __model_type
+              value: llm
+        - label:
+            en_US: gpt-4
+          value: gpt-4
+          show_on:
+            - variable: __model_type
+              value: llm
+        - label:
+            en_US: gpt-4-32k
+          value: gpt-4-32k
+          show_on:
+            - variable: __model_type
+              value: llm
+        - label:
+            en_US: gpt-4-1106-preview
+          value: gpt-4-1106-preview
+          show_on:
+            - variable: __model_type
+              value: llm
+        - label:
+            en_US: gpt-4-vision-preview
+          value: gpt-4-vision-preview
+          show_on:
+            - variable: __model_type
+              value: llm
+        - label:
+            en_US: gpt-35-turbo-instruct
+          value: gpt-35-turbo-instruct
+          show_on:
+            - variable: __model_type
+              value: llm
+        - label:
+            en_US: text-embedding-ada-002
+          value: text-embedding-ada-002
+          show_on:
+            - variable: __model_type
+              value: text-embedding
+      placeholder:
+        zh_Hans: 在此输入您的模型版本
+        en_US: Enter your model version
--- a/api/core/model_runtime/model_providers/azure_openai/llm/llm.py
+++ b/api/core/model_runtime/model_providers/azure_openai/llm/llm.py
@ -30,7 +30,7 @@ class AzureOpenAILargeLanguageModel(_CommonAzureOpenAI, LargeLanguageModel):
                stream: bool = True, user: Optional[str] = None) \
            -> Union[LLMResult, Generator]:

-        ai_model_entity = self._get_ai_model_entity(credentials['base_model_name'], model)
+        ai_model_entity = self._get_ai_model_entity(credentials.get('base_model_name'), model)

        if ai_model_entity.entity.model_properties.get(ModelPropertyKey.MODE) == LLMMode.CHAT.value:
            # chat model
@ -59,7 +59,7 @@ class AzureOpenAILargeLanguageModel(_CommonAzureOpenAI, LargeLanguageModel):
    def get_num_tokens(self, model: str, credentials: dict, prompt_messages: list[PromptMessage],
                       tools: Optional[list[PromptMessageTool]] = None) -> int:

-        model_mode = self._get_ai_model_entity(credentials['base_model_name'], model).entity.model_properties.get(
+        model_mode = self._get_ai_model_entity(credentials.get('base_model_name'), model).entity.model_properties.get(
            ModelPropertyKey.MODE)

        if model_mode == LLMMode.CHAT.value:
@ -79,7 +79,7 @@ class AzureOpenAILargeLanguageModel(_CommonAzureOpenAI, LargeLanguageModel):
        if 'base_model_name' not in credentials:
            raise CredentialsValidateFailedError('Base Model Name is required')

-        ai_model_entity = self._get_ai_model_entity(credentials['base_model_name'], model)
+        ai_model_entity = self._get_ai_model_entity(credentials.get('base_model_name'), model)

        if not ai_model_entity:
            raise CredentialsValidateFailedError(f'Base Model Name {credentials["base_model_name"]} is invalid')
@ -109,8 +109,8 @@ class AzureOpenAILargeLanguageModel(_CommonAzureOpenAI, LargeLanguageModel):
            raise CredentialsValidateFailedError(str(ex))

    def get_customizable_model_schema(self, model: str, credentials: dict) -> Optional[AIModelEntity]:
-        ai_model_entity = self._get_ai_model_entity(credentials['base_model_name'], model)
-        return ai_model_entity.entity
+        ai_model_entity = self._get_ai_model_entity(credentials.get('base_model_name'), model)
+        return ai_model_entity.entity if ai_model_entity else None

    def _generate(self, model: str, credentials: dict,
                  prompt_messages: list[PromptMessage], model_parameters: dict, stop: Optional[List[str]] = None,
@ -309,7 +309,7 @@ class AzureOpenAILargeLanguageModel(_CommonAzureOpenAI, LargeLanguageModel):

        # transform response
        response = LLMResult(
-            model=response.model,
+            model=response.model or model,
            prompt_messages=prompt_messages,
            message=assistant_prompt_message,
            usage=usage,
--- a/api/core/model_runtime/model_providers/baichuan/baichuan.yaml
+++ b/api/core/model_runtime/model_providers/baichuan/baichuan.yaml
@ -8,30 +8,30 @@ icon_large:
 background: "#FFF6F2"
 help:
  title:
-    en_US: Get your API Key from BAICHUAN AI 
+    en_US: Get your API Key from BAICHUAN AI
    zh_Hans: 从百川智能获取您的 API Key
  url:
    en_US: https://www.baichuan-ai.com
 supported_model_types:
- llm
- text-embedding
+  - llm
+  - text-embedding
 configurate_methods:
- predefined-model
+  - predefined-model
 provider_credential_schema:
  credential_form_schemas:
-  - variable: api_key
-    label:
-      en_US: API Key
-    type: secret-input
-    required: true
-    placeholder:
-      zh_Hans: 在此输入您的 API Key
-      en_US: Enter your API Key
-  - variable: secret_key
-    label:
-      en_US: Secret Key
-    type: secret-input
-    required: false
-    placeholder:
-      zh_Hans: 在此输入您的 Secret Key
-      en_US: Enter your Secret Key
+    - variable: api_key
+      label:
+        en_US: API Key
+      type: secret-input
+      required: true
+      placeholder:
+        zh_Hans: 在此输入您的 API Key
+        en_US: Enter your API Key
+    - variable: secret_key
+      label:
+        en_US: Secret Key
+      type: secret-input
+      required: false
+      placeholder:
+        zh_Hans: 在此输入您的 Secret Key
+        en_US: Enter your Secret Key
--- a/api/core/model_runtime/model_providers/baichuan/llm/baichuan2-53b.yaml
+++ b/api/core/model_runtime/model_providers/baichuan/llm/baichuan2-53b.yaml
@ -3,40 +3,40 @@ label:
  en_US: Baichuan2-53B
 model_type: llm
 features:
- agent-thought
+  - agent-thought
 model_properties:
  mode: chat
  context_size: 4000
 parameter_rules:
- name: temperature
-  use_template: temperature
- name: top_p
-  use_template: top_p
- name: top_k
-  label:
-    zh_Hans: 取样数量
-    en_US: Top k
-  type: int
-  help:
-    zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
-    en_US: Only sample from the top K options for each subsequent token.
-  required: false
- name: max_tokens
-  use_template: max_tokens
-  required: true
-  default: 1000
-  min: 1
-  max: 4000
- name: presence_penalty
-  use_template: presence_penalty
- name: frequency_penalty
-  use_template: frequency_penalty
- name: with_search_enhance
-  label:
-    zh_Hans: 搜索增强
-    en_US: Search Enhance
-  type: boolean
-  help:
-    zh_Hans: 允许模型自行进行外部搜索，以增强生成结果。
-    en_US: Allow the model to perform external search to enhance the generation results.
-  required: false
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: max_tokens
+    use_template: max_tokens
+    required: true
+    default: 1000
+    min: 1
+    max: 4000
+  - name: presence_penalty
+    use_template: presence_penalty
+  - name: frequency_penalty
+    use_template: frequency_penalty
+  - name: with_search_enhance
+    label:
+      zh_Hans: 搜索增强
+      en_US: Search Enhance
+    type: boolean
+    help:
+      zh_Hans: 允许模型自行进行外部搜索，以增强生成结果。
+      en_US: Allow the model to perform external search to enhance the generation results.
+    required: false
--- a/api/core/model_runtime/model_providers/baichuan/llm/baichuan2-turbo-192k.yaml
+++ b/api/core/model_runtime/model_providers/baichuan/llm/baichuan2-turbo-192k.yaml
@ -3,40 +3,40 @@ label:
  en_US: Baichuan2-Turbo-192K
 model_type: llm
 features:
- agent-thought
+  - agent-thought
 model_properties:
  mode: chat
  context_size: 192000
 parameter_rules:
- name: temperature
-  use_template: temperature
- name: top_p
-  use_template: top_p
- name: top_k
-  label:
-    zh_Hans: 取样数量
-    en_US: Top k
-  type: int
-  help:
-    zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
-    en_US: Only sample from the top K options for each subsequent token.
-  required: false
- name: max_tokens
-  use_template: max_tokens
-  required: true
-  default: 8000
-  min: 1
-  max: 192000
- name: presence_penalty
-  use_template: presence_penalty
- name: frequency_penalty
-  use_template: frequency_penalty
- name: with_search_enhance
-  label:
-    zh_Hans: 搜索增强
-    en_US: Search Enhance
-  type: boolean
-  help:
-    zh_Hans: 允许模型自行进行外部搜索，以增强生成结果。
-    en_US: Allow the model to perform external search to enhance the generation results.
-  required: false
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: max_tokens
+    use_template: max_tokens
+    required: true
+    default: 8000
+    min: 1
+    max: 192000
+  - name: presence_penalty
+    use_template: presence_penalty
+  - name: frequency_penalty
+    use_template: frequency_penalty
+  - name: with_search_enhance
+    label:
+      zh_Hans: 搜索增强
+      en_US: Search Enhance
+    type: boolean
+    help:
+      zh_Hans: 允许模型自行进行外部搜索，以增强生成结果。
+      en_US: Allow the model to perform external search to enhance the generation results.
+    required: false
--- a/api/core/model_runtime/model_providers/baichuan/llm/baichuan2-turbo.yaml
+++ b/api/core/model_runtime/model_providers/baichuan/llm/baichuan2-turbo.yaml
@ -3,40 +3,40 @@ label:
  en_US: Baichuan2-Turbo
 model_type: llm
 features:
- agent-thought
+  - agent-thought
 model_properties:
  mode: chat
  context_size: 192000
 parameter_rules:
- name: temperature
-  use_template: temperature
- name: top_p
-  use_template: top_p
- name: top_k
-  label:
-    zh_Hans: 取样数量
-    en_US: Top k
-  type: int
-  help:
-    zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
-    en_US: Only sample from the top K options for each subsequent token.
-  required: false
- name: max_tokens
-  use_template: max_tokens
-  required: true
-  default: 8000
-  min: 1
-  max: 192000
- name: presence_penalty
-  use_template: presence_penalty
- name: frequency_penalty
-  use_template: frequency_penalty
- name: with_search_enhance
-  label:
-    zh_Hans: 搜索增强
-    en_US: Search Enhance
-  type: boolean
-  help:
-    zh_Hans: 允许模型自行进行外部搜索，以增强生成结果。
-    en_US: Allow the model to perform external search to enhance the generation results.
-  required: false
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: max_tokens
+    use_template: max_tokens
+    required: true
+    default: 8000
+    min: 1
+    max: 192000
+  - name: presence_penalty
+    use_template: presence_penalty
+  - name: frequency_penalty
+    use_template: frequency_penalty
+  - name: with_search_enhance
+    label:
+      zh_Hans: 搜索增强
+      en_US: Search Enhance
+    type: boolean
+    help:
+      zh_Hans: 允许模型自行进行外部搜索，以增强生成结果。
+      en_US: Allow the model to perform external search to enhance the generation results.
+    required: false
--- a/api/core/model_runtime/model_providers/baichuan/text_embedding/baichuan-text-embedding.yaml
+++ b/api/core/model_runtime/model_providers/baichuan/text_embedding/baichuan-text-embedding.yaml
@ -2,4 +2,4 @@ model: baichuan-text-embedding
 model_type: text-embedding
 model_properties:
  context_size: 512
-  max_chunks: 16
+  max_chunks: 16
--- a/api/core/model_runtime/model_providers/baichuan/text_embedding/text_embedding.py
+++ b/api/core/model_runtime/model_providers/baichuan/text_embedding/text_embedding.py
@ -1,4 +1,4 @@
-from typing import Optional
+from typing import Optional, Tuple

 from core.model_runtime.entities.model_entities import PriceType
 from core.model_runtime.entities.text_embedding_entities import TextEmbeddingResult, EmbeddingUsage
@ -38,6 +38,50 @@ class BaichuanTextEmbeddingModel(TextEmbeddingModel):
            raise ValueError('Invalid model name')
        if not api_key:
            raise CredentialsValidateFailedError('api_key is required')
+        
+        # split into chunks of batch size 16
+        chunks = []
+        for i in range(0, len(texts), 16):
+            chunks.append(texts[i:i + 16])
+
+        embeddings = []
+        token_usage = 0
+
+        for chunk in chunks:
+            # embeding chunk
+            chunk_embeddings, chunk_usage = self.embedding(
+                model=model,
+                api_key=api_key,
+                texts=chunk,
+                user=user
+            )
+
+            embeddings.extend(chunk_embeddings)
+            token_usage += chunk_usage
+
+        result = TextEmbeddingResult(
+            model=model,
+            embeddings=embeddings,
+            usage=self._calc_response_usage(
+                model=model,
+                credentials=credentials,
+                tokens=token_usage
+            )
+        )
+
+        return result
+    
+    def embedding(self, model: str, api_key, texts: list[str], user: Optional[str] = None) \
+            -> Tuple[list[list[float]], int]:
+        """
+        Embed given texts
+
+        :param model: model name
+        :param credentials: model credentials
+        :param texts: texts to embed
+        :param user: unique user id
+        :return: embeddings result
+        """
        url = self.api_base
        headers = {
            'Authorization': 'Bearer ' + api_key,
@ -69,9 +113,9 @@ class BaichuanTextEmbeddingModel(TextEmbeddingModel):
                raise InsufficientAccountBalance(msg)
            elif err == 'invalid_authentication':
                raise InvalidAuthenticationError(msg)
-            elif 'rate' in err:
+            elif err and 'rate' in err:
                raise RateLimitReachedError(msg)
-            elif 'internal' in err:
+            elif err and 'internal' in err:
                raise InternalServerError(msg)
            elif err == 'api_key_empty':
                raise InvalidAPIKeyError(msg)
@ -85,17 +129,10 @@ class BaichuanTextEmbeddingModel(TextEmbeddingModel):
        except Exception as e:
            raise InternalServerError(f"Failed to convert response to json: {e} with text: {response.text}")

-        usage = self._calc_response_usage(model=model, credentials=credentials, tokens=usage['total_tokens'])
+        return [
+            data['embedding'] for data in embeddings
+        ], usage['total_tokens']

-        result = TextEmbeddingResult(
-            model=model,
-            embeddings=[[
-                float(data) for data in x['embedding']
-            ] for x in embeddings],
-            usage=usage
-        )
-
-        return result

    def get_num_tokens(self, model: str, credentials: dict, texts: list[str]) -> int:
        """
--- a/api/core/model_runtime/model_providers/chatglm/chatglm.yaml
+++ b/api/core/model_runtime/model_providers/chatglm/chatglm.yaml
@ -13,16 +13,16 @@ help:
  url:
    en_US: https://github.com/THUDM/ChatGLM3
 supported_model_types:
- llm
+  - llm
 configurate_methods:
- predefined-model
+  - predefined-model
 provider_credential_schema:
  credential_form_schemas:
-  - variable: api_base
-    label:
-      en_US: API URL
-    type: text-input
-    required: true
-    placeholder:
-      zh_Hans: 在此输入您的 API URL
-      en_US: Enter your API URL
+    - variable: api_base
+      label:
+        en_US: API URL
+      type: text-input
+      required: true
+      placeholder:
+        zh_Hans: 在此输入您的 API URL
+        en_US: Enter your API URL
--- a/api/core/model_runtime/model_providers/chatglm/llm/chatglm2-6b-32k.yaml
+++ b/api/core/model_runtime/model_providers/chatglm/llm/chatglm2-6b-32k.yaml
@ -3,19 +3,19 @@ label:
  en_US: ChatGLM2-6B-32K
 model_type: llm
 features:
- agent-thought
+  - agent-thought
 model_properties:
  mode: chat
  context_size: 32000
 parameter_rules:
- name: temperature
-  use_template: temperature
- name: top_p
-  use_template: top_p
-  required: false
- name: max_tokens
-  use_template: max_tokens
-  required: true
-  default: 2000
-  min: 1
-  max: 32000
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+    required: false
+  - name: max_tokens
+    use_template: max_tokens
+    required: true
+    default: 2000
+    min: 1
+    max: 32000
--- a/api/core/model_runtime/model_providers/chatglm/llm/chatglm2-6b.yaml
+++ b/api/core/model_runtime/model_providers/chatglm/llm/chatglm2-6b.yaml
@ -3,19 +3,19 @@ label:
  en_US: ChatGLM2-6B
 model_type: llm
 features:
- agent-thought
+  - agent-thought
 model_properties:
  mode: chat
  context_size: 2000
 parameter_rules:
- name: temperature
-  use_template: temperature
- name: top_p
-  use_template: top_p
-  required: false
- name: max_tokens
-  use_template: max_tokens
-  required: true
-  default: 256
-  min: 1
-  max: 2000
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+    required: false
+  - name: max_tokens
+    use_template: max_tokens
+    required: true
+    default: 256
+    min: 1
+    max: 2000
--- a/api/core/model_runtime/model_providers/chatglm/llm/chatglm3-6b-32k.yaml
+++ b/api/core/model_runtime/model_providers/chatglm/llm/chatglm3-6b-32k.yaml
@ -3,20 +3,20 @@ label:
  en_US: ChatGLM3-6B-32K
 model_type: llm
 features:
- tool-call
- agent-thought
+  - tool-call
+  - agent-thought
 model_properties:
  mode: chat
  context_size: 32000
 parameter_rules:
- name: temperature
-  use_template: temperature
- name: top_p
-  use_template: top_p
-  required: false
- name: max_tokens
-  use_template: max_tokens
-  required: true
-  default: 8000
-  min: 1
-  max: 32000
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+    required: false
+  - name: max_tokens
+    use_template: max_tokens
+    required: true
+    default: 8000
+    min: 1
+    max: 32000
--- a/api/core/model_runtime/model_providers/chatglm/llm/chatglm3-6b.yaml
+++ b/api/core/model_runtime/model_providers/chatglm/llm/chatglm3-6b.yaml
@ -3,20 +3,20 @@ label:
  en_US: ChatGLM3-6B
 model_type: llm
 features:
- tool-call
- agent-thought
+  - tool-call
+  - agent-thought
 model_properties:
  mode: chat
  context_size: 8000
 parameter_rules:
- name: temperature
-  use_template: temperature
- name: top_p
-  use_template: top_p
-  required: false
- name: max_tokens
-  use_template: max_tokens
-  required: true
-  default: 256
-  min: 1
-  max: 8000
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+    required: false
+  - name: max_tokens
+    use_template: max_tokens
+    required: true
+    default: 256
+    min: 1
+    max: 8000
--- a/api/core/model_runtime/model_providers/cohere/cohere.yaml
+++ b/api/core/model_runtime/model_providers/cohere/cohere.yaml
@ -14,18 +14,18 @@ help:
  url:
    en_US: https://dashboard.cohere.com/api-keys
 supported_model_types:
- rerank
+  - rerank
 configurate_methods:
- predefined-model
+  - predefined-model
 provider_credential_schema:
  credential_form_schemas:
-  - variable: api_key
-    label:
-      zh_Hans: API Key
-      en_US: API Key
-    type: secret-input
-    required: true
-    placeholder:
-      zh_Hans: 请填写 API Key
-      en_US: Please fill in API Key
-    show_on: []
+    - variable: api_key
+      label:
+        zh_Hans: API Key
+        en_US: API Key
+      type: secret-input
+      required: true
+      placeholder:
+        zh_Hans: 请填写 API Key
+        en_US: Please fill in API Key
+      show_on: [ ]
--- a/api/core/model_runtime/model_providers/cohere/rerank/rerank-multilingual-v2.0.yaml
+++ b/api/core/model_runtime/model_providers/cohere/rerank/rerank-multilingual-v2.0.yaml
@ -1,4 +1,4 @@
 model: rerank-multilingual-v2.0
 model_type: rerank
 model_properties:
-  context_size: 5120
+  context_size: 5120
--- a/api/core/model_runtime/model_providers/google/google.yaml
+++ b/api/core/model_runtime/model_providers/google/google.yaml
@ -16,17 +16,16 @@ help:
  url:
    en_US: https://ai.google.dev/
 supported_model_types:
- llm
+  - llm
 configurate_methods:
- predefined-model
+  - predefined-model
 provider_credential_schema:
  credential_form_schemas:
-  - variable: google_api_key
-    label:
-      en_US: API Key
-    type: secret-input
-    required: true
-    placeholder:
-      zh_Hans: 在此输入您的 API Key
-      en_US: Enter your API Key
-  
+    - variable: google_api_key
+      label:
+        en_US: API Key
+      type: secret-input
+      required: true
+      placeholder:
+        zh_Hans: 在此输入您的 API Key
+        en_US: Enter your API Key
--- a/api/core/model_runtime/model_providers/google/llm/gemini-pro-vision.yaml
+++ b/api/core/model_runtime/model_providers/google/llm/gemini-pro-vision.yaml
@ -3,32 +3,32 @@ label:
  en_US: Gemini Pro Vision
 model_type: llm
 features:
- vision
+  - vision
 model_properties:
  mode: chat
  context_size: 12288
 parameter_rules:
- name: temperature
-  use_template: temperature
- name: top_p
-  use_template: top_p
- name: top_k
-  label:
-    zh_Hans: 取样数量
-    en_US: Top k
-  type: int
-  help:
-    zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
-    en_US: Only sample from the top K options for each subsequent token.
-  required: false
- name: max_tokens_to_sample
-  use_template: max_tokens
-  required: true
-  default: 4096
-  min: 1
-  max: 4096
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: max_tokens_to_sample
+    use_template: max_tokens
+    required: true
+    default: 4096
+    min: 1
+    max: 4096
 pricing:
  input: '0.00'
  output: '0.00'
  unit: '0.000001'
-  currency: USD
+  currency: USD
--- a/api/core/model_runtime/model_providers/google/llm/gemini-pro.yaml
+++ b/api/core/model_runtime/model_providers/google/llm/gemini-pro.yaml
@ -3,32 +3,32 @@ label:
  en_US: Gemini Pro
 model_type: llm
 features:
- agent-thought
+  - agent-thought
 model_properties:
  mode: chat
  context_size: 30720
 parameter_rules:
- name: temperature
-  use_template: temperature
- name: top_p
-  use_template: top_p
- name: top_k
-  label:
-    zh_Hans: 取样数量
-    en_US: Top k
-  type: int
-  help:
-    zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
-    en_US: Only sample from the top K options for each subsequent token.
-  required: false
- name: max_tokens_to_sample
-  use_template: max_tokens
-  required: true
-  default: 2048
-  min: 1
-  max: 2048
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: max_tokens_to_sample
+    use_template: max_tokens
+    required: true
+    default: 2048
+    min: 1
+    max: 2048
 pricing:
  input: '0.00'
  output: '0.00'
  unit: '0.000001'
-  currency: USD
+  currency: USD
--- a/api/core/model_runtime/model_providers/huggingface_hub/huggingface_hub.yaml
+++ b/api/core/model_runtime/model_providers/huggingface_hub/huggingface_hub.yaml
@ -2,9 +2,9 @@ provider: huggingface_hub
 label:
  en_US: Hugging Face Model
 icon_small:
-    en_US: icon_s_en.svg
+  en_US: icon_s_en.svg
 icon_large:
-    en_US: icon_l_en.svg
+  en_US: icon_l_en.svg
 background: "#FFF8DC"
 help:
  title:
@ -13,90 +13,90 @@ help:
  url:
    en_US: https://huggingface.co/settings/tokens
 supported_model_types:
- llm
- text-embedding
+  - llm
+  - text-embedding
 configurate_methods:
- customizable-model
+  - customizable-model
 model_credential_schema:
  model:
    label:
      en_US: Model Name
      zh_Hans: 模型名称
  credential_form_schemas:
-  - variable: huggingfacehub_api_type
-    label:
-      en_US: Endpoint Type
-      zh_Hans: 端点类型
-    type: radio
-    required: true
-    default: hosted_inference_api
-    options:
-    - value: hosted_inference_api
-      label:
-        en_US: Hosted Inference API
-    - value: inference_endpoints
-      label:
-        en_US: Inference Endpoints
-  - variable: huggingfacehub_api_token
-    label:
-      en_US: API Token
-      zh_Hans: API Token
-    type: secret-input
-    required: true
-    placeholder:
-      en_US: Enter your Hugging Face Hub API Token here
-      zh_Hans: 在此输入您的 Hugging Face Hub API Token
-  - variable: huggingface_namespace
-    label:
-      en_US: 'User Name / Organization Name'
-      zh_Hans: '用户名 / 组织名称'
-    type: text-input
-    required: true
-    placeholder:
-      en_US: 'Enter your User Name / Organization Name here'
-      zh_Hans: '在此输入您的用户名 / 组织名称'
-    show_on:
-    - variable: __model_type
-      value: text-embedding
    - variable: huggingfacehub_api_type
-      value: inference_endpoints
-  - variable: huggingfacehub_endpoint_url
-    label:
-      en_US: Endpoint URL
-      zh_Hans: 端点 URL
-    type: text-input
-    required: true
-    placeholder:
-      en_US: Enter your Endpoint URL here
-      zh_Hans: 在此输入您的端点 URL
-    show_on:
-    - variable: huggingfacehub_api_type
-      value: inference_endpoints
-  - variable: task_type
-    label:
-      en_US: Task
-      zh_Hans: Task
-    type: select
-    options:
-    - value: text2text-generation
      label:
-        en_US: Text-to-Text Generation
-      show_on:
-      - variable: __model_type
-        value: llm
-    - value: text-generation
+        en_US: Endpoint Type
+        zh_Hans: 端点类型
+      type: radio
+      required: true
+      default: hosted_inference_api
+      options:
+        - value: hosted_inference_api
+          label:
+            en_US: Hosted Inference API
+        - value: inference_endpoints
+          label:
+            en_US: Inference Endpoints
+    - variable: huggingfacehub_api_token
      label:
-        en_US: Text Generation
-        zh_Hans: 文本生成
-      show_on:
-      - variable: __model_type
-        value: llm
-    - value: feature-extraction
+        en_US: API Token
+        zh_Hans: API Token
+      type: secret-input
+      required: true
+      placeholder:
+        en_US: Enter your Hugging Face Hub API Token here
+        zh_Hans: 在此输入您的 Hugging Face Hub API Token
+    - variable: huggingface_namespace
      label:
-        en_US: Feature Extraction
+        en_US: 'User Name / Organization Name'
+        zh_Hans: '用户名 / 组织名称'
+      type: text-input
+      required: true
+      placeholder:
+        en_US: 'Enter your User Name / Organization Name here'
+        zh_Hans: '在此输入您的用户名 / 组织名称'
      show_on:
-      - variable: __model_type
-        value: text-embedding
-    show_on:
-    - variable: huggingfacehub_api_type
-      value: inference_endpoints
+        - variable: __model_type
+          value: text-embedding
+        - variable: huggingfacehub_api_type
+          value: inference_endpoints
+    - variable: huggingfacehub_endpoint_url
+      label:
+        en_US: Endpoint URL
+        zh_Hans: 端点 URL
+      type: text-input
+      required: true
+      placeholder:
+        en_US: Enter your Endpoint URL here
+        zh_Hans: 在此输入您的端点 URL
+      show_on:
+        - variable: huggingfacehub_api_type
+          value: inference_endpoints
+    - variable: task_type
+      label:
+        en_US: Task
+        zh_Hans: Task
+      type: select
+      options:
+        - value: text2text-generation
+          label:
+            en_US: Text-to-Text Generation
+          show_on:
+            - variable: __model_type
+              value: llm
+        - value: text-generation
+          label:
+            en_US: Text Generation
+            zh_Hans: 文本生成
+          show_on:
+            - variable: __model_type
+              value: llm
+        - value: feature-extraction
+          label:
+            en_US: Feature Extraction
+          show_on:
+            - variable: __model_type
+              value: text-embedding
+      show_on:
+        - variable: huggingfacehub_api_type
+          value: inference_endpoints
--- a/api/core/model_runtime/model_providers/huggingface_hub/llm/llm.py
+++ b/api/core/model_runtime/model_providers/huggingface_hub/llm/llm.py
@ -10,7 +10,7 @@ from core.model_runtime.entities.llm_entities import LLMResult, LLMResultChunk,
 from core.model_runtime.entities.message_entities import PromptMessage, PromptMessageTool, AssistantPromptMessage, \
    UserPromptMessage, SystemPromptMessage
 from core.model_runtime.entities.model_entities import ParameterRule, DefaultParameterName, AIModelEntity, ModelType, \
-    FetchFrom
+    FetchFrom, ModelPropertyKey
 from core.model_runtime.errors.validate import CredentialsValidateFailedError
 from core.model_runtime.model_providers.__base.large_language_model import LargeLanguageModel
 from core.model_runtime.model_providers.huggingface_hub._common import _CommonHuggingfaceHub
@ -97,7 +97,7 @@ class HuggingfaceHubLargeLanguageModel(_CommonHuggingfaceHub, LargeLanguageModel
            fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
            model_type=ModelType.LLM,
            model_properties={
-                'mode': LLMMode.COMPLETION.value
+                ModelPropertyKey.MODE: LLMMode.COMPLETION.value
            },
            parameter_rules=self._get_customizable_model_parameter_rules()
        )
--- a/api/core/model_runtime/model_providers/jina/jina.yaml
+++ b/api/core/model_runtime/model_providers/jina/jina.yaml
@ -2,7 +2,7 @@ provider: jina
 label:
  en_US: Jina
 description:
-    en_US: Embedding Model Supported
+  en_US: Embedding Model Supported
 icon_small:
  en_US: icon_s_en.svg
 icon_large:
@ -15,16 +15,16 @@ help:
  url:
    en_US: https://jina.ai/embeddings/
 supported_model_types:
- text-embedding
+  - text-embedding
 configurate_methods:
- predefined-model
+  - predefined-model
 provider_credential_schema:
  credential_form_schemas:
-  - variable: api_key
-    label:
-      en_US: API Key
-    type: secret-input
-    required: true
-    placeholder:
-      zh_Hans: 在此输入您的 API Key
-      en_US: Enter your API Key
+    - variable: api_key
+      label:
+        en_US: API Key
+      type: secret-input
+      required: true
+      placeholder:
+        zh_Hans: 在此输入您的 API Key
+        en_US: Enter your API Key
--- a/api/core/model_runtime/model_providers/jina/text_embedding/jina-embeddings-v2-base-en.yaml
+++ b/api/core/model_runtime/model_providers/jina/text_embedding/jina-embeddings-v2-base-en.yaml
@ -6,4 +6,4 @@ model_properties:
 pricing:
  input: '0.001'
  unit: '0.001'
-  currency: USD
+  currency: USD
--- a/api/core/model_runtime/model_providers/jina/text_embedding/jina-embeddings-v2-small-en.yaml
+++ b/api/core/model_runtime/model_providers/jina/text_embedding/jina-embeddings-v2-small-en.yaml
@ -6,4 +6,4 @@ model_properties:
 pricing:
  input: '0.001'
  unit: '0.001'
-  currency: USD
+  currency: USD
--- a/api/core/model_runtime/model_providers/localai/localai.yaml
+++ b/api/core/model_runtime/model_providers/localai/localai.yaml
@ -13,10 +13,10 @@ help:
  url:
    en_US: https://github.com/go-skynet/LocalAI
 supported_model_types:
- llm
- text-embedding
+  - llm
+  - text-embedding
 configurate_methods:
- customizable-model
+  - customizable-model
 model_credential_schema:
  model:
    label:
@ -26,33 +26,33 @@ model_credential_schema:
      en_US: Enter your model name
      zh_Hans: 输入模型名称
  credential_form_schemas:
-  - variable: completion_type
-    show_on:
-      - variable: __model_type
-        value: llm
-    label:
-      en_US: Completion type
-    type: select
-    required: false
-    default: chat_completion
-    placeholder:
-      zh_Hans: 选择对话类型
-      en_US: Select completion type
-    options:
-      - value: completion
-        label:
-          en_US: Completion
-          zh_Hans: 补全
-      - value: chat_completion
-        label:
-          en_US: ChatCompletion
-          zh_Hans: 对话
-  - variable: server_url
-    label:
-      zh_Hans: 服务器URL
-      en_US: Server url
-    type: text-input
-    required: true
-    placeholder:
-      zh_Hans: 在此输入LocalAI的服务器地址，如 https://example.com/xxx
-      en_US: Enter the url of your LocalAI, for example https://example.com/xxx
+    - variable: completion_type
+      show_on:
+        - variable: __model_type
+          value: llm
+      label:
+        en_US: Completion type
+      type: select
+      required: false
+      default: chat_completion
+      placeholder:
+        zh_Hans: 选择对话类型
+        en_US: Select completion type
+      options:
+        - value: completion
+          label:
+            en_US: Completion
+            zh_Hans: 补全
+        - value: chat_completion
+          label:
+            en_US: ChatCompletion
+            zh_Hans: 对话
+    - variable: server_url
+      label:
+        zh_Hans: 服务器URL
+        en_US: Server url
+      type: text-input
+      required: true
+      placeholder:
+        zh_Hans: 在此输入LocalAI的服务器地址，如 https://example.com/xxx
+        en_US: Enter the url of your LocalAI, for example https://example.com/xxx
--- a/api/core/model_runtime/model_providers/minimax/llm/abab5-chat.yaml
+++ b/api/core/model_runtime/model_providers/minimax/llm/abab5-chat.yaml
@ -3,27 +3,27 @@ label:
  en_US: Abab5-Chat
 model_type: llm
 features:
- agent-thought
+  - agent-thought
 model_properties:
  mode: chat
  context_size: 6144
 parameter_rules:
- name: temperature
-  use_template: temperature
- name: top_p
-  use_template: top_p
- name: max_tokens
-  use_template: max_tokens
-  required: true
-  default: 6144
-  min: 1
-  max: 6144
- name: presence_penalty
-  use_template: presence_penalty
- name: frequency_penalty
-  use_template: frequency_penalty
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: max_tokens
+    use_template: max_tokens
+    required: true
+    default: 6144
+    min: 1
+    max: 6144
+  - name: presence_penalty
+    use_template: presence_penalty
+  - name: frequency_penalty
+    use_template: frequency_penalty
 pricing:
  input: '0.00'
  output: '0.015'
  unit: '0.001'
-  currency: RMB
+  currency: RMB
--- a/api/core/model_runtime/model_providers/minimax/llm/abab5.5-chat.yaml
+++ b/api/core/model_runtime/model_providers/minimax/llm/abab5.5-chat.yaml
@ -3,34 +3,34 @@ label:
  en_US: Abab5.5-Chat
 model_type: llm
 features:
- agent-thought
+  - agent-thought
 model_properties:
  mode: chat
  context_size: 16384
 parameter_rules:
- name: temperature
-  use_template: temperature
- name: top_p
-  use_template: top_p
- name: max_tokens
-  use_template: max_tokens
-  required: true
-  default: 6144
-  min: 1
-  max: 16384
- name: presence_penalty
-  use_template: presence_penalty
- name: frequency_penalty
-  use_template: frequency_penalty
- name: plugin_web_search
-  required: false
-  default: false
-  type: boolean
-  label:
-    en_US: Enable Web Search
-    zh_Hans: 开启网页搜索
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: max_tokens
+    use_template: max_tokens
+    required: true
+    default: 6144
+    min: 1
+    max: 16384
+  - name: presence_penalty
+    use_template: presence_penalty
+  - name: frequency_penalty
+    use_template: frequency_penalty
+  - name: plugin_web_search
+    required: false
+    default: false
+    type: boolean
+    label:
+      en_US: Enable Web Search
+      zh_Hans: 开启网页搜索
 pricing:
  input: '0.00'
  output: '0.015'
  unit: '0.001'
-  currency: RMB
+  currency: RMB
--- a/api/core/model_runtime/model_providers/minimax/minimax.yaml
+++ b/api/core/model_runtime/model_providers/minimax/minimax.yaml
@ -13,25 +13,25 @@ help:
  url:
    en_US: https://api.minimax.chat/user-center/basic-information/interface-key
 supported_model_types:
- llm
- text-embedding
+  - llm
+  - text-embedding
 configurate_methods:
- predefined-model
+  - predefined-model
 provider_credential_schema:
  credential_form_schemas:
-  - variable: minimax_api_key
-    label:
-      en_US: API Key
-    type: secret-input
-    required: true
-    placeholder:
-      zh_Hans: 在此输入您的 API Key
-      en_US: Enter your API Key
-  - variable: minimax_group_id
-    label:
-      en_US: Group ID
-    type: text-input
-    required: true
-    placeholder:
-      zh_Hans: 在此输入您的 Group ID
-      en_US: Enter your group ID
+    - variable: minimax_api_key
+      label:
+        en_US: API Key
+      type: secret-input
+      required: true
+      placeholder:
+        zh_Hans: 在此输入您的 API Key
+        en_US: Enter your API Key
+    - variable: minimax_group_id
+      label:
+        en_US: Group ID
+      type: text-input
+      required: true
+      placeholder:
+        zh_Hans: 在此输入您的 Group ID
+        en_US: Enter your group ID
--- a/api/core/model_runtime/model_providers/minimax/text_embedding/embo-01.yaml
+++ b/api/core/model_runtime/model_providers/minimax/text_embedding/embo-01.yaml
@ -6,4 +6,4 @@ model_properties:
 pricing:
  input: '0.0005'
  unit: '0.001'
-  currency: RMB
+  currency: RMB
--- a/api/core/model_runtime/model_providers/model_provider_factory.py
+++ b/api/core/model_runtime/model_providers/model_provider_factory.py
@ -217,7 +217,9 @@ class ModelProviderFactory:
        position_map = {}
        if os.path.exists(position_file_path):
            with open(position_file_path, 'r', encoding='utf-8') as f:
-                position_map = yaml.safe_load(f)
+                positions = yaml.safe_load(f)
+                # convert list to dict with key as model provider name, value as index
+                position_map = {position: index for index, position in enumerate(positions)}

        # traverse all model_provider_dir_paths
        for model_provider_dir_path in model_provider_dir_paths:
--- a/api/core/model_runtime/model_providers/openai/llm/_position.yaml
+++ b/api/core/model_runtime/model_providers/openai/llm/_position.yaml
@ -1,9 +1,11 @@
-gpt-4: 0
-gpt-4-32k: 1
-gpt-4-1106-preview: 2
-gpt-4-vision-preview: 3
-gpt-3.5-turbo: 4
-gpt-3.5-turbo-16k: 5
-gpt-3.5-turbo-1106: 6
-gpt-3.5-turbo-instruct: 7
-text-davinci-003: 8
+- gpt-4
+- gpt-4-32k
+- gpt-4-1106-preview
+- gpt-4-vision-preview
+- gpt-3.5-turbo
+- gpt-3.5-turbo-16k
+- gpt-3.5-turbo-16k-0613
+- gpt-3.5-turbo-1106
+- gpt-3.5-turbo-0613
+- gpt-3.5-turbo-instruct
+- text-davinci-003
--- a/api/core/model_runtime/model_providers/openai/llm/gpt-3.5-turbo-0613.yaml
+++ b/api/core/model_runtime/model_providers/openai/llm/gpt-3.5-turbo-0613.yaml
@ -4,27 +4,27 @@ label:
  en_US: gpt-3.5-turbo-0613
 model_type: llm
 features:
- multi-tool-call
- agent-thought
+  - multi-tool-call
+  - agent-thought
 model_properties:
  mode: chat
  context_size: 4096
 parameter_rules:
- name: temperature
-  use_template: temperature
- name: top_p
-  use_template: top_p
- name: presence_penalty
-  use_template: presence_penalty
- name: frequency_penalty
-  use_template: frequency_penalty
- name: max_tokens
-  use_template: max_tokens
-  default: 512
-  min: 1
-  max: 4096
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: presence_penalty
+    use_template: presence_penalty
+  - name: frequency_penalty
+    use_template: frequency_penalty
+  - name: max_tokens
+    use_template: max_tokens
+    default: 512
+    min: 1
+    max: 4096
 pricing:
  input: '0.0015'
  output: '0.002'
  unit: '0.001'
-  currency: USD
+  currency: USD
--- a/api/core/model_runtime/model_providers/openai/llm/gpt-3.5-turbo-1106.yaml
+++ b/api/core/model_runtime/model_providers/openai/llm/gpt-3.5-turbo-1106.yaml
@ -4,27 +4,27 @@ label:
  en_US: gpt-3.5-turbo-1106
 model_type: llm
 features:
- multi-tool-call
- agent-thought
+  - multi-tool-call
+  - agent-thought
 model_properties:
  mode: chat
  context_size: 16385
 parameter_rules:
- name: temperature
-  use_template: temperature
- name: top_p
-  use_template: top_p
- name: presence_penalty
-  use_template: presence_penalty
- name: frequency_penalty
-  use_template: frequency_penalty
- name: max_tokens
-  use_template: max_tokens
-  default: 512
-  min: 1
-  max: 16385
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: presence_penalty
+    use_template: presence_penalty
+  - name: frequency_penalty
+    use_template: frequency_penalty
+  - name: max_tokens
+    use_template: max_tokens
+    default: 512
+    min: 1
+    max: 16385
 pricing:
  input: '0.001'
  output: '0.002'
  unit: '0.001'
-  currency: USD
+  currency: USD
--- a/api/core/model_runtime/model_providers/openai/llm/gpt-3.5-turbo-16k-0613.yaml
+++ b/api/core/model_runtime/model_providers/openai/llm/gpt-3.5-turbo-16k-0613.yaml
@ -4,27 +4,27 @@ label:
  en_US: gpt-3.5-turbo-16k-0613
 model_type: llm
 features:
- multi-tool-call
- agent-thought
+  - multi-tool-call
+  - agent-thought
 model_properties:
  mode: chat
  context_size: 16385
 parameter_rules:
- name: temperature
-  use_template: temperature
- name: top_p
-  use_template: top_p
- name: presence_penalty
-  use_template: presence_penalty
- name: frequency_penalty
-  use_template: frequency_penalty
- name: max_tokens
-  use_template: max_tokens
-  default: 512
-  min: 1
-  max: 16385
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: presence_penalty
+    use_template: presence_penalty
+  - name: frequency_penalty
+    use_template: frequency_penalty
+  - name: max_tokens
+    use_template: max_tokens
+    default: 512
+    min: 1
+    max: 16385
 pricing:
  input: '0.003'
  output: '0.004'
  unit: '0.001'
-  currency: USD
+  currency: USD
--- a/api/core/model_runtime/model_providers/openai/llm/gpt-3.5-turbo-16k.yaml
+++ b/api/core/model_runtime/model_providers/openai/llm/gpt-3.5-turbo-16k.yaml
@ -4,27 +4,27 @@ label:
  en_US: gpt-3.5-turbo-16k
 model_type: llm
 features:
- multi-tool-call
- agent-thought
+  - multi-tool-call
+  - agent-thought
 model_properties:
  mode: chat
  context_size: 16385
 parameter_rules:
- name: temperature
-  use_template: temperature
- name: top_p
-  use_template: top_p
- name: presence_penalty
-  use_template: presence_penalty
- name: frequency_penalty
-  use_template: frequency_penalty
- name: max_tokens
-  use_template: max_tokens
-  default: 512
-  min: 1
-  max: 16385
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: presence_penalty
+    use_template: presence_penalty
+  - name: frequency_penalty
+    use_template: frequency_penalty
+  - name: max_tokens
+    use_template: max_tokens
+    default: 512
+    min: 1
+    max: 16385
 pricing:
  input: '0.003'
  output: '0.004'
  unit: '0.001'
-  currency: USD
+  currency: USD
--- a/api/core/model_runtime/model_providers/openai/llm/gpt-3.5-turbo-instruct.yaml
+++ b/api/core/model_runtime/model_providers/openai/llm/gpt-3.5-turbo-instruct.yaml
@ -3,26 +3,26 @@ label:
  zh_Hans: gpt-3.5-turbo-instruct
  en_US: gpt-3.5-turbo-instruct
 model_type: llm
-features: []
+features: [ ]
 model_properties:
  mode: completion
  context_size: 4096
 parameter_rules:
- name: temperature
-  use_template: temperature
- name: top_p
-  use_template: top_p
- name: presence_penalty
-  use_template: presence_penalty
- name: frequency_penalty
-  use_template: frequency_penalty
- name: max_tokens
-  use_template: max_tokens
-  default: 512
-  min: 1
-  max: 4096
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: presence_penalty
+    use_template: presence_penalty
+  - name: frequency_penalty
+    use_template: frequency_penalty
+  - name: max_tokens
+    use_template: max_tokens
+    default: 512
+    min: 1
+    max: 4096
 pricing:
  input: '0.0015'
  output: '0.002'
  unit: '0.001'
-  currency: USD
+  currency: USD
--- a/api/core/model_runtime/model_providers/openai/llm/gpt-3.5-turbo.yaml
+++ b/api/core/model_runtime/model_providers/openai/llm/gpt-3.5-turbo.yaml
@ -4,27 +4,27 @@ label:
  en_US: gpt-3.5-turbo
 model_type: llm
 features:
- multi-tool-call
- agent-thought
+  - multi-tool-call
+  - agent-thought
 model_properties:
  mode: chat
  context_size: 4096
 parameter_rules:
- name: temperature
-  use_template: temperature
- name: top_p
-  use_template: top_p
- name: presence_penalty
-  use_template: presence_penalty
- name: frequency_penalty
-  use_template: frequency_penalty
- name: max_tokens
-  use_template: max_tokens
-  default: 512
-  min: 1
-  max: 4096
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: presence_penalty
+    use_template: presence_penalty
+  - name: frequency_penalty
+    use_template: frequency_penalty
+  - name: max_tokens
+    use_template: max_tokens
+    default: 512
+    min: 1
+    max: 4096
 pricing:
  input: '0.001'
  output: '0.002'
  unit: '0.001'
-  currency: USD
+  currency: USD
--- a/api/core/model_runtime/model_providers/openai/llm/gpt-4-1106-preview.yaml
+++ b/api/core/model_runtime/model_providers/openai/llm/gpt-4-1106-preview.yaml
@ -4,55 +4,55 @@ label:
  en_US: gpt-4-1106-preview
 model_type: llm
 features:
- multi-tool-call
- agent-thought
+  - multi-tool-call
+  - agent-thought
 model_properties:
  mode: chat
  context_size: 128000
 parameter_rules:
- name: temperature
-  use_template: temperature
- name: top_p
-  use_template: top_p
- name: presence_penalty
-  use_template: presence_penalty
- name: frequency_penalty
-  use_template: frequency_penalty
- name: max_tokens
-  use_template: max_tokens
-  default: 512
-  min: 1
-  max: 128000
- name: seed
-  label:
-    zh_Hans: 种子
-    en_US: Seed
-  type: int
-  help:
-    zh_Hans: 如果指定，模型将尽最大努力进行确定性采样，使得重复的具有相同种子和参数的请求应该返回相同的结果。不能保证确定性，您应该参考 system_fingerprint
-      响应参数来监视变化。
-    en_US: If specified, model will make a best effort to sample deterministically,
-      such that repeated requests with the same seed and parameters should return
-      the same result. Determinism is not guaranteed, and you should refer to the
-      system_fingerprint response parameter to monitor changes in the backend.
-  required: false
-  precision: 2
-  min: 0
-  max: 1
- name: response_format
-  label:
-    zh_Hans: 回复格式
-    en_US: response_format
-  type: string
-  help:
-    zh_Hans: 指定模型必须输出的格式
-    en_US: specifying the format that the model must output
-  required: false
-  options:
-  - text
-  - json_object
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: presence_penalty
+    use_template: presence_penalty
+  - name: frequency_penalty
+    use_template: frequency_penalty
+  - name: max_tokens
+    use_template: max_tokens
+    default: 512
+    min: 1
+    max: 4096
+  - name: seed
+    label:
+      zh_Hans: 种子
+      en_US: Seed
+    type: int
+    help:
+      zh_Hans: 如果指定，模型将尽最大努力进行确定性采样，使得重复的具有相同种子和参数的请求应该返回相同的结果。不能保证确定性，您应该参考 system_fingerprint
+        响应参数来监视变化。
+      en_US: If specified, model will make a best effort to sample deterministically,
+        such that repeated requests with the same seed and parameters should return
+        the same result. Determinism is not guaranteed, and you should refer to the
+        system_fingerprint response parameter to monitor changes in the backend.
+    required: false
+    precision: 2
+    min: 0
+    max: 1
+  - name: response_format
+    label:
+      zh_Hans: 回复格式
+      en_US: response_format
+    type: string
+    help:
+      zh_Hans: 指定模型必须输出的格式
+      en_US: specifying the format that the model must output
+    required: false
+    options:
+      - text
+      - json_object
 pricing:
  input: '0.01'
  output: '0.03'
  unit: '0.001'
-  currency: USD
+  currency: USD
--- a/api/core/model_runtime/model_providers/openai/llm/gpt-4-32k.yaml
+++ b/api/core/model_runtime/model_providers/openai/llm/gpt-4-32k.yaml
@ -4,55 +4,55 @@ label:
  en_US: gpt-4-32k
 model_type: llm
 features:
- multi-tool-call
- agent-thought
+  - multi-tool-call
+  - agent-thought
 model_properties:
  mode: chat
  context_size: 32768
 parameter_rules:
- name: temperature
-  use_template: temperature
- name: top_p
-  use_template: top_p
- name: presence_penalty
-  use_template: presence_penalty
- name: frequency_penalty
-  use_template: frequency_penalty
- name: max_tokens
-  use_template: max_tokens
-  default: 512
-  min: 1
-  max: 32768
- name: seed
-  label:
-    zh_Hans: 种子
-    en_US: Seed
-  type: int
-  help:
-    zh_Hans: 如果指定，模型将尽最大努力进行确定性采样，使得重复的具有相同种子和参数的请求应该返回相同的结果。不能保证确定性，您应该参考 system_fingerprint
-      响应参数来监视变化。
-    en_US: If specified, model will make a best effort to sample deterministically,
-      such that repeated requests with the same seed and parameters should return
-      the same result. Determinism is not guaranteed, and you should refer to the
-      system_fingerprint response parameter to monitor changes in the backend.
-  required: false
-  precision: 2
-  min: 0
-  max: 1
- name: response_format
-  label:
-    zh_Hans: 回复格式
-    en_US: response_format
-  type: string
-  help:
-    zh_Hans: 指定模型必须输出的格式
-    en_US: specifying the format that the model must output
-  required: false
-  options:
-  - text
-  - json_object
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: presence_penalty
+    use_template: presence_penalty
+  - name: frequency_penalty
+    use_template: frequency_penalty
+  - name: max_tokens
+    use_template: max_tokens
+    default: 512
+    min: 1
+    max: 32768
+  - name: seed
+    label:
+      zh_Hans: 种子
+      en_US: Seed
+    type: int
+    help:
+      zh_Hans: 如果指定，模型将尽最大努力进行确定性采样，使得重复的具有相同种子和参数的请求应该返回相同的结果。不能保证确定性，您应该参考 system_fingerprint
+        响应参数来监视变化。
+      en_US: If specified, model will make a best effort to sample deterministically,
+        such that repeated requests with the same seed and parameters should return
+        the same result. Determinism is not guaranteed, and you should refer to the
+        system_fingerprint response parameter to monitor changes in the backend.
+    required: false
+    precision: 2
+    min: 0
+    max: 1
+  - name: response_format
+    label:
+      zh_Hans: 回复格式
+      en_US: response_format
+    type: string
+    help:
+      zh_Hans: 指定模型必须输出的格式
+      en_US: specifying the format that the model must output
+    required: false
+    options:
+      - text
+      - json_object
 pricing:
  input: '0.06'
  output: '0.12'
  unit: '0.001'
-  currency: USD
+  currency: USD
--- a/api/core/model_runtime/model_providers/openai/llm/gpt-4-vision-preview.yaml
+++ b/api/core/model_runtime/model_providers/openai/llm/gpt-4-vision-preview.yaml
@ -4,54 +4,54 @@ label:
  en_US: gpt-4-vision-preview
 model_type: llm
 features:
- vision
+  - vision
 model_properties:
  mode: chat
  context_size: 128000
 parameter_rules:
- name: temperature
-  use_template: temperature
- name: top_p
-  use_template: top_p
- name: presence_penalty
-  use_template: presence_penalty
- name: frequency_penalty
-  use_template: frequency_penalty
- name: max_tokens
-  use_template: max_tokens
-  default: 512
-  min: 1
-  max: 128000
- name: seed
-  label:
-    zh_Hans: 种子
-    en_US: Seed
-  type: int
-  help:
-    zh_Hans: 如果指定，模型将尽最大努力进行确定性采样，使得重复的具有相同种子和参数的请求应该返回相同的结果。不能保证确定性，您应该参考 system_fingerprint
-      响应参数来监视变化。
-    en_US: If specified, model will make a best effort to sample deterministically,
-      such that repeated requests with the same seed and parameters should return
-      the same result. Determinism is not guaranteed, and you should refer to the
-      system_fingerprint response parameter to monitor changes in the backend.
-  required: false
-  precision: 2
-  min: 0
-  max: 1
- name: response_format
-  label:
-    zh_Hans: 回复格式
-    en_US: response_format
-  type: string
-  help:
-    zh_Hans: 指定模型必须输出的格式
-    en_US: specifying the format that the model must output
-  required: false
-  options:
-  - text
-  - json_object
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: presence_penalty
+    use_template: presence_penalty
+  - name: frequency_penalty
+    use_template: frequency_penalty
+  - name: max_tokens
+    use_template: max_tokens
+    default: 512
+    min: 1
+    max: 4096
+  - name: seed
+    label:
+      zh_Hans: 种子
+      en_US: Seed
+    type: int
+    help:
+      zh_Hans: 如果指定，模型将尽最大努力进行确定性采样，使得重复的具有相同种子和参数的请求应该返回相同的结果。不能保证确定性，您应该参考 system_fingerprint
+        响应参数来监视变化。
+      en_US: If specified, model will make a best effort to sample deterministically,
+        such that repeated requests with the same seed and parameters should return
+        the same result. Determinism is not guaranteed, and you should refer to the
+        system_fingerprint response parameter to monitor changes in the backend.
+    required: false
+    precision: 2
+    min: 0
+    max: 1
+  - name: response_format
+    label:
+      zh_Hans: 回复格式
+      en_US: response_format
+    type: string
+    help:
+      zh_Hans: 指定模型必须输出的格式
+      en_US: specifying the format that the model must output
+    required: false
+    options:
+      - text
+      - json_object
 pricing:
  input: '0.01'
  output: '0.03'
  unit: '0.001'
-  currency: USD
+  currency: USD
--- a/api/core/model_runtime/model_providers/openai/llm/gpt-4.yaml
+++ b/api/core/model_runtime/model_providers/openai/llm/gpt-4.yaml
@ -4,55 +4,55 @@ label:
  en_US: gpt-4
 model_type: llm
 features:
- multi-tool-call
- agent-thought
+  - multi-tool-call
+  - agent-thought
 model_properties:
  mode: chat
  context_size: 8192
 parameter_rules:
- name: temperature
-  use_template: temperature
- name: top_p
-  use_template: top_p
- name: presence_penalty
-  use_template: presence_penalty
- name: frequency_penalty
-  use_template: frequency_penalty
- name: max_tokens
-  use_template: max_tokens
-  default: 512
-  min: 1
-  max: 8192
- name: seed
-  label:
-    zh_Hans: 种子
-    en_US: Seed
-  type: int
-  help:
-    zh_Hans: 如果指定，模型将尽最大努力进行确定性采样，使得重复的具有相同种子和参数的请求应该返回相同的结果。不能保证确定性，您应该参考 system_fingerprint
-      响应参数来监视变化。
-    en_US: If specified, model will make a best effort to sample deterministically,
-      such that repeated requests with the same seed and parameters should return
-      the same result. Determinism is not guaranteed, and you should refer to the
-      system_fingerprint response parameter to monitor changes in the backend.
-  required: false
-  precision: 2
-  min: 0
-  max: 1
- name: response_format
-  label:
-    zh_Hans: 回复格式
-    en_US: response_format
-  type: string
-  help:
-    zh_Hans: 指定模型必须输出的格式
-    en_US: specifying the format that the model must output
-  required: false
-  options:
-  - text
-  - json_object
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: presence_penalty
+    use_template: presence_penalty
+  - name: frequency_penalty
+    use_template: frequency_penalty
+  - name: max_tokens
+    use_template: max_tokens
+    default: 512
+    min: 1
+    max: 8192
+  - name: seed
+    label:
+      zh_Hans: 种子
+      en_US: Seed
+    type: int
+    help:
+      zh_Hans: 如果指定，模型将尽最大努力进行确定性采样，使得重复的具有相同种子和参数的请求应该返回相同的结果。不能保证确定性，您应该参考 system_fingerprint
+        响应参数来监视变化。
+      en_US: If specified, model will make a best effort to sample deterministically,
+        such that repeated requests with the same seed and parameters should return
+        the same result. Determinism is not guaranteed, and you should refer to the
+        system_fingerprint response parameter to monitor changes in the backend.
+    required: false
+    precision: 2
+    min: 0
+    max: 1
+  - name: response_format
+    label:
+      zh_Hans: 回复格式
+      en_US: response_format
+    type: string
+    help:
+      zh_Hans: 指定模型必须输出的格式
+      en_US: specifying the format that the model must output
+    required: false
+    options:
+      - text
+      - json_object
 pricing:
  input: '0.03'
  output: '0.06'
  unit: '0.001'
-  currency: USD
+  currency: USD
--- a/api/core/model_runtime/model_providers/openai/llm/text-davinci-003.yaml
+++ b/api/core/model_runtime/model_providers/openai/llm/text-davinci-003.yaml
@ -3,26 +3,26 @@ label:
  zh_Hans: text-davinci-003
  en_US: text-davinci-003
 model_type: llm
-features: []
+features: [ ]
 model_properties:
  mode: completion
  context_size: 4096
 parameter_rules:
- name: temperature
-  use_template: temperature
- name: top_p
-  use_template: top_p
- name: presence_penalty
-  use_template: presence_penalty
- name: frequency_penalty
-  use_template: frequency_penalty
- name: max_tokens
-  use_template: max_tokens
-  default: 512
-  min: 1
-  max: 4096
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: presence_penalty
+    use_template: presence_penalty
+  - name: frequency_penalty
+    use_template: frequency_penalty
+  - name: max_tokens
+    use_template: max_tokens
+    default: 512
+    min: 1
+    max: 4096
 pricing:
  input: '0.001'
  output: '0.002'
  unit: '0.001'
-  currency: USD
+  currency: USD
--- a/api/core/model_runtime/model_providers/openai/moderation/text-moderation-stable.yaml
+++ b/api/core/model_runtime/model_providers/openai/moderation/text-moderation-stable.yaml
@ -2,4 +2,4 @@ model: text-moderation-stable
 model_type: moderation
 model_properties:
  max_chunks: 32
-  max_characters_per_chunk: 2000
+  max_characters_per_chunk: 2000
--- a/api/core/model_runtime/model_providers/openai/openai.yaml
+++ b/api/core/model_runtime/model_providers/openai/openai.yaml
@ -2,8 +2,8 @@ provider: openai
 label:
  en_US: OpenAI
 description:
-    en_US: Models provided by OpenAI, such as GPT-3.5-Turbo and GPT-4.
-    zh_Hans: OpenAI 提供的模型，例如 GPT-3.5-Turbo 和 GPT-4。
+  en_US: Models provided by OpenAI, such as GPT-3.5-Turbo and GPT-4.
+  zh_Hans: OpenAI 提供的模型，例如 GPT-3.5-Turbo 和 GPT-4。
 icon_small:
  en_US: icon_s_en.svg
 icon_large:
@ -16,13 +16,13 @@ help:
  url:
    en_US: https://platform.openai.com/account/api-keys
 supported_model_types:
- llm
- text-embedding
- speech2text
- moderation
+  - llm
+  - text-embedding
+  - speech2text
+  - moderation
 configurate_methods:
- predefined-model
- customizable-model
+  - predefined-model
+  - customizable-model
 model_credential_schema:
  model:
    label:
@ -32,57 +32,57 @@ model_credential_schema:
      en_US: Enter your model name
      zh_Hans: 输入模型名称
  credential_form_schemas:
-  - variable: openai_api_key
-    label:
-      en_US: API Key
-    type: secret-input
-    required: true
-    placeholder:
-      zh_Hans: 在此输入您的 API Key
-      en_US: Enter your API Key
-  - variable: openai_organization
-    label:
+    - variable: openai_api_key
+      label:
+        en_US: API Key
+      type: secret-input
+      required: true
+      placeholder:
+        zh_Hans: 在此输入您的 API Key
+        en_US: Enter your API Key
+    - variable: openai_organization
+      label:
        zh_Hans: 组织 ID
        en_US: Organization
-    type: text-input
-    required: false
-    placeholder:
-      zh_Hans: 在此输入您的组织 ID
-      en_US: Enter your Organization ID
-  - variable: openai_api_base
-    label:
-      zh_Hans: API Base
-      en_US: API Base
-    type: text-input
-    required: false
-    placeholder:
-      zh_Hans: 在此输入您的 API Base
-      en_US: Enter your API Base
+      type: text-input
+      required: false
+      placeholder:
+        zh_Hans: 在此输入您的组织 ID
+        en_US: Enter your Organization ID
+    - variable: openai_api_base
+      label:
+        zh_Hans: API Base
+        en_US: API Base
+      type: text-input
+      required: false
+      placeholder:
+        zh_Hans: 在此输入您的 API Base
+        en_US: Enter your API Base
 provider_credential_schema:
  credential_form_schemas:
-  - variable: openai_api_key
-    label:
-      en_US: API Key
-    type: secret-input
-    required: true
-    placeholder:
-      zh_Hans: 在此输入您的 API Key
-      en_US: Enter your API Key
-  - variable: openai_organization
-    label:
+    - variable: openai_api_key
+      label:
+        en_US: API Key
+      type: secret-input
+      required: true
+      placeholder:
+        zh_Hans: 在此输入您的 API Key
+        en_US: Enter your API Key
+    - variable: openai_organization
+      label:
        zh_Hans: 组织 ID
        en_US: Organization
-    type: text-input
-    required: false
-    placeholder:
-      zh_Hans: 在此输入您的组织 ID
-      en_US: Enter your Organization ID
-  - variable: openai_api_base
-    label:
-      zh_Hans: API Base
-      en_US: API Base
-    type: text-input
-    required: false
-    placeholder:
-      zh_Hans: 在此输入您的 API Base
-      en_US: Enter your API Base
+      type: text-input
+      required: false
+      placeholder:
+        zh_Hans: 在此输入您的组织 ID
+        en_US: Enter your Organization ID
+    - variable: openai_api_base
+      label:
+        zh_Hans: API Base
+        en_US: API Base
+      type: text-input
+      required: false
+      placeholder:
+        zh_Hans: 在此输入您的 API Base
+        en_US: Enter your API Base
--- a/api/core/model_runtime/model_providers/openai/speech2text/whisper-1.yaml
+++ b/api/core/model_runtime/model_providers/openai/speech2text/whisper-1.yaml
@ -2,4 +2,4 @@ model: whisper-1
 model_type: speech2text
 model_properties:
  file_upload_limit: 25
-  supported_file_extensions: mp3,mp4,mpeg,mpga,m4a,wav,webm
+  supported_file_extensions: mp3,mp4,mpeg,mpga,m4a,wav,webm
--- a/api/core/model_runtime/model_providers/openai/text_embedding/text-embedidng-ada-002.yaml
+++ b/api/core/model_runtime/model_providers/openai/text_embedding/text-embedidng-ada-002.yaml
@ -6,4 +6,4 @@ model_properties:
 pricing:
  input: '0.0001'
  unit: '0.001'
-  currency: USD
+  currency: USD
--- a/api/core/model_runtime/model_providers/openai_api_compatible/_common.py
+++ b/api/core/model_runtime/model_providers/openai_api_compatible/_common.py
@ -40,87 +40,4 @@ class _CommonOAI_API_Compat:
                requests.exceptions.ConnectTimeout,  # Timeout
                requests.exceptions.ReadTimeout  # Timeout
            ]
-        }
-
-    def get_customizable_model_schema(self, model: str, credentials: dict) -> AIModelEntity:
-        """
-            generate custom model entities from credentials
-        """
-        model_type = ModelType.LLM if credentials.get('__model_type') == 'llm' else ModelType.TEXT_EMBEDDING
-        
-        entity = AIModelEntity(
-            model=model,
-            label=I18nObject(en_US=model),
-            model_type=model_type,
-            fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
-            model_properties={
-                ModelPropertyKey.CONTEXT_SIZE: credentials.get('context_size', 16000),
-                ModelPropertyKey.MAX_CHUNKS: credentials.get('max_chunks', 1),
-            },
-            parameter_rules=[
-                ParameterRule(
-                    name=DefaultParameterName.TEMPERATURE.value,
-                    label=I18nObject(en_US="Temperature"),
-                    type=ParameterType.FLOAT,
-                    default=float(credentials.get('temperature', 1)),
-                    min=0,
-                    max=2
-                ),
-                ParameterRule(
-                    name=DefaultParameterName.TOP_P.value,
-                    label=I18nObject(en_US="Top P"),
-                    type=ParameterType.FLOAT,
-                    default=float(credentials.get('top_p', 1)),
-                    min=0,
-                    max=1
-                ),
-                ParameterRule(
-                    name="top_k",
-                    label=I18nObject(en_US="Top K"),
-                    type=ParameterType.INT,
-                    default=int(credentials.get('top_k', 1)),
-                    min=1,
-                    max=100
-                ),
-                ParameterRule(
-                    name=DefaultParameterName.FREQUENCY_PENALTY.value,
-                    label=I18nObject(en_US="Frequency Penalty"),
-                    type=ParameterType.FLOAT,
-                    default=float(credentials.get('frequency_penalty', 0)),
-                    min=-2,
-                    max=2
-                ),
-                ParameterRule(
-                    name=DefaultParameterName.PRESENCE_PENALTY.value,
-                    label=I18nObject(en_US="PRESENCE Penalty"),
-                    type=ParameterType.FLOAT,
-                    default=float(credentials.get('PRESENCE_penalty', 0)),
-                    min=-2,
-                    max=2
-                ),
-                ParameterRule(
-                    name=DefaultParameterName.MAX_TOKENS.value,
-                    label=I18nObject(en_US="Max Tokens"),
-                    type=ParameterType.INT,
-                    default=1024,
-                    min=1,
-                    max=int(credentials.get('max_tokens_to_sample', 4096)),
-                )
-            ],
-            pricing=PriceConfig(
-                input=Decimal(credentials.get('input_price', 0)),
-                output=Decimal(credentials.get('output_price', 0)),
-                unit=Decimal(credentials.get('unit', 0)),
-                currency=credentials.get('currency', "USD")
-            )
-        )
-
-        if model_type == ModelType.LLM:
-            if credentials['mode'] == 'chat':
-                entity.model_properties[ModelPropertyKey.MODE] = LLMMode.CHAT.value
-            elif credentials['mode'] == 'completion':
-                entity.model_properties[ModelPropertyKey.MODE] = LLMMode.COMPLETION.value
-            else:
-                raise ValueError(f"Unknown completion type {credentials['completion_type']}")
-        
-        return entity
+        }
--- a/api/core/model_runtime/model_providers/openai_api_compatible/llm/llm.py
+++ b/api/core/model_runtime/model_providers/openai_api_compatible/llm/llm.py
@ -158,7 +158,7 @@ class OAIAPICompatLargeLanguageModel(_CommonOAI_API_Compat, LargeLanguageModel):
            model_type=ModelType.LLM,
            fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
            model_properties={
-                ModelPropertyKey.CONTEXT_SIZE: int(credentials.get('context_size')),
+                ModelPropertyKey.CONTEXT_SIZE: int(credentials.get('context_size', "4096")),
                ModelPropertyKey.MODE: credentials.get('mode'),
            },
            parameter_rules=[
@ -168,7 +168,8 @@ class OAIAPICompatLargeLanguageModel(_CommonOAI_API_Compat, LargeLanguageModel):
                    type=ParameterType.FLOAT,
                    default=float(credentials.get('temperature', 0.7)),
                    min=0,
-                    max=2
+                    max=2,
+                    precision=2
                ),
                ParameterRule(
                    name=DefaultParameterName.TOP_P.value,
@ -176,7 +177,8 @@ class OAIAPICompatLargeLanguageModel(_CommonOAI_API_Compat, LargeLanguageModel):
                    type=ParameterType.FLOAT,
                    default=float(credentials.get('top_p', 1)),
                    min=0,
-                    max=1
+                    max=1,
+                    precision=2
                ),
                ParameterRule(
                    name="top_k",
@ -196,9 +198,9 @@ class OAIAPICompatLargeLanguageModel(_CommonOAI_API_Compat, LargeLanguageModel):
                ),
                ParameterRule(
                    name=DefaultParameterName.PRESENCE_PENALTY.value,
-                    label=I18nObject(en_US="PRESENCE Penalty"),
+                    label=I18nObject(en_US="Presence Penalty"),
                    type=ParameterType.FLOAT,
-                    default=float(credentials.get('PRESENCE_penalty', 0)),
+                    default=float(credentials.get('presence_penalty', 0)),
                    min=-2,
                    max=2
                ),
@ -219,6 +221,13 @@ class OAIAPICompatLargeLanguageModel(_CommonOAI_API_Compat, LargeLanguageModel):
            )
        )

+        if credentials['mode'] == 'chat':
+            entity.model_properties[ModelPropertyKey.MODE] = LLMMode.CHAT.value
+        elif credentials['mode'] == 'completion':
+            entity.model_properties[ModelPropertyKey.MODE] = LLMMode.COMPLETION.value
+        else:
+            raise ValueError(f"Unknown completion type {credentials['completion_type']}")
+    
        return entity

    # validate_credentials method has been rewritten to use the requests library for compatibility with all providers following OpenAI's API standard.
@ -239,7 +248,8 @@ class OAIAPICompatLargeLanguageModel(_CommonOAI_API_Compat, LargeLanguageModel):
        :return: full response or stream response chunk generator result
        """
        headers = {
-            'Content-Type': 'application/json'
+            'Content-Type': 'application/json',
+            'Accept-Charset': 'utf-8',
        }

        api_key = credentials.get('api_key')
@ -261,7 +271,7 @@ class OAIAPICompatLargeLanguageModel(_CommonOAI_API_Compat, LargeLanguageModel):
        if completion_type is LLMMode.CHAT:
            endpoint_url = urljoin(endpoint_url, 'chat/completions')
            data['messages'] = [self._convert_prompt_message_to_dict(m) for m in prompt_messages]
-        elif completion_type == LLMMode.COMPLETION:
+        elif completion_type is LLMMode.COMPLETION:
            endpoint_url = urljoin(endpoint_url, 'completions')
            data['prompt'] = prompt_messages[0].content
        else:
@ -291,9 +301,8 @@ class OAIAPICompatLargeLanguageModel(_CommonOAI_API_Compat, LargeLanguageModel):
            stream=stream
        )

-        # Debug: Print request headers and json data
-        logger.debug(f"Request headers: {headers}")
-        logger.debug(f"Request JSON data: {data}")
+        if response.encoding is None or response.encoding == 'ISO-8859-1':
+            response.encoding = 'utf-8'

        if response.status_code != 200:
            raise InvokeError(f"API request failed with status code {response.status_code}: {response.text}")
@ -337,9 +346,9 @@ class OAIAPICompatLargeLanguageModel(_CommonOAI_API_Compat, LargeLanguageModel):
                )
            )

-        for chunk in response.iter_content(chunk_size=2048):
+        for chunk in response.iter_lines(decode_unicode=True, delimiter='\n\n'):
            if chunk:
-                decoded_chunk = chunk.decode('utf-8').strip().lstrip('data: ').lstrip()
+                decoded_chunk = chunk.strip().lstrip('data: ').lstrip()

                chunk_json = None
                try:
@ -356,7 +365,7 @@ class OAIAPICompatLargeLanguageModel(_CommonOAI_API_Compat, LargeLanguageModel):
                    continue

                choice = chunk_json['choices'][0]
-                chunk_index = choice['index'] if 'index' in choice else chunk_index
+                chunk_index += 1

                if 'delta' in choice:
                    delta = choice['delta']
@ -408,12 +417,6 @@ class OAIAPICompatLargeLanguageModel(_CommonOAI_API_Compat, LargeLanguageModel):
                            message=assistant_prompt_message,
                        )
                    )
-            else:
-                yield create_final_llm_result_chunk(
-                    index=chunk_index + 1,
-                    message=AssistantPromptMessage(content=""),
-                    finish_reason="End of stream."
-                )

            chunk_index += 1

--- a/api/core/model_runtime/model_providers/openai_api_compatible/openai_api_compatible.yaml
+++ b/api/core/model_runtime/model_providers/openai_api_compatible/openai_api_compatible.yaml
@ -2,76 +2,76 @@ provider: openai_api_compatible
 label:
  en_US: OpenAI-API-compatible
 description:
-  en_US: All model providers compatible with OpenAI's API standard, such as Together.ai.
-  zh_Hans: 兼容 OpenAI API 的模型供应商，例如 Together.ai。
+  en_US: Model providers compatible with OpenAI's API standard, such as LM Studio.
+  zh_Hans: 兼容 OpenAI API 的模型供应商，例如 LM Studio 。
 supported_model_types:
- llm
- text-embedding
+  - llm
+  - text-embedding
 configurate_methods:
- customizable-model
+  - customizable-model
 model_credential_schema:
  model:
    label:
      en_US: Model Name
      zh_Hans: 模型名称
    placeholder:
-      en_US: Enter full model name   
+      en_US: Enter full model name
      zh_Hans: 输入模型全称
  credential_form_schemas:
-  - variable: api_key
-    label:
-      en_US: API Key
-    type: secret-input
-    required: false
-    placeholder:
-      zh_Hans: 在此输入您的 API Key
-      en_US: Enter your API Key
-  - variable: endpoint_url
-    label:
-      zh_Hans: API endpoint URL
-      en_US: API endpoint URL
-    type: text-input
-    required: true
-    placeholder:
-      zh_Hans: Base URL, eg. https://api.openai.com/v1
-      en_US: Base URL, eg. https://api.openai.com/v1
-  - variable: mode
-    show_on:
-      - variable: __model_type
-        value: llm
-    label:
-      en_US: Completion mode
-    type: select
-    required: false
-    default: chat
-    placeholder:
-      zh_Hans: 选择对话类型
-      en_US: Select completion mode
-    options:
-      - value: completion
-        label:
-          en_US: Completion
-          zh_Hans: 补全
-      - value: chat
-        label:
-          en_US: Chat
-          zh_Hans: 对话
-  - variable: context_size
-    label:
-      zh_Hans: 模型上下文长度
-      en_US: Model context size
-    required: true
-    type: text-input
-    default: '4096'
-    placeholder:
-      zh_Hans: 在此输入您的模型上下文长度
-      en_US: Enter your Model context size
-  - variable: max_tokens_to_sample
-    label:
-      zh_Hans: 最大 token 上限
-      en_US: Upper bound for max tokens
-    show_on:
-    - variable: __model_type
-      value: llm
-    default: '4096'
-    type: text-input
+    - variable: api_key
+      label:
+        en_US: API Key
+      type: secret-input
+      required: false
+      placeholder:
+        zh_Hans: 在此输入您的 API Key
+        en_US: Enter your API Key
+    - variable: endpoint_url
+      label:
+        zh_Hans: API endpoint URL
+        en_US: API endpoint URL
+      type: text-input
+      required: true
+      placeholder:
+        zh_Hans: Base URL, eg. https://api.openai.com/v1
+        en_US: Base URL, eg. https://api.openai.com/v1
+    - variable: mode
+      show_on:
+        - variable: __model_type
+          value: llm
+      label:
+        en_US: Completion mode
+      type: select
+      required: false
+      default: chat
+      placeholder:
+        zh_Hans: 选择对话类型
+        en_US: Select completion mode
+      options:
+        - value: completion
+          label:
+            en_US: Completion
+            zh_Hans: 补全
+        - value: chat
+          label:
+            en_US: Chat
+            zh_Hans: 对话
+    - variable: context_size
+      label:
+        zh_Hans: 模型上下文长度
+        en_US: Model context size
+      required: true
+      type: text-input
+      default: '4096'
+      placeholder:
+        zh_Hans: 在此输入您的模型上下文长度
+        en_US: Enter your Model context size
+    - variable: max_tokens_to_sample
+      label:
+        zh_Hans: 最大 token 上限
+        en_US: Upper bound for max tokens
+      show_on:
+        - variable: __model_type
+          value: llm
+      default: '4096'
+      type: text-input
--- a/api/core/model_runtime/model_providers/openai_api_compatible/text_embedding/text_embedding.py
+++ b/api/core/model_runtime/model_providers/openai_api_compatible/text_embedding/text_embedding.py
@ -112,7 +112,7 @@ class OAICompatEmbeddingModel(_CommonOAI_API_Compat, TextEmbeddingModel):
            credentials=credentials,
            tokens=used_tokens
        )
-
+        
        return TextEmbeddingResult(
            embeddings=batched_embeddings,
            usage=usage,
--- a/api/core/model_runtime/model_providers/openllm/openllm.yaml
+++ b/api/core/model_runtime/model_providers/openllm/openllm.yaml
@ -13,10 +13,10 @@ help:
  url:
    en_US: https://github.com/bentoml/OpenLLM
 supported_model_types:
- llm
- text-embedding
+  - llm
+  - text-embedding
 configurate_methods:
- customizable-model
+  - customizable-model
 model_credential_schema:
  model:
    label:
@ -26,12 +26,12 @@ model_credential_schema:
      en_US: Enter your model name
      zh_Hans: 输入模型名称
  credential_form_schemas:
-  - variable: server_url
-    label:
-      zh_Hans: 服务器URL
-      en_US: Server url
-    type: text-input
-    required: true
-    placeholder:
-      zh_Hans: 在此输入OpenLLM的服务器地址，如 https://example.com/xxx
-      en_US: Enter the url of your OpenLLM, for example https://example.com/xxx
+    - variable: server_url
+      label:
+        zh_Hans: 服务器URL
+        en_US: Server url
+      type: text-input
+      required: true
+      placeholder:
+        zh_Hans: 在此输入OpenLLM的服务器地址，如 https://example.com/xxx
+        en_US: Enter the url of your OpenLLM, for example https://example.com/xxx
--- a/api/core/model_runtime/model_providers/replicate/replicate.yaml
+++ b/api/core/model_runtime/model_providers/replicate/replicate.yaml
@ -13,29 +13,29 @@ help:
  url:
    en_US: https://replicate.com/account/api-tokens
 supported_model_types:
- llm
- text-embedding
+  - llm
+  - text-embedding
 configurate_methods:
- customizable-model
+  - customizable-model
 model_credential_schema:
  model:
    label:
      en_US: Model Name
      zh_Hans: 模型名称
  credential_form_schemas:
-  - variable: replicate_api_token
-    label:
-      en_US: API Key
-    type: secret-input
-    required: true
-    placeholder:
-      zh_Hans: 在此输入您的 Replicate API Key
-      en_US: Enter your Replicate API Key
-  - variable: model_version
-    label:
-      en_US: Model Version
-    type: text-input
-    required: true
-    placeholder:
-      zh_Hans: 在此输入您的模型版本
-      en_US: Enter your model version
+    - variable: replicate_api_token
+      label:
+        en_US: API Key
+      type: secret-input
+      required: true
+      placeholder:
+        zh_Hans: 在此输入您的 Replicate API Key
+        en_US: Enter your Replicate API Key
+    - variable: model_version
+      label:
+        en_US: Model Version
+      type: text-input
+      required: true
+      placeholder:
+        zh_Hans: 在此输入您的模型版本
+        en_US: Enter your model version
--- a/api/core/model_runtime/model_providers/spark/llm/spark-1.5.yaml
+++ b/api/core/model_runtime/model_providers/spark/llm/spark-1.5.yaml
@ -5,29 +5,29 @@ model_type: llm
 model_properties:
  mode: chat
 parameter_rules:
- name: temperature
-  use_template: temperature
-  default: 0.5
-  help:
-    zh_Hans: 核采样阈值。用于决定结果随机性，取值越高随机性越强即相同的问题得到的不同答案的可能性越高。
-    en_US: Kernel sampling threshold. Used to determine the randomness of the results. The higher the value, the stronger the randomness, that is, the higher the possibility of getting different answers to the same question.
- name: max_tokens
-  use_template: max_tokens
-  default: 512
-  min: 1
-  max: 4096
-  help:
-    zh_Hans: 模型回答的tokens的最大长度。
-    en_US: 模型回答的tokens的最大长度。
- name: top_k
-  label:
-    zh_Hans: 取样数量
-    en_US: Top k
-  type: int
-  default: 4
-  min: 1
-  max: 6
-  help:
-    zh_Hans: 从 k 个候选中随机选择⼀个（⾮等概率）。
-    en_US: Randomly select one from k candidates (non-equal probability).
-  required: false
+  - name: temperature
+    use_template: temperature
+    default: 0.5
+    help:
+      zh_Hans: 核采样阈值。用于决定结果随机性，取值越高随机性越强即相同的问题得到的不同答案的可能性越高。
+      en_US: Kernel sampling threshold. Used to determine the randomness of the results. The higher the value, the stronger the randomness, that is, the higher the possibility of getting different answers to the same question.
+  - name: max_tokens
+    use_template: max_tokens
+    default: 512
+    min: 1
+    max: 4096
+    help:
+      zh_Hans: 模型回答的tokens的最大长度。
+      en_US: 模型回答的tokens的最大长度。
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    default: 4
+    min: 1
+    max: 6
+    help:
+      zh_Hans: 从 k 个候选中随机选择⼀个（⾮等概率）。
+      en_US: Randomly select one from k candidates (non-equal probability).
+    required: false
--- a/api/core/model_runtime/model_providers/spark/llm/spark-2.yaml
+++ b/api/core/model_runtime/model_providers/spark/llm/spark-2.yaml
@ -6,29 +6,29 @@ model_type: llm
 model_properties:
  mode: chat
 parameter_rules:
- name: temperature
-  use_template: temperature
-  default: 0.5
-  help:
-    zh_Hans: 核采样阈值。用于决定结果随机性，取值越高随机性越强即相同的问题得到的不同答案的可能性越高。
-    en_US: Kernel sampling threshold. Used to determine the randomness of the results. The higher the value, the stronger the randomness, that is, the higher the possibility of getting different answers to the same question.
- name: max_tokens
-  use_template: max_tokens
-  default: 2048
-  min: 1
-  max: 8192
-  help:
-    zh_Hans: 模型回答的tokens的最大长度。
-    en_US: 模型回答的tokens的最大长度。
- name: top_k
-  label:
-    zh_Hans: 取样数量
-    en_US: Top k
-  type: int
-  default: 4
-  min: 1
-  max: 6
-  help:
-    zh_Hans: 从 k 个候选中随机选择⼀个（⾮等概率）。
-    en_US: Randomly select one from k candidates (non-equal probability).
-  required: false
+  - name: temperature
+    use_template: temperature
+    default: 0.5
+    help:
+      zh_Hans: 核采样阈值。用于决定结果随机性，取值越高随机性越强即相同的问题得到的不同答案的可能性越高。
+      en_US: Kernel sampling threshold. Used to determine the randomness of the results. The higher the value, the stronger the randomness, that is, the higher the possibility of getting different answers to the same question.
+  - name: max_tokens
+    use_template: max_tokens
+    default: 2048
+    min: 1
+    max: 8192
+    help:
+      zh_Hans: 模型回答的tokens的最大长度。
+      en_US: 模型回答的tokens的最大长度。
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    default: 4
+    min: 1
+    max: 6
+    help:
+      zh_Hans: 从 k 个候选中随机选择⼀个（⾮等概率）。
+      en_US: Randomly select one from k candidates (non-equal probability).
+    required: false
--- a/api/core/model_runtime/model_providers/spark/llm/spark-3.yaml
+++ b/api/core/model_runtime/model_providers/spark/llm/spark-3.yaml
@ -5,29 +5,29 @@ model_type: llm
 model_properties:
  mode: chat
 parameter_rules:
- name: temperature
-  use_template: temperature
-  default: 0.5
-  help:
-    zh_Hans: 核采样阈值。用于决定结果随机性，取值越高随机性越强即相同的问题得到的不同答案的可能性越高。
-    en_US: Kernel sampling threshold. Used to determine the randomness of the results. The higher the value, the stronger the randomness, that is, the higher the possibility of getting different answers to the same question.
- name: max_tokens
-  use_template: max_tokens
-  default: 2048
-  min: 1
-  max: 8192
-  help:
-    zh_Hans: 模型回答的tokens的最大长度。
-    en_US: 模型回答的tokens的最大长度。
- name: top_k
-  label:
-    zh_Hans: 取样数量
-    en_US: Top k
-  type: int
-  default: 4
-  min: 1
-  max: 6
-  help:
-    zh_Hans: 从 k 个候选中随机选择⼀个（⾮等概率）。
-    en_US: Randomly select one from k candidates (non-equal probability).
-  required: false
+  - name: temperature
+    use_template: temperature
+    default: 0.5
+    help:
+      zh_Hans: 核采样阈值。用于决定结果随机性，取值越高随机性越强即相同的问题得到的不同答案的可能性越高。
+      en_US: Kernel sampling threshold. Used to determine the randomness of the results. The higher the value, the stronger the randomness, that is, the higher the possibility of getting different answers to the same question.
+  - name: max_tokens
+    use_template: max_tokens
+    default: 2048
+    min: 1
+    max: 8192
+    help:
+      zh_Hans: 模型回答的tokens的最大长度。
+      en_US: 模型回答的tokens的最大长度。
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    default: 4
+    min: 1
+    max: 6
+    help:
+      zh_Hans: 从 k 个候选中随机选择⼀个（⾮等概率）。
+      en_US: Randomly select one from k candidates (non-equal probability).
+    required: false
--- a/api/core/model_runtime/model_providers/spark/spark.yaml
+++ b/api/core/model_runtime/model_providers/spark/spark.yaml
@ -15,32 +15,32 @@ help:
  url:
    en_US: https://www.xfyun.cn/solutions/xinghuoAPI
 supported_model_types:
- llm
+  - llm
 configurate_methods:
- predefined-model
+  - predefined-model
 provider_credential_schema:
  credential_form_schemas:
-  - variable: app_id
-    label:
-      en_US: APPID
-    type: text-input
-    required: true
-    placeholder:
-      zh_Hans: 在此输入您的 APPID
-      en_US: Enter your APPID
-  - variable: api_secret
-    label:
-      en_US: APISecret
-    type: secret-input
-    required: true
-    placeholder:
-      zh_Hans: 在此输入您的 APISecret
-      en_US: Enter your APISecret
-  - variable: api_key
-    label:
-      en_US: APIKey
-    type: secret-input
-    required: true
-    placeholder:
-      zh_Hans: 在此输入您的 APIKey
-      en_US: Enter your APIKey
+    - variable: app_id
+      label:
+        en_US: APPID
+      type: text-input
+      required: true
+      placeholder:
+        zh_Hans: 在此输入您的 APPID
+        en_US: Enter your APPID
+    - variable: api_secret
+      label:
+        en_US: APISecret
+      type: secret-input
+      required: true
+      placeholder:
+        zh_Hans: 在此输入您的 APISecret
+        en_US: Enter your APISecret
+    - variable: api_key
+      label:
+        en_US: APIKey
+      type: secret-input
+      required: true
+      placeholder:
+        zh_Hans: 在此输入您的 APIKey
+        en_US: Enter your APIKey
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
takatost	f7939c758f	feat: bump version 0.4.5 (#1994 )	2024-01-11 10:55:56 +08:00
takatost	bf7045566d	fix: azure openai model parameters wrong when using hosting credentials (#1993 )	2024-01-11 10:49:35 +08:00
Yeuoly	ebd11e7482	fix: baichuan max chunks (#1990 )	2024-01-10 23:13:35 +08:00
takatost	94626487db	fix: resend url optional (#1987 )	2024-01-10 21:14:10 +08:00
Jyong	24bdedf802	fix get embedding model provider in empty dataset (#1986 ) Co-authored-by: jyong <jyong@dify.ai>	2024-01-10 20:48:16 +08:00
Chenhe Gu	0025ba4921	Escape capturing prices with dollar sign as math expressions (#1985 )	2024-01-10 19:55:50 +08:00
Jyong	7c0676343f	Update Qdrant version (#1979 )	2024-01-10 18:15:13 +08:00
Benjamin	1fe4e3afde	Update Resend SDK and resend api url in configuration. (#1963 )	2024-01-10 18:15:02 +08:00
Jyong	9dee9e7ade	fix rerank issue when doing economy search (#1978 ) Co-authored-by: jyong <jyong@dify.ai>	2024-01-09 20:56:13 +08:00
takatost	33901384c6	fix: httpx socks package missing (#1977 )	2024-01-09 20:16:07 +08:00
Charlie.Wei	7a221d0858	Fix hosting cloud version limit (#1975 ) Co-authored-by: luowei <glpat-EjySCyNjWiLqAED-YmwM> Co-authored-by: crazywoola <427733928@qq.com> Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>	2024-01-09 19:56:09 +08:00
Jyong	60ee98f578	zhipu embedding token method (#1976 ) Co-authored-by: jyong <jyong@dify.ai>	2024-01-09 19:54:02 +08:00
Charlie.Wei	5b24d7129e	Azure openai init (#1929 ) Co-authored-by: luowei <glpat-EjySCyNjWiLqAED-YmwM> Co-authored-by: crazywoola <427733928@qq.com> Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>	2024-01-09 19:17:47 +08:00
Bowen Liang	b8592ad412	fix: indentation violations in YAML files (#1972 )	2024-01-09 18:15:25 +08:00
Bowen Liang	e696b72f08	web: requires NodeJs 18.17 at minimum for NextJs 14 (#1974 )	2024-01-09 18:15:04 +08:00
Chenhe Gu	344821ed35	enforce utf-8 encoding for provider response (#1973 )	2024-01-09 14:22:05 +08:00
Bowen Liang	126b4c332f	web: bump Next.js to 14.0 for faster local server startup (#1925 )	2024-01-09 13:46:02 +08:00
Bowen Liang	c32c177e15	improvement: introduce Super-Linter actions to check style for shell script, dockerfile and yaml files (#1966 )	2024-01-09 10:31:52 +08:00
zxhlyh	853cdd741f	fix: update model list (#1967 )	2024-01-08 18:54:39 +08:00
Bowen Liang	69d42ae95b	fix: cover missed source paths for eslint (#1956 )	2024-01-08 18:06:23 +08:00
Chenhe Gu	5ff701ca3f	correct xorbits spelling (#1965 )	2024-01-08 10:19:56 +08:00
takatost	9f58912fd7	bump version to 0.4.4 (#1962 )	2024-01-06 03:08:05 +08:00
takatost	0c746f5c5a	fix: generate not stop when pressing stop link (#1961 )	2024-01-06 03:03:56 +08:00
Garfield Dai	a8cedea15a	fix: check result should be string. (#1959 )	2024-01-05 22:11:51 +08:00
Chenhe Gu	87832ede17	delete remnant 'required': false (#1955 )	2024-01-05 19:18:33 +08:00
Jyong	4d99c689f0	prohibit enable and disable function when segment is not completed (#1954 ) Co-authored-by: jyong <jyong@dify.ai> Co-authored-by: Joel <iamjoel007@gmail.com>	2024-01-05 18:18:38 +08:00
Jyong	28b26f67e2	optimize qa prompt (#1957 ) Co-authored-by: jyong <jyong@dify.ai>	2024-01-05 18:17:55 +08:00
Chenhe Gu	b934232411	change API key field to 'required' (#1953 )	2024-01-05 17:19:04 +08:00
takatost	2f120786fd	feat: reorder togetherai (#1951 )	2024-01-05 17:04:37 +08:00
Chenhe Gu	6075fee556	Add Together.ai's OpenAI API-compatible inference endpoints (#1947 )	2024-01-05 16:36:29 +08:00
Chenhe Gu	de584807e1	fix streaming (#1944 )	2024-01-05 01:03:54 -06:00
zxhlyh	a1285cbf15	fix: text-generation run batch (#1945 )	2024-01-05 14:47:00 +08:00
Garfield Dai	cf1f6f3961	fix: text completion app cannot get data. (#1942 )	2024-01-05 14:29:01 +08:00
takatost	f4d97ef9fa	fix: arg user required and must not be null in service generate api (#1943 )	2024-01-05 14:28:03 +08:00
takatost	28883e80d4	fix: gpt-4-32k model name empty in OpenAI response (#1941 )	2024-01-05 12:49:26 +08:00
takatost	a0f74cdd9d	fix: llm result usage none (#1940 )	2024-01-05 12:47:10 +08:00
takatost	296bf443a8	feat: reuse decoding_rsa_key & decoding_cipher_rsa & optimize construct (#1937 )	2024-01-05 12:13:45 +08:00
takatost	af7be9bdd7	Feat/optimize entity construct (#1935 )	2024-01-05 09:43:41 +08:00
takatost	2cfd5568e1	fix: vision fail in complete app (#1933 )	2024-01-05 04:23:12 +08:00
takatost	faf40a42bc	feat: optimize memory & invoke error output (#1931 )	2024-01-05 03:47:46 +08:00