Compare commits

..

137 Commits

Author SHA1 Message Date
8c1bca3119 fix: eslint run failed 2025-02-14 15:01:02 +08:00
a8982a98f4 chore: update libs 2025-02-14 14:13:44 +08:00
130964d9a7 update eslint.config.mjs 2025-02-14 14:00:59 +08:00
1a8a1a9574 fix: ignore .storybook folder 2025-02-08 17:52:10 +08:00
20bcb49932 fix: ignore rule no-explicit-any 2025-02-08 17:50:35 +08:00
91e411bbaa wip: update eslint config and stash 2025-02-08 15:45:16 +08:00
55ce3618ce fix: Dollar Sign Handling in Markdown (#13178)
Co-authored-by: crazywoola <427733928@qq.com>
2025-02-05 11:00:56 +08:00
e9e34c1ab2 Install apt dependencies using bookworm source, consistent with base image. Remove unnecessary, error-prone pins (#13176) 2025-02-05 10:07:22 +08:00
d4c916b496 chore(pyproject): Add type stubs into pyproject.toml (#13145)
Signed-off-by: -LAN- <laipz8200@outlook.com>
2025-02-04 12:01:28 +08:00
8fbc9c9342 Solve circular dependency issue between workflow/constants.ts file and default.ts file (#13165) 2025-02-04 09:26:01 +08:00
1b6fd9dfe8 fix: set indexing technique from dataset during update-by-text (#13155) 2025-02-03 11:06:03 +08:00
304467e3f5 fix: not install libmagic raise error (#13146) 2025-02-03 11:05:20 +08:00
7452032d81 add azure openai api version 2024-12-01-preview (#13135) 2025-02-03 11:04:20 +08:00
87e2048f1b nitpick: fix small typos in template.en.mdx (#13156) 2025-02-03 11:03:11 +08:00
d876084392 chore: upgrade libldap2 (#13158) 2025-02-03 11:02:14 +08:00
840729afa5 feat: the think tag display of siliconflow's deepseek r1 (#13153) 2025-02-02 21:55:13 +08:00
941ad03f3c pass model and cost so that langfuse can show cost (#13117) 2025-02-02 15:27:27 +08:00
d73d191f99 feature. add feat to modify metadata via dataset api (#13116) 2025-02-02 15:27:12 +08:00
c2664e0283 chore: fix wrong VectorType match case (#13123) 2025-02-02 15:26:59 +08:00
ee61cede4e test(huggingface_hub): Skip the failed test temporarily. (#13142)
Signed-off-by: -LAN- <laipz8200@outlook.com>
2025-02-02 14:47:26 +08:00
b47669b80b fix: deduct LLM quota after processing invoke result (#13075)
Signed-off-by: -LAN- <laipz8200@outlook.com>
2025-02-02 12:05:11 +08:00
c0d0c63592 feat: switch to chat messages before regenerated (#11301)
Co-authored-by: zuodongxu <192560071+zuodongxu@users.noreply.github.com>
2025-01-31 13:05:10 +08:00
b09c39c8dc refactor: avoid to use extra space when finding model by name (#13043) 2025-01-30 15:08:29 +08:00
b4b09ddc3c add tongyi qwen2.5-14b/7b-instruct-1m model (#13089) 2025-01-29 11:58:01 +08:00
d0a21086bd refactor: Update Firecrawl API parameters and default settings (#13082) 2025-01-29 11:21:05 +08:00
d44882c1b5 refactor: reduce duplciate code by inheritance (#13073) 2025-01-28 10:52:01 +08:00
23c68efa2d fix: fix the formatter is not applied on log file (#12704) 2025-01-28 10:49:58 +08:00
560c5de1b7 Fixed Novita AI color and added DeepSeek R1 model (#13074) 2025-01-28 10:38:54 +08:00
5d91dbd000 Set default LOG_LEVEL to INFO for celery workers and beat (#13066)
Co-authored-by: Abdullah AlOsaimi <189027247+osaimi@users.noreply.github.com>
2025-01-27 17:09:41 +08:00
6c31ee36cd fix qwen-vl blocking mode (#13052) 2025-01-27 11:35:23 +08:00
edc29780ed fix: "Model schema not found" error only in agents (#12655) (#12760) 2025-01-27 11:33:13 +08:00
aad7e4dd1c fix:Improve MIME type detection for remote URL uploads using python-magic (#12693) 2025-01-27 11:33:03 +08:00
a6a727e8a4 feat: add inner API to create workspace without requiring email (#13021) 2025-01-26 15:36:56 +08:00
d1fc65fabc fix: adjust iteration node dark style (#13051) 2025-01-26 11:19:41 +08:00
d4be5ef9de Update Novita AI predefined models (#13045) 2025-01-26 09:25:29 +08:00
1374be5a31 fix: Unexpected tag creation when pressing enter during tag conversion (#13041) 2025-01-25 19:30:26 +08:00
b2bbc28580 support bedrock kb: retrieve and generate (#13027) 2025-01-25 17:28:06 +08:00
59b3e672aa feat: add agent thinking content display of deepseek R1 (#12949) 2025-01-24 20:13:42 +08:00
a2f8bce8f5 chore: add Japanese translation: model_providers/bedrock (#13016) 2025-01-24 18:43:33 +08:00
a2b9adb3a2 Change typo in translation (#13004) 2025-01-24 13:48:21 +08:00
28067640b5 fix: wrong zh_Hans translation: Ohio (#13006) 2025-01-24 13:41:20 +08:00
da67916843 feat: add glm-4-air-0111 (#12997)
Co-authored-by: lowell <lowell.hu@zkteco.in>
2025-01-24 10:04:46 +08:00
e54ce479ad Feat/prompt editor dark theme (#12976) 2025-01-23 16:20:00 +08:00
6024d8a42d refactor: Update Firecrawl to use v1 API (#12574)
Co-authored-by: Ademílson Tonato <ademilson.tonato@refurbed.com>
2025-01-23 11:14:48 +08:00
f565f08aa0 fix: get property of string type variable caused page crash (#12969) 2025-01-23 11:02:29 +08:00
fd4afe09f8 fix: tools translate search (#12950)
Co-authored-by: lowell <lowell.hu@zkteco.in>
2025-01-22 19:27:02 +08:00
dd0904f95c feat: add giteeAI risk control identification. (#12946) 2025-01-22 19:26:25 +08:00
4c3076f2a4 feat: add pg vector index (#12338)
Co-authored-by: huangzhuo <huangzhuo1@xiaomi.com>
2025-01-22 17:07:18 +08:00
1e73f63ff8 chore: update version to 0.15.2 in packaging and docker configurations (#12940)
Signed-off-by: -LAN- <laipz8200@outlook.com>
2025-01-22 16:40:44 +08:00
d167d5b1be feat(ark): support doubao 1.5 series of models (#12935) 2025-01-22 15:25:57 +08:00
71fa14f791 fix: resolve clipboard.writeText failure under HTTP protocol (#12936) 2025-01-22 15:18:23 +08:00
8dd1873e76 feat: workflow note dark theme (#12932) 2025-01-22 14:22:33 +08:00
f91f5c7401 fix(batch_create_segment_to_index_task): count max_position in memory. (#12929) 2025-01-22 13:39:02 +08:00
c62b7cc679 chore(build): bump poetry from 1.x to 2.x (#12369) 2025-01-22 13:38:24 +08:00
3ee213ddca add milvus full text search setting (#12930) 2025-01-22 13:36:39 +08:00
8429877b02 fix: Agent is configured for ReAct inference mode, an error is reported when viewing the agent log (#12920)
Co-authored-by: crazywoola <427733928@qq.com>
2025-01-22 13:20:32 +08:00
05a0faff6a fix: app token's last_used_at can't be updated when last_used_at is null (#12770) 2025-01-22 11:01:45 +08:00
e09f6e4987 feat: support config chunk length by env (#12925) 2025-01-22 10:43:40 +08:00
e23f4b0265 feat: add gemini-2.0-flash-thinking-exp-01-21 (#12924) 2025-01-22 10:14:37 +08:00
f582d4a13e feat: Add ability to change profile avatar (#12642) 2025-01-22 10:11:31 +08:00
2f41bd495d fix:Fix a bug that returns null when the passed path is a file. (#12775)
Co-authored-by: 刘江波 <jiangbo721@163.com>
2025-01-22 10:10:03 +08:00
162a8c4393 fix update segment keyword with same content (#12908) 2025-01-21 19:19:32 +08:00
3d1ce4c53f bug: fixed bedrock rerank bug (#12774)
Co-authored-by: hobo.l <hobo.l@binance.com>
2025-01-21 19:09:36 +08:00
6db3ae9b8e chore: remove webapp ga (#12909) 2025-01-21 18:38:33 +08:00
6d0cb9dc33 fix: variable panel scrollable (#12769)
Co-authored-by: zhaoqingyu.1075 <zhaoqingyu.1075@bytedance.com>
2025-01-21 17:50:42 +08:00
46e95e8309 fix: OpenAI o1 Bad Request Error (#12839) 2025-01-21 15:29:13 +08:00
a7b9375877 Update deepseek model configuration (#12899) 2025-01-21 15:28:11 +08:00
0c6a8a130e fix: external dataset hit test display issue(#12564) (#12612)
Co-authored-by: zhuxinliang <zhuxinliang@didiglobal.com>
2025-01-21 14:31:45 +08:00
9903f1e703 add deepseek-reasoner (#12898) 2025-01-21 12:40:58 +08:00
6fad719e42 chore(fix): Invalid quotes for using Array[String] in HTTP request node as JSON body (#12761) 2025-01-21 10:38:44 +08:00
9aaee8ee47 fix: Issues related to the deletion of conversation_id (#12488) (#12665) 2025-01-21 10:25:35 +08:00
166221d784 chore(lint): fix quotes for f-string formatting by bumping ruff to 0.9.x (#12702) 2025-01-21 10:12:29 +08:00
925d69a2ee feat:Support Minimax-Text-01 (#12763) 2025-01-21 10:08:53 +08:00
5ff08e241a fix: serply credential check query might return empty records (#12784) 2025-01-21 09:38:56 +08:00
3defd24087 feat: allow updating chunk settings for the existing documents (#12833) 2025-01-21 09:25:40 +08:00
9d86147d20 fix: SparkLite API Auth error (#12781) (#12790) 2025-01-20 22:21:21 +08:00
80801ac4ab fix: "parmas" spelling mistake. (#12875) 2025-01-20 22:18:30 +08:00
210926cd91 Fix suggested_question_prompt (#12738) 2025-01-20 22:16:30 +08:00
677a69deed fix(i18n): correct typo in zh-Hant translation (#12852) 2025-01-20 22:15:41 +08:00
8dfdee21ce chore: fix chinese translation for 'recall' (#12772)
Co-authored-by: zhaoqingyu.1075 <zhaoqingyu.1075@bytedance.com>
2025-01-20 22:15:26 +08:00
6ea77ab4cd fix: DeepSeek API Error with response format active (text and json_object) (#12747) 2025-01-20 22:04:18 +08:00
e3c996688d feat: enhance credential extraction logic based on configurate method (#12853) 2025-01-20 21:59:22 +08:00
bc3a570dda fix: Fix rerank model switching issue (#12721)
ok
2025-01-14 15:42:45 +08:00
0800021a2d chore: translate i18n files (#12708)
Co-authored-by: JzoNgKVO <27049666+JzoNgKVO@users.noreply.github.com>
2025-01-14 13:35:23 +08:00
435eddd867 Feat: copyright modification (#12707) 2025-01-14 10:00:57 +08:00
6e0fb055d1 chore: bump version to 0.15.1 (#12690)
Signed-off-by: -LAN- <laipz8200@outlook.com>
2025-01-13 19:21:06 +08:00
eux
1e9ac7ffeb feat: add table of contents to Knowledge API doc (#12688) 2025-01-13 18:31:43 +08:00
b4873ecb43 [fix] support feature restore (#12563) 2025-01-13 18:29:06 +08:00
mbo
1859d57784 api tool support multiple env url (#12249)
Co-authored-by: mabo <mabo@aeyes.ai>
2025-01-13 17:49:30 +08:00
69d58fbb50 Add new integration with Opik Tracking tool (#11501) 2025-01-13 17:41:44 +08:00
cb34991663 fix: add type hints for App model and improve error handling in audio services (#12677)
Signed-off-by: -LAN- <laipz8200@outlook.com>
2025-01-13 15:55:16 +08:00
c700364e1c fix: Update variable handling in VariableAssignerNode and clean up app_dsl_service (#12672)
Signed-off-by: -LAN- <laipz8200@outlook.com>
2025-01-13 15:54:26 +08:00
9a6b1dc3a1 Revert "Feat/new saas billing" (#12673) 2025-01-13 15:17:43 +08:00
54b5b80a07 fix(workflow): fix answer node stream processing in conditional branches (#12510) 2025-01-13 14:54:21 +08:00
831459b895 fix: ruff with statements (#12578)
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
2025-01-13 09:55:55 +08:00
4e101604c3 fix: ruff check for True if ... else (#12576)
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-01-13 09:38:48 +08:00
a6455269f0 chore: Adjust translations to align with Taiwanese Mandarin conventions (#12633) 2025-01-13 09:12:43 +08:00
cd257b91c5 Fix pandas indexing method for knowledge base imports (#12637) (#12638)
Co-authored-by: CN-P5 <heibai2006@qq.com>
2025-01-13 09:06:59 +08:00
d8f57bf899 Feat/new saas billing (#12591) 2025-01-12 14:50:46 +08:00
989fb11fd7 improve the readability of the function generate_api_key (#12552) 2025-01-09 21:30:17 +08:00
140965b738 chore: translate i18n files (#12543)
Co-authored-by: WTW0313 <30284043+WTW0313@users.noreply.github.com>
2025-01-09 20:30:06 +08:00
14ee51aead Feat/add knowledge include all filter (#12537) 2025-01-09 20:21:25 +08:00
2e97ba5700 fix: Add datasets list access control and fix datasets config display issue (#12533)
Co-authored-by: nite-knite <nkCoding@gmail.com>
2025-01-09 17:44:11 +08:00
f549d53b68 fix: sum costs return error value on overview page (#12534) 2025-01-09 16:04:14 +08:00
a085ad4719 feat: show workflow running status (#12531) 2025-01-09 15:36:13 +08:00
f230a9232e fix: Parsing OpenAPI spec for external tools (#12518) (#12530) 2025-01-09 15:30:43 +08:00
e84bf35e2a fix: same chunk insert deadlock (#12502)
Co-authored-by: huangzhuo <huangzhuo1@xiaomi.com>
2025-01-09 15:16:41 +08:00
eux
20f090537f feat: add GET upload file API endpoint to dataset service api (#11899) 2025-01-09 14:52:09 +08:00
dbe7a7c4fd Fix: Add a INFO-level log when fallback to gpt2tokenizer (#12508) 2025-01-09 14:37:46 +08:00
b7a4e3903e fix: add last_refresh_time to track the validity of is_other_tab_refreshing (#12517) 2025-01-09 10:40:45 +08:00
b4c1c2f731 fix: Reverse sync docker-compose-template.yaml (#12509) 2025-01-09 10:21:22 +08:00
1b940e7daa feat: add ci job to test template for docker compose (#12514) 2025-01-09 00:04:58 +08:00
f4ee50a7ad chore: improve app doc (#12490) 2025-01-08 18:37:12 +08:00
bee32d960a fix #12453 #12482 (#12495) 2025-01-08 18:26:05 +08:00
040a3b782c FEAT: support milvus to full text search (#11430)
Signed-off-by: YoungLH <974840768@qq.com>
2025-01-08 17:39:53 +08:00
d649037c3e feat: support single run doc extractor node (#11318) 2025-01-08 15:20:15 +08:00
0a49d3dd52 fix: tiktoken cannot be loaded without internet (#12478)
Signed-off-by: -LAN- <laipz8200@outlook.com>
2025-01-08 14:49:44 +08:00
53bb37b749 fix: fix the incorrect plaintext file key when saving (#10429) 2025-01-08 12:52:45 +08:00
d2586278d6 Feat elasticsearch japanese (#12194) 2025-01-08 12:35:41 +08:00
6635c393e9 fix: adjust opacity for model selector based on readonly state (#12472) 2025-01-08 12:11:45 +08:00
6222179a57 Revert "fix:deepseek tool call not working correctly" (#12463) 2025-01-08 10:50:34 +08:00
05bda6f38d add tidb on qdrant redis lock (#12462) 2025-01-08 08:55:44 +08:00
4295cefeb1 fix: allow fallback to remote_url when url is not provided (#12455) 2025-01-07 22:33:25 +08:00
67228c9b26 fix: url with variable not work (#12452) 2025-01-07 21:55:51 +08:00
fd2bfff023 remove knowledge admin role (#12450) 2025-01-07 21:30:23 +08:00
4e6c86341d Add 'document' feature to Sonnet 3.5 through OpenRouter (#12444) 2025-01-07 19:51:38 +08:00
2a14c67edc Fix #12448 - update bedrock retrieve tool, support hybrid search type and re… (#12446)
Co-authored-by: Yuanbo Li <ybalbert@amazon.com>
2025-01-07 19:51:23 +08:00
c236f05f4b chore: bump version to 0.15.0 (#12297)
Signed-off-by: -LAN- <laipz8200@outlook.com>
2025-01-07 18:05:14 +08:00
0eeacdc80c refactor: enhance API token validation with session locking and last used timestamp update (#12426)
Signed-off-by: -LAN- <laipz8200@outlook.com>
2025-01-07 18:04:41 +08:00
41f39bf3fc Fix newline characters in tables during document parsing (#12112)
Co-authored-by: hisir <admin@qq.com>
2025-01-07 17:26:24 +08:00
9677144015 fix:deepseek tool call not working correctly (#12437) 2025-01-07 17:25:38 +08:00
15797c556f add fish-speech-1.5 from siliconflow (#12425) 2025-01-07 15:27:34 +08:00
acacf35a2a chore(docker/.env.example): Add TOP_K_MAX_VALUE to the .env.example… (#12422)
Signed-off-by: -LAN- <laipz8200@outlook.com>
2025-01-07 14:51:16 +08:00
d3f5b1cbb6 refactor: use tiktoken for token calculation (#12416)
Signed-off-by: -LAN- <laipz8200@outlook.com>
2025-01-07 13:32:30 +08:00
why
196ed8101b fix: [PromptEditorHeightResizeWrap] Bug #12410 (#12406) 2025-01-07 12:21:54 +08:00
dc650c5368 Fixes #12414: Add cheaper model and long context model for Qwen2.5-72B-Instruct from siliconflow (#12415) 2025-01-07 11:28:24 +08:00
2bb521b135 Support TTS and Speech2Text for Model Provider GPUStack (#12381) 2025-01-07 09:42:11 +08:00
439 changed files with 14096 additions and 7035 deletions

View File

@ -8,7 +8,7 @@ inputs:
poetry-version:
description: Poetry version to set up
required: true
default: '1.8.4'
default: '2.0.1'
poetry-lockfile:
description: Path to the Poetry lockfile to restore cache from
required: true

View File

@ -42,25 +42,23 @@ jobs:
run: poetry install -C api --with dev
- name: Check dependencies in pyproject.toml
run: poetry run -C api bash dev/pytest/pytest_artifacts.sh
run: poetry run -P api bash dev/pytest/pytest_artifacts.sh
- name: Run Unit tests
run: poetry run -C api bash dev/pytest/pytest_unit_tests.sh
run: poetry run -P api bash dev/pytest/pytest_unit_tests.sh
- name: Run ModelRuntime
run: poetry run -C api bash dev/pytest/pytest_model_runtime.sh
run: poetry run -P api bash dev/pytest/pytest_model_runtime.sh
- name: Run dify config tests
run: poetry run -C api python dev/pytest/pytest_config_tests.py
run: poetry run -P api python dev/pytest/pytest_config_tests.py
- name: Run Tool
run: poetry run -C api bash dev/pytest/pytest_tools.sh
run: poetry run -P api bash dev/pytest/pytest_tools.sh
- name: Run mypy
run: |
pushd api
poetry run python -m mypy --install-types --non-interactive .
popd
poetry run -C api python -m mypy --install-types --non-interactive .
- name: Set up dotenvs
run: |
@ -80,4 +78,4 @@ jobs:
ssrf_proxy
- name: Run Workflow
run: poetry run -C api bash dev/pytest/pytest_workflow.sh
run: poetry run -P api bash dev/pytest/pytest_workflow.sh

View File

@ -38,12 +38,12 @@ jobs:
if: steps.changed-files.outputs.any_changed == 'true'
run: |
poetry run -C api ruff --version
poetry run -C api ruff check ./api
poetry run -C api ruff format --check ./api
poetry run -C api ruff check ./
poetry run -C api ruff format --check ./
- name: Dotenv check
if: steps.changed-files.outputs.any_changed == 'true'
run: poetry run -C api dotenv-linter ./api/.env.example ./web/.env.example
run: poetry run -P api dotenv-linter ./api/.env.example ./web/.env.example
- name: Lint hints
if: failure()
@ -82,6 +82,33 @@ jobs:
if: steps.changed-files.outputs.any_changed == 'true'
run: yarn run lint
docker-compose-template:
name: Docker Compose Template
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Check changed files
id: changed-files
uses: tj-actions/changed-files@v45
with:
files: |
docker/generate_docker_compose
docker/.env.example
docker/docker-compose-template.yaml
docker/docker-compose.yaml
- name: Generate Docker Compose
if: steps.changed-files.outputs.any_changed == 'true'
run: |
cd docker
./generate_docker_compose
- name: Check for changes
if: steps.changed-files.outputs.any_changed == 'true'
run: git diff --exit-code
superlinter:
name: SuperLinter

View File

@ -70,4 +70,4 @@ jobs:
tidb
- name: Test Vector Stores
run: poetry run -C api bash dev/pytest/pytest_vdb.sh
run: poetry run -P api bash dev/pytest/pytest_vdb.sh

View File

@ -53,10 +53,12 @@ ignore = [
"FURB152", # math-constant
"UP007", # non-pep604-annotation
"UP032", # f-string
"UP045", # non-pep604-annotation-optional
"B005", # strip-with-multi-characters
"B006", # mutable-argument-default
"B007", # unused-loop-control-variable
"B026", # star-arg-unpacking-after-keyword-arg
"B903", # class-as-data-structure
"B904", # raise-without-from-inside-except
"B905", # zip-without-explicit-strict
"N806", # non-lowercase-variable-in-function

View File

@ -4,7 +4,7 @@ FROM python:3.12-slim-bookworm AS base
WORKDIR /app/api
# Install Poetry
ENV POETRY_VERSION=1.8.4
ENV POETRY_VERSION=2.0.1
# if you located in China, you can use aliyun mirror to speed up
# RUN pip install --no-cache-dir poetry==${POETRY_VERSION} -i https://mirrors.aliyun.com/pypi/simple/
@ -52,12 +52,14 @@ RUN apt-get update \
&& apt-get install -y --no-install-recommends curl nodejs libgmp-dev libmpfr-dev libmpc-dev \
# if you located in China, you can use aliyun mirror to speed up
# && echo "deb http://mirrors.aliyun.com/debian testing main" > /etc/apt/sources.list \
&& echo "deb http://deb.debian.org/debian testing main" > /etc/apt/sources.list \
&& echo "deb http://deb.debian.org/debian bookworm main" > /etc/apt/sources.list \
&& apt-get update \
# For Security
&& apt-get install -y --no-install-recommends expat=2.6.4-1 libldap-2.5-0=2.5.19+dfsg-1 perl=5.40.0-8 libsqlite3-0=3.46.1-1 zlib1g=1:1.3.dfsg+really1.3.1-1+b1 \
&& apt-get install -y --no-install-recommends expat libldap-2.5-0 perl libsqlite3-0 zlib1g \
# install a chinese font to support the use of tools like matplotlib
&& apt-get install -y fonts-noto-cjk \
# install libmagic to support the use of python-magic guess MIMETYPE
&& apt-get install -y libmagic1 \
&& apt-get autoremove -y \
&& rm -rf /var/lib/apt/lists/*

View File

@ -79,5 +79,5 @@
2. Run the tests locally with mocked system environment variables in `tool.pytest_env` section in `pyproject.toml`
```bash
poetry run -C api bash dev/pytest/pytest_all_tests.sh
poetry run -P api bash dev/pytest/pytest_all_tests.sh
```

View File

@ -146,7 +146,7 @@ class EndpointConfig(BaseSettings):
)
CONSOLE_WEB_URL: str = Field(
description="Base URL for the console web interface," "used for frontend references and CORS configuration",
description="Base URL for the console web interface,used for frontend references and CORS configuration",
default="",
)

View File

@ -181,7 +181,7 @@ class HostedFetchAppTemplateConfig(BaseSettings):
"""
HOSTED_FETCH_APP_TEMPLATES_MODE: str = Field(
description="Mode for fetching app templates: remote, db, or builtin" " default to remote,",
description="Mode for fetching app templates: remote, db, or builtin default to remote,",
default="remote",
)

View File

@ -33,3 +33,9 @@ class MilvusConfig(BaseSettings):
description="Name of the Milvus database to connect to (default is 'default')",
default="default",
)
MILVUS_ENABLE_HYBRID_SEARCH: bool = Field(
description="Enable hybrid search features (requires Milvus >= 2.5.0). Set to false for compatibility with "
"older versions",
default=True,
)

View File

@ -9,7 +9,7 @@ class PackagingInfo(BaseSettings):
CURRENT_VERSION: str = Field(
description="Dify version",
default="0.14.2",
default="0.15.2",
)
COMMIT_SHA: str = Field(

View File

@ -1,12 +1,32 @@
import mimetypes
import os
import platform
import re
import urllib.parse
import warnings
from collections.abc import Mapping
from typing import Any
from uuid import uuid4
import httpx
try:
import magic
except ImportError:
if platform.system() == "Windows":
warnings.warn(
"To use python-magic guess MIMETYPE, you need to run `pip install python-magic-bin`", stacklevel=2
)
elif platform.system() == "Darwin":
warnings.warn("To use python-magic guess MIMETYPE, you need to run `brew install libmagic`", stacklevel=2)
elif platform.system() == "Linux":
warnings.warn(
"To use python-magic guess MIMETYPE, you need to run `sudo apt-get install libmagic1`", stacklevel=2
)
else:
warnings.warn("To use python-magic guess MIMETYPE, you need to install `libmagic`", stacklevel=2)
magic = None # type: ignore
from pydantic import BaseModel
from configs import dify_config
@ -47,6 +67,13 @@ def guess_file_info_from_response(response: httpx.Response):
# If guessing fails, use Content-Type from response headers
mimetype = response.headers.get("Content-Type", "application/octet-stream")
# Use python-magic to guess MIME type if still unknown or generic
if mimetype == "application/octet-stream" and magic is not None:
try:
mimetype = magic.from_buffer(response.content[:1024], mime=True)
except magic.MagicException:
pass
extension = os.path.splitext(filename)[1]
# Ensure filename has an extension

View File

@ -56,7 +56,7 @@ class InsertExploreAppListApi(Resource):
app = App.query.filter(App.id == args["app_id"]).first()
if not app:
raise NotFound(f'App \'{args["app_id"]}\' is not found')
raise NotFound(f"App '{args['app_id']}' is not found")
site = app.site
if not site:

View File

@ -22,7 +22,7 @@ from controllers.console.wraps import account_initialization_required, setup_req
from core.errors.error import ModelCurrentlyNotSupportError, ProviderTokenNotInitError, QuotaExceededError
from core.model_runtime.errors.invoke import InvokeError
from libs.login import login_required
from models.model import AppMode
from models import App, AppMode
from services.audio_service import AudioService
from services.errors.audio import (
AudioTooLargeServiceError,
@ -79,7 +79,7 @@ class ChatMessageTextApi(Resource):
@login_required
@account_initialization_required
@get_app_model
def post(self, app_model):
def post(self, app_model: App):
from werkzeug.exceptions import InternalServerError
try:
@ -98,9 +98,13 @@ class ChatMessageTextApi(Resource):
and app_model.workflow.features_dict
):
text_to_speech = app_model.workflow.features_dict.get("text_to_speech")
if text_to_speech is None:
raise ValueError("TTS is not enabled")
voice = args.get("voice") or text_to_speech.get("voice")
else:
try:
if app_model.app_model_config is None:
raise ValueError("AppModelConfig not found")
voice = args.get("voice") or app_model.app_model_config.text_to_speech_dict.get("voice")
except Exception:
voice = None

View File

@ -52,12 +52,12 @@ class DatasetListApi(Resource):
# provider = request.args.get("provider", default="vendor")
search = request.args.get("keyword", default=None, type=str)
tag_ids = request.args.getlist("tag_ids")
include_all = request.args.get("include_all", default="false").lower() == "true"
if ids:
datasets, total = DatasetService.get_datasets_by_ids(ids, current_user.current_tenant_id)
else:
datasets, total = DatasetService.get_datasets(
page, limit, current_user.current_tenant_id, current_user, search, tag_ids
page, limit, current_user.current_tenant_id, current_user, search, tag_ids, include_all
)
# check embedding setting
@ -457,7 +457,7 @@ class DatasetIndexingEstimateApi(Resource):
)
except LLMBadRequestError:
raise ProviderNotInitializeError(
"No Embedding Model available. Please configure a valid provider " "in the Settings -> Model Provider."
"No Embedding Model available. Please configure a valid provider in the Settings -> Model Provider."
)
except ProviderTokenNotInitError as ex:
raise ProviderNotInitializeError(ex.description)
@ -619,9 +619,7 @@ class DatasetRetrievalSettingApi(Resource):
vector_type = dify_config.VECTOR_STORE
match vector_type:
case (
VectorType.MILVUS
| VectorType.RELYT
| VectorType.PGVECTOR
VectorType.RELYT
| VectorType.TIDB_VECTOR
| VectorType.CHROMA
| VectorType.TENCENT
@ -640,10 +638,12 @@ class DatasetRetrievalSettingApi(Resource):
| VectorType.MYSCALE
| VectorType.ORACLE
| VectorType.ELASTICSEARCH
| VectorType.ELASTICSEARCH_JA
| VectorType.PGVECTOR
| VectorType.TIDB_ON_QDRANT
| VectorType.LINDORM
| VectorType.COUCHBASE
| VectorType.MILVUS
):
return {
"retrieval_method": [
@ -683,6 +683,7 @@ class DatasetRetrievalSettingMockApi(Resource):
| VectorType.MYSCALE
| VectorType.ORACLE
| VectorType.ELASTICSEARCH
| VectorType.ELASTICSEARCH_JA
| VectorType.COUCHBASE
| VectorType.PGVECTOR
| VectorType.LINDORM

View File

@ -257,7 +257,8 @@ class DatasetDocumentListApi(Resource):
parser.add_argument("original_document_id", type=str, required=False, location="json")
parser.add_argument("doc_form", type=str, default="text_model", required=False, nullable=False, location="json")
parser.add_argument("retrieval_model", type=dict, required=False, nullable=False, location="json")
parser.add_argument("embedding_model", type=str, required=False, nullable=True, location="json")
parser.add_argument("embedding_model_provider", type=str, required=False, nullable=True, location="json")
parser.add_argument(
"doc_language", type=str, default="English", required=False, nullable=False, location="json"
)
@ -349,8 +350,7 @@ class DatasetInitApi(Resource):
)
except InvokeAuthorizationError:
raise ProviderNotInitializeError(
"No Embedding Model available. Please configure a valid provider "
"in the Settings -> Model Provider."
"No Embedding Model available. Please configure a valid provider in the Settings -> Model Provider."
)
except ProviderTokenNotInitError as ex:
raise ProviderNotInitializeError(ex.description)
@ -525,8 +525,7 @@ class DocumentBatchIndexingEstimateApi(DocumentResource):
return response.model_dump(), 200
except LLMBadRequestError:
raise ProviderNotInitializeError(
"No Embedding Model available. Please configure a valid provider "
"in the Settings -> Model Provider."
"No Embedding Model available. Please configure a valid provider in the Settings -> Model Provider."
)
except ProviderTokenNotInitError as ex:
raise ProviderNotInitializeError(ex.description)

View File

@ -168,8 +168,7 @@ class DatasetDocumentSegmentApi(Resource):
)
except LLMBadRequestError:
raise ProviderNotInitializeError(
"No Embedding Model available. Please configure a valid provider "
"in the Settings -> Model Provider."
"No Embedding Model available. Please configure a valid provider in the Settings -> Model Provider."
)
except ProviderTokenNotInitError as ex:
raise ProviderNotInitializeError(ex.description)
@ -217,8 +216,7 @@ class DatasetDocumentSegmentAddApi(Resource):
)
except LLMBadRequestError:
raise ProviderNotInitializeError(
"No Embedding Model available. Please configure a valid provider "
"in the Settings -> Model Provider."
"No Embedding Model available. Please configure a valid provider in the Settings -> Model Provider."
)
except ProviderTokenNotInitError as ex:
raise ProviderNotInitializeError(ex.description)
@ -267,8 +265,7 @@ class DatasetDocumentSegmentUpdateApi(Resource):
)
except LLMBadRequestError:
raise ProviderNotInitializeError(
"No Embedding Model available. Please configure a valid provider "
"in the Settings -> Model Provider."
"No Embedding Model available. Please configure a valid provider in the Settings -> Model Provider."
)
except ProviderTokenNotInitError as ex:
raise ProviderNotInitializeError(ex.description)
@ -368,9 +365,9 @@ class DatasetDocumentSegmentBatchImportApi(Resource):
result = []
for index, row in df.iterrows():
if document.doc_form == "qa_model":
data = {"content": row[0], "answer": row[1]}
data = {"content": row.iloc[0], "answer": row.iloc[1]}
else:
data = {"content": row[0]}
data = {"content": row.iloc[0]}
result.append(data)
if len(result) == 0:
raise ValueError("The CSV file is empty.")
@ -437,8 +434,7 @@ class ChildChunkAddApi(Resource):
)
except LLMBadRequestError:
raise ProviderNotInitializeError(
"No Embedding Model available. Please configure a valid provider "
"in the Settings -> Model Provider."
"No Embedding Model available. Please configure a valid provider in the Settings -> Model Provider."
)
except ProviderTokenNotInitError as ex:
raise ProviderNotInitializeError(ex.description)

View File

@ -32,7 +32,7 @@ class ConversationListApi(InstalledAppResource):
pinned = None
if "pinned" in args and args["pinned"] is not None:
pinned = True if args["pinned"] == "true" else False
pinned = args["pinned"] == "true"
try:
with Session(db.engine) as session:

View File

@ -50,7 +50,7 @@ class MessageListApi(InstalledAppResource):
try:
return MessageService.pagination_by_first_id(
app_model, current_user, args["conversation_id"], args["first_id"], args["limit"], "desc"
app_model, current_user, args["conversation_id"], args["first_id"], args["limit"]
)
except services.errors.conversation.ConversationNotExistsError:
raise NotFound("Conversation Not Exists.")

View File

@ -1,3 +1,5 @@
import json
from flask_restful import Resource, reqparse # type: ignore
from controllers.console.wraps import setup_required
@ -29,4 +31,34 @@ class EnterpriseWorkspace(Resource):
return {"message": "enterprise workspace created."}
class EnterpriseWorkspaceNoOwnerEmail(Resource):
@setup_required
@inner_api_only
def post(self):
parser = reqparse.RequestParser()
parser.add_argument("name", type=str, required=True, location="json")
args = parser.parse_args()
tenant = TenantService.create_tenant(args["name"], is_from_dashboard=True)
tenant_was_created.send(tenant)
resp = {
"id": tenant.id,
"name": tenant.name,
"encrypt_public_key": tenant.encrypt_public_key,
"plan": tenant.plan,
"status": tenant.status,
"custom_config": json.loads(tenant.custom_config) if tenant.custom_config else {},
"created_at": tenant.created_at.isoformat() if tenant.created_at else None,
"updated_at": tenant.updated_at.isoformat() if tenant.updated_at else None,
}
return {
"message": "enterprise workspace created.",
"tenant": resp,
}
api.add_resource(EnterpriseWorkspace, "/enterprise/workspace")
api.add_resource(EnterpriseWorkspaceNoOwnerEmail, "/enterprise/workspace/ownerless")

View File

@ -7,4 +7,4 @@ api = ExternalApi(bp)
from . import index
from .app import app, audio, completion, conversation, file, message, workflow
from .dataset import dataset, document, hit_testing, segment
from .dataset import dataset, document, hit_testing, segment, upload_file

View File

@ -31,8 +31,11 @@ class DatasetListApi(DatasetApiResource):
# provider = request.args.get("provider", default="vendor")
search = request.args.get("keyword", default=None, type=str)
tag_ids = request.args.getlist("tag_ids")
include_all = request.args.get("include_all", default="false").lower() == "true"
datasets, total = DatasetService.get_datasets(page, limit, tenant_id, current_user, search, tag_ids)
datasets, total = DatasetService.get_datasets(
page, limit, tenant_id, current_user, search, tag_ids, include_all
)
# check embedding setting
provider_manager = ProviderManager()
configurations = provider_manager.get_configurations(tenant_id=current_user.current_tenant_id)

View File

@ -18,6 +18,7 @@ from controllers.service_api.app.error import (
from controllers.service_api.dataset.error import (
ArchivedDocumentImmutableError,
DocumentIndexingError,
InvalidMetadataError,
)
from controllers.service_api.wraps import DatasetApiResource, cloud_edition_billing_resource_check
from core.errors.error import ProviderTokenNotInitError
@ -50,6 +51,9 @@ class DocumentAddByTextApi(DatasetApiResource):
"indexing_technique", type=str, choices=Dataset.INDEXING_TECHNIQUE_LIST, nullable=False, location="json"
)
parser.add_argument("retrieval_model", type=dict, required=False, nullable=False, location="json")
parser.add_argument("doc_type", type=str, required=False, nullable=True, location="json")
parser.add_argument("doc_metadata", type=dict, required=False, nullable=True, location="json")
args = parser.parse_args()
dataset_id = str(dataset_id)
tenant_id = str(tenant_id)
@ -61,6 +65,28 @@ class DocumentAddByTextApi(DatasetApiResource):
if not dataset.indexing_technique and not args["indexing_technique"]:
raise ValueError("indexing_technique is required.")
# Validate metadata if provided
if args.get("doc_type") or args.get("doc_metadata"):
if not args.get("doc_type") or not args.get("doc_metadata"):
raise InvalidMetadataError("Both doc_type and doc_metadata must be provided when adding metadata")
if args["doc_type"] not in DocumentService.DOCUMENT_METADATA_SCHEMA:
raise InvalidMetadataError(
"Invalid doc_type. Must be one of: " + ", ".join(DocumentService.DOCUMENT_METADATA_SCHEMA.keys())
)
if not isinstance(args["doc_metadata"], dict):
raise InvalidMetadataError("doc_metadata must be a dictionary")
# Validate metadata schema based on doc_type
if args["doc_type"] != "others":
metadata_schema = DocumentService.DOCUMENT_METADATA_SCHEMA[args["doc_type"]]
for key, value in args["doc_metadata"].items():
if key in metadata_schema and not isinstance(value, metadata_schema[key]):
raise InvalidMetadataError(f"Invalid type for metadata field {key}")
# set to MetaDataConfig
args["metadata"] = {"doc_type": args["doc_type"], "doc_metadata": args["doc_metadata"]}
text = args.get("text")
name = args.get("name")
if text is None or name is None:
@ -107,6 +133,8 @@ class DocumentUpdateByTextApi(DatasetApiResource):
"doc_language", type=str, default="English", required=False, nullable=False, location="json"
)
parser.add_argument("retrieval_model", type=dict, required=False, nullable=False, location="json")
parser.add_argument("doc_type", type=str, required=False, nullable=True, location="json")
parser.add_argument("doc_metadata", type=dict, required=False, nullable=True, location="json")
args = parser.parse_args()
dataset_id = str(dataset_id)
tenant_id = str(tenant_id)
@ -115,6 +143,32 @@ class DocumentUpdateByTextApi(DatasetApiResource):
if not dataset:
raise ValueError("Dataset is not exist.")
# indexing_technique is already set in dataset since this is an update
args["indexing_technique"] = dataset.indexing_technique
# Validate metadata if provided
if args.get("doc_type") or args.get("doc_metadata"):
if not args.get("doc_type") or not args.get("doc_metadata"):
raise InvalidMetadataError("Both doc_type and doc_metadata must be provided when adding metadata")
if args["doc_type"] not in DocumentService.DOCUMENT_METADATA_SCHEMA:
raise InvalidMetadataError(
"Invalid doc_type. Must be one of: " + ", ".join(DocumentService.DOCUMENT_METADATA_SCHEMA.keys())
)
if not isinstance(args["doc_metadata"], dict):
raise InvalidMetadataError("doc_metadata must be a dictionary")
# Validate metadata schema based on doc_type
if args["doc_type"] != "others":
metadata_schema = DocumentService.DOCUMENT_METADATA_SCHEMA[args["doc_type"]]
for key, value in args["doc_metadata"].items():
if key in metadata_schema and not isinstance(value, metadata_schema[key]):
raise InvalidMetadataError(f"Invalid type for metadata field {key}")
# set to MetaDataConfig
args["metadata"] = {"doc_type": args["doc_type"], "doc_metadata": args["doc_metadata"]}
if args["text"]:
text = args.get("text")
name = args.get("name")
@ -161,6 +215,30 @@ class DocumentAddByFileApi(DatasetApiResource):
args["doc_form"] = "text_model"
if "doc_language" not in args:
args["doc_language"] = "English"
# Validate metadata if provided
if args.get("doc_type") or args.get("doc_metadata"):
if not args.get("doc_type") or not args.get("doc_metadata"):
raise InvalidMetadataError("Both doc_type and doc_metadata must be provided when adding metadata")
if args["doc_type"] not in DocumentService.DOCUMENT_METADATA_SCHEMA:
raise InvalidMetadataError(
"Invalid doc_type. Must be one of: " + ", ".join(DocumentService.DOCUMENT_METADATA_SCHEMA.keys())
)
if not isinstance(args["doc_metadata"], dict):
raise InvalidMetadataError("doc_metadata must be a dictionary")
# Validate metadata schema based on doc_type
if args["doc_type"] != "others":
metadata_schema = DocumentService.DOCUMENT_METADATA_SCHEMA[args["doc_type"]]
for key, value in args["doc_metadata"].items():
if key in metadata_schema and not isinstance(value, metadata_schema[key]):
raise InvalidMetadataError(f"Invalid type for metadata field {key}")
# set to MetaDataConfig
args["metadata"] = {"doc_type": args["doc_type"], "doc_metadata": args["doc_metadata"]}
# get dataset info
dataset_id = str(dataset_id)
tenant_id = str(tenant_id)
@ -228,6 +306,29 @@ class DocumentUpdateByFileApi(DatasetApiResource):
if "doc_language" not in args:
args["doc_language"] = "English"
# Validate metadata if provided
if args.get("doc_type") or args.get("doc_metadata"):
if not args.get("doc_type") or not args.get("doc_metadata"):
raise InvalidMetadataError("Both doc_type and doc_metadata must be provided when adding metadata")
if args["doc_type"] not in DocumentService.DOCUMENT_METADATA_SCHEMA:
raise InvalidMetadataError(
"Invalid doc_type. Must be one of: " + ", ".join(DocumentService.DOCUMENT_METADATA_SCHEMA.keys())
)
if not isinstance(args["doc_metadata"], dict):
raise InvalidMetadataError("doc_metadata must be a dictionary")
# Validate metadata schema based on doc_type
if args["doc_type"] != "others":
metadata_schema = DocumentService.DOCUMENT_METADATA_SCHEMA[args["doc_type"]]
for key, value in args["doc_metadata"].items():
if key in metadata_schema and not isinstance(value, metadata_schema[key]):
raise InvalidMetadataError(f"Invalid type for metadata field {key}")
# set to MetaDataConfig
args["metadata"] = {"doc_type": args["doc_type"], "doc_metadata": args["doc_metadata"]}
# get dataset info
dataset_id = str(dataset_id)
tenant_id = str(tenant_id)

View File

@ -53,8 +53,7 @@ class SegmentApi(DatasetApiResource):
)
except LLMBadRequestError:
raise ProviderNotInitializeError(
"No Embedding Model available. Please configure a valid provider "
"in the Settings -> Model Provider."
"No Embedding Model available. Please configure a valid provider in the Settings -> Model Provider."
)
except ProviderTokenNotInitError as ex:
raise ProviderNotInitializeError(ex.description)
@ -95,8 +94,7 @@ class SegmentApi(DatasetApiResource):
)
except LLMBadRequestError:
raise ProviderNotInitializeError(
"No Embedding Model available. Please configure a valid provider "
"in the Settings -> Model Provider."
"No Embedding Model available. Please configure a valid provider in the Settings -> Model Provider."
)
except ProviderTokenNotInitError as ex:
raise ProviderNotInitializeError(ex.description)
@ -175,8 +173,7 @@ class DatasetSegmentApi(DatasetApiResource):
)
except LLMBadRequestError:
raise ProviderNotInitializeError(
"No Embedding Model available. Please configure a valid provider "
"in the Settings -> Model Provider."
"No Embedding Model available. Please configure a valid provider in the Settings -> Model Provider."
)
except ProviderTokenNotInitError as ex:
raise ProviderNotInitializeError(ex.description)

View File

@ -0,0 +1,54 @@
from werkzeug.exceptions import NotFound
from controllers.service_api import api
from controllers.service_api.wraps import (
DatasetApiResource,
)
from core.file import helpers as file_helpers
from extensions.ext_database import db
from models.dataset import Dataset
from models.model import UploadFile
from services.dataset_service import DocumentService
class UploadFileApi(DatasetApiResource):
def get(self, tenant_id, dataset_id, document_id):
"""Get upload file."""
# check dataset
dataset_id = str(dataset_id)
tenant_id = str(tenant_id)
dataset = db.session.query(Dataset).filter(Dataset.tenant_id == tenant_id, Dataset.id == dataset_id).first()
if not dataset:
raise NotFound("Dataset not found.")
# check document
document_id = str(document_id)
document = DocumentService.get_document(dataset.id, document_id)
if not document:
raise NotFound("Document not found.")
# check upload file
if document.data_source_type != "upload_file":
raise ValueError(f"Document data source type ({document.data_source_type}) is not upload_file.")
data_source_info = document.data_source_info_dict
if data_source_info and "upload_file_id" in data_source_info:
file_id = data_source_info["upload_file_id"]
upload_file = db.session.query(UploadFile).filter(UploadFile.id == file_id).first()
if not upload_file:
raise NotFound("UploadFile not found.")
else:
raise ValueError("Upload file id not found in document data source info.")
url = file_helpers.get_signed_file_url(upload_file_id=upload_file.id)
return {
"id": upload_file.id,
"name": upload_file.name,
"size": upload_file.size,
"extension": upload_file.extension,
"url": url,
"download_url": f"{url}&as_attachment=true",
"mime_type": upload_file.mime_type,
"created_by": upload_file.created_by,
"created_at": upload_file.created_at.timestamp(),
}, 200
api.add_resource(UploadFileApi, "/datasets/<uuid:dataset_id>/documents/<uuid:document_id>/upload-file")

View File

@ -1,5 +1,5 @@
from collections.abc import Callable
from datetime import UTC, datetime
from datetime import UTC, datetime, timedelta
from enum import Enum
from functools import wraps
from typing import Optional
@ -8,6 +8,8 @@ from flask import current_app, request
from flask_login import user_logged_in # type: ignore
from flask_restful import Resource # type: ignore
from pydantic import BaseModel
from sqlalchemy import select, update
from sqlalchemy.orm import Session
from werkzeug.exceptions import Forbidden, Unauthorized
from extensions.ext_database import db
@ -174,7 +176,7 @@ def validate_dataset_token(view=None):
return decorator
def validate_and_get_api_token(scope=None):
def validate_and_get_api_token(scope: str | None = None):
"""
Validate and get API token.
"""
@ -188,20 +190,29 @@ def validate_and_get_api_token(scope=None):
if auth_scheme != "bearer":
raise Unauthorized("Authorization scheme must be 'Bearer'")
api_token = (
db.session.query(ApiToken)
.filter(
ApiToken.token == auth_token,
ApiToken.type == scope,
current_time = datetime.now(UTC).replace(tzinfo=None)
cutoff_time = current_time - timedelta(minutes=1)
with Session(db.engine, expire_on_commit=False) as session:
update_stmt = (
update(ApiToken)
.where(
ApiToken.token == auth_token,
(ApiToken.last_used_at.is_(None) | (ApiToken.last_used_at < cutoff_time)),
ApiToken.type == scope,
)
.values(last_used_at=current_time)
.returning(ApiToken)
)
.first()
)
result = session.execute(update_stmt)
api_token = result.scalar_one_or_none()
if not api_token:
raise Unauthorized("Access token is invalid")
api_token.last_used_at = datetime.now(UTC).replace(tzinfo=None)
db.session.commit()
if not api_token:
stmt = select(ApiToken).where(ApiToken.token == auth_token, ApiToken.type == scope)
api_token = session.scalar(stmt)
if not api_token:
raise Unauthorized("Access token is invalid")
else:
session.commit()
return api_token
@ -229,7 +240,7 @@ def create_or_update_end_user_for_user_id(app_model: App, user_id: Optional[str]
tenant_id=app_model.tenant_id,
app_id=app_model.id,
type="service_api",
is_anonymous=True if user_id == "DEFAULT-USER" else False,
is_anonymous=user_id == "DEFAULT-USER",
session_id=user_id,
)
db.session.add(end_user)

View File

@ -39,7 +39,7 @@ class ConversationListApi(WebApiResource):
pinned = None
if "pinned" in args and args["pinned"] is not None:
pinned = True if args["pinned"] == "true" else False
pinned = args["pinned"] == "true"
try:
with Session(db.engine) as session:

View File

@ -91,7 +91,7 @@ class MessageListApi(WebApiResource):
try:
return MessageService.pagination_by_first_id(
app_model, end_user, args["conversation_id"], args["first_id"], args["limit"], "desc"
app_model, end_user, args["conversation_id"], args["first_id"], args["limit"]
)
except services.errors.conversation.ConversationNotExistsError:
raise NotFound("Conversation Not Exists.")

View File

@ -172,7 +172,7 @@ class CotAgentRunner(BaseAgentRunner, ABC):
self.save_agent_thought(
agent_thought=agent_thought,
tool_name=scratchpad.action.action_name if scratchpad.action else "",
tool_name=(scratchpad.action.action_name if scratchpad.action and not scratchpad.is_final() else ""),
tool_input={scratchpad.action.action_name: scratchpad.action.action_input} if scratchpad.action else {},
tool_invoke_meta={},
thought=scratchpad.thought or "",

View File

@ -202,7 +202,7 @@ class AgentChatAppRunner(AppRunner):
# change function call strategy based on LLM model
llm_model = cast(LargeLanguageModel, model_instance.model_type_instance)
model_schema = llm_model.get_model_schema(model_instance.model, model_instance.credentials)
if not model_schema or not model_schema.features:
if not model_schema:
raise ValueError("Model schema not found")
if {ModelFeature.MULTI_TOOL_CALL, ModelFeature.TOOL_CALL}.intersection(model_schema.features or []):

View File

@ -167,8 +167,7 @@ class AppQueueManager:
else:
if isinstance(data, DeclarativeMeta) or hasattr(data, "_sa_instance_state"):
raise TypeError(
"Critical Error: Passing SQLAlchemy Model instances "
"that cause thread safety issues is not allowed."
"Critical Error: Passing SQLAlchemy Model instances that cause thread safety issues is not allowed."
)

View File

@ -89,6 +89,7 @@ class MessageBasedAppGenerator(BaseAppGenerator):
Conversation.id == conversation_id,
Conversation.app_id == app_model.id,
Conversation.status == "normal",
Conversation.is_deleted.is_(False),
]
if isinstance(user, Account):

View File

@ -145,7 +145,7 @@ class MessageCycleManage:
# get extension
if "." in message_file.url:
extension = f'.{message_file.url.split(".")[-1]}'
extension = f".{message_file.url.split('.')[-1]}"
if len(extension) > 10:
extension = ".bin"
else:

View File

@ -62,8 +62,9 @@ class ApiExternalDataTool(ExternalDataTool):
if not api_based_extension:
raise ValueError(
"[External data tool] API query failed, variable: {}, "
"error: api_based_extension_id is invalid".format(self.variable)
"[External data tool] API query failed, variable: {}, error: api_based_extension_id is invalid".format(
self.variable
)
)
# decrypt api_key

View File

@ -90,7 +90,7 @@ class File(BaseModel):
def markdown(self) -> str:
url = self.generate_url()
if self.type == FileType.IMAGE:
text = f'![{self.filename or ""}]({url})'
text = f"![{self.filename or ''}]({url})"
else:
text = f"[{self.filename or url}]({url})"

View File

@ -530,7 +530,6 @@ class IndexingRunner:
# chunk nodes by chunk size
indexing_start_at = time.perf_counter()
tokens = 0
chunk_size = 10
if dataset_document.doc_form != IndexType.PARENT_CHILD_INDEX:
# create keyword index
create_keyword_thread = threading.Thread(
@ -539,11 +538,22 @@ class IndexingRunner:
)
create_keyword_thread.start()
max_workers = 10
if dataset.indexing_technique == "high_quality":
with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as executor:
futures = []
for i in range(0, len(documents), chunk_size):
chunk_documents = documents[i : i + chunk_size]
# Distribute documents into multiple groups based on the hash values of page_content
# This is done to prevent multiple threads from processing the same document,
# Thereby avoiding potential database insertion deadlocks
document_groups: list[list[Document]] = [[] for _ in range(max_workers)]
for document in documents:
hash = helper.generate_text_hash(document.page_content)
group_index = int(hash, 16) % max_workers
document_groups[group_index].append(document)
for chunk_documents in document_groups:
if len(chunk_documents) == 0:
continue
futures.append(
executor.submit(
self._process_chunk,

View File

@ -131,7 +131,7 @@ JAVASCRIPT_CODE_GENERATOR_PROMPT_TEMPLATE = (
SUGGESTED_QUESTIONS_AFTER_ANSWER_INSTRUCTION_PROMPT = (
"Please help me predict the three most likely questions that human would ask, "
"and keeping each question under 20 characters.\n"
"MAKE SURE your output is the SAME language as the Assistant's latest response"
"MAKE SURE your output is the SAME language as the Assistant's latest response. "
"The output must be an array in JSON format following the specified schema:\n"
'["question1","question2","question3"]\n'
)

View File

@ -221,13 +221,12 @@ class AIModel(ABC):
:param credentials: model credentials
:return: model schema
"""
# get predefined models (predefined_models)
models = self.predefined_models()
model_map = {model.model: model for model in models}
if model in model_map:
return model_map[model]
# Try to get model schema from predefined models
for predefined_model in self.predefined_models():
if model == predefined_model.model:
return predefined_model
# Try to get model schema from credentials
if credentials:
model_schema = self.get_customizable_model_schema_from_credentials(model, credentials)
if model_schema:

View File

@ -1,13 +1,11 @@
from concurrent.futures import ProcessPoolExecutor
from os.path import abspath, dirname, join
import logging
from threading import Lock
from typing import Any, cast
from typing import Any
from transformers import GPT2Tokenizer as TransformerGPT2Tokenizer # type: ignore
logger = logging.getLogger(__name__)
_tokenizer: Any = None
_lock = Lock()
_executor = ProcessPoolExecutor(max_workers=1)
class GPT2Tokenizer:
@ -17,22 +15,37 @@ class GPT2Tokenizer:
use gpt2 tokenizer to get num tokens
"""
_tokenizer = GPT2Tokenizer.get_encoder()
tokens = _tokenizer.encode(text, verbose=False)
tokens = _tokenizer.encode(text)
return len(tokens)
@staticmethod
def get_num_tokens(text: str) -> int:
future = _executor.submit(GPT2Tokenizer._get_num_tokens_by_gpt2, text)
result = future.result()
return cast(int, result)
# Because this process needs more cpu resource, we turn this back before we find a better way to handle it.
#
# future = _executor.submit(GPT2Tokenizer._get_num_tokens_by_gpt2, text)
# result = future.result()
# return cast(int, result)
return GPT2Tokenizer._get_num_tokens_by_gpt2(text)
@staticmethod
def get_encoder() -> Any:
global _tokenizer, _lock
with _lock:
if _tokenizer is None:
base_path = abspath(__file__)
gpt2_tokenizer_path = join(dirname(base_path), "gpt2")
_tokenizer = TransformerGPT2Tokenizer.from_pretrained(gpt2_tokenizer_path)
# Try to use tiktoken to get the tokenizer because it is faster
#
try:
import tiktoken
_tokenizer = tiktoken.get_encoding("gpt2")
except Exception:
from os.path import abspath, dirname, join
from transformers import GPT2Tokenizer as TransformerGPT2Tokenizer # type: ignore
base_path = abspath(__file__)
gpt2_tokenizer_path = join(dirname(base_path), "gpt2")
_tokenizer = TransformerGPT2Tokenizer.from_pretrained(gpt2_tokenizer_path)
logger.info("Fallback to Transformers' GPT-2 tokenizer from tiktoken")
return _tokenizer

View File

@ -53,6 +53,9 @@ model_credential_schema:
type: select
required: true
options:
- label:
en_US: 2024-12-01-preview
value: 2024-12-01-preview
- label:
en_US: 2024-10-01-preview
value: 2024-10-01-preview

View File

@ -108,7 +108,7 @@ class AzureOpenAILargeLanguageModel(_CommonAzureOpenAI, LargeLanguageModel):
ai_model_entity = self._get_ai_model_entity(base_model_name=base_model_name, model=model)
if not ai_model_entity:
raise CredentialsValidateFailedError(f'Base Model Name {credentials["base_model_name"]} is invalid')
raise CredentialsValidateFailedError(f"Base Model Name {credentials['base_model_name']} is invalid")
try:
client = AzureOpenAI(**self._to_credential_kwargs(credentials))

View File

@ -130,7 +130,7 @@ class AzureOpenAITextEmbeddingModel(_CommonAzureOpenAI, TextEmbeddingModel):
raise CredentialsValidateFailedError("Base Model Name is required")
if not self._get_ai_model_entity(credentials["base_model_name"], model):
raise CredentialsValidateFailedError(f'Base Model Name {credentials["base_model_name"]} is invalid')
raise CredentialsValidateFailedError(f"Base Model Name {credentials['base_model_name']} is invalid")
try:
credentials_kwargs = self._to_credential_kwargs(credentials)

View File

@ -44,6 +44,7 @@ provider_credential_schema:
label:
en_US: AWS Region
zh_Hans: AWS 地区
ja_JP: AWS リージョン
type: select
default: us-east-1
options:
@ -51,62 +52,77 @@ provider_credential_schema:
label:
en_US: US East (N. Virginia)
zh_Hans: 美国东部 (弗吉尼亚北部)
ja_JP: 米国 (バージニア北部)
- value: us-east-2
label:
en_US: US East (Ohio)
zh_Hans: 美国东部 (弗吉尼亚北部)
zh_Hans: 美国东部 (俄亥俄)
ja_JP: 米国 (オハイオ)
- value: us-west-2
label:
en_US: US West (Oregon)
zh_Hans: 美国西部 (俄勒冈州)
ja_JP: 米国 (オレゴン)
- value: ap-south-1
label:
en_US: Asia Pacific (Mumbai)
zh_Hans: 亚太地区(孟买)
ja_JP: アジアパシフィック (ムンバイ)
- value: ap-southeast-1
label:
en_US: Asia Pacific (Singapore)
zh_Hans: 亚太地区 (新加坡)
ja_JP: アジアパシフィック (シンガポール)
- value: ap-southeast-2
label:
en_US: Asia Pacific (Sydney)
zh_Hans: 亚太地区 (悉尼)
ja_JP: アジアパシフィック (シドニー)
- value: ap-northeast-1
label:
en_US: Asia Pacific (Tokyo)
zh_Hans: 亚太地区 (东京)
ja_JP: アジアパシフィック (東京)
- value: ap-northeast-2
label:
en_US: Asia Pacific (Seoul)
zh_Hans: 亚太地区(首尔)
ja_JP: アジアパシフィック (ソウル)
- value: ca-central-1
label:
en_US: Canada (Central)
zh_Hans: 加拿大(中部)
ja_JP: カナダ (中部)
- value: eu-central-1
label:
en_US: Europe (Frankfurt)
zh_Hans: 欧洲 (法兰克福)
ja_JP: 欧州 (フランクフルト)
- value: eu-west-1
label:
en_US: Europe (Ireland)
zh_Hans: 欧洲(爱尔兰)
ja_JP: 欧州 (アイルランド)
- value: eu-west-2
label:
en_US: Europe (London)
zh_Hans: 欧洲西部 (伦敦)
ja_JP: 欧州 (ロンドン)
- value: eu-west-3
label:
en_US: Europe (Paris)
zh_Hans: 欧洲(巴黎)
ja_JP: 欧州 (パリ)
- value: sa-east-1
label:
en_US: South America (São Paulo)
zh_Hans: 南美洲(圣保罗)
ja_JP: 南米 (サンパウロ)
- value: us-gov-west-1
label:
en_US: AWS GovCloud (US-West)
zh_Hans: AWS GovCloud (US-West)
ja_JP: AWS GovCloud (米国西部)
- variable: model_for_validation
required: false
label:

View File

@ -70,7 +70,7 @@ class BedrockRerankModel(RerankModel):
rerankingConfiguration = {
"type": "BEDROCK_RERANKING_MODEL",
"bedrockRerankingConfiguration": {
"numberOfResults": top_n,
"numberOfResults": min(top_n, len(text_sources)),
"modelConfiguration": {
"modelArn": model_package_arn,
},

View File

@ -677,16 +677,17 @@ class CohereLargeLanguageModel(LargeLanguageModel):
:return: model schema
"""
# get model schema
models = self.predefined_models()
model_map = {model.model: model for model in models}
mode = credentials.get("mode")
base_model_schema = None
for predefined_model in self.predefined_models():
if (
mode == "chat" and predefined_model.model == "command-light-chat"
) or predefined_model.model == "command-light":
base_model_schema = predefined_model
break
if mode == "chat":
base_model_schema = model_map["command-light-chat"]
else:
base_model_schema = model_map["command-light"]
if not base_model_schema:
raise ValueError("Model not found")
base_model_schema = cast(AIModelEntity, base_model_schema)

View File

@ -1,2 +1,3 @@
- deepseek-chat
- deepseek-coder
- deepseek-reasoner

View File

@ -10,7 +10,7 @@ features:
- stream-tool-call
model_properties:
mode: chat
context_size: 128000
context_size: 64000
parameter_rules:
- name: temperature
use_template: temperature

View File

@ -10,7 +10,7 @@ features:
- stream-tool-call
model_properties:
mode: chat
context_size: 128000
context_size: 64000
parameter_rules:
- name: temperature
use_template: temperature

View File

@ -0,0 +1,21 @@
model: deepseek-reasoner
label:
zh_Hans: deepseek-reasoner
en_US: deepseek-reasoner
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 64000
parameter_rules:
- name: max_tokens
use_template: max_tokens
min: 1
max: 8192
default: 4096
pricing:
input: "4"
output: "16"
unit: "0.000001"
currency: RMB

View File

@ -1,10 +1,13 @@
import json
from collections.abc import Generator
from typing import Optional, Union
import requests
from yarl import URL
from core.model_runtime.entities.llm_entities import LLMMode, LLMResult
from core.model_runtime.entities.llm_entities import LLMMode, LLMResult, LLMResultChunk, LLMResultChunkDelta
from core.model_runtime.entities.message_entities import (
AssistantPromptMessage,
PromptMessage,
PromptMessageTool,
)
@ -24,9 +27,6 @@ class DeepseekLargeLanguageModel(OAIAPICompatLargeLanguageModel):
user: Optional[str] = None,
) -> Union[LLMResult, Generator]:
self._add_custom_parameters(credentials)
# {"response_format": "xx"} need convert to {"response_format": {"type": "xx"}}
if "response_format" in model_parameters:
model_parameters["response_format"] = {"type": model_parameters.get("response_format")}
return super()._invoke(model, credentials, prompt_messages, model_parameters, tools, stop, stream)
def validate_credentials(self, model: str, credentials: dict) -> None:
@ -39,3 +39,208 @@ class DeepseekLargeLanguageModel(OAIAPICompatLargeLanguageModel):
credentials["mode"] = LLMMode.CHAT.value
credentials["function_calling_type"] = "tool_call"
credentials["stream_function_calling"] = "support"
def _handle_generate_stream_response(
self, model: str, credentials: dict, response: requests.Response, prompt_messages: list[PromptMessage]
) -> Generator:
"""
Handle llm stream response
:param model: model name
:param credentials: model credentials
:param response: streamed response
:param prompt_messages: prompt messages
:return: llm response chunk generator
"""
full_assistant_content = ""
chunk_index = 0
is_reasoning_started = False # Add flag to track reasoning state
def create_final_llm_result_chunk(
id: Optional[str], index: int, message: AssistantPromptMessage, finish_reason: str, usage: dict
) -> LLMResultChunk:
# calculate num tokens
prompt_tokens = usage and usage.get("prompt_tokens")
if prompt_tokens is None:
prompt_tokens = self._num_tokens_from_string(model, prompt_messages[0].content)
completion_tokens = usage and usage.get("completion_tokens")
if completion_tokens is None:
completion_tokens = self._num_tokens_from_string(model, full_assistant_content)
# transform usage
usage = self._calc_response_usage(model, credentials, prompt_tokens, completion_tokens)
return LLMResultChunk(
id=id,
model=model,
prompt_messages=prompt_messages,
delta=LLMResultChunkDelta(index=index, message=message, finish_reason=finish_reason, usage=usage),
)
# delimiter for stream response, need unicode_escape
import codecs
delimiter = credentials.get("stream_mode_delimiter", "\n\n")
delimiter = codecs.decode(delimiter, "unicode_escape")
tools_calls: list[AssistantPromptMessage.ToolCall] = []
def increase_tool_call(new_tool_calls: list[AssistantPromptMessage.ToolCall]):
def get_tool_call(tool_call_id: str):
if not tool_call_id:
return tools_calls[-1]
tool_call = next((tool_call for tool_call in tools_calls if tool_call.id == tool_call_id), None)
if tool_call is None:
tool_call = AssistantPromptMessage.ToolCall(
id=tool_call_id,
type="function",
function=AssistantPromptMessage.ToolCall.ToolCallFunction(name="", arguments=""),
)
tools_calls.append(tool_call)
return tool_call
for new_tool_call in new_tool_calls:
# get tool call
tool_call = get_tool_call(new_tool_call.function.name)
# update tool call
if new_tool_call.id:
tool_call.id = new_tool_call.id
if new_tool_call.type:
tool_call.type = new_tool_call.type
if new_tool_call.function.name:
tool_call.function.name = new_tool_call.function.name
if new_tool_call.function.arguments:
tool_call.function.arguments += new_tool_call.function.arguments
finish_reason = None # The default value of finish_reason is None
message_id, usage = None, None
for chunk in response.iter_lines(decode_unicode=True, delimiter=delimiter):
chunk = chunk.strip()
if chunk:
# ignore sse comments
if chunk.startswith(":"):
continue
decoded_chunk = chunk.strip().removeprefix("data:").lstrip()
if decoded_chunk == "[DONE]": # Some provider returns "data: [DONE]"
continue
try:
chunk_json: dict = json.loads(decoded_chunk)
# stream ended
except json.JSONDecodeError as e:
yield create_final_llm_result_chunk(
id=message_id,
index=chunk_index + 1,
message=AssistantPromptMessage(content=""),
finish_reason="Non-JSON encountered.",
usage=usage,
)
break
# handle the error here. for issue #11629
if chunk_json.get("error") and chunk_json.get("choices") is None:
raise ValueError(chunk_json.get("error"))
if chunk_json:
if u := chunk_json.get("usage"):
usage = u
if not chunk_json or len(chunk_json["choices"]) == 0:
continue
choice = chunk_json["choices"][0]
finish_reason = chunk_json["choices"][0].get("finish_reason")
message_id = chunk_json.get("id")
chunk_index += 1
if "delta" in choice:
delta = choice["delta"]
is_reasoning = delta.get("reasoning_content")
delta_content = delta.get("content") or delta.get("reasoning_content")
assistant_message_tool_calls = None
if "tool_calls" in delta and credentials.get("function_calling_type", "no_call") == "tool_call":
assistant_message_tool_calls = delta.get("tool_calls", None)
elif (
"function_call" in delta
and credentials.get("function_calling_type", "no_call") == "function_call"
):
assistant_message_tool_calls = [
{"id": "tool_call_id", "type": "function", "function": delta.get("function_call", {})}
]
# assistant_message_function_call = delta.delta.function_call
# extract tool calls from response
if assistant_message_tool_calls:
tool_calls = self._extract_response_tool_calls(assistant_message_tool_calls)
increase_tool_call(tool_calls)
if delta_content is None or delta_content == "":
continue
# Add markdown quote markers for reasoning content
if is_reasoning:
if not is_reasoning_started:
delta_content = "> 💭 " + delta_content
is_reasoning_started = True
elif "\n\n" in delta_content:
delta_content = delta_content.replace("\n\n", "\n> ")
elif "\n" in delta_content:
delta_content = delta_content.replace("\n", "\n> ")
elif is_reasoning_started:
# If we were in reasoning mode but now getting regular content,
# add \n\n to close the reasoning block
delta_content = "\n\n" + delta_content
is_reasoning_started = False
# transform assistant message to prompt message
assistant_prompt_message = AssistantPromptMessage(
content=delta_content,
)
# reset tool calls
tool_calls = []
full_assistant_content += delta_content
elif "text" in choice:
choice_text = choice.get("text", "")
if choice_text == "":
continue
# transform assistant message to prompt message
assistant_prompt_message = AssistantPromptMessage(content=choice_text)
full_assistant_content += choice_text
else:
continue
yield LLMResultChunk(
id=message_id,
model=model,
prompt_messages=prompt_messages,
delta=LLMResultChunkDelta(
index=chunk_index,
message=assistant_prompt_message,
),
)
chunk_index += 1
if tools_calls:
yield LLMResultChunk(
id=message_id,
model=model,
prompt_messages=prompt_messages,
delta=LLMResultChunkDelta(
index=chunk_index,
message=AssistantPromptMessage(tool_calls=tools_calls, content=""),
),
)
yield create_final_llm_result_chunk(
id=message_id,
index=chunk_index,
message=AssistantPromptMessage(content=""),
finish_reason=finish_reason,
usage=usage,
)

View File

@ -1,5 +1,6 @@
- gemini-2.0-flash-exp
- gemini-2.0-flash-thinking-exp-1219
- gemini-2.0-flash-thinking-exp-01-21
- gemini-1.5-pro
- gemini-1.5-pro-latest
- gemini-1.5-pro-001

View File

@ -0,0 +1,39 @@
model: gemini-2.0-flash-thinking-exp-01-21
label:
en_US: Gemini 2.0 Flash Thinking Exp 01-21
model_type: llm
features:
- agent-thought
- vision
- document
- video
- audio
model_properties:
mode: chat
context_size: 32767
parameter_rules:
- name: temperature
use_template: temperature
- name: top_p
use_template: top_p
- name: top_k
label:
zh_Hans: 取样数量
en_US: Top k
type: int
help:
zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
en_US: Only sample from the top K options for each subsequent token.
required: false
- name: max_output_tokens
use_template: max_tokens
default: 8192
min: 1
max: 8192
- name: json_schema
use_template: json_schema
pricing:
input: '0.00'
output: '0.00'
unit: '0.000001'
currency: USD

View File

@ -162,9 +162,9 @@ class HuggingfaceHubTextEmbeddingModel(_CommonHuggingfaceHub, TextEmbeddingModel
@staticmethod
def _check_endpoint_url_model_repository_name(credentials: dict, model_name: str):
try:
url = f'{HUGGINGFACE_ENDPOINT_API}{credentials["huggingface_namespace"]}'
url = f"{HUGGINGFACE_ENDPOINT_API}{credentials['huggingface_namespace']}"
headers = {
"Authorization": f'Bearer {credentials["huggingfacehub_api_token"]}',
"Authorization": f"Bearer {credentials['huggingfacehub_api_token']}",
"Content-Type": "application/json",
}

View File

@ -34,6 +34,7 @@ from core.model_runtime.model_providers.minimax.llm.types import MinimaxMessage
class MinimaxLargeLanguageModel(LargeLanguageModel):
model_apis = {
"minimax-text-01": MinimaxChatCompletionPro,
"abab7-chat-preview": MinimaxChatCompletionPro,
"abab6.5t-chat": MinimaxChatCompletionPro,
"abab6.5s-chat": MinimaxChatCompletionPro,

View File

@ -0,0 +1,46 @@
model: minimax-text-01
label:
en_US: Minimax-Text-01
model_type: llm
features:
- agent-thought
- tool-call
- stream-tool-call
model_properties:
mode: chat
context_size: 1000192
parameter_rules:
- name: temperature
use_template: temperature
min: 0.01
max: 1
default: 0.1
- name: top_p
use_template: top_p
min: 0.01
max: 1
default: 0.95
- name: max_tokens
use_template: max_tokens
required: true
default: 2048
min: 1
max: 1000192
- name: mask_sensitive_info
type: boolean
default: true
label:
zh_Hans: 隐私保护
en_US: Moderate
help:
zh_Hans: 对输出中易涉及隐私问题的文本信息进行打码目前包括但不限于邮箱、域名、链接、证件号、家庭住址等默认true即开启打码
en_US: Mask the sensitive info of the generated content, such as email/domain/link/address/phone/id..
- name: presence_penalty
use_template: presence_penalty
- name: frequency_penalty
use_template: frequency_penalty
pricing:
input: '0.001'
output: '0.008'
unit: '0.001'
currency: RMB

View File

@ -44,9 +44,6 @@ class MoonshotLargeLanguageModel(OAIAPICompatLargeLanguageModel):
self._add_custom_parameters(credentials)
self._add_function_call(model, credentials)
user = user[:32] if user else None
# {"response_format": "json_object"} need convert to {"response_format": {"type": "json_object"}}
if "response_format" in model_parameters:
model_parameters["response_format"] = {"type": model_parameters.get("response_format")}
return super()._invoke(model, credentials, prompt_messages, model_parameters, tools, stop, stream, user)
def validate_credentials(self, model: str, credentials: dict) -> None:

View File

@ -1,19 +1,11 @@
<svg width="162" height="36" viewBox="0 0 162 36" fill="none" xmlns="http://www.w3.org/2000/svg">
<path fill-rule="evenodd" clip-rule="evenodd" d="M2 0C0.895431 0 0 0.895432 0 2V29.1891C0 30.2937 0.895433 31.1891 2 31.1891H5.51171L16.0608 35.1377C16.7145 35.3824 17.4114 34.8991 17.4114 34.2012V11.3669C17.4114 10.533 16.894 9.78665 16.1131 9.49405L5.51171 5.52152H25.58V31.1891H29.0917C30.1963 31.1891 31.0917 30.2937 31.0917 29.1891V2C31.0917 0.895431 30.1963 0 29.0917 0H2ZM14.6022 23.7351C15.0558 23.956 15.4239 23.6812 15.4239 23.1185C15.4239 22.5557 15.0558 21.9204 14.6022 21.6995C14.1486 21.4775 13.7804 21.7545 13.7804 22.3161C13.7804 22.8777 14.1486 23.513 14.6022 23.7351Z" fill="white"/>
<path fill-rule="evenodd" clip-rule="evenodd" d="M2 0C0.895431 0 0 0.895432 0 2V29.1891C0 30.2937 0.895433 31.1891 2 31.1891H5.51171L16.0608 35.1377C16.7145 35.3824 17.4114 34.8991 17.4114 34.2012V11.3669C17.4114 10.533 16.894 9.78665 16.1131 9.49405L5.51171 5.52152H25.58V31.1891H29.0917C30.1963 31.1891 31.0917 30.2937 31.0917 29.1891V2C31.0917 0.895431 30.1963 0 29.0917 0H2ZM14.6022 23.7351C15.0558 23.956 15.4239 23.6812 15.4239 23.1185C15.4239 22.5557 15.0558 21.9204 14.6022 21.6995C14.1486 21.4775 13.7804 21.7545 13.7804 22.3161C13.7804 22.8777 14.1486 23.513 14.6022 23.7351Z" fill="url(#paint0_linear_1473_71)"/>
<path d="M55.9397 27.8804H59.0566V19.0803C59.0566 14.9105 56.381 12.7172 52.8228 12.7172C51.0023 12.7172 49.3197 13.4483 48.2991 14.6668V12.9609H45.1546V27.8804H48.2991V19.5406C48.2991 16.8059 49.8162 15.3978 52.1332 15.3978C54.4226 15.3978 55.9397 16.8059 55.9397 19.5406V27.8804Z" fill="#11101A"/>
<path fill-rule="evenodd" clip-rule="evenodd" d="M69.7881 12.7172C74.1187 12.7172 77.539 15.7228 77.539 20.4071C77.539 25.0915 74.0083 28.1241 69.6502 28.1241C65.3196 28.1241 62.0372 25.0915 62.0372 20.4071C62.0372 15.7228 65.4575 12.7172 69.7881 12.7172ZM69.7342 15.3979C67.362 15.3979 65.2381 17.0225 65.2381 20.4071C65.2381 23.7918 67.2793 25.4435 69.6514 25.4435C71.996 25.4435 74.313 23.7918 74.313 20.4071C74.313 17.0225 72.0788 15.3979 69.7342 15.3979Z" fill="#11101A"/>
<path d="M78.861 12.9609L84.6259 27.8804H88.3772L94.1697 12.9609H90.8321L86.5291 25.1185L82.2261 12.9609H78.861Z" fill="#11101A"/>
<path fill-rule="evenodd" clip-rule="evenodd" d="M100.13 9.00761C100.13 10.1178 99.2477 10.9842 98.1443 10.9842C97.0134 10.9842 96.1308 10.1178 96.1308 9.00761C96.1308 7.89745 97.0134 7.03098 98.1443 7.03098C99.2477 7.03098 100.13 7.89745 100.13 9.00761ZM99.6882 27.8804H96.5437V12.9609H99.6882V27.8804Z" fill="#11101A"/>
<path d="M104.322 23.7376C104.322 26.7702 106.004 27.8804 108.708 27.8804H111.19V25.308H109.259C107.935 25.308 107.494 24.8477 107.494 23.7376V15.479H111.19V12.9609H107.494V9.25128H104.322V12.9609H102.529V15.479H104.322V23.7376Z" fill="#11101A"/>
<path fill-rule="evenodd" clip-rule="evenodd" d="M120.154 28.1241C116.209 28.1241 113.037 24.9561 113.037 20.353C113.037 15.7498 116.209 12.7172 120.209 12.7172C122.774 12.7172 124.539 13.9086 125.477 15.1271V12.9609H128.649V27.8804H125.477V25.6601C124.512 26.9327 122.691 28.1241 120.154 28.1241ZM120.87 25.4435C123.242 25.4435 125.476 23.6293 125.476 20.4071C125.476 17.212 123.242 15.3979 120.87 15.3979C118.526 15.3979 116.264 17.1308 116.264 20.353C116.264 23.5752 118.526 25.4435 120.87 25.4435Z" fill="#11101A"/>
<path d="M136.043 26.0933C136.043 24.9832 135.16 24.1167 134.057 24.1167C132.926 24.1167 132.043 24.9832 132.043 26.0933C132.043 27.2035 132.926 28.07 134.057 28.07C135.16 28.07 136.043 27.2035 136.043 26.0933Z" fill="#11101A"/>
<path fill-rule="evenodd" clip-rule="evenodd" d="M145.502 28.1241C141.558 28.1241 138.386 24.9561 138.386 20.353C138.386 15.7498 141.558 12.7172 145.557 12.7172C148.123 12.7172 149.888 13.9086 150.826 15.1271V12.9609H153.998V27.8804H150.826V25.6601C149.86 26.9327 148.04 28.1241 145.502 28.1241ZM146.219 25.4435C148.591 25.4435 150.825 23.6293 150.825 20.4071C150.825 17.212 148.591 15.3979 146.219 15.3979C143.874 15.3979 141.612 17.1308 141.612 20.353C141.612 23.5752 143.874 25.4435 146.219 25.4435Z" fill="#11101A"/>
<path fill-rule="evenodd" clip-rule="evenodd" d="M161.722 9.00761C161.722 10.1178 160.84 10.9842 159.736 10.9842C158.605 10.9842 157.723 10.1178 157.723 9.00761C157.723 7.89745 158.605 7.03098 159.736 7.03098C160.84 7.03098 161.722 7.89745 161.722 9.00761ZM161.28 27.8804H158.136V12.9609H161.28V27.8804Z" fill="#11101A"/>
<svg width="88" height="24" viewBox="0 0 88 24" fill="none" xmlns="http://www.w3.org/2000/svg">
<g clip-path="url(#clip0_1923_1287)">
<path d="M24 18.8323V18.8326H14.3246L9.16716 13.6751V18.8326H0V18.8314L9.16716 9.66422V4H9.16774L24 18.8323Z" fill="black"/>
</g>
<path fill-rule="evenodd" clip-rule="evenodd" d="M73.2505 16.8061H76.5869V18.9145H73.9391C72.0857 18.9145 70.9202 17.8952 70.9202 15.9977V10.3921H69.0316V8.26609H70.9202L71.4677 5.47209H73.2329V8.26609H76.5869V10.3921H73.2505V16.8061ZM33.8133 4.85699L38.6679 15.681H38.809V4.85699H41.3333V18.9145H37.52L32.6654 8.09046H32.5243V18.9145H30V4.85699H33.8133ZM47.812 19.1254C44.7225 19.1254 42.7457 16.9641 42.7457 13.6079C42.7457 10.2517 44.6873 8.05518 47.812 8.05518C50.9367 8.05518 52.8429 10.1635 52.8429 13.6079C52.8429 17.0523 50.9014 19.1254 47.812 19.1254ZM47.812 17.017C49.1891 17.017 50.3363 16.5423 50.3715 15.1894V12.0265C50.3715 10.6383 49.2068 10.1635 47.812 10.1635C46.4172 10.1635 45.2171 10.6383 45.2171 12.0265V15.1894C45.2524 16.5599 46.4348 17.017 47.812 17.017ZM55.5444 8.24846L58.2979 16.6826H58.439L61.1926 8.24846H63.7346L59.9389 18.8968H56.7966L53.0186 8.24846H55.5429H55.5444ZM65.0419 8.26609H67.3722V18.9145H65.0419V8.26609ZM64.9001 4.85699H67.5126V6.86027H64.9001V4.85699ZM82.3064 19.143C79.4639 19.143 77.6458 16.9817 77.6458 13.6079C77.6458 10.2341 79.4286 8.07282 82.3064 8.07282C83.6483 8.07282 84.7425 8.59973 85.3958 9.58373H85.5369L85.9962 8.26609H87.7614V18.9145H85.9962L85.5369 17.6314H85.3958C84.6896 18.5625 83.5072 19.1423 82.3064 19.1423V19.143ZM82.7826 17.017C84.1774 17.017 85.3951 16.5776 85.4304 15.1894V12.0265C85.4304 10.603 84.159 10.1988 82.7297 10.1988C81.3004 10.1988 80.1172 10.6383 80.1172 12.0265V15.1894C80.1525 16.5952 81.3709 17.017 82.7826 17.017Z" fill="black"/>
<defs>
<linearGradient id="paint0_linear_1473_71" x1="31" y1="-2" x2="0.975591" y2="14.2625" gradientUnits="userSpaceOnUse">
<stop stop-color="#2622FF"/>
<stop offset="1" stop-color="#A717FF"/>
</linearGradient>
<clipPath id="clip0_1923_1287">
<rect width="24" height="14.8326" fill="white" transform="translate(0 4)"/>
</clipPath>
</defs>
</svg>

Before

Width:  |  Height:  |  Size: 4.5 KiB

After

Width:  |  Height:  |  Size: 1.9 KiB

View File

@ -1,10 +1,3 @@
<svg width="32" height="36" viewBox="0 0 32 36" fill="none" xmlns="http://www.w3.org/2000/svg">
<path fill-rule="evenodd" clip-rule="evenodd" d="M2 0C0.895431 0 0 0.895432 0 2V29.1891C0 30.2937 0.895433 31.1891 2 31.1891H5.51171L16.0608 35.1377C16.7145 35.3824 17.4114 34.8991 17.4114 34.2012V11.3669C17.4114 10.533 16.894 9.78665 16.1131 9.49405L5.51171 5.52152H25.58V31.1891H29.0917C30.1963 31.1891 31.0917 30.2937 31.0917 29.1891V2C31.0917 0.895431 30.1963 0 29.0917 0H2ZM14.6022 23.7351C15.0558 23.956 15.4239 23.6812 15.4239 23.1185C15.4239 22.5557 15.0558 21.9204 14.6022 21.6995C14.1486 21.4775 13.7804 21.7545 13.7804 22.3161C13.7804 22.8777 14.1486 23.513 14.6022 23.7351Z" fill="white"/>
<path fill-rule="evenodd" clip-rule="evenodd" d="M2 0C0.895431 0 0 0.895432 0 2V29.1891C0 30.2937 0.895433 31.1891 2 31.1891H5.51171L16.0608 35.1377C16.7145 35.3824 17.4114 34.8991 17.4114 34.2012V11.3669C17.4114 10.533 16.894 9.78665 16.1131 9.49405L5.51171 5.52152H25.58V31.1891H29.0917C30.1963 31.1891 31.0917 30.2937 31.0917 29.1891V2C31.0917 0.895431 30.1963 0 29.0917 0H2ZM14.6022 23.7351C15.0558 23.956 15.4239 23.6812 15.4239 23.1185C15.4239 22.5557 15.0558 21.9204 14.6022 21.6995C14.1486 21.4775 13.7804 21.7545 13.7804 22.3161C13.7804 22.8777 14.1486 23.513 14.6022 23.7351Z" fill="url(#paint0_linear_1473_97)"/>
<defs>
<linearGradient id="paint0_linear_1473_97" x1="31" y1="-2" x2="0.975591" y2="14.2625" gradientUnits="userSpaceOnUse">
<stop stop-color="#2622FF"/>
<stop offset="1" stop-color="#A717FF"/>
</linearGradient>
</defs>
<svg width="24" height="15" viewBox="0 0 24 15" fill="none" xmlns="http://www.w3.org/2000/svg">
<path d="M24 14.8323V14.8326H14.3246L9.16716 9.67507V14.8326H0V14.8314L9.16716 5.66422V0H9.16774L24 14.8323Z" fill="black"/>
</svg>

Before

Width:  |  Height:  |  Size: 1.5 KiB

After

Width:  |  Height:  |  Size: 228 B

View File

@ -0,0 +1,41 @@
model: Sao10K/L3-8B-Stheno-v3.2
label:
zh_Hans: L3 8B Stheno V3.2
en_US: L3 8B Stheno V3.2
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 8192
parameter_rules:
- name: temperature
use_template: temperature
min: 0
max: 2
default: 1
- name: top_p
use_template: top_p
min: 0
max: 1
default: 1
- name: max_tokens
use_template: max_tokens
min: 1
max: 2048
default: 512
- name: frequency_penalty
use_template: frequency_penalty
min: -2
max: 2
default: 0
- name: presence_penalty
use_template: presence_penalty
min: -2
max: 2
default: 0
pricing:
input: '0.0005'
output: '0.0005'
unit: '0.0001'
currency: USD

View File

@ -0,0 +1,41 @@
# Deepseek Models
- deepseek/deepseek-r1
- deepseek/deepseek_v3
# LLaMA Models
- meta-llama/llama-3.3-70b-instruct
- meta-llama/llama-3.2-11b-vision-instruct
- meta-llama/llama-3.2-3b-instruct
- meta-llama/llama-3.2-1b-instruct
- meta-llama/llama-3.1-70b-instruct
- meta-llama/llama-3.1-8b-instruct
- meta-llama/llama-3.1-8b-instruct-max
- meta-llama/llama-3.1-8b-instruct-bf16
- meta-llama/llama-3-70b-instruct
- meta-llama/llama-3-8b-instruct
# Mistral Models
- mistralai/mistral-nemo
- mistralai/mistral-7b-instruct
# Qwen Models
- qwen/qwen-2.5-72b-instruct
- qwen/qwen-2-72b-instruct
- qwen/qwen-2-vl-72b-instruct
- qwen/qwen-2-7b-instruct
# Other Models
- sao10k/L3-8B-Stheno-v3.2
- sao10k/l3-70b-euryale-v2.1
- sao10k/l31-70b-euryale-v2.2
- sao10k/l3-8b-lunaris
- jondurbin/airoboros-l2-70b
- cognitivecomputations/dolphin-mixtral-8x22b
- google/gemma-2-9b-it
- nousresearch/hermes-2-pro-llama-3-8b
- sophosympatheia/midnight-rose-70b
- gryphe/mythomax-l2-13b
- nousresearch/nous-hermes-llama2-13b
- openchat/openchat-7b
- teknium/openhermes-2.5-mistral-7b
- microsoft/wizardlm-2-8x22b

View File

@ -1,7 +1,7 @@
model: jondurbin/airoboros-l2-70b
label:
zh_Hans: jondurbin/airoboros-l2-70b
en_US: jondurbin/airoboros-l2-70b
zh_Hans: Airoboros L2 70B
en_US: Airoboros L2 70B
model_type: llm
features:
- agent-thought

View File

@ -0,0 +1,41 @@
model: deepseek/deepseek-r1
label:
zh_Hans: DeepSeek R1
en_US: DeepSeek R1
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 64000
parameter_rules:
- name: temperature
use_template: temperature
min: 0
max: 2
default: 1
- name: top_p
use_template: top_p
min: 0
max: 1
default: 1
- name: max_tokens
use_template: max_tokens
min: 1
max: 2048
default: 512
- name: frequency_penalty
use_template: frequency_penalty
min: -2
max: 2
default: 0
- name: presence_penalty
use_template: presence_penalty
min: -2
max: 2
default: 0
pricing:
input: '0.04'
output: '0.04'
unit: '0.0001'
currency: USD

View File

@ -0,0 +1,41 @@
model: deepseek/deepseek_v3
label:
zh_Hans: DeepSeek V3
en_US: DeepSeek V3
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 64000
parameter_rules:
- name: temperature
use_template: temperature
min: 0
max: 2
default: 1
- name: top_p
use_template: top_p
min: 0
max: 1
default: 1
- name: max_tokens
use_template: max_tokens
min: 1
max: 2048
default: 512
- name: frequency_penalty
use_template: frequency_penalty
min: -2
max: 2
default: 0
- name: presence_penalty
use_template: presence_penalty
min: -2
max: 2
default: 0
pricing:
input: '0.0089'
output: '0.0089'
unit: '0.0001'
currency: USD

View File

@ -1,7 +1,7 @@
model: cognitivecomputations/dolphin-mixtral-8x22b
label:
zh_Hans: cognitivecomputations/dolphin-mixtral-8x22b
en_US: cognitivecomputations/dolphin-mixtral-8x22b
zh_Hans: Dolphin Mixtral 8x22B
en_US: Dolphin Mixtral 8x22B
model_type: llm
features:
- agent-thought

View File

@ -1,7 +1,7 @@
model: google/gemma-2-9b-it
label:
zh_Hans: google/gemma-2-9b-it
en_US: google/gemma-2-9b-it
zh_Hans: Gemma 2 9B
en_US: Gemma 2 9B
model_type: llm
features:
- agent-thought

View File

@ -1,7 +1,7 @@
model: nousresearch/hermes-2-pro-llama-3-8b
label:
zh_Hans: nousresearch/hermes-2-pro-llama-3-8b
en_US: nousresearch/hermes-2-pro-llama-3-8b
zh_Hans: Hermes 2 Pro Llama 3 8B
en_US: Hermes 2 Pro Llama 3 8B
model_type: llm
features:
- agent-thought

View File

@ -1,7 +1,7 @@
model: sao10k/l3-70b-euryale-v2.1
label:
zh_Hans: sao10k/l3-70b-euryale-v2.1
en_US: sao10k/l3-70b-euryale-v2.1
zh_Hans: "L3 70B Euryale V2.1\t"
en_US: "L3 70B Euryale V2.1\t"
model_type: llm
features:
- agent-thought

View File

@ -0,0 +1,41 @@
model: sao10k/l3-8b-lunaris
label:
zh_Hans: "Sao10k L3 8B Lunaris"
en_US: "Sao10k L3 8B Lunaris"
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 8192
parameter_rules:
- name: temperature
use_template: temperature
min: 0
max: 2
default: 1
- name: top_p
use_template: top_p
min: 0
max: 1
default: 1
- name: max_tokens
use_template: max_tokens
min: 1
max: 2048
default: 512
- name: frequency_penalty
use_template: frequency_penalty
min: -2
max: 2
default: 0
- name: presence_penalty
use_template: presence_penalty
min: -2
max: 2
default: 0
pricing:
input: '0.0005'
output: '0.0005'
unit: '0.0001'
currency: USD

View File

@ -0,0 +1,41 @@
model: sao10k/l31-70b-euryale-v2.2
label:
zh_Hans: L31 70B Euryale V2.2
en_US: L31 70B Euryale V2.2
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 16000
parameter_rules:
- name: temperature
use_template: temperature
min: 0
max: 2
default: 1
- name: top_p
use_template: top_p
min: 0
max: 1
default: 1
- name: max_tokens
use_template: max_tokens
min: 1
max: 2048
default: 512
- name: frequency_penalty
use_template: frequency_penalty
min: -2
max: 2
default: 0
- name: presence_penalty
use_template: presence_penalty
min: -2
max: 2
default: 0
pricing:
input: '0.0148'
output: '0.0148'
unit: '0.0001'
currency: USD

View File

@ -1,7 +1,7 @@
model: meta-llama/llama-3-70b-instruct
label:
zh_Hans: meta-llama/llama-3-70b-instruct
en_US: meta-llama/llama-3-70b-instruct
zh_Hans: Llama3 70b Instruct
en_US: Llama3 70b Instruct
model_type: llm
features:
- agent-thought

View File

@ -1,7 +1,7 @@
model: meta-llama/llama-3-8b-instruct
label:
zh_Hans: meta-llama/llama-3-8b-instruct
en_US: meta-llama/llama-3-8b-instruct
zh_Hans: Llama 3 8B Instruct
en_US: Llama 3 8B Instruct
model_type: llm
features:
- agent-thought
@ -35,7 +35,7 @@ parameter_rules:
max: 2
default: 0
pricing:
input: '0.00063'
output: '0.00063'
input: '0.0004'
output: '0.0004'
unit: '0.0001'
currency: USD

View File

@ -1,13 +1,13 @@
model: meta-llama/llama-3.1-70b-instruct
label:
zh_Hans: meta-llama/llama-3.1-70b-instruct
en_US: meta-llama/llama-3.1-70b-instruct
zh_Hans: Llama 3.1 70B Instruct
en_US: Llama 3.1 70B Instruct
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 8192
context_size: 32768
parameter_rules:
- name: temperature
use_template: temperature
@ -35,7 +35,7 @@ parameter_rules:
max: 2
default: 0
pricing:
input: '0.0055'
output: '0.0076'
input: '0.0034'
output: '0.0039'
unit: '0.0001'
currency: USD

View File

@ -0,0 +1,41 @@
model: meta-llama/llama-3.1-8b-instruct-bf16
label:
zh_Hans: Llama 3.1 8B Instruct BF16
en_US: Llama 3.1 8B Instruct BF16
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 8192
parameter_rules:
- name: temperature
use_template: temperature
min: 0
max: 2
default: 1
- name: top_p
use_template: top_p
min: 0
max: 1
default: 1
- name: max_tokens
use_template: max_tokens
min: 1
max: 2048
default: 512
- name: frequency_penalty
use_template: frequency_penalty
min: -2
max: 2
default: 0
- name: presence_penalty
use_template: presence_penalty
min: -2
max: 2
default: 0
pricing:
input: '0.0006'
output: '0.0006'
unit: '0.0001'
currency: USD

View File

@ -0,0 +1,41 @@
model: meta-llama/llama-3.1-8b-instruct-max
label:
zh_Hans: "Llama3.1 8B Instruct Max\t"
en_US: "Llama3.1 8B Instruct Max\t"
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 16384
parameter_rules:
- name: temperature
use_template: temperature
min: 0
max: 2
default: 1
- name: top_p
use_template: top_p
min: 0
max: 1
default: 1
- name: max_tokens
use_template: max_tokens
min: 1
max: 2048
default: 512
- name: frequency_penalty
use_template: frequency_penalty
min: -2
max: 2
default: 0
- name: presence_penalty
use_template: presence_penalty
min: -2
max: 2
default: 0
pricing:
input: '0.0005'
output: '0.0005'
unit: '0.0001'
currency: USD

View File

@ -1,13 +1,13 @@
model: meta-llama/llama-3.1-8b-instruct
label:
zh_Hans: meta-llama/llama-3.1-8b-instruct
en_US: meta-llama/llama-3.1-8b-instruct
zh_Hans: Llama 3.1 8B Instruct
en_US: Llama 3.1 8B Instruct
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 8192
context_size: 16384
parameter_rules:
- name: temperature
use_template: temperature
@ -35,7 +35,7 @@ parameter_rules:
max: 2
default: 0
pricing:
input: '0.001'
output: '0.001'
input: '0.0005'
output: '0.0005'
unit: '0.0001'
currency: USD

View File

@ -0,0 +1,41 @@
model: meta-llama/llama-3.2-11b-vision-instruct
label:
zh_Hans: "Llama 3.2 11B Vision Instruct\t"
en_US: "Llama 3.2 11B Vision Instruct\t"
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 32768
parameter_rules:
- name: temperature
use_template: temperature
min: 0
max: 2
default: 1
- name: top_p
use_template: top_p
min: 0
max: 1
default: 1
- name: max_tokens
use_template: max_tokens
min: 1
max: 2048
default: 512
- name: frequency_penalty
use_template: frequency_penalty
min: -2
max: 2
default: 0
- name: presence_penalty
use_template: presence_penalty
min: -2
max: 2
default: 0
pricing:
input: '0.0006'
output: '0.0006'
unit: '0.0001'
currency: USD

View File

@ -0,0 +1,41 @@
model: meta-llama/llama-3.2-1b-instruct
label:
zh_Hans: "Llama 3.2 1B Instruct\t"
en_US: "Llama 3.2 1B Instruct\t"
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 131000
parameter_rules:
- name: temperature
use_template: temperature
min: 0
max: 2
default: 1
- name: top_p
use_template: top_p
min: 0
max: 1
default: 1
- name: max_tokens
use_template: max_tokens
min: 1
max: 2048
default: 512
- name: frequency_penalty
use_template: frequency_penalty
min: -2
max: 2
default: 0
- name: presence_penalty
use_template: presence_penalty
min: -2
max: 2
default: 0
pricing:
input: '0.0002'
output: '0.0002'
unit: '0.0001'
currency: USD

View File

@ -1,7 +1,7 @@
model: Nous-Hermes-2-Mixtral-8x7B-DPO
model: meta-llama/llama-3.2-3b-instruct
label:
zh_Hans: Nous-Hermes-2-Mixtral-8x7B-DPO
en_US: Nous-Hermes-2-Mixtral-8x7B-DPO
zh_Hans: Llama 3.2 3B Instruct
en_US: Llama 3.2 3B Instruct
model_type: llm
features:
- agent-thought
@ -35,7 +35,7 @@ parameter_rules:
max: 2
default: 0
pricing:
input: '0.0027'
output: '0.0027'
input: '0.0003'
output: '0.0005'
unit: '0.0001'
currency: USD

View File

@ -0,0 +1,41 @@
model: meta-llama/llama-3.3-70b-instruct
label:
zh_Hans: Llama 3.3 70B Instruct
en_US: Llama 3.3 70B Instruct
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 131072
parameter_rules:
- name: temperature
use_template: temperature
min: 0
max: 2
default: 1
- name: top_p
use_template: top_p
min: 0
max: 1
default: 1
- name: max_tokens
use_template: max_tokens
min: 1
max: 2048
default: 512
- name: frequency_penalty
use_template: frequency_penalty
min: -2
max: 2
default: 0
- name: presence_penalty
use_template: presence_penalty
min: -2
max: 2
default: 0
pricing:
input: '0.0039'
output: '0.0039'
unit: '0.0001'
currency: USD

View File

@ -1,7 +1,7 @@
model: sophosympatheia/midnight-rose-70b
label:
zh_Hans: sophosympatheia/midnight-rose-70b
en_US: sophosympatheia/midnight-rose-70b
zh_Hans: Midnight Rose 70B
en_US: Midnight Rose 70B
model_type: llm
features:
- agent-thought

View File

@ -1,7 +1,7 @@
model: mistralai/mistral-7b-instruct
label:
zh_Hans: mistralai/mistral-7b-instruct
en_US: mistralai/mistral-7b-instruct
zh_Hans: Mistral 7B Instruct
en_US: Mistral 7B Instruct
model_type: llm
features:
- agent-thought

View File

@ -0,0 +1,41 @@
model: mistralai/mistral-nemo
label:
zh_Hans: Mistral Nemo
en_US: Mistral Nemo
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 131072
parameter_rules:
- name: temperature
use_template: temperature
min: 0
max: 2
default: 1
- name: top_p
use_template: top_p
min: 0
max: 1
default: 1
- name: max_tokens
use_template: max_tokens
min: 1
max: 2048
default: 512
- name: frequency_penalty
use_template: frequency_penalty
min: -2
max: 2
default: 0
- name: presence_penalty
use_template: presence_penalty
min: -2
max: 2
default: 0
pricing:
input: '0.0017'
output: '0.0017'
unit: '0.0001'
currency: USD

View File

@ -1,7 +1,7 @@
model: gryphe/mythomax-l2-13b
label:
zh_Hans: gryphe/mythomax-l2-13b
en_US: gryphe/mythomax-l2-13b
zh_Hans: Mythomax L2 13B
en_US: Mythomax L2 13B
model_type: llm
features:
- agent-thought
@ -35,7 +35,7 @@ parameter_rules:
max: 2
default: 0
pricing:
input: '0.00119'
output: '0.00119'
input: '0.0009'
output: '0.0009'
unit: '0.0001'
currency: USD

View File

@ -1,7 +1,7 @@
model: nousresearch/nous-hermes-llama2-13b
label:
zh_Hans: nousresearch/nous-hermes-llama2-13b
en_US: nousresearch/nous-hermes-llama2-13b
zh_Hans: Nous Hermes Llama2 13B
en_US: Nous Hermes Llama2 13B
model_type: llm
features:
- agent-thought

View File

@ -1,7 +1,7 @@
model: lzlv_70b
model: openchat/openchat-7b
label:
zh_Hans: lzlv_70b
en_US: lzlv_70b
zh_Hans: OpenChat 7B
en_US: OpenChat 7B
model_type: llm
features:
- agent-thought
@ -35,7 +35,7 @@ parameter_rules:
max: 2
default: 0
pricing:
input: '0.0058'
output: '0.0078'
input: '0.0006'
output: '0.0006'
unit: '0.0001'
currency: USD

View File

@ -1,7 +1,7 @@
model: teknium/openhermes-2.5-mistral-7b
label:
zh_Hans: teknium/openhermes-2.5-mistral-7b
en_US: teknium/openhermes-2.5-mistral-7b
zh_Hans: Openhermes2.5 Mistral 7B
en_US: Openhermes2.5 Mistral 7B
model_type: llm
features:
- agent-thought

View File

@ -1,7 +1,7 @@
model: meta-llama/llama-3.1-405b-instruct
model: qwen/qwen-2-72b-instruct
label:
zh_Hans: meta-llama/llama-3.1-405b-instruct
en_US: meta-llama/llama-3.1-405b-instruct
zh_Hans: Qwen2 72B Instruct
en_US: Qwen2 72B Instruct
model_type: llm
features:
- agent-thought
@ -35,7 +35,7 @@ parameter_rules:
max: 2
default: 0
pricing:
input: '0.03'
output: '0.05'
input: '0.0034'
output: '0.0039'
unit: '0.0001'
currency: USD

View File

@ -0,0 +1,41 @@
model: qwen/qwen-2-7b-instruct
label:
zh_Hans: Qwen 2 7B Instruct
en_US: Qwen 2 7B Instruct
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 32768
parameter_rules:
- name: temperature
use_template: temperature
min: 0
max: 2
default: 1
- name: top_p
use_template: top_p
min: 0
max: 1
default: 1
- name: max_tokens
use_template: max_tokens
min: 1
max: 2048
default: 512
- name: frequency_penalty
use_template: frequency_penalty
min: -2
max: 2
default: 0
- name: presence_penalty
use_template: presence_penalty
min: -2
max: 2
default: 0
pricing:
input: '0.00054'
output: '0.00054'
unit: '0.0001'
currency: USD

View File

@ -0,0 +1,41 @@
model: qwen/qwen-2-vl-72b-instruct
label:
zh_Hans: Qwen 2 VL 72B Instruct
en_US: Qwen 2 VL 72B Instruct
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 32768
parameter_rules:
- name: temperature
use_template: temperature
min: 0
max: 2
default: 1
- name: top_p
use_template: top_p
min: 0
max: 1
default: 1
- name: max_tokens
use_template: max_tokens
min: 1
max: 2048
default: 512
- name: frequency_penalty
use_template: frequency_penalty
min: -2
max: 2
default: 0
- name: presence_penalty
use_template: presence_penalty
min: -2
max: 2
default: 0
pricing:
input: '0.0045'
output: '0.0045'
unit: '0.0001'
currency: USD

View File

@ -0,0 +1,41 @@
model: qwen/qwen-2.5-72b-instruct
label:
zh_Hans: Qwen 2.5 72B Instruct
en_US: Qwen 2.5 72B Instruct
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 32000
parameter_rules:
- name: temperature
use_template: temperature
min: 0
max: 2
default: 1
- name: top_p
use_template: top_p
min: 0
max: 1
default: 1
- name: max_tokens
use_template: max_tokens
min: 1
max: 2048
default: 512
- name: frequency_penalty
use_template: frequency_penalty
min: -2
max: 2
default: 0
- name: presence_penalty
use_template: presence_penalty
min: -2
max: 2
default: 0
pricing:
input: '0.0038'
output: '0.004'
unit: '0.0001'
currency: USD

View File

@ -1,7 +1,7 @@
model: microsoft/wizardlm-2-8x22b
label:
zh_Hans: microsoft/wizardlm-2-8x22b
en_US: microsoft/wizardlm-2-8x22b
zh_Hans: Wizardlm 2 8x22B
en_US: Wizardlm 2 8x22B
model_type: llm
features:
- agent-thought
@ -35,7 +35,7 @@ parameter_rules:
max: 2
default: 0
pricing:
input: '0.0064'
output: '0.0064'
input: '0.0062'
output: '0.0062'
unit: '0.0001'
currency: USD

View File

@ -1,6 +1,6 @@
provider: novita
label:
en_US: novita.ai
en_US: Novita AI
description:
en_US: An LLM API that matches various application scenarios with high cost-effectiveness.
zh_Hans: 适配多种海外应用场景的高性价比 LLM API
@ -8,13 +8,13 @@ icon_small:
en_US: icon_s_en.svg
icon_large:
en_US: icon_l_en.svg
background: "#eadeff"
background: "#c7fce2"
help:
title:
en_US: Get your API key from novita.ai
zh_Hans: novita.ai 获取 API Key
en_US: Get your API key from Novita AI
zh_Hans: Novita AI 获取 API Key
url:
en_US: https://novita.ai/settings#key-management?utm_source=dify&utm_medium=ch&utm_campaign=api
en_US: https://novita.ai/settings/key-management?utm_source=dify&utm_medium=ch&utm_campaign=api
supported_model_types:
- llm
configurate_methods:

View File

@ -1,5 +1,6 @@
import json
import logging
import re
from collections.abc import Generator
from typing import Any, Optional, Union, cast
@ -340,9 +341,6 @@ class OpenAILargeLanguageModel(_CommonOpenAI, LargeLanguageModel):
:param credentials: provider credentials
:return:
"""
# get predefined models
predefined_models = self.predefined_models()
predefined_models_map = {model.model: model for model in predefined_models}
# transform credentials to kwargs for model instance
credentials_kwargs = self._to_credential_kwargs(credentials)
@ -358,9 +356,10 @@ class OpenAILargeLanguageModel(_CommonOpenAI, LargeLanguageModel):
base_model = model.id.split(":")[1]
base_model_schema = None
for predefined_model_name, predefined_model in predefined_models_map.items():
if predefined_model_name in base_model:
for predefined_model in self.predefined_models():
if predefined_model.model in base_model:
base_model_schema = predefined_model
break
if not base_model_schema:
continue
@ -621,11 +620,19 @@ class OpenAILargeLanguageModel(_CommonOpenAI, LargeLanguageModel):
prompt_messages = self._clear_illegal_prompt_messages(model, prompt_messages)
# o1 compatibility
block_as_stream = False
if model.startswith("o1"):
if "max_tokens" in model_parameters:
model_parameters["max_completion_tokens"] = model_parameters["max_tokens"]
del model_parameters["max_tokens"]
if re.match(r"^o1(-\d{4}-\d{2}-\d{2})?$", model):
if stream:
block_as_stream = True
stream = False
if "stream_options" in extra_model_kwargs:
del extra_model_kwargs["stream_options"]
if "stop" in extra_model_kwargs:
del extra_model_kwargs["stop"]
@ -642,7 +649,45 @@ class OpenAILargeLanguageModel(_CommonOpenAI, LargeLanguageModel):
if stream:
return self._handle_chat_generate_stream_response(model, credentials, response, prompt_messages, tools)
return self._handle_chat_generate_response(model, credentials, response, prompt_messages, tools)
block_result = self._handle_chat_generate_response(model, credentials, response, prompt_messages, tools)
if block_as_stream:
return self._handle_chat_block_as_stream_response(block_result, prompt_messages, stop)
return block_result
def _handle_chat_block_as_stream_response(
self,
block_result: LLMResult,
prompt_messages: list[PromptMessage],
stop: Optional[list[str]] = None,
) -> Generator[LLMResultChunk, None, None]:
"""
Handle llm chat response
:param model: model name
:param credentials: credentials
:param response: response
:param prompt_messages: prompt messages
:param tools: tools for tool calling
:return: llm response chunk generator
"""
text = block_result.message.content
text = cast(str, text)
if stop:
text = self.enforce_stop_tokens(text, stop)
yield LLMResultChunk(
model=block_result.model,
prompt_messages=prompt_messages,
system_fingerprint=block_result.system_fingerprint,
delta=LLMResultChunkDelta(
index=0,
message=block_result.message,
finish_reason="stop",
usage=block_result.usage,
),
)
def _handle_chat_generate_response(
self,
@ -1139,12 +1184,14 @@ class OpenAILargeLanguageModel(_CommonOpenAI, LargeLanguageModel):
base_model = model.split(":")[1]
# get model schema
models = self.predefined_models()
model_map = {model.model: model for model in models}
if base_model not in model_map:
raise ValueError(f"Base model {base_model} not found")
base_model_schema = None
for predefined_model in self.predefined_models():
if base_model == predefined_model.model:
base_model_schema = predefined_model
break
base_model_schema = model_map[base_model]
if not base_model_schema:
raise ValueError(f"Base model {base_model} not found")
base_model_schema_features = base_model_schema.features or []
base_model_schema_model_properties = base_model_schema.model_properties

View File

@ -7,6 +7,7 @@ features:
- vision
- tool-call
- stream-tool-call
- document
model_properties:
mode: chat
context_size: 200000

View File

@ -1,29 +1,13 @@
import json
import time
from decimal import Decimal
from typing import Optional
from urllib.parse import urljoin
import numpy as np
import requests
from core.entities.embedding_type import EmbeddingInputType
from core.model_runtime.entities.common_entities import I18nObject
from core.model_runtime.entities.model_entities import (
AIModelEntity,
FetchFrom,
ModelPropertyKey,
ModelType,
PriceConfig,
PriceType,
from core.model_runtime.entities.text_embedding_entities import TextEmbeddingResult
from core.model_runtime.model_providers.openai_api_compatible.text_embedding.text_embedding import (
OAICompatEmbeddingModel,
)
from core.model_runtime.entities.text_embedding_entities import EmbeddingUsage, TextEmbeddingResult
from core.model_runtime.errors.validate import CredentialsValidateFailedError
from core.model_runtime.model_providers.__base.text_embedding_model import TextEmbeddingModel
from core.model_runtime.model_providers.openai_api_compatible._common import _CommonOaiApiCompat
class OAICompatEmbeddingModel(_CommonOaiApiCompat, TextEmbeddingModel):
class PerfXCloudEmbeddingModel(OAICompatEmbeddingModel):
"""
Model class for an OpenAI API-compatible text embedding model.
"""
@ -47,86 +31,10 @@ class OAICompatEmbeddingModel(_CommonOaiApiCompat, TextEmbeddingModel):
:return: embeddings result
"""
# Prepare headers and payload for the request
headers = {"Content-Type": "application/json"}
api_key = credentials.get("api_key")
if api_key:
headers["Authorization"] = f"Bearer {api_key}"
endpoint_url: Optional[str]
if "endpoint_url" not in credentials or credentials["endpoint_url"] == "":
endpoint_url = "https://cloud.perfxlab.cn/v1/"
else:
endpoint_url = credentials.get("endpoint_url")
assert endpoint_url is not None, "endpoint_url is required in credentials"
if not endpoint_url.endswith("/"):
endpoint_url += "/"
credentials["endpoint_url"] = "https://cloud.perfxlab.cn/v1/"
assert isinstance(endpoint_url, str)
endpoint_url = urljoin(endpoint_url, "embeddings")
extra_model_kwargs = {}
if user:
extra_model_kwargs["user"] = user
extra_model_kwargs["encoding_format"] = "float"
# get model properties
context_size = self._get_context_size(model, credentials)
max_chunks = self._get_max_chunks(model, credentials)
inputs = []
indices = []
used_tokens = 0
for i, text in enumerate(texts):
# Here token count is only an approximation based on the GPT2 tokenizer
# TODO: Optimize for better token estimation and chunking
num_tokens = self._get_num_tokens_by_gpt2(text)
if num_tokens >= context_size:
cutoff = int(np.floor(len(text) * (context_size / num_tokens)))
# if num tokens is larger than context length, only use the start
inputs.append(text[0:cutoff])
else:
inputs.append(text)
indices += [i]
batched_embeddings = []
_iter = range(0, len(inputs), max_chunks)
for i in _iter:
# Prepare the payload for the request
payload = {"input": inputs[i : i + max_chunks], "model": model, **extra_model_kwargs}
# Make the request to the OpenAI API
response = requests.post(endpoint_url, headers=headers, data=json.dumps(payload), timeout=(10, 300))
response.raise_for_status() # Raise an exception for HTTP errors
response_data = response.json()
# Extract embeddings and used tokens from the response
embeddings_batch = [data["embedding"] for data in response_data["data"]]
embedding_used_tokens = response_data["usage"]["total_tokens"]
used_tokens += embedding_used_tokens
batched_embeddings += embeddings_batch
# calc usage
usage = self._calc_response_usage(model=model, credentials=credentials, tokens=used_tokens)
return TextEmbeddingResult(embeddings=batched_embeddings, usage=usage, model=model)
def get_num_tokens(self, model: str, credentials: dict, texts: list[str]) -> int:
"""
Approximate number of tokens for given messages using GPT2 tokenizer
:param model: model name
:param credentials: model credentials
:param texts: texts to embed
:return:
"""
return sum(self._get_num_tokens_by_gpt2(text) for text in texts)
return OAICompatEmbeddingModel._invoke(self, model, credentials, texts, user, input_type)
def validate_credentials(self, model: str, credentials: dict) -> None:
"""
@ -136,93 +44,7 @@ class OAICompatEmbeddingModel(_CommonOaiApiCompat, TextEmbeddingModel):
:param credentials: model credentials
:return:
"""
try:
headers = {"Content-Type": "application/json"}
if "endpoint_url" not in credentials or credentials["endpoint_url"] == "":
credentials["endpoint_url"] = "https://cloud.perfxlab.cn/v1/"
api_key = credentials.get("api_key")
if api_key:
headers["Authorization"] = f"Bearer {api_key}"
endpoint_url: Optional[str]
if "endpoint_url" not in credentials or credentials["endpoint_url"] == "":
endpoint_url = "https://cloud.perfxlab.cn/v1/"
else:
endpoint_url = credentials.get("endpoint_url")
assert endpoint_url is not None, "endpoint_url is required in credentials"
if not endpoint_url.endswith("/"):
endpoint_url += "/"
assert isinstance(endpoint_url, str)
endpoint_url = urljoin(endpoint_url, "embeddings")
payload = {"input": "ping", "model": model}
response = requests.post(url=endpoint_url, headers=headers, data=json.dumps(payload), timeout=(10, 300))
if response.status_code != 200:
raise CredentialsValidateFailedError(
f"Credentials validation failed with status code {response.status_code}"
)
try:
json_result = response.json()
except json.JSONDecodeError as e:
raise CredentialsValidateFailedError("Credentials validation failed: JSON decode error")
if "model" not in json_result:
raise CredentialsValidateFailedError("Credentials validation failed: invalid response")
except CredentialsValidateFailedError:
raise
except Exception as ex:
raise CredentialsValidateFailedError(str(ex))
def get_customizable_model_schema(self, model: str, credentials: dict) -> AIModelEntity:
"""
generate custom model entities from credentials
"""
entity = AIModelEntity(
model=model,
label=I18nObject(en_US=model),
model_type=ModelType.TEXT_EMBEDDING,
fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
model_properties={
ModelPropertyKey.CONTEXT_SIZE: int(credentials.get("context_size", 512)),
ModelPropertyKey.MAX_CHUNKS: 1,
},
parameter_rules=[],
pricing=PriceConfig(
input=Decimal(credentials.get("input_price", 0)),
unit=Decimal(credentials.get("unit", 0)),
currency=credentials.get("currency", "USD"),
),
)
return entity
def _calc_response_usage(self, model: str, credentials: dict, tokens: int) -> EmbeddingUsage:
"""
Calculate response usage
:param model: model name
:param credentials: model credentials
:param tokens: input tokens
:return: usage
"""
# get input price info
input_price_info = self.get_price(
model=model, credentials=credentials, price_type=PriceType.INPUT, tokens=tokens
)
# transform usage
usage = EmbeddingUsage(
tokens=tokens,
total_tokens=tokens,
unit_price=input_price_info.unit_price,
price_unit=input_price_info.unit,
total_price=input_price_info.total_amount,
currency=input_price_info.currency,
latency=time.perf_counter() - self.started_at,
)
return usage
OAICompatEmbeddingModel.validate_credentials(self, model, credentials)

View File

@ -7,6 +7,8 @@
- Qwen/Qwen2.5-Coder-7B-Instruct
- Qwen/Qwen2-VL-72B-Instruct
- Qwen/Qwen2-1.5B-Instruct
- Qwen/Qwen2.5-72B-Instruct-128K
- Vendor-A/Qwen/Qwen2.5-72B-Instruct
- Pro/Qwen/Qwen2-VL-7B-Instruct
- OpenGVLab/InternVL2-26B
- Pro/OpenGVLab/InternVL2-8B

View File

@ -1,9 +1,16 @@
import json
from collections.abc import Generator
from typing import Optional, Union
import requests
from core.model_runtime.entities.common_entities import I18nObject
from core.model_runtime.entities.llm_entities import LLMMode, LLMResult
from core.model_runtime.entities.message_entities import PromptMessage, PromptMessageTool
from core.model_runtime.entities.llm_entities import LLMMode, LLMResult, LLMResultChunk, LLMResultChunkDelta
from core.model_runtime.entities.message_entities import (
AssistantPromptMessage,
PromptMessage,
PromptMessageTool,
)
from core.model_runtime.entities.model_entities import (
AIModelEntity,
FetchFrom,
@ -29,9 +36,6 @@ class SiliconflowLargeLanguageModel(OAIAPICompatLargeLanguageModel):
user: Optional[str] = None,
) -> Union[LLMResult, Generator]:
self._add_custom_parameters(credentials)
# {"response_format": "json_object"} need convert to {"response_format": {"type": "json_object"}}
if "response_format" in model_parameters:
model_parameters["response_format"] = {"type": model_parameters.get("response_format")}
return super()._invoke(model, credentials, prompt_messages, model_parameters, tools, stop, stream)
def validate_credentials(self, model: str, credentials: dict) -> None:
@ -92,3 +96,208 @@ class SiliconflowLargeLanguageModel(OAIAPICompatLargeLanguageModel):
),
],
)
def _handle_generate_stream_response(
self, model: str, credentials: dict, response: requests.Response, prompt_messages: list[PromptMessage]
) -> Generator:
"""
Handle llm stream response
:param model: model name
:param credentials: model credentials
:param response: streamed response
:param prompt_messages: prompt messages
:return: llm response chunk generator
"""
full_assistant_content = ""
chunk_index = 0
is_reasoning_started = False # Add flag to track reasoning state
def create_final_llm_result_chunk(
id: Optional[str], index: int, message: AssistantPromptMessage, finish_reason: str, usage: dict
) -> LLMResultChunk:
# calculate num tokens
prompt_tokens = usage and usage.get("prompt_tokens")
if prompt_tokens is None:
prompt_tokens = self._num_tokens_from_string(model, prompt_messages[0].content)
completion_tokens = usage and usage.get("completion_tokens")
if completion_tokens is None:
completion_tokens = self._num_tokens_from_string(model, full_assistant_content)
# transform usage
usage = self._calc_response_usage(model, credentials, prompt_tokens, completion_tokens)
return LLMResultChunk(
id=id,
model=model,
prompt_messages=prompt_messages,
delta=LLMResultChunkDelta(index=index, message=message, finish_reason=finish_reason, usage=usage),
)
# delimiter for stream response, need unicode_escape
import codecs
delimiter = credentials.get("stream_mode_delimiter", "\n\n")
delimiter = codecs.decode(delimiter, "unicode_escape")
tools_calls: list[AssistantPromptMessage.ToolCall] = []
def increase_tool_call(new_tool_calls: list[AssistantPromptMessage.ToolCall]):
def get_tool_call(tool_call_id: str):
if not tool_call_id:
return tools_calls[-1]
tool_call = next((tool_call for tool_call in tools_calls if tool_call.id == tool_call_id), None)
if tool_call is None:
tool_call = AssistantPromptMessage.ToolCall(
id=tool_call_id,
type="function",
function=AssistantPromptMessage.ToolCall.ToolCallFunction(name="", arguments=""),
)
tools_calls.append(tool_call)
return tool_call
for new_tool_call in new_tool_calls:
# get tool call
tool_call = get_tool_call(new_tool_call.function.name)
# update tool call
if new_tool_call.id:
tool_call.id = new_tool_call.id
if new_tool_call.type:
tool_call.type = new_tool_call.type
if new_tool_call.function.name:
tool_call.function.name = new_tool_call.function.name
if new_tool_call.function.arguments:
tool_call.function.arguments += new_tool_call.function.arguments
finish_reason = None # The default value of finish_reason is None
message_id, usage = None, None
for chunk in response.iter_lines(decode_unicode=True, delimiter=delimiter):
chunk = chunk.strip()
if chunk:
# ignore sse comments
if chunk.startswith(":"):
continue
decoded_chunk = chunk.strip().removeprefix("data:").lstrip()
if decoded_chunk == "[DONE]": # Some provider returns "data: [DONE]"
continue
try:
chunk_json: dict = json.loads(decoded_chunk)
# stream ended
except json.JSONDecodeError as e:
yield create_final_llm_result_chunk(
id=message_id,
index=chunk_index + 1,
message=AssistantPromptMessage(content=""),
finish_reason="Non-JSON encountered.",
usage=usage,
)
break
# handle the error here. for issue #11629
if chunk_json.get("error") and chunk_json.get("choices") is None:
raise ValueError(chunk_json.get("error"))
if chunk_json:
if u := chunk_json.get("usage"):
usage = u
if not chunk_json or len(chunk_json["choices"]) == 0:
continue
choice = chunk_json["choices"][0]
finish_reason = chunk_json["choices"][0].get("finish_reason")
message_id = chunk_json.get("id")
chunk_index += 1
if "delta" in choice:
delta = choice["delta"]
delta_content = delta.get("content")
assistant_message_tool_calls = None
if "tool_calls" in delta and credentials.get("function_calling_type", "no_call") == "tool_call":
assistant_message_tool_calls = delta.get("tool_calls", None)
elif (
"function_call" in delta
and credentials.get("function_calling_type", "no_call") == "function_call"
):
assistant_message_tool_calls = [
{"id": "tool_call_id", "type": "function", "function": delta.get("function_call", {})}
]
# assistant_message_function_call = delta.delta.function_call
# extract tool calls from response
if assistant_message_tool_calls:
tool_calls = self._extract_response_tool_calls(assistant_message_tool_calls)
increase_tool_call(tool_calls)
if delta_content is None or delta_content == "":
continue
# Check for think tags
if "<think>" in delta_content:
is_reasoning_started = True
# Remove <think> tag and add markdown quote
delta_content = "> 💭 " + delta_content.replace("<think>", "")
elif "</think>" in delta_content:
# Remove </think> tag and add newlines to end quote block
delta_content = delta_content.replace("</think>", "") + "\n\n"
is_reasoning_started = False
elif is_reasoning_started:
# Add quote markers for content within thinking block
if "\n\n" in delta_content:
delta_content = delta_content.replace("\n\n", "\n> ")
elif "\n" in delta_content:
delta_content = delta_content.replace("\n", "\n> ")
# transform assistant message to prompt message
assistant_prompt_message = AssistantPromptMessage(
content=delta_content,
)
# reset tool calls
tool_calls = []
full_assistant_content += delta_content
elif "text" in choice:
choice_text = choice.get("text", "")
if choice_text == "":
continue
# transform assistant message to prompt message
assistant_prompt_message = AssistantPromptMessage(content=choice_text)
full_assistant_content += choice_text
else:
continue
yield LLMResultChunk(
id=message_id,
model=model,
prompt_messages=prompt_messages,
delta=LLMResultChunkDelta(
index=chunk_index,
message=assistant_prompt_message,
),
)
chunk_index += 1
if tools_calls:
yield LLMResultChunk(
id=message_id,
model=model,
prompt_messages=prompt_messages,
delta=LLMResultChunkDelta(
index=chunk_index,
message=AssistantPromptMessage(tool_calls=tools_calls, content=""),
),
)
yield create_final_llm_result_chunk(
id=message_id,
index=chunk_index,
message=AssistantPromptMessage(content=""),
finish_reason=finish_reason,
usage=usage,
)

View File

@ -0,0 +1,51 @@
model: Qwen/Qwen2.5-72B-Instruct-128K
label:
en_US: Qwen/Qwen2.5-72B-Instruct-128K
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 131072
parameter_rules:
- name: temperature
use_template: temperature
- name: max_tokens
use_template: max_tokens
type: int
default: 512
min: 1
max: 4096
help:
zh_Hans: 指定生成结果长度的上限。如果生成结果截断,可以调大该参数。
en_US: Specifies the upper limit on the length of generated results. If the generated results are truncated, you can increase this parameter.
- name: top_p
use_template: top_p
- name: top_k
label:
zh_Hans: 取样数量
en_US: Top k
type: int
help:
zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
en_US: Only sample from the top K options for each subsequent token.
required: false
- name: frequency_penalty
use_template: frequency_penalty
- name: response_format
label:
zh_Hans: 回复格式
en_US: Response Format
type: string
help:
zh_Hans: 指定模型必须输出的格式
en_US: specifying the format that the model must output
required: false
options:
- text
- json_object
pricing:
input: '4.13'
output: '4.13'
unit: '0.000001'
currency: RMB

View File

@ -0,0 +1,51 @@
model: Vendor-A/Qwen/Qwen2.5-72B-Instruct
label:
en_US: Vendor-A/Qwen/Qwen2.5-72B-Instruct
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 32768
parameter_rules:
- name: temperature
use_template: temperature
- name: max_tokens
use_template: max_tokens
type: int
default: 512
min: 1
max: 4096
help:
zh_Hans: 指定生成结果长度的上限。如果生成结果截断,可以调大该参数。
en_US: Specifies the upper limit on the length of generated results. If the generated results are truncated, you can increase this parameter.
- name: top_p
use_template: top_p
- name: top_k
label:
zh_Hans: 取样数量
en_US: Top k
type: int
help:
zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
en_US: Only sample from the top K options for each subsequent token.
required: false
- name: frequency_penalty
use_template: frequency_penalty
- name: response_format
label:
zh_Hans: 回复格式
en_US: Response Format
type: string
help:
zh_Hans: 指定模型必须输出的格式
en_US: specifying the format that the model must output
required: false
options:
- text
- json_object
pricing:
input: '1.00'
output: '1.00'
unit: '0.000001'
currency: RMB

View File

@ -15,7 +15,7 @@ parameter_rules:
type: int
default: 512
min: 1
max: 8192
max: 4096
help:
zh_Hans: 指定生成结果长度的上限。如果生成结果截断,可以调大该参数。
en_US: Specifies the upper limit on the length of generated results. If the generated results are truncated, you can increase this parameter.

Some files were not shown because too many files have changed in this diff Show More