Compare commits

..

52 Commits

Author SHA1 Message Date
3cc697832a feat: bump version to 0.3.32 (#1620) 2023-11-25 16:43:31 +08:00
bb98f5756a feat: add xinference rerank model (#1619) 2023-11-25 16:23:24 +08:00
e1d2203371 fix: provider chatglm tests error (#1618) 2023-11-25 16:04:36 +08:00
93467cb363 fix: dataset tool missing in n-to-1 retrieve mode (#1617) 2023-11-25 16:04:22 +08:00
ea526d0822 feat: chatglm3 support (#1616) 2023-11-25 15:37:07 +08:00
0e627c920f feat: xinference rerank model support (#1615) 2023-11-25 03:56:00 +08:00
ea35f1dce1 feat: bump version to 0.3.31-fix3 (#1606) 2023-11-22 19:40:52 +08:00
a5b80c9d1f Fix/multi thread parameter (#1604) 2023-11-22 18:31:29 +08:00
f704094a5f fix hybrid search when document is none (#1603)
Co-authored-by: jyong <jyong@dify.ai>
2023-11-22 17:53:42 +08:00
1f58f15bff feat: optimize db connections in thread (#1601) 2023-11-22 16:55:59 +08:00
b930716745 fix weaviate hybrid search issue (#1600)
Co-authored-by: jyong <jyong@dify.ai>
2023-11-22 16:41:20 +08:00
9587479b76 fix: chat token spent info style (#1597) 2023-11-22 15:22:50 +08:00
3c0fbf3a6a fix sql transaction error in statistic API (#1586) 2023-11-22 14:28:21 +08:00
caa330c91f feat: bump version to 0.3.31-fix2 (#1592) 2023-11-22 01:53:40 +08:00
4a55d5729d feat: add anthropic claude-2.1 support (#1591) 2023-11-22 01:46:19 +08:00
d6a6697891 fix: safari can not in (#1590) 2023-11-21 20:25:23 +08:00
778cfb37a2 feat: bump version to 0.3.31-fix1 (#1589) 2023-11-21 17:34:41 +08:00
ce85ee3aa6 Update docker-compose.yaml (#1587) 2023-11-21 17:33:35 +08:00
b23de4affc fix: chat on start bug (#1588) 2023-11-21 17:26:49 +08:00
d8a7e894aa fix: retrieval test page hide rerank model also hide retrieval config (#1585) 2023-11-21 16:07:47 +08:00
d5acfaa14e feat: bump version to 0.3.31 (#1584) 2023-11-21 15:50:18 +08:00
cc35d0645a Compatible model saving error (#1582)
Co-authored-by: jyong <jyong@dify.ai>
2023-11-21 15:38:27 +08:00
c9368925a3 feat: add supported_model_types field and filter in provider list (#1581) 2023-11-21 15:06:47 +08:00
0d9ce1bab0 fix multi retrieval with resource score issue (#1578)
Co-authored-by: jyong <jyong@dify.ai>
2023-11-21 13:47:55 +08:00
519fb90d5a fix: some text (#1579) 2023-11-21 13:46:51 +08:00
6768fd4d87 fix: some RAG retrieval bugs (#1577)
Co-authored-by: Joel <iamjoel007@gmail.com>
2023-11-21 13:46:07 +08:00
d0456d0f42 feat: configurable invite expiry time (#1573) 2023-11-21 11:50:06 +08:00
7cda3fe85b fix(api): patch Windows timezone set (#1575) 2023-11-21 11:49:07 +08:00
5b7071e4b0 Feat/sdk vision support (#1531)
Co-authored-by: Joel <iamjoel007@gmail.com>
2023-11-20 17:54:01 +08:00
ac3496e681 fix(web): Sidebar create new chat context (#1569) 2023-11-20 15:57:31 +08:00
657334a5fd feat: fetch stream compatibility enhance (#1551) 2023-11-20 15:30:32 +08:00
31195975f5 chore: retrieval docs links and enchance help doc translation (#1570) 2023-11-20 11:07:45 +08:00
6717bb2b72 fix the error message (#1564)
Co-authored-by: jyong <jyong@dify.ai>
2023-11-19 15:12:55 +08:00
0e08526428 fix hybrid search reranking check (#1563)
Co-authored-by: jyong <jyong@dify.ai>
2023-11-18 17:06:28 +08:00
888e8c6dac feat: add retriever rank fe (#1557)
Co-authored-by: StyleZhang <jasonapring2015@outlook.com>
2023-11-18 11:53:35 +08:00
e017eff5e4 Update README_CN.md 2023-11-18 00:43:01 +08:00
e4dd79bbb1 Feat/jp and es (#1562) 2023-11-18 00:25:26 +08:00
4588831bff Feat/add retriever rerank (#1560)
Co-authored-by: jyong <jyong@dify.ai>
2023-11-17 22:13:37 +08:00
a4f37220a0 Add some interesting badges :) (#1558) 2023-11-17 16:31:54 +08:00
d654770732 feat: supports for new version of openllm (#1554) 2023-11-17 14:07:36 +08:00
19fc9e3466 fix: upload file not clickable in firefox (#1552) 2023-11-17 09:57:53 +08:00
d048557bfe update images (#1549) 2023-11-16 15:07:18 +08:00
5feea0382e update images (#1548) 2023-11-16 15:03:42 +08:00
18cf7f7ed0 feat: remove plugin page (#1544) 2023-11-16 11:56:25 +08:00
cfbfd59b8f fix: upload image (#1522) 2023-11-16 11:56:11 +08:00
d9336d9ae4 feat: add code of conduct (#1541) 2023-11-16 09:44:06 +08:00
3365c4da9e Doc/update readme patch 1 (#1538) 2023-11-15 20:52:53 +08:00
8f2bd7663d feat: optimize timezone of server (#1537) 2023-11-15 19:14:31 +08:00
8306b4373b doc: update readme (#1536) 2023-11-15 19:10:17 +08:00
149e959d09 new readme (#1528) 2023-11-15 17:26:04 +08:00
54a42d08d7 fix: conversation rename always auto generate (#1530) 2023-11-15 15:03:21 +08:00
481b083506 Update README.md (#1525) 2023-11-15 10:45:21 +08:00
217 changed files with 6411 additions and 1999 deletions

43
.github/CODE_OF_CONDUCT.md vendored Normal file
View File

@ -0,0 +1,43 @@
# Dify Code of Conduct
## Our Pledge
We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, caste, color, religion, or sexual identity
and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.
## Our Standards
Examples of behavior that contributes to a positive environment for our
community include:
* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our mistakes,
and learning from the experience
* Focusing on what is best not just for us as individuals, but for the
overall community
Examples of unacceptable behavior include:
* The use of sexualized language or imagery, and sexual attention or
advances of any kind
* Trolling, insulting or derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or email
address, without their explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting
## Language Policy
To facilitate clear and effective communication, all discussions, comments, documentation, and pull requests in this project should be conducted in English. This ensures that all contributors can participate and collaborate effectively.

View File

@ -1,5 +1,5 @@
name: "🕷️ Bug report"
description: Report errors or unexpected behavior
description: Report errors or unexpected behavior [please use English :]
labels:
- bug
body:

View File

@ -1,5 +1,5 @@
name: "📚 Documentation Issue"
description: Report issues in our documentation
description: Report issues in our documentation [please use English :]
labels:
- ducumentation
body:

View File

@ -1,5 +1,5 @@
name: "⭐ Feature or enhancement request"
description: Propose something new.
description: Propose something new. [please use English :]
labels:
- enhancement
body:

View File

@ -1,5 +1,5 @@
name: "🤝 Help Wanted"
description: "Request help from the community"
description: "Request help from the community" [please use English :]
labels:
- help-wanted
body:

View File

@ -1,5 +1,5 @@
name: "🌐 Localization/Translation issue"
description: Report incorrect translations.
description: Report incorrect translations. [please use English :]
labels:
- translation
body:

View File

@ -34,9 +34,7 @@ jobs:
type=raw,value=latest,enable=${{ startsWith(github.ref, 'refs/tags/') }}
type=ref,event=branch
type=sha,enable=true,priority=100,prefix=,suffix=,format=long
type=semver,pattern={{major}}.{{minor}}.{{patch}}
type=semver,pattern={{major}}.{{minor}}
type=semver,pattern={{major}}
type=raw,value=${{ github.ref_name }},enable=${{ startsWith(github.ref, 'refs/tags/') }}
- name: Build and push
uses: docker/build-push-action@v4

View File

@ -34,9 +34,7 @@ jobs:
type=raw,value=latest,enable=${{ startsWith(github.ref, 'refs/tags/') }}
type=ref,event=branch
type=sha,enable=true,priority=100,prefix=,suffix=,format=long
type=semver,pattern={{major}}.{{minor}}.{{patch}}
type=semver,pattern={{major}}.{{minor}}
type=semver,pattern={{major}}
type=raw,value=${{ github.ref_name }},enable=${{ startsWith(github.ref, 'refs/tags/') }}
- name: Build and push
uses: docker/build-push-action@v4

View File

@ -20,11 +20,11 @@ jobs:
steps:
- uses: actions/stale@v5
with:
days-before-issue-stale: 30
days-before-issue-stale: 15
days-before-issue-close: 3
repo-token: ${{ secrets.GITHUB_TOKEN }}
stale-issue-message: "Close due to it's no longer active, if you have any questions, you can reopen it."
stale-pr-message: "Close due to it's no longer active, if you have any questions, you can reopen it."
stale-issue-label: 'no-issue-activity'
stale-pr-label: 'no-pr-activity'
any-of-labels: 'duplicate,question,invalid,wontfix,no-issue-activity,no-pr-activity,enhancement'
any-of-labels: 'duplicate,question,invalid,wontfix,no-issue-activity,no-pr-activity,enhancement,cant-reproduce,help-wanted'

184
README.md
View File

@ -1,65 +1,67 @@
![](./images/describe-en.png)
[![](./images/describe.png)](https://dify.ai)
<p align="center">
<a href="./README.md">English</a> |
<a href="./README_CN.md">简体中文</a> |
<a href="./README_JA.md">日本語</a> |
<a href="./README_ES.md">Español</a>
<a href="./README_ES.md">Español</a> |
<a href="./README_KL.md">Klingon</a>
</p>
#### [Website](https://dify.ai) • [Docs](https://docs.dify.ai) • [Deployment Docs](https://docs.dify.ai/getting-started/install-self-hosted) • [FAQ](https://docs.dify.ai/getting-started/faq) • [Twitter](https://twitter.com/dify_ai) • [Discord](https://discord.gg/FngNHpbcY7)
<p align="center">
<a href="https://dify.ai" target="_blank">
<img alt="Static Badge" src="https://img.shields.io/badge/AI-Dify?logo=AI&logoColor=%20%23f5f5f5&label=Dify&labelColor=%20%23155EEF&color=%23EAECF0"></a>
<a href="https://discord.gg/FngNHpbcY7" target="_blank">
<img src="https://img.shields.io/discord/1082486657678311454?logo=discord"
alt="chat on Discord"></a>
<a href="https://twitter.com/intent/follow?screen_name=dify_ai" target="_blank">
<img src="https://img.shields.io/twitter/follow/dify_ai?style=social&logo=X"
alt="follow on Twitter"></a>
<a href="https://hub.docker.com/u/langgenius" target="_blank">
<img alt="Docker Pulls" src="https://img.shields.io/docker/pulls/langgenius/dify-web"></a>
</p>
**Dify** is an easy-to-use LLMOps platform designed to empower more people to create sustainable, AI-native applications. With visual orchestration for various application types, Dify offers out-of-the-box, ready-to-use applications that can also serve as Backend-as-a-Service APIs. Unify your development process with one API for plugins and datasets integration, and streamline your operations using a single interface for prompt engineering, visual analytics, and continuous improvement.
**Dify** is an LLM application development platform that has already seen over **100,000** applications built on Dify.AI. It integrates the concepts of Backend as a Service and LLMOps, covering the core tech stack required for building generative AI-native applications, including a built-in RAG engine. With Dify, **you can self-deploy capabilities similar to Assistants API and GPTs based on any LLMs.**
Applications created with Dify include:
Out-of-the-box web sites supporting form mode and chat conversation mode
A single API encompassing plugin capabilities, context enhancement, and more, saving you backend coding effort
Visual data analysis, log review, and annotation for applications
https://github.com/langgenius/dify/assets/100913391/f6e658d5-31b3-4c16-a0af-9e191da4d0f6
## Highlighted Features
**1. LLMs support:** Choose capabilities based on different models when building your Dify AI apps. Dify is compatible with Langchain, meaning it will support various LLMs. Currently supported:
- [x] **OpenAI**: GPT4, GPT3.5-turbo, GPT3.5-turbo-16k, text-davinci-003
- [x] **Azure OpenAI Service**
- [x] **Anthropic**: Claude2, Claude-instant
- [x] **Replicate**
- [x] **Hugging Face Hub**
- [x] **ChatGLM**
- [x] **Llama2**
- [x] **MiniMax**
- [x] **Spark**
- [x] **Wenxin**
- [x] **Tongyi**
We provide the following free resources for registered Dify cloud users (sign up at [dify.ai](https://dify.ai)):
* 200 free OpenAI queries to build OpenAI-based apps
**2. Visual orchestration:** Build an AI app in minutes by writing and debugging prompts visually.
**3. Text embedding:** Fully automated text preprocessing embeds your data as context without complex concepts. Supports PDF, TXT, and syncing data from Notion, webpages, APIs.
**4. API-based:** Backend-as-a-service. Access web apps directly or integrate via APIs without complex backend setup.
**5. Plugins:** Dify "Smart Chat" now supports first-party plugins like web browsing, Google search, Wikipedia to enable online lookup, analyzing web content, and explaining the AI's reasoning process conversationally.
**6. Team workspaces:** Team members can join workspaces to collaboratively edit, manage, and use team AI apps.
**7. Data labeling and improvement:** Visually inspect AI app logs and improve data via labeling. Observe the AI's reasoning process to continuously enhance performance. (Coming soon)
## Use cases
* [Create an AI ChatBot with Business Data in Minutes.](https://docs.dify.ai/use-cases/create-an-ai-chatbot-with-business-data-in-minutes)
* [How to Build an Notion AI Assistant Based on Your Own Notes?](https://docs.dify.ai/use-cases/build-an-notion-ai-assistant)
* [Create a Midjoureny Prompt Bot Without Code in Just a Few Minutes.](https://docs.dify.ai/use-cases/create-a-midjoureny-prompt-bot-with-dify)
![](./images/demo.png)
## Use Cloud Services
Visit [Dify.ai](https://dify.ai)
Using [Dify.AI Cloud](https://dify.ai) provides all the capabilities of the open-source version, and includes a complimentary 200 GPT trial credits.
## Why Dify
Dify features model neutrality and is a complete, engineered tech stack compared to hardcoded development libraries like LangChain. Unlike OpenAI's Assistants API, Dify allows for full local deployment of services.
| Feature | Dify.AI | Assistants API | LangChain |
|---------|---------|----------------|-----------|
| **Programming Approach** | API-oriented | API-oriented | Python Code-oriented |
| **Ecosystem Strategy** | Open Source | Closed and Commercial | Open Source |
| **RAG Engine** | Supported | Supported | Not Supported |
| **Prompt IDE** | Included | Included | None |
| **Supported LLMs** | Rich Variety | Only GPT | Rich Variety |
| **Local Deployment** | Supported | Not Supported | Not Applicable |
## Features
![](./images/models.png)
**1. LLM Support**: Integration with OpenAI's GPT family of models, or the open-source Llama2 family models. In fact, Dify supports mainstream commercial models and open-source models (locally deployed or based on MaaS).
**2. Prompt IDE**: Visual orchestration of applications and services based on LLMs with your team.
**3. RAG Engine**: Includes various RAG capabilities based on full-text indexing or vector database embeddings, allowing direct upload of PDFs, TXTs, and other text formats.
**4. Agents**: A Function Calling based Agent framework that allows users to configure what they see is what they get. Dify includes basic plugin capabilities like Google Search.
**5. Continuous Operations**: Monitor and analyze application logs and performance, continuously improving Prompts, datasets, or models using production data.
## Before You Start
- [Website](https://dify.ai)
- [Docs](https://docs.dify.ai)
- [Deployment Docs](https://docs.dify.ai/getting-started/install-self-hosted)
- [FAQ](https://docs.dify.ai/getting-started/faq)
## Install the Community Edition
@ -88,88 +90,28 @@ You can go to https://github.com/BorisPolonsky/dify-helm for deployment informat
### Configuration
If you need to customize the configuration, please refer to the comments in our [docker-compose.yml](docker/docker-compose.yaml) file and manually set the environment configuration. After making the changes, please run 'docker-compose up -d' again.
If you need to customize the configuration, please refer to the comments in our [docker-compose.yml](docker/docker-compose.yaml) file and manually set the environment configuration. After making the changes, please run `docker-compose up -d` again. You can see the full list of environment variables in our [docs](https://docs.dify.ai/getting-started/install-self-hosted/environments).
## Roadmap
Features under development:
- **Datasets**, supporting more datasets, e.g. syncing content from Notion or webpages
We will support more datasets, including text, webpages, and even Notion content. Users can build AI applications based on their own data sources.
- **Plugins**, introducing ChatGPT Plugin-standard plugins for applications, or using Dify-produced plugins
We will release plugins complying with ChatGPT standard, or Dify's own plugins to enable more capabilities in applications.
## Q&A
**Q: What can I do with Dify?**
A: Dify is a simple yet powerful LLM development and operations tool. You can use it to build commercial-grade applications, personal assistants. If you want to develop your own applications, LangDifyGenius can save you backend work in integrating with OpenAI and offer visual operations capabilities, allowing you to continuously improve and train your GPT model.
**Q: How do I use Dify to "train" my own model?**
A: A valuable application consists of Prompt Engineering, context enhancement, and Fine-tuning. We've created a hybrid programming approach combining Prompts with programming languages (similar to a template engine), making it easy to accomplish long-text embedding or capturing subtitles from a user-input Youtube video - all of which will be submitted as context for LLMs to process. We place great emphasis on application operability, with data generated by users during App usage available for analysis, annotation, and continuous training. Without the right tools, these steps can be time-consuming.
**Q: What do I need to prepare if I want to create my own application?**
A: We assume you already have an OpenAI API Key; if not, please register for one. If you already have some content that can serve as training context, that's great!
**Q: What interface languages are available?**
A: English and Chinese are currently supported, and you can contribute language packs to us.
## Star History
[![Star History Chart](https://api.star-history.com/svg?repos=langgenius/dify&type=Date)](https://star-history.com/#langgenius/dify&Date)
## Contributing
## Community & Support
We welcome you to contribute to Dify to help make Dify better. We welcome contributions in various ways, submitting code, issues, new ideas, or sharing the interesting and useful AI applications you have created based on Dify. At the same time, we also welcome you to share Dify at different events, conferences, and social media.
We welcome you to contribute to Dify to help make Dify better in various ways, submitting code, issues, new ideas, or sharing the interesting and useful AI applications you have created based on Dify. At the same time, we also welcome you to share Dify at different events, conferences, and social media.
### Submit a Pull Request
- [GitHub Issues](https://github.com/langgenius/dify/issues). Best for: bugs and errors you encounter using Dify.AI, see the [Contribution Guide](CONTRIBUTING.md).
- [Email Support](mailto:hello@dify.ai?subject=[GitHub]Questions%20About%20Dify). Best for: questions you have about using Dify.AI.
- [Discord](https://discord.gg/FngNHpbcY7). Best for: sharing your applications and hanging out with the community.
- [Twitter](https://twitter.com/dify_ai). Best for: sharing your applications and hanging out with the community.
- [Business License](mailto:business@dify.ai?subject=[GitHub]Business%20License%20Inquiry). Best for: business inquiries of licensing Dify.AI for commercial use.
To ensure proper review, all code contributions, including from contributors with direct commit access, must be submitted as PR requests and approved by core developers before merging branches.
We welcome PRs from everyone! If you're willing to help out, you can learn more about how to contribute code to the project in the [Contribution Guide](CONTRIBUTING.md).
### Submit issues or ideas
You can submit your issues or ideas by adding issues to the Dify repository. If you encounter issues, please describe the steps you took to encounter the issue as much as possible so we can better discover it. If you have any new ideas for our product, we also welcome your feedback. Please share your insights as much as possible so we can get more feedback and further discussion in the community.
### Share your applications
We encourage all community members to share their AI applications built on Dify, which can be applied to different scenarios or different users. This will provide powerful inspiration for people who want to create AI capabilities! You can share your experience by [submitting an issue in the Dify-user-case repository](https://github.com/langgenius/dify-user-case/issues).
### Share Dify with others
We encourage community contributors to actively demonstrate different aspects of using Dify. You can talk or share any feature of using Dify at meetups and conferences, blogs or social media. We believe your unique sharing will be of great help to others! Mention @Dify.AI on Twitter and/or communicate on [Discord](https://discord.gg/FngNHpbcY7) so we can give pointers and tips and help you spread the word by promoting your content on the different Dify communication channels.
### Help others
You can also help people in need of help on Discord, GitHub issues or other social platforms, guide others to solve problems encountered during use and share usage experiences. This is also a great contribution! If you want to become a maintainer of the Dify community, please contact the official team via [Discord](https://discord.gg/FngNHpbcY7) or email us at support@dify.ai.
## Contact Us
If you have any questions, suggestions, or partnership inquiries, feel free to contact us through the following channels:
- Submit an Issue or PR on our GitHub Repo
- Join the discussion in our [Discord](https://discord.gg/FngNHpbcY7) Community
- Send an email to hello@dify.ai
We're eager to assist you and together create more fun and useful AI applications!
## Security
## Security Disclosure
To protect your privacy, please avoid posting security issues on GitHub. Instead, send your questions to security@dify.ai and we will provide you with a more detailed answer.
## Citation
This software uses the following open-source software:
- Chase, H. (2022). LangChain [Computer software]. https://github.com/hwchase17/langchain
For more information, please refer to the official website or license text of the respective software.
## License
This repository is available under the [Dify Open Source License](LICENSE).
This repository is available under the [Dify Open Source License](LICENSE), which is essentially Apache 2.0 with a few additional restrictions.

View File

@ -1,60 +1,63 @@
![](./images/describe-cn.jpg)
[![](./images/describe.png)](https://dify.ai)
<p align="center">
<a href="./README.md">English</a> |
<a href="./README_CN.md">简体中文</a> |
<a href="./README_JA.md">日本語</a> |
<a href="./README_ES.md">Español</a>
<a href="./README_ES.md">Español</a> |
<a href="./README_KL.md">Klingon</a>
</p>
<p align="center">
<a href="https://dify.ai" target="_blank">
<img alt="Static Badge" src="https://img.shields.io/badge/AI-Dify?logo=AI&logoColor=%20%23f5f5f5&label=Dify&labelColor=%20%23155EEF&color=%23EAECF0"></a>
<a href="https://discord.gg/FngNHpbcY7" target="_blank">
<img src="https://img.shields.io/discord/1082486657678311454?logo=discord"
alt="chat on Discord"></a>
<a href="https://twitter.com/intent/follow?screen_name=dify_ai" target="_blank">
<img src="https://img.shields.io/twitter/follow/dify_ai?style=social&logo=X"
alt="follow on Twitter"></a>
<a href="https://hub.docker.com/u/langgenius" target="_blank">
<img alt="Docker Pulls" src="https://img.shields.io/docker/pulls/langgenius/dify-web"></a>
</p>
#### [官方网站](https://dify.ai) • [使用文档](https://docs.dify.ai/v/zh-hans) · [部署文档](https://docs.dify.ai/v/zh-hans/getting-started/install-self-hosted) · [FAQ](https://docs.dify.ai/v/zh-hans/getting-started/faq) • [Twitter](https://twitter.com/dify_ai) • [Discord](https://discord.gg/FngNHpbcY7)
Dify 是一个 LLM 应用开发平台,已经有超过 10 万个应用基于 Dify.AI 构建。它融合了 Backend as Service 和 LLMOps 的理念,涵盖了构建生成式 AI 原生应用所需的核心技术栈,包括一个内置 RAG 引擎。使用 Dify你可以基于任何模型自部署类似 Assistants API 和 GPTs 的能力。
**Dify** 是一个易用的 LLMOps 平台,基于不同的大型语言模型能力,让更多人可以简易地创建可持续运营的原生 AI 应用。Dify 提供多种类型应用的可视化编排,应用可开箱即用,也能以“后端即服务”的 API 提供服务。
![](./images/demo.png)
通过 Dify 创建的应用包含了:
## 为什么选择 Dify
- 开箱即用的的 Web 站点,支持表单模式和聊天对话模式
- 一套 API 即可包含插件、上下文增强等能力,替你省下了后端代码的编写工作
- 可视化的对应用进行数据分析,查阅日志或进行标注
Dify 具有模型中立性,相较 LangChain 等硬编码开发库 Dify 是一个完整的、工程化的技术栈,而相较于 OpenAI 的 Assistants API 你可以完全将服务部署在本地。
https://github.com/langgenius/dify/assets/100913391/f6e658d5-31b3-4c16-a0af-9e191da4d0f6
## 核心能力
1. **模型支持:** 你可以在 Dify 上选择基于不同模型的能力来开发你的 AI 应用。Dify 兼容 Langchain这意味着我们将逐步支持多种 LLMs ,目前支持的模型供应商:
- [x] **OpenAI**GPT4、GPT3.5-turbo、GPT3.5-turbo-16k、text-davinci-003
- [x] **Azure OpenAI Service**
- [x] **Anthropic**Claude2、Claude-instant
- [x] **Replicate**
- [x] **Hugging Face Hub**
- [x] **ChatGLM**
- [x] **Llama2**
- [x] **MiniMax**
- [x] **讯飞星火大模型**
- [x] **文心一言**
- [x] **通义千问**
| 功能 | Dify.AI | Assistants API | LangChain |
| --- | --- | --- | --- |
| 编程方式 | 面向 API | 面向 API | 面向 Python 代码 |
| 生态策略 | 开源 | 封闭且商用 | 开源 |
| RAG 引擎 | 支持 | 支持 | 不支持 |
| Prompt IDE | 包含 | 包含 | 没有 |
| 支持的 LLMs | 丰富 | 仅 GPT | 丰富 |
| 本地部署 | 支持 | 不支持 | 不适用 |
我们为所有注册云端版的用户免费提供以下资源(登录 [dify.ai](https://cloud.dify.ai) 即可使用):
* 200 次 OpenAI 模型的消息调用额度,用于创建基于 OpenAI 模型的 AI 应用
* 300 万 讯飞星火大模型 Token 的调用额度,用于创建基于讯飞星火大模型的 AI 应用
* 100 万 MiniMax Token 的调用额度,用于创建基于 MiniMax 模型的 AI 应用
2. **可视化编排 Prompt** 通过界面化编写 prompt 并调试,只需几分钟即可发布一个 AI 应用。
3. **文本 Embedding 处理(数据集)**:全自动完成文本预处理,使用你的数据作为上下文,无需理解晦涩的概念和技术处理。支持 PDF、txt 等文件格式,支持从 Notion、网页、API 同步数据。
4. **基于 API 开发:** 后端即服务。您可以直接访问网页应用,也可以接入 API 集成到您的应用中,无需关注复杂的后端架构和部署过程。
5. **插件能力:** Dify 「智聊」平台已支持网页浏览、Google 搜索、Wikipedia 查询等第一方插件,可在对话中实现联网搜索、分析网页内容、展示 AI 的推理过程。
6. **团队 Workspace** 团队成员可加入 Workspace 编辑、管理和使用团队内的 AI 应用。
6. **数据标注与改进:** 可视化查阅 AI 应用日志并对数据进行改进标注,观测 AI 的推理过程不断提高其性能。Coming soon
-----------------------------
## Use cases
* [几分钟创建一个带有业务数据的官网 AI 智能客服](https://docs.dify.ai/v/zh-hans/use-cases/create-an-ai-chatbot-with-business-data-in-minutes)
* [构建一个 Notion AI 助手](https://docs.dify.ai/v/zh-hans/use-cases/build-an-notion-ai-assistant)
* [创建 Midjoureny 提示词机器人](https://docs.dify.ai/v/zh-hans/use-cases/create-a-midjoureny-prompt-word-robot-with-zero-code)
## 特点
![](./images/models.png)
## 使用云服务
**1. LLM支持**:与 OpenAI 的 GPT 系列模型集成,或者与开源的 Llama2 系列模型集成。事实上Dify支持主流的商业模型和开源模型(本地部署或基于 MaaS)。
访问 [Dify.ai](https://cloud.dify.ai) 使用云端版
**2. Prompt IDE**:和团队一起在 Dify 协作,通过可视化的 Prompt 和应用编排工具开发 AI 应用。 支持无缝切换多种大型语言模型
**3. RAG引擎**:包括各种基于全文索引或向量数据库嵌入的 RAG 能力,允许直接上传 PDF、TXT 等各种文本格式。
**4. Agent**:基于函数调用的 Agent框架允许用户自定义配置所见即所得。Dify 提供了基本的插件能力,如谷歌搜索。
**5. 持续运营**:监控和分析应用日志和性能,使用生产数据持续改进 Prompt、数据集或模型。
## 在开始之前
- [网站](https://dify.ai)
- [文档](https://docs.dify.ai)
- [部署文档](https://docs.dify.ai/getting-started/install-self-hosted)
- [常见问题](https://docs.dify.ai/getting-started/faq)
## 安装社区版
@ -83,80 +86,29 @@ docker compose up -d
### 配置
需要自定义配置,请参考我们的 [docker-compose.yml](docker/docker-compose.yaml) 文件中的注释,并手动设置环境配置,修改完毕后,请再次`docker-compose up -d`
## Roadmap
我们正在开发中的功能:
- **数据集**支持更多的数据集通过网页、API 同步内容。用户可以根据自己的数据源构建 AI 应用程序。
- **插件**,我们将发布符合 ChatGPT 标准的插件,支持更多 Dify 自己的插件,支持用户自定义插件能力,以在应用程序中启用更多功能,例如以支持以目标为导向的分解推理任务。
## Q&A
**Q: 我能用 Dify 做什么?**
A: Dify 是一个简单且能力丰富的 LLM 开发和运营工具。你可以用它搭建商用级应用个人助理。如果你想自己开发应用Dify 也能为你省下接入 OpenAI 的后端工作,使用我们逐步提供的可视化运营能力,你可以持续的改进和训练你的 GPT 模型。
**Q: 如何使用 Dify “训练”自己的模型?**
A: 一个有价值的应用由 Prompt Engineering、上下文增强和 Fine-tune 三个环节组成。我们创造了一种 Prompt 结合编程语言的 Hybrid 编程方式(类似一个模版引擎),你可以轻松的完成长文本嵌入,或抓取用户输入的一个 Youtube 视频的字幕——这些都将作为上下文提交给 LLMs 进行计算。我们十分注重应用的可运营性,你的用户在使用 App 期间产生的数据,可进行分析、标记和持续训练。以上环节如果没有好的工具支持,可能会消耗你大量的时间。
**Q: 如果要创建一个自己的应用,我需要准备什么?**
A: 我们假定你已经有了 OpenAI 或 Claude 等模型的 API Key如果没有请去注册一个。如果你已经有了一些内容可以作为训练上下文就太好了。
**Q: 提供哪些界面语言?**
A: 支持英文、中文,你可以为我们贡献语言包并提供维护支持。
如果您需要自定义配置,请参考我们的 [docker-compose.yml](docker/docker-compose.yaml) 文件中的注释,并手动设置环境配置。更改后,请再次`docker-compose up -d`您可以在我们的[文档](https://docs.dify.ai/getting-started/install-self-hosted/environments)中查看所有环境变量的完整列表。
## Star History
[![Star History Chart](https://api.star-history.com/svg?repos=langgenius/dify&type=Date)](https://star-history.com/#langgenius/dify&Date)
## 贡献
## 社区与支持
我们欢迎为 Dify 出贡献帮助 Dify 变得更好。我们欢迎各种方式的贡献,提交代码、问题、新想法、或者分享基于 Dify 创建出的各种有趣有用的 AI 应用。同时,我们也欢迎在不同的活动、研讨会、社交媒体上分享 Dify。
我们欢迎为 Dify 出贡献,以帮助改善 Dify。包括提交代码、问题、新想法,或分享基于 Dify 创建有趣有用的 AI 应用程序。同时,我们也欢迎在不同的活动、会议和社交媒体上分享 Dify。
### 贡献代码
为了确保正确审查,所有代码贡献 - 包括来自具有直接提交更改权限的贡献者 - 都必须提交 PR 请求并在合并分支之前得到核心开发人员的批准。
- [GitHub Issues](https://github.com/langgenius/dify/issues)。👉:使用 Dify.AI 时遇到的错误和问题,请参阅[贡献指南](CONTRIBUTING.md)。
- [电子邮件支持](mailto:hello@dify.ai?subject=[GitHub]Questions%20About%20Dify)。👉:关于使用 Dify.AI 的问题
- [Discord](https://discord.gg/FngNHpbcY7)。👉:分享您的应用程序并与社区交流。
- [Twitter](https://twitter.com/dify_ai)。👉:分享您的应用程序并与社区交流。
- [商业许可](mailto:business@dify.ai?subject=[GitHub]Business%20License%20Inquiry)。👉:有关商业用途许可 Dify.AI 的商业咨询。
- [微信]() 👉:扫描下方二维码,添加微信好友,备注 Dify我们将邀请您加入 Dify 社区。
<img src="./images/wechat.png" alt="wechat" width="100"/>
我们欢迎所有人提交 PR如果您愿意提供帮助可以在 [贡献指南](CONTRIBUTING_CN.md) 中了解有关如何为项目做出代码贡献的更多信息。
### 提交问题或想法
你可以通过 Dify 代码仓库新增 issues 来提交你的问题或想法。如遇到问题,请尽可能描述你遇到问题的操作步骤,以便我们更好地发现它。如果你对我们的产品有任何新想法,也欢迎向我们反馈,请尽可能多地分享你的见解,以便我们在社区中获得更多反馈和进一步讨论。
### 分享你的应用
我们鼓励所有社区成员分享他们基于 Dify 创造出的 AI 应用,它们可以是应用于不同情景或不同用户,这将有助于为希望基于 AI 能力创造的人们提供强大灵感!你可以通过 [Dify-user-case 仓库项目提交 issue](https://github.com/langgenius/dify-user-case) 来分享你的应用案例。
### 向别人分享 Dify
我们鼓励社区贡献者们积极展示你使用 Dify 的不同角度。你可以通过线下研讨会、博客或社交媒体上谈论或分享你使用 Dify 的任意功能,相信你独特的使用分享会给别人带来非常大的帮助!如果你需要任何指导帮助,欢迎联系我们 support@dify.ai ,你也可以在 twitter @Dify.AI 或在 [Discord 社区](https://discord.gg/FngNHpbcY7)交流来帮助你传播信息。
### 帮助别人
你还可以在 Discord、GitHub issues或其他社交平台上帮助需要帮助的人指导别人解决使用过程中遇到的问题和分享使用经验。这也是个非常了不起的贡献如果你希望成为 Dify 社区的维护者,请通过[Discord 社区](https://discord.gg/FngNHpbcY7) 联系官方团队或邮件联系我们 support@dify.ai.
## 联系我们
如果您有任何问题、建议或合作意向,欢迎通过以下方式联系我们:
- 在我们的 [GitHub Repo](https://github.com/langgenius/dify) 上提交 Issue 或 PR
- 在我们的 [Discord 社区](https://discord.gg/FngNHpbcY7) 上加入讨论
- 发送邮件至 hello@dify.ai
## 安全
## 安全问题
为了保护您的隐私,请避免在 GitHub 上发布安全问题。发送问题至 security@dify.ai我们将为您做更细致的解答。
## Citation
本软件使用了以下开源软件:
- Chase, H. (2022). LangChain [Computer software]. https://github.com/hwchase17/langchain
更多信息,请参考相应软件的官方网站或许可证文本。
## License
本仓库遵循 [Dify Open Source License](LICENSE) 开源协议。

View File

@ -1,4 +1,4 @@
![](./images/describe-en.png)
[![](./images/describe.png)](https://dify.ai)
<p align="center">
<a href="./README.md">English</a> |
<a href="./README_CN.md">简体中文</a> |
@ -6,36 +6,71 @@
<a href="./README_ES.md">Español</a>
</p>
[Sitio web](https://dify.ai) • [Documentación](https://docs.dify.ai) • [Twitter](https://twitter.com/dify_ai) • [Discord](https://discord.gg/FngNHpbcY7)
<p align="center">
<a href="https://dify.ai" target="_blank">
<img alt="Static Badge" src="https://img.shields.io/badge/AI-Dify?logo=AI&logoColor=%20%23f5f5f5&label=Dify&labelColor=%20%23155EEF&color=%23EAECF0"></a>
<a href="https://discord.gg/FngNHpbcY7" target="_blank">
<img src="https://img.shields.io/discord/1082486657678311454?logo=discord"
alt="chat on Discord"></a>
<a href="https://twitter.com/intent/follow?screen_name=dify_ai" target="_blank">
<img src="https://img.shields.io/twitter/follow/dify_ai?style=social&logo=X"
alt="follow on Twitter"></a>
<a href="https://hub.docker.com/u/langgenius" target="_blank">
<img alt="Docker Pulls" src="https://img.shields.io/docker/pulls/langgenius/dify-web"></a>
</p>
**Dify** es una plataforma LLMOps fácil de usar diseñada para capacitar a más personas para que creen aplicaciones sostenibles basadas en IA. Con orquestación visual para varios tipos de aplicaciones, Dify ofrece aplicaciones listas para usar que también pueden funcionar como APIs de Backend-as-a-Service. Unifica tu proceso de desarrollo con una API para la integración de complementos y conjuntos de datos, y agiliza tus operaciones utilizando una interfaz única para la ingeniería de indicaciones, análisis visual y mejora continua.
**Dify** es una plataforma de desarrollo de aplicaciones para modelos de lenguaje de gran tamaño (LLM) que ya ha visto la creación de más de **100,000** aplicaciones basadas en Dify.AI. Integra los conceptos de Backend como Servicio y LLMOps, cubriendo el conjunto de tecnologías esenciales requerido para construir aplicaciones nativas de inteligencia artificial generativa, incluyendo un motor RAG incorporado. Con Dify, **puedes auto-desplegar capacidades similares a las de Assistants API y GPTs basadas en cualquier LLM.**
Las aplicaciones creadas con Dify incluyen:
![](./images/demo.png)
- Sitios web listos para usar que admiten el modo de formulario y el modo de conversación por chat.
- Una API única que abarca capacidades de complementos, mejora de contexto y más, lo que te ahorra esfuerzo de programación en el backend.
- Análisis visual de datos, revisión de registros y anotación para aplicaciones.
## Utilizar Servicios en la Nube
Dify es compatible con Langchain, lo que significa que gradualmente admitiremos múltiples LLMs, actualmente compatibles con:
Usar [Dify.AI Cloud](https://dify.ai) proporciona todas las capacidades de la versión de código abierto, e incluye un complemento de 200 créditos de prueba para GPT.
- GPT 3 (text-davinci-003)
- GPT 3.5 Turbo (ChatGPT)
- GPT-4
## Por qué Dify
## Usar servicios en la nube
Dify se caracteriza por su neutralidad de modelo y es un conjunto tecnológico completo e ingenierizado, en comparación con las bibliotecas de desarrollo codificadas como LangChain. A diferencia de la API de Assistants de OpenAI, Dify permite el despliegue local completo de los servicios.
Visita [Dify.ai](https://dify.ai)
| Característica | Dify.AI | API de Assistants | LangChain |
|----------------|---------|------------------|-----------|
| **Enfoque de Programación** | Orientado a API | Orientado a API | Orientado a Código en Python |
| **Estrategia del Ecosistema** | Código Abierto | Cerrado y Comercial | Código Abierto |
| **Motor RAG** | Soportado | Soportado | No Soportado |
| **IDE de Prompts** | Incluido | Incluido | Ninguno |
| **LLMs Soportados** | Gran Variedad | Solo GPT | Gran Variedad |
| **Despliegue Local** | Soportado | No Soportado | No Aplicable |
## Características
![](./images/models.png)
**1. Soporte LLM**: Integración con la familia de modelos GPT de OpenAI, o los modelos de la familia Llama2 de código abierto. De hecho, Dify soporta modelos comerciales convencionales y modelos de código abierto (desplegados localmente o basados en MaaS).
**2. IDE de Prompts**: Orquestación visual de aplicaciones y servicios basados en LLMs con tu equipo.
**3. Motor RAG**: Incluye varias capacidades RAG basadas en indexación de texto completo o incrustaciones de base de datos vectoriales, permitiendo la carga directa de PDFs, TXTs y otros formatos de texto.
**4. Agentes**: Un marco de Agentes basado en Llamadas de Función que permite a los usuarios configurar lo que ven es lo que obtienen. Dify incluye capacidades básicas de plugins como la Búsqueda de Google.
**5. Operaciones Continuas**: Monitorear y analizar registros de aplicaciones y rendimiento, mejorando continuamente Prompts, conjuntos de datos o modelos usando datos de producción.
## Antes de Empezar
- [Sitio web](https://dify.ai)
- [Documentación](https://docs.dify.ai)
- [Documentación de Implementación](https://docs.dify.ai/getting-started/install-self-hosted)
- [Preguntas Frecuentes](https://docs.dify.ai/getting-started/faq)
## Instalar la Edición Comunitaria
### Requisitos del sistema
### Requisitos del Sistema
Antes de instalar Dify, asegúrate de que tu máquina cumple con los siguientes requisitos mínimos del sistema:
Antes de instalar Dify, asegúrate de que tu máquina cumpla con los siguientes requisitos mínimos del sistema:
- CPU >= 2 Core
- CPU >= 2 núcleos
- RAM >= 4GB
### Inicio rápido
### Inicio Rápido
La forma más sencilla de iniciar el servidor de Dify es ejecutar nuestro archivo [docker-compose.yml](docker/docker-compose.yaml). Antes de ejecutar el comando de instalación, asegúrate de que [Docker](https://docs.docker.com/get-docker/) y [Docker Compose](https://docs.docker.com/compose/install/) estén instalados en tu máquina:
@ -44,80 +79,34 @@ cd docker
docker compose up -d
```
Después de ejecutarlo, puedes acceder al panel de control de Dify en tu navegador desde [http://localhost/install](http://localhost/install) y comenzar el proceso de instalación de inicialización.
Después de ejecutarlo, puedes acceder al panel de control de Dify en tu navegador en [http://localhost/install](http://localhost/install) y comenzar el proceso de instalación de inicialización.
### Helm Chart
### Gráfico Helm
Un gran agradecimiento a @BorisPolonsky por proporcionarnos una versión de [Helm Chart](https://helm.sh/), que permite desplegar Dify en Kubernetes.
Puede ir a https://github.com/BorisPolonsky/dify-helm para obtener información de despliegue.
Un gran agradecimiento a @BorisPolonsky por proporcionarnos una versión del [Gráfico Helm](https://helm.sh/), que permite implementar Dify en Kubernetes. Puedes visitar https://github.com/BorisPolonsky/dify-helm para obtener información sobre la implementación.
### Configuración
Si necesitas personalizar la configuración, consulta los comentarios en nuestro archivo [docker-compose.yml](docker/docker-compose.yaml) y configura manualmente la configuración del entorno. Después de realizar los cambios, ejecuta nuevamente 'docker-compose up -d'.
Si necesitas personalizar la configuración, consulta los comentarios en nuestro archivo [docker-compose.yml](docker/docker-compose.yaml) y configura manualmente la configuración del entorno. Después de realizar los cambios, ejecuta nuevamente `docker-compose up -d`. Puedes ver la lista completa de variables de entorno en nuestra [documentación](https://docs.dify.ai/getting-started/install-self-hosted/environments).
## Hoja de ruta
## Historial de Estrellas
Funciones en desarrollo:
[![Gráfico de Historial de Estrellas](https://api.star-history.com/svg?repos=langgenius/dify&type=Date)](https://star-history.com/#langgenius/dify&Date)
- **Conjuntos de datos**, admitiendo más conjuntos de datos, por ejemplo, sincronización de contenido desde Notion o páginas web.
Admitiremos más conjuntos de datos, incluidos texto, páginas web e incluso contenido de Notion. Los usuarios pueden construir aplicaciones de IA basadas en sus propias fuentes de datos
- **Complementos**, introduciendo complementos estándar de ChatGPT para aplicaciones, o utilizando complementos producidos por Dify.
Lanzaremos complementos que cumplan con el estándar de ChatGPT, o nuestros propios complementos de Dify para habilitar más capacidades en las aplicaciones.
- **Modelos de código abierto**, por ejemplo, adoptar Llama como proveedor de modelos o para un ajuste adicional.
Trabajaremos con excelentes modelos de código abierto como Llama, proporcionándolos como opciones de modelos en nuestra plataforma o utilizándolos para un ajuste adicional.
## Comunidad y Soporte
## Preguntas y respuestas
Te damos la bienvenida a contribuir a Dify para ayudar a hacer que Dify sea mejor de diversas maneras, enviando código, informando problemas, proponiendo nuevas ideas o compartiendo las aplicaciones de inteligencia artificial interesantes y útiles que hayas creado basadas en Dify. Al mismo tiempo, también te invitamos a compartir Dify en diferentes eventos, conferencias y redes sociales.
**P: ¿Qué puedo hacer con Dify?**
- [Problemas en GitHub](https://github.com/langgenius/dify/issues). Lo mejor para: errores y problemas que encuentres al usar Dify.AI, consulta la [Guía de Contribución](CONTRIBUTING.md).
- [Soporte por Correo Electrónico](mailto:hello@dify.ai?subject=[GitHub]Preguntas%20sobre%20Dify). Lo mejor para: preguntas que tengas sobre el uso de Dify.AI.
- [Discord](https://discord.gg/FngNHpbcY7). Lo mejor para: compartir tus aplicaciones y socializar con la comunidad.
- [Twitter](https://twitter.com/dify_ai). Lo mejor para: compartir tus aplicaciones y socializar con la comunidad.
- [Licencia Comercial](mailto:business@dify.ai?subject=[GitHub]Consulta%20de%20Licencia%20Comercial). Lo mejor para: consultas comerciales sobre la licencia de Dify.AI para uso comercial.
R: Dify es una herramienta de desarrollo y operaciones de LLM, simple pero poderosa. Puedes usarla para construir aplicaciones de calidad comercial y asistentes personales. Si deseas desarrollar tus propias aplicaciones, LangDifyGenius puede ahorrarte trabajo en el backend al integrar con OpenAI y ofrecer capacidades de operaciones visuales, lo que te permite mejorar y entrenar continuamente tu modelo GPT.
**P: ¿Cómo uso Dify para "entrenar" mi propio modelo?**
R: Una aplicación valiosa consta de Ingeniería de indicaciones, mejora de contexto y ajuste fino. Hemos creado un enfoque de programación híbrida que combina las indicaciones con lenguajes de programación (similar a un motor de plantillas), lo que facilita la incorporación de texto largo o la captura de subtítulos de un video de YouTube ingresado por el usuario, todo lo cual se enviará como contexto para que los LLM lo procesen. Damos gran importancia a la operabilidad de la aplicación, con los datos generados por los usuarios durante el uso de la aplicación disponibles para análisis, anotación y entrenamiento continuo. Sin las herramientas adecuadas, estos pasos pueden llevar mucho tiempo.
**P: ¿Qué necesito preparar si quiero crear mi propia aplicación?**
R: Suponemos que ya tienes una clave de API de OpenAI; si no la tienes, por favor regístrate. ¡Si ya tienes contenido que pueda servir como contexto de entrenamiento, eso es genial!
**P: ¿Qué idiomas de interfaz están disponibles?**
R: Actualmente se admiten inglés y chino, y puedes contribuir con paquetes de idiomas.
## Historial de estrellas
[![Gráfico de historial de estrellas](https://api.star-history.com/svg?repos=langgenius/dify&type=Date)](https://star-history.com/#langgenius/dify&Date)
## Contáctanos
Si tienes alguna pregunta, sugerencia o consulta sobre asociación, no dudes en contactarnos a través de los siguientes canales:
- Presentar un problema o una solicitud de extracción en nuestro repositorio de GitHub.
- Únete a la discusión en nuestra comunidad de [Discord](https://discord.gg/FngNHpbcY7).
- Envía un correo electrónico a hello@dify.ai.
¡Estamos ansiosos por ayudarte y crear juntos aplicaciones de IA más divertidas y útiles!
## Contribuciones
Para garantizar una revisión adecuada, todas las contribuciones de código, incluidas las de los colaboradores con acceso directo a los compromisos, deben enviarse mediante solicitudes de extracción y ser aprobadas por el equipo principal de
desarrollo antes de fusionarse.
¡Agradecemos todas las solicitudes de extracción! Si deseas ayudar, consulta la [Guía de Contribución](CONTRIBUTING.md) para obtener más información sobre cómo comenzar.
## Seguridad
## Divulgación de Seguridad
Para proteger tu privacidad, evita publicar problemas de seguridad en GitHub. En su lugar, envía tus preguntas a security@dify.ai y te proporcionaremos una respuesta más detallada.
## Citación
Este software utiliza el siguiente software de código abierto:
- Chase, H. (2022). LangChain [Software de computadora]. https://github.com/hwchase17/langchain
Para obtener más información, consulta el sitio web oficial o el texto de la licencia del software correspondiente.
## Licencia
Este repositorio está disponible bajo la [Licencia de código abierto de Dify](LICENSE).

View File

@ -1,122 +1,103 @@
![](./images/describe-en.png)
[![](./images/describe.png)](https://dify.ai)
<p align="center">
<a href="./README.md">English</a> |
<a href="./README_CN.md">简体中文</a> |
<a href="./README_JA.md">日本語</a> |
<a href="./README_ES.md">Español</a>
<a href="./README_ES.md">Español</a> |
<a href="./README_KL.md">Klingon</a>
</p>
[Web サイト](https://dify.ai) • [ドキュメント](https://docs.dify.ai) • [Twitter](https://twitter.com/dify_ai) • [Discord](https://discord.gg/FngNHpbcY7)
<p align="center">
<a href="https://dify.ai" target="_blank">
<img alt="Static Badge" src="https://img.shields.io/badge/AI-Dify?logo=AI&logoColor=%20%23f5f5f5&label=Dify&labelColor=%20%23155EEF&color=%23EAECF0"></a>
<a href="https://discord.gg/FngNHpbcY7" target="_blank">
<img src="https://img.shields.io/discord/1082486657678311454?logo=discord"
alt="chat on Discord"></a>
<a href="https://twitter.com/intent/follow?screen_name=dify_ai" target="_blank">
<img src="https://img.shields.io/twitter/follow/dify_ai?style=social&logo=X"
alt="follow on Twitter"></a>
<a href="https://hub.docker.com/u/langgenius" target="_blank">
<img alt="Docker Pulls" src="https://img.shields.io/docker/pulls/langgenius/dify-web"></a>
</p>
"Difyは、既にDify.AI上で10万以上のアプリケーションが構築されているLLMアプリケーション開発プラットフォームです。バックエンド・アズ・ア・サービスとLLMOpsの概念を統合し、組み込みのRAGエンジンを含む、生成AIネイティブアプリケーションを構築するためのコアテックスタックをカバーしています。Difyを使用すると、どのLLMに基づいても、Assistants APIやGPTのような機能を自己デプロイすることができます。"
**Dify** は、より多くの人々が持続可能な AI ネイティブアプリケーションを作成できるように設計された、使いやすい LLMOps プラットフォームです。様々なアプリケーションタイプに対応したビジュアルオーケストレーションにより Dify は Backend-as-a-Service API としても機能する、すぐに使えるアプリケーションを提供します。プラグインやデータセットを統合するための1つの API で開発プロセスを統一し、プロンプトエンジニアリング、ビジュアル分析、継続的な改善のための1つのインターフェイスを使って業務を合理化します。
Difyで作成したアプリケーションは以下の通りです:
フォームモードとチャット会話モードをサポートする、すぐに使える Web サイト
プラグイン機能、コンテキストの強化などを網羅する単一の API により、バックエンドのコーディングの手間を省きます。
アプリケーションの視覚的なデータ分析、ログレビュー、アノテーションが可能です。
Dify は LangChain と互換性があり、複数の LLM を徐々にサポートします:
- GPT 3 (text-davinci-003)
- GPT 3.5 Turbo(ChatGPT)
- GPT-4
Please note that translating complex technical terms can sometimes result in slight variations in meaning due to differences in language nuances.
## クラウドサービスの利用
[Dify.ai](https://dify.ai) をご覧ください
[Dify.AI Cloud](https://dify.ai) を使用すると、オープンソース版の全機能を利用でき、さらに200GPTのトライアルクレジットが無料で提供されます。
## Community Edition のインストール
## Difyの利点
Difyはモデルニュートラルであり、LangChainのようなハードコードされた開発ライブラリと比較して、完全にエンジニアリングされた技術スタックを特徴としています。OpenAIのAssistants APIとは異なり、Difyではサービスの完全なローカルデプロイメントが可能です。
| 機能 | Dify.AI | Assistants API | LangChain |
|---------|---------|----------------|-----------|
| **プログラミングアプローチ** | API指向 | API指向 | Pythonコード指向 |
| **エコシステム戦略** | オープンソース | 閉鎖的かつ商業的 | オープンソース |
| **RAGエンジン** | サポート済み | サポート済み | 非サポート |
| **プロンプトIDE** | 含まれる | 含まれる | なし |
| **サポートされるLLMs** | 豊富な種類 | GPTのみ | 豊富な種類 |
| **ローカルデプロイメント** | サポート済み | 非サポート | 該当なし |
## 開始する前に
- [Website](https://dify.ai)
- [Docs](https://docs.dify.ai)
- [Deployment Docs](https://docs.dify.ai/getting-started/install-self-hosted)
- [FAQ](https://docs.dify.ai/getting-started/faq)
## コミュニティエディションのインストール
### システム要件
Dify をインストールする前に、お使いのマシンが以下の最低システム要件を満たしていることを確認してください:
Difyをインストールする前に、以下の最低限のシステム要件を満たしていることを確認してください
- CPU >= 1 Core
- CPU >= 2コア
- RAM >= 4GB
### クイックスタート
Dify サーバーを起動する最も簡単な方法は、[docker-compose.yml](docker/docker-compose.yaml) ファイルを実行することです。インストールコマンドを実行する前に、[Docker](https://docs.docker.com/get-docker/) と [Docker Compose](https://docs.docker.com/compose/install/) がお使いのマシンにインストールされていることを確認してください:
Difyサーバーを始める最も簡単な方法は、[docker-compose.yml](docker/docker-compose.yaml) ファイルを実行することです。インストールコマンドを実行する前に、マシンに [Docker](https://docs.docker.com/get-docker/) と [Docker Compose](https://docs.docker.com/compose/install/) がインストールされていることを確認してください
```bash
cd docker
docker compose up -d
```
実行後、ブラウザで [http://localhost/install](http://localhost/install) にアクセスし、初期化インストール作業を開始することができます。
実行後、ブラウザで [http://localhost/install](http://localhost/install) にアクセスし、初期化インストールプロセスを開始できます。
### Helm Chart
@BorisPolonsky に大感謝します。彼は Dify を Kubernetes 上にデプロイするための [Helm Chart](https://helm.sh/) バージョンを提供してくれました
@BorisPolonskyによる[Helm Chart](https://helm.sh/) バージョンを提供してくれて、大変感謝しています。これにより、DifyはKubernetes上にデプロイすることができます
デプロイ情報については、https://github.com/BorisPolonsky/dify-helm をご覧ください。
### 構成
### 設定
カスタマイズが必要な場合は、[docker-compose.yml](docker/docker-compose.yaml) ファイルのコメントを参照し、手動で環境設定をお願いします。変更後、再度 'docker-compose up -d' を実行してください。
## ロードマップ
開発中の機能:
- **データセット**, Notionやウェブページからのコンテンツ同期など、より多くのデータセットをサポートします
テキスト、ウェブページ、さらには Notion コンテンツなど、より多くのデータセットをサポートする予定です。ユーザーは、自分のデータソースをもとに AI アプリケーションを構築することができます。
- **プラグイン**, アプリケーションに ChatGPT プラグイン標準のプラグインを導入する、または Dify 制作のプラグインを利用する
今後、ChatGPT 規格に準拠したプラグインや、ディファイ独自のプラグインを公開し、より多くの機能をアプリケーションで実現できるようにします。
- **オープンソースモデル**, 例えばモデルプロバイダーとして Llama を採用したり、さらにファインチューニングを行う
Llama のような優れたオープンソースモデルを、私たちのプラットフォームのモデルオプションとして提供したり、さらなる微調整のために使用したりすることで、協力していきます。
設定をカスタマイズする必要がある場合は、[docker-compose.yml](docker/docker-compose.yaml) ファイルのコメントを参照し、環境設定を手動で行ってください。変更を行った後は、もう一度 `docker-compose up -d` を実行してください。環境変数の完全なリストは、[ドキュメント](https://docs.dify.ai/getting-started/install-self-hosted/environments)で確認できます。
## Q&A
**Q: Dify で何ができるのか?**
A: Dify はシンプルでパワフルな LLM 開発・運用ツールです。商用グレードのアプリケーション、パーソナルアシスタントを構築するために使用することができます。独自のアプリケーションを開発したい場合、LangDifyGenius は OpenAI と統合する際のバックエンド作業を省き、視覚的な操作機能を提供し、GPT モデルを継続的に改善・訓練することが可能です。
**Q: Dify を使って、自分のモデルを「トレーニング」するにはどうすればいいのでしょうか?**
A: プロンプトエンジニアリング、コンテキスト拡張、ファインチューニングからなる価値あるアプリケーションです。プロンプトとプログラミング言語を組み合わせたハイブリッドプログラミングアプローチ(テンプレートエンジンのようなもの)で、長文の埋め込みやユーザー入力の YouTube 動画からの字幕取り込みなどを簡単に実現し、これらはすべて LLM が処理するコンテキストとして提出される予定です。また、アプリケーションの操作性を重視し、ユーザーがアプリケーションを使用する際に生成したデータを分析、アノテーション、継続的なトレーニングに利用できるようにしました。適切なツールがなければ、これらのステップに時間がかかることがあります。
**Q: 自分でアプリケーションを作りたい場合、何を準備すればよいですか?**
A: すでに OpenAI API Key をお持ちだと思いますが、お持ちでない場合はご登録ください。もし、すでにトレーニングのコンテキストとなるコンテンツをお持ちでしたら、それは素晴らしいことです!
**Q: インターフェイスにどの言語が使えますか?**
A: 現在、英語と中国語に対応しており、言語パックを寄贈することも可能です。
## Star ヒストリー
## スターヒストリー
[![Star History Chart](https://api.star-history.com/svg?repos=langgenius/dify&type=Date)](https://star-history.com/#langgenius/dify&Date)
## お問合せ
## コミュニティとサポート
ご質問、ご提案、パートナーシップに関するお問い合わせは、以下のチャンネルからお気軽にご連絡ください:
Difyに貢献していただき、コードの提出、問題の報告、新しいアイデアの提供、またはDifyを基に作成した興味深く有用なAIアプリケーションの共有により、Difyをより良いものにするお手伝いを歓迎します。同時に、さまざまなイベント、会議、ソーシャルメディアでDifyを共有することも歓迎します。
- GitHub Repo で Issue や PR を提出する
- [Discord](https://discord.gg/FngNHpbcY7) コミュニティで議論に参加する
- hello@dify.ai にメールを送信します
私たちは、皆様のお手伝いをさせていただき、より楽しく、より便利な AI アプリケーションを一緒に作っていきたいと思っています!
## コントリビュート
適切なレビューを行うため、コミットへの直接アクセスが可能なコントリビュータを含むすべてのコードコントリビュータは、プルリクエストで提出し、マージされる前にコア開発チームによって承認される必要があります。
私たちはすべてのプルリクエストを歓迎します!協力したい方は、[コントリビューションガイド](CONTRIBUTING.md) をチェックしてみてください。
- [GitHub Issues](https://github.com/langgenius/dify/issues)。最適な使用法Dify.AIの使用中に遭遇するバグやエラー、[貢献ガイド](CONTRIBUTING.md)を参照。
- [Email サポート](mailto:hello@dify.ai?subject=[GitHub]Questions%20About%20Dify)。最適な使用法Dify.AIの使用に関する質問
- [Discord](https://discord.gg/FngNHpbcY7)。最適な使用法:アプリケーションの共有とコミュニティとの交流。
- [Twitter](https://twitter.com/dify_ai)。最適な使用法:アプリケーションの共有とコミュニティとの交流。
- [ビジネスライセンス](mailto:business@dify.ai?subject=[GitHub]Business%20License%20Inquiry)。最適な使用法Dify.AIを商業利用するためのビジネス関連の問い合わせ。
## セキュリティ
プライバシー保護のため、GitHub へのセキュリティ問題の投稿は避けてください。代わりに、あなたの質問を security@dify.ai に送ってください。より詳細な回答を提供します。
## 引用
本ソフトウェアは、以下のオープンソースソフトウェアを使用しています:
- Chase, H. (2022). LangChain [Computer software]. https://github.com/hwchase17/langchain
詳しくは、各ソフトウェアの公式サイトまたはライセンス文をご参照ください。
## ライセンス
このリポジトリは、[Dify Open Source License](LICENSE) のもとで利用できます。

114
README_KL.md Normal file
View File

@ -0,0 +1,114 @@
[![](./images/describe.png)](https://dify.ai)
<p align="center">
<a href="./README.md">English</a> |
<a href="./README_CN.md">简体中文</a> |
<a href="./README_JA.md">日本語</a> |
<a href="./README_ES.md">Español</a> |
<a href="./README_KL.md">Klingon</a>
</p>
<p align="center">
<a href="https://dify.ai" target="_blank">
<img alt="Static Badge" src="https://img.shields.io/badge/AI-Dify?logo=AI&logoColor=%20%23f5f5f5&label=Dify&labelColor=%20%23155EEF&color=%23EAECF0"></a>
<a href="https://discord.gg/FngNHpbcY7" target="_blank">
<img src="https://img.shields.io/discord/1082486657678311454?logo=discord"
alt="chat on Discord"></a>
<a href="https://twitter.com/intent/follow?screen_name=dify_ai" target="_blank">
<img src="https://img.shields.io/twitter/follow/dify_ai?style=social&logo=X"
alt="follow on Twitter"></a>
<a href="https://hub.docker.com/u/langgenius" target="_blank">
<img alt="Docker Pulls" src="https://img.shields.io/docker/pulls/langgenius/dify-web"></a>
</p>
**Dify** Hoch LLM qorwI' pIqoDvam pagh laHta' je **100,000** pIqoDvamvam Dify.AI De'wI'. Dify leghpu' Backend chu' a Service teH LLMOps vItlhutlh, generative AI-native pIqoD teq wa'vam, vIyoD Built-in RAG engine. Dify, **'ej chenmoHmoH Hoch 'oHna' Assistant API 'ej GPTmey HoStaHbogh LLMmey.**
![](./images/demo.png)
## ngIl QaQ
[Dify.AI ngIl](https://dify.ai) pIm neHlaH 'ej ghaH. cha'logh wa' DIvI' 200 GPT trial credits.
## Dify WovmoH
Dify Daq rIn neutrality 'ej Hoch, LangChain tInHar HubwI'. maH Daqbe'law' Qawqar, OpenAI's Assistant API Daq local neH deployment.
| Qo'logh | Dify.AI | Assistants API | LangChain |
|---------|---------|----------------|-----------|
| **qet QaS** | API-oriented | API-oriented | Python Code-oriented |
| **Ecosystem Strategy** | Open Source | Closed and Commercial | Open Source |
| **RAG Engine** | Ha'qu' | Ha'qu' | ghoS Ha'qu' |
| **Prompt IDE** | jaH Include | jaH Include | qeylIS qaq |
| **qet LLMmey** | bo'Degh Hoch | GPTmey tIn | bo'Degh Hoch |
| **local deployment** | Ha'qu' | tInHa'qu' | tInHa'qu' ghogh |
## ruch
![](./images/models.png)
**1. LLM tIq**: OpenAI's GPT Hur nISmoHvam neH vIngeH, wa' Llama2 Hur nISmoHvam. Heghlu'lu'pu' Dify mIw 'oH choH qay'be'.Daq commercial Hurmey 'ej Open Source Hurmey (maqtaHvIS pagh locally neH neH deployment HoSvam).
**2. Prompt IDE**: cha'logh wa' LLMmey Hoch janlu'pu' 'ej lughpu' choH qay'be'.
**3. RAG Engine**: RAG vaD tIqpu' lo'taH indexing qor neH vector database wa' embeddings wIj, PDFs, TXTs, 'ej ghojmoHmoH HIq qorlIj je upload.
**4. jenSuvpu'**: jenbe' SuDqang naQ moDwu' jenSuvpu' porgh cha'logh choHvam. Dify Google Search Hur vItlhutlh plugin choH.
**5. QaS muDHa'wI': cha'logh wa' pIq mI' logs 'ej quv yIn, vItlhutlh tIq 'e'wIj lo'taHmoHmoH Prompts, vItlhutlh, Hurmey ghaH production data jatlh.
## Do'wI' qabmey lo'taH
- [Website](https://dify.ai)
- [Docs](https://docs.dify.ai)
- [lo'taHmoH Docs](https://docs.dify.ai/getting-started/install-self-hosted)
- [FAQ](https://docs.dify.ai/getting-started/faq)
## Community Edition tu' yo'
### System Qab
Dify yo' yo' qaqmeH SuS chenmoH 'oH qech!
- CPU >= 2 Cores
- RAM >= 4GB
### Quick Start
Dify server luHoHtaHlu' vIngeH lo'laHbe'chugh vIyoD [docker-compose.yml](docker/docker-compose.yaml) QorwI'ghach. toH yItlhutlh chenmoH luH!chugh 'ay' vaj vIneHmeH, 'ej [Docker](https://docs.docker.com/get-docker/) 'ej [Docker Compose](https://docs.docker.com/compose/install/) vaj 'oH 'e' vIneHmeH:
```bash
cd docker
docker compose up -d
```
luHoHtaHmeH HoHtaHvIS, Dify dashboard vIneHmeH vIngeH lI'wI' [http://localhost/install](http://localhost/install) 'ej 'oH initialization 'e' vIneHmeH.
### Helm Chart
@BorisPolonsky Dify wIq tIq ['ay'var (Helm Chart)](https://helm.sh/) version Hur yIn chu' Dify luHoHchu'. Heghlu'lu' vIneHmeH [https://github.com/BorisPolonsky/dify-helm](https://github.com/BorisPolonsky/dify-helm) 'ej vaj QaS deployment information.
### veS config
chenmoHDI' config lo'taH ghaH, vItlhutlh HIq wIgharghbe'lu'pu'. toH lo'taHvIS pagh vay' vIneHmeH, 'ej `docker-compose up -d` wa'DIch. tIqmoHmeH list full wa' lo'taHvo'lu'pu' ghaH [docs](https://docs.dify.ai/getting-started/install-self-hosted/environments).
## tIng qem
[![tIng qem Hur Chart](https://api.star-history.com/svg?repos=langgenius/dify&type=Date)](https://star-history.com/#langgenius/dify&Date)
## choHmoH 'ej vItlhutlh
Dify choHmoH je mIw Dify puqloD, Dify ghaHta'bogh vItlhutlh, HurDI' code, ghItlh, ghItlh qo'lu'pu'pu' qej. tIqmeH, Hurmey je, Dify Hur tIqDI' woDDaj, DuD QangmeH 'ej HInobDaq vItlhutlh HImej Dify'e'.
- [GitHub vItlhutlh](https://github.com/langgenius/dify/issues). Hurmey: bugs 'ej errors Dify.AI tIqmeH. yImej [Contribution Guide](CONTRIBUTING.md).
- [Email QaH](mailto:hello@dify.ai?subject=[GitHub]Questions%20About%20Dify). Hurmey: questions vItlhutlh Dify.AI chaw'.
- [Discord](https://discord.gg/FngNHpbcY7). Hurmey: jIpuv 'ej jImej mIw Dify vItlhutlh.
- [Twitter](https://twitter.com/dify_ai). Hurmey: jIpuv 'ej jImej mIw Dify vItlhutlh.
- [Business License](mailto:business@dify.ai?subject=[GitHub]Business%20License%20Inquiry). Hurmey: qurgh vItlhutlh Hurmey Dify.AI tIqbe'law'.
## bIQDaqmey bom
taghlI' vIngeH'a'? pong security 'oH posting GitHub. yItlhutlh, toH security@dify.ai 'ej vIngeH'a'.
## License
ghItlh puqloD chenmoH [Dify vItlhutlh Hur](LICENSE), ghaH nIvbogh Apache 2.0.

View File

@ -10,6 +10,7 @@ if not os.environ.get("DEBUG") or os.environ.get("DEBUG").lower() != 'true':
import grpc.experimental.gevent
grpc.experimental.gevent.init_gevent()
import time
import logging
import json
import threading
@ -36,6 +37,13 @@ from libs.passport import PassportService
import warnings
warnings.simplefilter("ignore", ResourceWarning)
# fix windows platform
if os.name == "nt":
os.system('tzutil /s "UTC"')
else:
os.environ['TZ'] = 'UTC'
time.tzset()
class DifyApp(Flask):
pass

View File

@ -8,6 +8,8 @@ import time
import uuid
import click
import qdrant_client
from qdrant_client.http.models import TextIndexParams, TextIndexType, TokenizerType
from tqdm import tqdm
from flask import current_app, Flask
from langchain.embeddings import OpenAIEmbeddings
@ -484,6 +486,38 @@ def normalization_collections():
click.echo(click.style('Congratulations! restore {} dataset indexes.'.format(len(normalization_count)), fg='green'))
@click.command('add-qdrant-full-text-index', help='add qdrant full text index')
def add_qdrant_full_text_index():
click.echo(click.style('Start add full text index.', fg='green'))
binds = db.session.query(DatasetCollectionBinding).all()
if binds and current_app.config['VECTOR_STORE'] == 'qdrant':
qdrant_url = current_app.config['QDRANT_URL']
qdrant_api_key = current_app.config['QDRANT_API_KEY']
client = qdrant_client.QdrantClient(
qdrant_url,
api_key=qdrant_api_key, # For Qdrant Cloud, None for local instance
)
for bind in binds:
try:
text_index_params = TextIndexParams(
type=TextIndexType.TEXT,
tokenizer=TokenizerType.MULTILINGUAL,
min_token_len=2,
max_token_len=20,
lowercase=True
)
client.create_payload_index(bind.collection_name, 'page_content',
field_schema=text_index_params)
except Exception as e:
click.echo(
click.style('Create full text index error: {} {}'.format(e.__class__.__name__, str(e)),
fg='red'))
click.echo(
click.style(
'Congratulations! add collection {} full text index successful.'.format(bind.collection_name),
fg='green'))
def deal_dataset_vector(flask_app: Flask, dataset: Dataset, normalization_count: list):
with flask_app.app_context():
try:
@ -647,10 +681,10 @@ def update_app_model_configs(batch_size):
pbar.update(len(data_batch))
@click.command('migrate_default_input_to_dataset_query_variable')
@click.option("--batch-size", default=500, help="Number of records to migrate in each batch.")
def migrate_default_input_to_dataset_query_variable(batch_size):
click.secho("Starting...", fg='green')
total_records = db.session.query(AppModelConfig) \
@ -658,13 +692,13 @@ def migrate_default_input_to_dataset_query_variable(batch_size):
.filter(App.mode == 'completion') \
.filter(AppModelConfig.dataset_query_variable == None) \
.count()
if total_records == 0:
click.secho("No data to migrate.", fg='green')
return
num_batches = (total_records + batch_size - 1) // batch_size
with tqdm(total=total_records, desc="Migrating Data") as pbar:
for i in range(num_batches):
offset = i * batch_size
@ -697,14 +731,14 @@ def migrate_default_input_to_dataset_query_variable(batch_size):
for form in user_input_form:
paragraph = form.get('paragraph')
if paragraph \
and paragraph.get('variable') == 'query':
data.dataset_query_variable = 'query'
break
and paragraph.get('variable') == 'query':
data.dataset_query_variable = 'query'
break
if paragraph \
and paragraph.get('variable') == 'default_input':
data.dataset_query_variable = 'default_input'
break
and paragraph.get('variable') == 'default_input':
data.dataset_query_variable = 'default_input'
break
db.session.commit()
@ -712,7 +746,7 @@ def migrate_default_input_to_dataset_query_variable(batch_size):
click.secho(f"Error while migrating data: {e}, app_id: {data.app_id}, app_model_config_id: {data.id}",
fg='red')
continue
click.secho(f"Successfully migrated batch {i + 1}/{num_batches}.", fg='green')
pbar.update(len(data_batch))
@ -731,3 +765,4 @@ def register_commands(app):
app.cli.add_command(update_app_model_configs)
app.cli.add_command(normalization_collections)
app.cli.add_command(migrate_default_input_to_dataset_query_variable)
app.cli.add_command(add_qdrant_full_text_index)

View File

@ -60,7 +60,8 @@ DEFAULTS = {
'UPLOAD_FILE_BATCH_LIMIT': 5,
'UPLOAD_IMAGE_FILE_SIZE_LIMIT': 10,
'OUTPUT_MODERATION_BUFFER_SIZE': 300,
'MULTIMODAL_SEND_IMAGE_FORMAT': 'base64'
'MULTIMODAL_SEND_IMAGE_FORMAT': 'base64',
'INVITE_EXPIRY_HOURS': 72
}
@ -90,7 +91,7 @@ class Config:
# ------------------------
# General Configurations.
# ------------------------
self.CURRENT_VERSION = "0.3.30"
self.CURRENT_VERSION = "0.3.32"
self.COMMIT_SHA = get_env('COMMIT_SHA')
self.EDITION = "SELF_HOSTED"
self.DEPLOY_ENV = get_env('DEPLOY_ENV')
@ -218,6 +219,11 @@ class Config:
self.MAIL_TYPE = get_env('MAIL_TYPE')
self.MAIL_DEFAULT_SEND_FROM = get_env('MAIL_DEFAULT_SEND_FROM')
self.RESEND_API_KEY = get_env('RESEND_API_KEY')
# ------------------------
# Workpace Configurations.
# ------------------------
self.INVITE_EXPIRY_HOURS = int(get_env('INVITE_EXPIRY_HOURS'))
# ------------------------
# Sentry Configurations.

View File

@ -62,16 +62,15 @@ class DailyConversationStatistic(Resource):
sql_query += ' GROUP BY date order by date'
with db.engine.begin() as conn:
rs = conn.execute(db.text(sql_query), arg_dict)
response_data = []
for i in rs:
response_data.append({
'date': str(i.date),
'conversation_count': i.conversation_count
})
with db.engine.begin() as conn:
rs = conn.execute(db.text(sql_query), arg_dict)
for i in rs:
response_data.append({
'date': str(i.date),
'conversation_count': i.conversation_count
})
return jsonify({
'data': response_data
@ -124,16 +123,15 @@ class DailyTerminalsStatistic(Resource):
sql_query += ' GROUP BY date order by date'
with db.engine.begin() as conn:
rs = conn.execute(db.text(sql_query), arg_dict)
response_data = []
for i in rs:
response_data.append({
'date': str(i.date),
'terminal_count': i.terminal_count
})
with db.engine.begin() as conn:
rs = conn.execute(db.text(sql_query), arg_dict)
for i in rs:
response_data.append({
'date': str(i.date),
'terminal_count': i.terminal_count
})
return jsonify({
'data': response_data
@ -187,18 +185,17 @@ class DailyTokenCostStatistic(Resource):
sql_query += ' GROUP BY date order by date'
with db.engine.begin() as conn:
rs = conn.execute(db.text(sql_query), arg_dict)
response_data = []
for i in rs:
response_data.append({
'date': str(i.date),
'token_count': i.token_count,
'total_price': i.total_price,
'currency': 'USD'
})
with db.engine.begin() as conn:
rs = conn.execute(db.text(sql_query), arg_dict)
for i in rs:
response_data.append({
'date': str(i.date),
'token_count': i.token_count,
'total_price': i.total_price,
'currency': 'USD'
})
return jsonify({
'data': response_data
@ -256,16 +253,15 @@ LEFT JOIN conversations c on c.id=subquery.conversation_id
GROUP BY date
ORDER BY date"""
response_data = []
with db.engine.begin() as conn:
rs = conn.execute(db.text(sql_query), arg_dict)
response_data = []
for i in rs:
response_data.append({
'date': str(i.date),
'interactions': float(i.interactions.quantize(Decimal('0.01')))
})
for i in rs:
response_data.append({
'date': str(i.date),
'interactions': float(i.interactions.quantize(Decimal('0.01')))
})
return jsonify({
'data': response_data
@ -320,20 +316,19 @@ class UserSatisfactionRateStatistic(Resource):
sql_query += ' GROUP BY date order by date'
with db.engine.begin() as conn:
rs = conn.execute(db.text(sql_query), arg_dict)
response_data = []
for i in rs:
response_data.append({
'date': str(i.date),
'rate': round((i.feedback_count * 1000 / i.message_count) if i.message_count > 0 else 0, 2),
})
with db.engine.begin() as conn:
rs = conn.execute(db.text(sql_query), arg_dict)
for i in rs:
response_data.append({
'date': str(i.date),
'rate': round((i.feedback_count * 1000 / i.message_count) if i.message_count > 0 else 0, 2),
})
return jsonify({
'data': response_data
})
'data': response_data
})
class AverageResponseTimeStatistic(Resource):
@ -383,16 +378,15 @@ class AverageResponseTimeStatistic(Resource):
sql_query += ' GROUP BY date order by date'
with db.engine.begin() as conn:
rs = conn.execute(db.text(sql_query), arg_dict)
response_data = []
for i in rs:
response_data.append({
'date': str(i.date),
'latency': round(i.latency * 1000, 4)
})
with db.engine.begin() as conn:
rs = conn.execute(db.text(sql_query), arg_dict)
for i in rs:
response_data.append({
'date': str(i.date),
'latency': round(i.latency * 1000, 4)
})
return jsonify({
'data': response_data
@ -447,16 +441,15 @@ WHERE app_id = :app_id'''
sql_query += ' GROUP BY date order by date'
with db.engine.begin() as conn:
rs = conn.execute(db.text(sql_query), arg_dict)
response_data = []
for i in rs:
response_data.append({
'date': str(i.date),
'tps': round(i.tokens_per_second, 4)
})
with db.engine.begin() as conn:
rs = conn.execute(db.text(sql_query), arg_dict)
for i in rs:
response_data.append({
'date': str(i.date),
'tps': round(i.tokens_per_second, 4)
})
return jsonify({
'data': response_data

View File

@ -170,6 +170,7 @@ class DatasetApi(Resource):
help='Invalid indexing technique.')
parser.add_argument('permission', type=str, location='json', choices=(
'only_me', 'all_team_members'), help='Invalid permission.')
parser.add_argument('retrieval_model', type=dict, location='json', help='Invalid retrieval model.')
args = parser.parse_args()
# The role of the current user in the ta table must be admin or owner
@ -401,6 +402,7 @@ class DatasetApiKeyApi(Resource):
class DatasetApiDeleteApi(Resource):
resource_type = 'dataset'
@setup_required
@login_required
@account_initialization_required
@ -436,6 +438,50 @@ class DatasetApiBaseUrlApi(Resource):
}
class DatasetRetrievalSettingApi(Resource):
@setup_required
@login_required
@account_initialization_required
def get(self):
vector_type = current_app.config['VECTOR_STORE']
if vector_type == 'milvus':
return {
'retrieval_method': [
'semantic_search'
]
}
elif vector_type == 'qdrant' or vector_type == 'weaviate':
return {
'retrieval_method': [
'semantic_search', 'full_text_search', 'hybrid_search'
]
}
else:
raise ValueError("Unsupported vector db type.")
class DatasetRetrievalSettingMockApi(Resource):
@setup_required
@login_required
@account_initialization_required
def get(self, vector_type):
if vector_type == 'milvus':
return {
'retrieval_method': [
'semantic_search'
]
}
elif vector_type == 'qdrant' or vector_type == 'weaviate':
return {
'retrieval_method': [
'semantic_search', 'full_text_search', 'hybrid_search'
]
}
else:
raise ValueError("Unsupported vector db type.")
api.add_resource(DatasetListApi, '/datasets')
api.add_resource(DatasetApi, '/datasets/<uuid:dataset_id>')
api.add_resource(DatasetQueryApi, '/datasets/<uuid:dataset_id>/queries')
@ -445,3 +491,5 @@ api.add_resource(DatasetIndexingStatusApi, '/datasets/<uuid:dataset_id>/indexing
api.add_resource(DatasetApiKeyApi, '/datasets/api-keys')
api.add_resource(DatasetApiDeleteApi, '/datasets/api-keys/<uuid:api_key_id>')
api.add_resource(DatasetApiBaseUrlApi, '/datasets/api-base-info')
api.add_resource(DatasetRetrievalSettingApi, '/datasets/retrieval-setting')
api.add_resource(DatasetRetrievalSettingMockApi, '/datasets/retrieval-setting/<string:vector_type>')

View File

@ -221,6 +221,8 @@ class DatasetDocumentListApi(Resource):
parser.add_argument('doc_form', type=str, default='text_model', required=False, nullable=False, location='json')
parser.add_argument('doc_language', type=str, default='English', required=False, nullable=False,
location='json')
parser.add_argument('retrieval_model', type=dict, required=False, nullable=False,
location='json')
args = parser.parse_args()
if not dataset.indexing_technique and not args['indexing_technique']:
@ -263,6 +265,8 @@ class DatasetInitApi(Resource):
parser.add_argument('doc_form', type=str, default='text_model', required=False, nullable=False, location='json')
parser.add_argument('doc_language', type=str, default='English', required=False, nullable=False,
location='json')
parser.add_argument('retrieval_model', type=dict, required=False, nullable=False,
location='json')
args = parser.parse_args()
if args['indexing_technique'] == 'high_quality':
try:

View File

@ -42,19 +42,18 @@ class HitTestingApi(Resource):
parser = reqparse.RequestParser()
parser.add_argument('query', type=str, location='json')
parser.add_argument('retrieval_model', type=dict, required=False, location='json')
args = parser.parse_args()
query = args['query']
if not query or len(query) > 250:
raise ValueError('Query is required and cannot exceed 250 characters')
HitTestingService.hit_testing_args_check(args)
try:
response = HitTestingService.retrieve(
dataset=dataset,
query=query,
query=args['query'],
account=current_user,
limit=10,
retrieval_model=args['retrieval_model'],
limit=10
)
return {"query": response['query'], 'records': marshal(response['records'], hit_testing_record_fields)}
@ -68,7 +67,7 @@ class HitTestingApi(Resource):
raise ProviderModelCurrentlyNotSupportError()
except LLMBadRequestError:
raise ProviderNotInitializeError(
f"No Embedding Model available. Please configure a valid provider "
f"No Embedding Model or Reranking Model available. Please configure a valid provider "
f"in the Settings -> Model Provider.")
except ValueError as e:
raise ValueError(str(e))

View File

@ -21,8 +21,12 @@ class ModelProviderListApi(Resource):
def get(self):
tenant_id = current_user.current_tenant_id
parser = reqparse.RequestParser()
parser.add_argument('model_type', type=str, required=False, nullable=True, location='args')
args = parser.parse_args()
provider_service = ProviderService()
provider_list = provider_service.get_provider_list(tenant_id)
provider_list = provider_service.get_provider_list(tenant_id=tenant_id, model_type=args.get('model_type'))
return provider_list
@ -111,7 +115,7 @@ class ModelProviderModelValidateApi(Resource):
parser = reqparse.RequestParser()
parser.add_argument('model_name', type=str, required=True, nullable=False, location='json')
parser.add_argument('model_type', type=str, required=True, nullable=False,
choices=['text-generation', 'embeddings', 'speech2text'], location='json')
choices=['text-generation', 'embeddings', 'speech2text', 'reranking'], location='json')
parser.add_argument('config', type=dict, required=True, nullable=False, location='json')
args = parser.parse_args()
@ -151,7 +155,7 @@ class ModelProviderModelUpdateApi(Resource):
parser = reqparse.RequestParser()
parser.add_argument('model_name', type=str, required=True, nullable=False, location='json')
parser.add_argument('model_type', type=str, required=True, nullable=False,
choices=['text-generation', 'embeddings', 'speech2text'], location='json')
choices=['text-generation', 'embeddings', 'speech2text', 'reranking'], location='json')
parser.add_argument('config', type=dict, required=True, nullable=False, location='json')
args = parser.parse_args()
@ -180,7 +184,7 @@ class ModelProviderModelUpdateApi(Resource):
parser = reqparse.RequestParser()
parser.add_argument('model_name', type=str, required=True, nullable=False, location='args')
parser.add_argument('model_type', type=str, required=True, nullable=False,
choices=['text-generation', 'embeddings', 'speech2text'], location='args')
choices=['text-generation', 'embeddings', 'speech2text', 'reranking'], location='args')
args = parser.parse_args()
provider_service = ProviderService()

View File

@ -1,3 +1,5 @@
import logging
from flask_login import current_user
from libs.login import login_required
from flask_restful import Resource, reqparse
@ -19,7 +21,7 @@ class DefaultModelApi(Resource):
def get(self):
parser = reqparse.RequestParser()
parser.add_argument('model_type', type=str, required=True, nullable=False,
choices=['text-generation', 'embeddings', 'speech2text'], location='args')
choices=['text-generation', 'embeddings', 'speech2text', 'reranking'], location='args')
args = parser.parse_args()
tenant_id = current_user.current_tenant_id
@ -71,19 +73,21 @@ class DefaultModelApi(Resource):
@account_initialization_required
def post(self):
parser = reqparse.RequestParser()
parser.add_argument('model_name', type=str, required=True, nullable=False, location='json')
parser.add_argument('model_type', type=str, required=True, nullable=False,
choices=['text-generation', 'embeddings', 'speech2text'], location='json')
parser.add_argument('provider_name', type=str, required=True, nullable=False, location='json')
parser.add_argument('model_settings', type=list, required=True, nullable=False, location='json')
args = parser.parse_args()
provider_service = ProviderService()
provider_service.update_default_model_of_model_type(
tenant_id=current_user.current_tenant_id,
model_type=args['model_type'],
provider_name=args['provider_name'],
model_name=args['model_name']
)
model_settings = args['model_settings']
for model_setting in model_settings:
try:
provider_service.update_default_model_of_model_type(
tenant_id=current_user.current_tenant_id,
model_type=model_setting['model_type'],
provider_name=model_setting['provider_name'],
model_name=model_setting['model_name']
)
except Exception:
logging.warning(f"{model_setting['model_type']} save error")
return {'result': 'success'}

View File

@ -67,7 +67,7 @@ class ConversationRenameApi(AppApiResource):
parser = reqparse.RequestParser()
parser.add_argument('name', type=str, required=False, location='json')
parser.add_argument('user', type=str, location='json')
parser.add_argument('auto_generate', type=bool, required=False, default='False', location='json')
parser.add_argument('auto_generate', type=bool, required=False, default=False, location='json')
args = parser.parse_args()
if end_user is None and args['user'] is not None:

View File

@ -26,6 +26,9 @@ class FileApi(AppApiResource):
if 'file' not in request.files:
raise NoFileUploadedError()
if not file.mimetype:
raise UnsupportedFileTypeError()
if len(request.files) > 1:
raise TooManyFilesError()

View File

@ -36,6 +36,8 @@ class DocumentAddByTextApi(DatasetApiResource):
location='json')
parser.add_argument('indexing_technique', type=str, choices=Dataset.INDEXING_TECHNIQUE_LIST, nullable=False,
location='json')
parser.add_argument('retrieval_model', type=dict, required=False, nullable=False,
location='json')
args = parser.parse_args()
dataset_id = str(dataset_id)
tenant_id = str(tenant_id)
@ -95,6 +97,8 @@ class DocumentUpdateByTextApi(DatasetApiResource):
parser.add_argument('doc_form', type=str, default='text_model', required=False, nullable=False, location='json')
parser.add_argument('doc_language', type=str, default='English', required=False, nullable=False,
location='json')
parser.add_argument('retrieval_model', type=dict, required=False, nullable=False,
location='json')
args = parser.parse_args()
dataset_id = str(dataset_id)
tenant_id = str(tenant_id)

View File

@ -14,7 +14,6 @@ from pydantic import root_validator
from core.model_providers.models.entity.message import to_prompt_messages
from core.model_providers.models.llm.base import BaseLLM
from core.third_party.langchain.llms.fake import FakeLLM
from core.tool.dataset_retriever_tool import DatasetRetrieverTool
class MultiDatasetRouterAgent(OpenAIFunctionsAgent):
@ -60,7 +59,6 @@ class MultiDatasetRouterAgent(OpenAIFunctionsAgent):
return AgentFinish(return_values={"output": ''}, log='')
elif len(self.tools) == 1:
tool = next(iter(self.tools))
tool = cast(DatasetRetrieverTool, tool)
rst = tool.run(tool_input={'query': kwargs['input']})
# output = ''
# rst_json = json.loads(rst)

View File

@ -0,0 +1,158 @@
import json
from typing import Tuple, List, Any, Union, Sequence, Optional, cast
from langchain.agents import OpenAIFunctionsAgent, BaseSingleActionAgent
from langchain.agents.openai_functions_agent.base import _format_intermediate_steps, _parse_ai_message
from langchain.callbacks.base import BaseCallbackManager
from langchain.callbacks.manager import Callbacks
from langchain.prompts.chat import BaseMessagePromptTemplate
from langchain.schema import AgentAction, AgentFinish, SystemMessage, Generation, LLMResult, AIMessage
from langchain.schema.language_model import BaseLanguageModel
from langchain.tools import BaseTool
from pydantic import root_validator
from core.model_providers.models.entity.message import to_prompt_messages
from core.model_providers.models.llm.base import BaseLLM
from core.third_party.langchain.llms.fake import FakeLLM
from core.tool.dataset_retriever_tool import DatasetRetrieverTool
class MultiDatasetRouterAgent(OpenAIFunctionsAgent):
"""
An Multi Dataset Retrieve Agent driven by Router.
"""
model_instance: BaseLLM
class Config:
"""Configuration for this pydantic object."""
arbitrary_types_allowed = True
@root_validator
def validate_llm(cls, values: dict) -> dict:
return values
def should_use_agent(self, query: str):
"""
return should use agent
:param query:
:return:
"""
return True
def plan(
self,
intermediate_steps: List[Tuple[AgentAction, str]],
callbacks: Callbacks = None,
**kwargs: Any,
) -> Union[AgentAction, AgentFinish]:
"""Given input, decided what to do.
Args:
intermediate_steps: Steps the LLM has taken to date, along with observations
**kwargs: User inputs.
Returns:
Action specifying what tool to use.
"""
if len(self.tools) == 0:
return AgentFinish(return_values={"output": ''}, log='')
elif len(self.tools) == 1:
tool = next(iter(self.tools))
tool = cast(DatasetRetrieverTool, tool)
rst = tool.run(tool_input={'query': kwargs['input']})
# output = ''
# rst_json = json.loads(rst)
# for item in rst_json:
# output += f'{item["content"]}\n'
return AgentFinish(return_values={"output": rst}, log=rst)
if intermediate_steps:
_, observation = intermediate_steps[-1]
return AgentFinish(return_values={"output": observation}, log=observation)
try:
agent_decision = self.real_plan(intermediate_steps, callbacks, **kwargs)
if isinstance(agent_decision, AgentAction):
tool_inputs = agent_decision.tool_input
if isinstance(tool_inputs, dict) and 'query' in tool_inputs and 'chat_history' not in kwargs:
tool_inputs['query'] = kwargs['input']
agent_decision.tool_input = tool_inputs
else:
agent_decision.return_values['output'] = ''
return agent_decision
except Exception as e:
new_exception = self.model_instance.handle_exceptions(e)
raise new_exception
def real_plan(
self,
intermediate_steps: List[Tuple[AgentAction, str]],
callbacks: Callbacks = None,
**kwargs: Any,
) -> Union[AgentAction, AgentFinish]:
"""Given input, decided what to do.
Args:
intermediate_steps: Steps the LLM has taken to date, along with observations
**kwargs: User inputs.
Returns:
Action specifying what tool to use.
"""
agent_scratchpad = _format_intermediate_steps(intermediate_steps)
selected_inputs = {
k: kwargs[k] for k in self.prompt.input_variables if k != "agent_scratchpad"
}
full_inputs = dict(**selected_inputs, agent_scratchpad=agent_scratchpad)
prompt = self.prompt.format_prompt(**full_inputs)
messages = prompt.to_messages()
prompt_messages = to_prompt_messages(messages)
result = self.model_instance.run(
messages=prompt_messages,
functions=self.functions,
)
ai_message = AIMessage(
content=result.content,
additional_kwargs={
'function_call': result.function_call
}
)
agent_decision = _parse_ai_message(ai_message)
return agent_decision
async def aplan(
self,
intermediate_steps: List[Tuple[AgentAction, str]],
callbacks: Callbacks = None,
**kwargs: Any,
) -> Union[AgentAction, AgentFinish]:
raise NotImplementedError()
@classmethod
def from_llm_and_tools(
cls,
model_instance: BaseLLM,
tools: Sequence[BaseTool],
callback_manager: Optional[BaseCallbackManager] = None,
extra_prompt_messages: Optional[List[BaseMessagePromptTemplate]] = None,
system_message: Optional[SystemMessage] = SystemMessage(
content="You are a helpful AI assistant."
),
**kwargs: Any,
) -> BaseSingleActionAgent:
prompt = cls.create_prompt(
extra_prompt_messages=extra_prompt_messages,
system_message=system_message,
)
return cls(
model_instance=model_instance,
llm=FakeLLM(response=''),
prompt=prompt,
tools=tools,
callback_manager=callback_manager,
**kwargs,
)

View File

@ -89,7 +89,6 @@ class StructuredMultiDatasetRouterAgent(StructuredChatAgent):
return AgentFinish(return_values={"output": ''}, log='')
elif len(self.dataset_tools) == 1:
tool = next(iter(self.dataset_tools))
tool = cast(DatasetRetrieverTool, tool)
rst = tool.run(tool_input={'query': kwargs['input']})
return AgentFinish(return_values={"output": rst}, log=rst)

View File

@ -18,6 +18,7 @@ from langchain.agents import AgentExecutor as LCAgentExecutor
from core.helper import moderation
from core.model_providers.error import LLMError
from core.model_providers.models.llm.base import BaseLLM
from core.tool.dataset_multi_retriever_tool import DatasetMultiRetrieverTool
from core.tool.dataset_retriever_tool import DatasetRetrieverTool
@ -78,7 +79,7 @@ class AgentExecutor:
verbose=True
)
elif self.configuration.strategy == PlanningStrategy.ROUTER:
self.configuration.tools = [t for t in self.configuration.tools if isinstance(t, DatasetRetrieverTool)]
self.configuration.tools = [t for t in self.configuration.tools if isinstance(t, DatasetRetrieverTool) or isinstance(t, DatasetMultiRetrieverTool)]
agent = MultiDatasetRouterAgent.from_llm_and_tools(
model_instance=self.configuration.model_instance,
tools=self.configuration.tools,
@ -86,7 +87,7 @@ class AgentExecutor:
verbose=True
)
elif self.configuration.strategy == PlanningStrategy.REACT_ROUTER:
self.configuration.tools = [t for t in self.configuration.tools if isinstance(t, DatasetRetrieverTool)]
self.configuration.tools = [t for t in self.configuration.tools if isinstance(t, DatasetRetrieverTool) or isinstance(t, DatasetMultiRetrieverTool)]
agent = StructuredMultiDatasetRouterAgent.from_llm_and_tools(
model_instance=self.configuration.model_instance,
tools=self.configuration.tools,

View File

@ -10,8 +10,7 @@ from models.dataset import DocumentSegment
class DatasetIndexToolCallbackHandler:
"""Callback handler for dataset tool."""
def __init__(self, dataset_id: str, conversation_message_task: ConversationMessageTask) -> None:
self.dataset_id = dataset_id
def __init__(self, conversation_message_task: ConversationMessageTask) -> None:
self.conversation_message_task = conversation_message_task
def on_tool_end(self, documents: List[Document]) -> None:
@ -21,7 +20,6 @@ class DatasetIndexToolCallbackHandler:
# add hit count to document segment
db.session.query(DocumentSegment).filter(
DocumentSegment.dataset_id == self.dataset_id,
DocumentSegment.index_node_id == doc_id
).update(
{DocumentSegment.hit_count: DocumentSegment.hit_count + 1},

View File

@ -127,6 +127,7 @@ class Completion:
memory=memory,
rest_tokens=rest_tokens_for_context_and_memory,
chain_callback=chain_callback,
tenant_id=app.tenant_id,
retriever_from=retriever_from
)

View File

@ -3,7 +3,7 @@ from pathlib import Path
from typing import List, Union, Optional
import requests
from langchain.document_loaders import TextLoader, Docx2txtLoader
from langchain.document_loaders import TextLoader, Docx2txtLoader, UnstructuredFileLoader, UnstructuredAPIFileLoader
from langchain.schema import Document
from core.data_loader.loader.csv_loader import CSVLoader
@ -20,13 +20,13 @@ USER_AGENT = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTM
class FileExtractor:
@classmethod
def load(cls, upload_file: UploadFile, return_text: bool = False) -> Union[List[Document] | str]:
def load(cls, upload_file: UploadFile, return_text: bool = False, is_automatic: bool = False) -> Union[List[Document] | str]:
with tempfile.TemporaryDirectory() as temp_dir:
suffix = Path(upload_file.key).suffix
file_path = f"{temp_dir}/{next(tempfile._get_candidate_names())}{suffix}"
storage.download(upload_file.key, file_path)
return cls.load_from_file(file_path, return_text, upload_file)
return cls.load_from_file(file_path, return_text, upload_file, is_automatic)
@classmethod
def load_from_url(cls, url: str, return_text: bool = False) -> Union[List[Document] | str]:
@ -44,24 +44,34 @@ class FileExtractor:
@classmethod
def load_from_file(cls, file_path: str, return_text: bool = False,
upload_file: Optional[UploadFile] = None) -> Union[List[Document] | str]:
upload_file: Optional[UploadFile] = None,
is_automatic: bool = False) -> Union[List[Document] | str]:
input_file = Path(file_path)
delimiter = '\n'
file_extension = input_file.suffix.lower()
if file_extension == '.xlsx':
loader = ExcelLoader(file_path)
elif file_extension == '.pdf':
loader = PdfLoader(file_path, upload_file=upload_file)
elif file_extension in ['.md', '.markdown']:
loader = MarkdownLoader(file_path, autodetect_encoding=True)
elif file_extension in ['.htm', '.html']:
loader = HTMLLoader(file_path)
elif file_extension == '.docx':
loader = Docx2txtLoader(file_path)
elif file_extension == '.csv':
loader = CSVLoader(file_path, autodetect_encoding=True)
if is_automatic:
loader = UnstructuredFileLoader(
file_path, strategy="hi_res", mode="elements"
)
# loader = UnstructuredAPIFileLoader(
# file_path=filenames[0],
# api_key="FAKE_API_KEY",
# )
else:
# txt
loader = TextLoader(file_path, autodetect_encoding=True)
if file_extension == '.xlsx':
loader = ExcelLoader(file_path)
elif file_extension == '.pdf':
loader = PdfLoader(file_path, upload_file=upload_file)
elif file_extension in ['.md', '.markdown']:
loader = MarkdownLoader(file_path, autodetect_encoding=True)
elif file_extension in ['.htm', '.html']:
loader = HTMLLoader(file_path)
elif file_extension == '.docx':
loader = Docx2txtLoader(file_path)
elif file_extension == '.csv':
loader = CSVLoader(file_path, autodetect_encoding=True)
else:
# txt
loader = TextLoader(file_path, autodetect_encoding=True)
return delimiter.join([document.page_content for document in loader.load()]) if return_text else loader.load()

View File

@ -40,6 +40,13 @@ class BaseVectorIndex(BaseIndex):
def _get_vector_store_class(self) -> type:
raise NotImplementedError
@abstractmethod
def search_by_full_text_index(
self, query: str,
**kwargs: Any
) -> List[Document]:
raise NotImplementedError
def search(
self, query: str,
**kwargs: Any

View File

@ -1,16 +1,14 @@
from typing import Optional, cast
from typing import cast, Any, List
from langchain.embeddings.base import Embeddings
from langchain.schema import Document, BaseRetriever
from langchain.vectorstores import VectorStore, milvus
from langchain.schema import Document
from langchain.vectorstores import VectorStore
from pydantic import BaseModel, root_validator
from core.index.base import BaseIndex
from core.index.vector_index.base import BaseVectorIndex
from core.vector_store.milvus_vector_store import MilvusVectorStore
from core.vector_store.weaviate_vector_store import WeaviateVectorStore
from extensions.ext_database import db
from models.dataset import Dataset, DatasetCollectionBinding
from models.dataset import Dataset
class MilvusConfig(BaseModel):
@ -74,7 +72,7 @@ class MilvusVectorIndex(BaseVectorIndex):
index_params = {
'metric_type': 'IP',
'index_type': "HNSW",
'params': {"M": 8, "efConstruction": 64}
'params': {"M": 8, "efConstruction": 64}
}
self._vector_store = MilvusVectorStore.from_documents(
texts,
@ -152,3 +150,7 @@ class MilvusVectorIndex(BaseVectorIndex):
),
],
))
def search_by_full_text_index(self, query: str, **kwargs: Any) -> List[Document]:
# milvus/zilliz doesn't support bm25 search
return []

View File

@ -191,3 +191,21 @@ class QdrantVectorIndex(BaseVectorIndex):
return True
return False
def search_by_full_text_index(self, query: str, **kwargs: Any) -> List[Document]:
vector_store = self._get_vector_store()
vector_store = cast(self._get_vector_store_class(), vector_store)
from qdrant_client.http import models
return vector_store.similarity_search_by_bm25(models.Filter(
must=[
models.FieldCondition(
key="group_id",
match=models.MatchValue(value=self.dataset.id),
),
models.FieldCondition(
key="page_content",
match=models.MatchText(text=query),
)
],
), kwargs.get('top_k', 2))

View File

@ -1,4 +1,4 @@
from typing import Optional, cast
from typing import Optional, cast, Any, List
import requests
import weaviate
@ -26,6 +26,7 @@ class WeaviateConfig(BaseModel):
class WeaviateVectorIndex(BaseVectorIndex):
def __init__(self, dataset: Dataset, config: WeaviateConfig, embeddings: Embeddings):
super().__init__(dataset, embeddings)
self._client = self._init_client(config)
@ -110,7 +111,7 @@ class WeaviateVectorIndex(BaseVectorIndex):
if self._vector_store:
return self._vector_store
attributes = ['doc_id', 'dataset_id', 'document_id']
attributes = ['doc_id', 'dataset_id', 'document_id', 'doc_hash']
if self._is_origin():
attributes = ['doc_id']
@ -148,3 +149,9 @@ class WeaviateVectorIndex(BaseVectorIndex):
return True
return False
def search_by_full_text_index(self, query: str, **kwargs: Any) -> List[Document]:
vector_store = self._get_vector_store()
vector_store = cast(self._get_vector_store_class(), vector_store)
return vector_store.similarity_search_by_bm25(query, kwargs.get('top_k', 2), **kwargs)

View File

@ -49,14 +49,14 @@ class IndexingRunner:
if not dataset:
raise ValueError("no dataset found")
# load file
text_docs = self._load_data(dataset_document)
# get the process rule
processing_rule = db.session.query(DatasetProcessRule). \
filter(DatasetProcessRule.id == dataset_document.dataset_process_rule_id). \
first()
# load file
text_docs = self._load_data(dataset_document)
# get splitter
splitter = self._get_splitter(processing_rule)
@ -380,7 +380,7 @@ class IndexingRunner:
"preview": preview_texts
}
def _load_data(self, dataset_document: DatasetDocument) -> List[Document]:
def _load_data(self, dataset_document: DatasetDocument, automatic: bool = False) -> List[Document]:
# load file
if dataset_document.data_source_type not in ["upload_file", "notion_import"]:
return []
@ -396,7 +396,7 @@ class IndexingRunner:
one_or_none()
if file_detail:
text_docs = FileExtractor.load(file_detail)
text_docs = FileExtractor.load(file_detail, is_automatic=False)
elif dataset_document.data_source_type == 'notion_import':
loader = NotionLoader.from_document(dataset_document)
text_docs = loader.load()

View File

@ -9,6 +9,7 @@ from core.model_providers.models.embedding.base import BaseEmbedding
from core.model_providers.models.entity.model_params import ModelKwargs, ModelType
from core.model_providers.models.llm.base import BaseLLM
from core.model_providers.models.moderation.base import BaseModeration
from core.model_providers.models.reranking.base import BaseReranking
from core.model_providers.models.speech2text.base import BaseSpeech2Text
from extensions.ext_database import db
from models.provider import TenantDefaultModel
@ -140,6 +141,44 @@ class ModelFactory:
name=model_name
)
@classmethod
def get_reranking_model(cls,
tenant_id: str,
model_provider_name: Optional[str] = None,
model_name: Optional[str] = None) -> Optional[BaseReranking]:
"""
get reranking model.
:param tenant_id: a string representing the ID of the tenant.
:param model_provider_name:
:param model_name:
:return:
"""
if (model_provider_name is None or len(model_provider_name) == 0) and (model_name is None or len(model_name) == 0):
default_model = cls.get_default_model(tenant_id, ModelType.RERANKING)
if not default_model:
raise LLMBadRequestError(f"Default model is not available. "
f"Please configure a Default Reranking Model "
f"in the Settings -> Model Provider.")
model_provider_name = default_model.provider_name
model_name = default_model.model_name
# get model provider
model_provider = ModelProviderFactory.get_preferred_model_provider(tenant_id, model_provider_name)
if not model_provider:
raise ProviderTokenNotInitError(f"Model {model_name} provider credentials is not initialized.")
# init reranking model
model_class = model_provider.get_model_class(model_type=ModelType.RERANKING)
return model_class(
model_provider=model_provider,
name=model_name
)
@classmethod
def get_speech2text_model(cls,
tenant_id: str,

View File

@ -72,6 +72,9 @@ class ModelProviderFactory:
elif provider_name == 'localai':
from core.model_providers.providers.localai_provider import LocalAIProvider
return LocalAIProvider
elif provider_name == 'cohere':
from core.model_providers.providers.cohere_provider import CohereProvider
return CohereProvider
else:
raise NotImplementedError

View File

@ -17,7 +17,7 @@ class ModelType(enum.Enum):
IMAGE = 'image'
VIDEO = 'video'
MODERATION = 'moderation'
RERANKING = 'reranking'
@staticmethod
def value_of(value):
for member in ModelType:

View File

@ -1,27 +1,45 @@
import decimal
import logging
from typing import List, Optional, Any
import openai
from langchain.callbacks.manager import Callbacks
from langchain.llms import ChatGLM
from langchain.schema import LLMResult
from langchain.schema import LLMResult, get_buffer_string
from core.model_providers.error import LLMBadRequestError
from core.model_providers.error import LLMBadRequestError, LLMRateLimitError, LLMAuthorizationError, \
LLMAPIUnavailableError, LLMAPIConnectionError
from core.model_providers.models.llm.base import BaseLLM
from core.model_providers.models.entity.message import PromptMessage, MessageType
from core.model_providers.models.entity.model_params import ModelMode, ModelKwargs
from core.third_party.langchain.llms.chat_open_ai import EnhanceChatOpenAI
class ChatGLMModel(BaseLLM):
model_mode: ModelMode = ModelMode.COMPLETION
model_mode: ModelMode = ModelMode.CHAT
def _init_client(self) -> Any:
provider_model_kwargs = self._to_model_kwargs_input(self.model_rules, self.model_kwargs)
return ChatGLM(
extra_model_kwargs = {
'top_p': provider_model_kwargs.get('top_p')
}
if provider_model_kwargs.get('max_length') is not None:
extra_model_kwargs['max_length'] = provider_model_kwargs.get('max_length')
client = EnhanceChatOpenAI(
model_name=self.name,
temperature=provider_model_kwargs.get('temperature'),
max_tokens=provider_model_kwargs.get('max_tokens'),
model_kwargs=extra_model_kwargs,
streaming=self.streaming,
callbacks=self.callbacks,
endpoint_url=self.credentials.get('api_base'),
**provider_model_kwargs
request_timeout=60,
openai_api_key="1",
openai_api_base=self.credentials['api_base'] + '/v1'
)
return client
def _run(self, messages: List[PromptMessage],
stop: Optional[List[str]] = None,
callbacks: Callbacks = None,
@ -45,19 +63,40 @@ class ChatGLMModel(BaseLLM):
:return:
"""
prompts = self._get_prompt_from_messages(messages)
return max(self._client.get_num_tokens(prompts), 0)
return max(sum([self._client.get_num_tokens(get_buffer_string([m])) for m in prompts]) - len(prompts), 0)
def get_currency(self):
return 'RMB'
def _set_model_kwargs(self, model_kwargs: ModelKwargs):
provider_model_kwargs = self._to_model_kwargs_input(self.model_rules, model_kwargs)
for k, v in provider_model_kwargs.items():
if hasattr(self.client, k):
setattr(self.client, k, v)
extra_model_kwargs = {
'top_p': provider_model_kwargs.get('top_p')
}
self.client.temperature = provider_model_kwargs.get('temperature')
self.client.max_tokens = provider_model_kwargs.get('max_tokens')
self.client.model_kwargs = extra_model_kwargs
def handle_exceptions(self, ex: Exception) -> Exception:
if isinstance(ex, ValueError):
return LLMBadRequestError(f"ChatGLM: {str(ex)}")
if isinstance(ex, openai.error.InvalidRequestError):
logging.warning("Invalid request to ChatGLM API.")
return LLMBadRequestError(str(ex))
elif isinstance(ex, openai.error.APIConnectionError):
logging.warning("Failed to connect to ChatGLM API.")
return LLMAPIConnectionError(ex.__class__.__name__ + ":" + str(ex))
elif isinstance(ex, (openai.error.APIError, openai.error.ServiceUnavailableError, openai.error.Timeout)):
logging.warning("ChatGLM service unavailable.")
return LLMAPIUnavailableError(ex.__class__.__name__ + ":" + str(ex))
elif isinstance(ex, openai.error.RateLimitError):
return LLMRateLimitError(str(ex))
elif isinstance(ex, openai.error.AuthenticationError):
return LLMAuthorizationError(str(ex))
elif isinstance(ex, openai.error.OpenAIError):
return LLMBadRequestError(ex.__class__.__name__ + ":" + str(ex))
else:
return ex
@classmethod
def support_streaming(cls):
return True

View File

@ -0,0 +1,36 @@
from abc import abstractmethod
from typing import Any, Optional, List
from langchain.schema import Document
from core.model_providers.models.base import BaseProviderModel
from core.model_providers.models.entity.model_params import ModelType
from core.model_providers.providers.base import BaseModelProvider
import logging
logger = logging.getLogger(__name__)
class BaseReranking(BaseProviderModel):
name: str
type: ModelType = ModelType.RERANKING
def __init__(self, model_provider: BaseModelProvider, client: Any, name: str):
super().__init__(model_provider, client)
self.name = name
@property
def base_model_name(self) -> str:
"""
get base model name
:return: str
"""
return self.name
@abstractmethod
def rerank(self, query: str, documents: List[Document], score_threshold: Optional[float], top_k: Optional[int]) -> Optional[List[Document]]:
raise NotImplementedError
@abstractmethod
def handle_exceptions(self, ex: Exception) -> Exception:
raise NotImplementedError

View File

@ -0,0 +1,73 @@
import logging
from typing import Optional, List
import cohere
import openai
from langchain.schema import Document
from core.model_providers.error import LLMBadRequestError, LLMAPIConnectionError, LLMAPIUnavailableError, \
LLMRateLimitError, LLMAuthorizationError
from core.model_providers.models.reranking.base import BaseReranking
from core.model_providers.providers.base import BaseModelProvider
class CohereReranking(BaseReranking):
def __init__(self, model_provider: BaseModelProvider, name: str):
self.credentials = model_provider.get_model_credentials(
model_name=name,
model_type=self.type
)
client = cohere.Client(self.credentials.get('api_key'))
super().__init__(model_provider, client, name)
def rerank(self, query: str, documents: List[Document], score_threshold: Optional[float], top_k: Optional[int]) -> Optional[List[Document]]:
docs = []
doc_id = []
for document in documents:
if document.metadata['doc_id'] not in doc_id:
doc_id.append(document.metadata['doc_id'])
docs.append(document.page_content)
results = self.client.rerank(query=query, documents=docs, model=self.name, top_n=top_k)
rerank_documents = []
for idx, result in enumerate(results):
# format document
rerank_document = Document(
page_content=result.document['text'],
metadata={
"doc_id": documents[result.index].metadata['doc_id'],
"doc_hash": documents[result.index].metadata['doc_hash'],
"document_id": documents[result.index].metadata['document_id'],
"dataset_id": documents[result.index].metadata['dataset_id'],
'score': result.relevance_score
}
)
# score threshold check
if score_threshold is not None:
if result.relevance_score >= score_threshold:
rerank_documents.append(rerank_document)
else:
rerank_documents.append(rerank_document)
return rerank_documents
def handle_exceptions(self, ex: Exception) -> Exception:
if isinstance(ex, openai.error.InvalidRequestError):
logging.warning("Invalid request to OpenAI API.")
return LLMBadRequestError(str(ex))
elif isinstance(ex, openai.error.APIConnectionError):
logging.warning("Failed to connect to OpenAI API.")
return LLMAPIConnectionError(ex.__class__.__name__ + ":" + str(ex))
elif isinstance(ex, (openai.error.APIError, openai.error.ServiceUnavailableError, openai.error.Timeout)):
logging.warning("OpenAI service unavailable.")
return LLMAPIUnavailableError(ex.__class__.__name__ + ":" + str(ex))
elif isinstance(ex, openai.error.RateLimitError):
return LLMRateLimitError(str(ex))
elif isinstance(ex, openai.error.AuthenticationError):
return LLMAuthorizationError(str(ex))
elif isinstance(ex, openai.error.OpenAIError):
return LLMBadRequestError(ex.__class__.__name__ + ":" + str(ex))
else:
return ex

View File

@ -0,0 +1,58 @@
import logging
from typing import Optional, List
from langchain.schema import Document
from xinference_client.client.restful.restful_client import Client
from core.model_providers.error import LLMBadRequestError
from core.model_providers.models.reranking.base import BaseReranking
from core.model_providers.providers.base import BaseModelProvider
class XinferenceReranking(BaseReranking):
def __init__(self, model_provider: BaseModelProvider, name: str):
self.credentials = model_provider.get_model_credentials(
model_name=name,
model_type=self.type
)
client = Client(self.credentials['server_url'])
super().__init__(model_provider, client, name)
def rerank(self, query: str, documents: List[Document], score_threshold: Optional[float], top_k: Optional[int]) -> Optional[List[Document]]:
docs = []
doc_id = []
for document in documents:
if document.metadata['doc_id'] not in doc_id:
doc_id.append(document.metadata['doc_id'])
docs.append(document.page_content)
model = self.client.get_model(self.credentials['model_uid'])
response = model.rerank(query=query, documents=docs, top_n=top_k)
rerank_documents = []
for idx, result in enumerate(response['results']):
# format document
index = result['index']
rerank_document = Document(
page_content=result['document'],
metadata={
"doc_id": documents[index].metadata['doc_id'],
"doc_hash": documents[index].metadata['doc_hash'],
"document_id": documents[index].metadata['document_id'],
"dataset_id": documents[index].metadata['dataset_id'],
'score': result['relevance_score']
}
)
# score threshold check
if score_threshold is not None:
if result.relevance_score >= score_threshold:
rerank_documents.append(rerank_document)
else:
rerank_documents.append(rerank_document)
return rerank_documents
def handle_exceptions(self, ex: Exception) -> Exception:
return LLMBadRequestError(f"Xinference rerank: {str(ex)}")

View File

@ -32,9 +32,12 @@ class AnthropicProvider(BaseModelProvider):
if model_type == ModelType.TEXT_GENERATION:
return [
{
'id': 'claude-instant-1',
'name': 'claude-instant-1',
'id': 'claude-2.1',
'name': 'claude-2.1',
'mode': ModelMode.CHAT.value,
'features': [
ModelFeature.AGENT_THOUGHT.value
]
},
{
'id': 'claude-2',
@ -44,6 +47,11 @@ class AnthropicProvider(BaseModelProvider):
ModelFeature.AGENT_THOUGHT.value
]
},
{
'id': 'claude-instant-1',
'name': 'claude-instant-1',
'mode': ModelMode.CHAT.value,
},
]
else:
return []
@ -73,12 +81,18 @@ class AnthropicProvider(BaseModelProvider):
:param model_type:
:return:
"""
model_max_tokens = {
'claude-instant-1': 100000,
'claude-2': 100000,
'claude-2.1': 200000,
}
return ModelKwargsRules(
temperature=KwargRule[float](min=0, max=1, default=1, precision=2),
top_p=KwargRule[float](min=0, max=1, default=0.7, precision=2),
presence_penalty=KwargRule[float](enabled=False),
frequency_penalty=KwargRule[float](enabled=False),
max_tokens=KwargRule[int](alias="max_tokens_to_sample", min=10, max=100000, default=256, precision=0),
max_tokens=KwargRule[int](alias="max_tokens_to_sample", min=10, max=model_max_tokens.get(model_name, 100000), default=256, precision=0),
)
@classmethod

View File

@ -2,6 +2,7 @@ import json
from json import JSONDecodeError
from typing import Type
import requests
from langchain.llms import ChatGLM
from core.helper import encrypter
@ -25,21 +26,26 @@ class ChatGLMProvider(BaseModelProvider):
if model_type == ModelType.TEXT_GENERATION:
return [
{
'id': 'chatglm2-6b',
'name': 'ChatGLM2-6B',
'mode': ModelMode.COMPLETION.value,
'id': 'chatglm3-6b',
'name': 'ChatGLM3-6B',
'mode': ModelMode.CHAT.value,
},
{
'id': 'chatglm-6b',
'name': 'ChatGLM-6B',
'mode': ModelMode.COMPLETION.value,
'id': 'chatglm3-6b-32k',
'name': 'ChatGLM3-6B-32K',
'mode': ModelMode.CHAT.value,
},
{
'id': 'chatglm2-6b',
'name': 'ChatGLM2-6B',
'mode': ModelMode.CHAT.value,
}
]
else:
return []
def _get_text_generation_model_mode(self, model_name) -> str:
return ModelMode.COMPLETION.value
return ModelMode.CHAT.value
def get_model_class(self, model_type: ModelType) -> Type[BaseProviderModel]:
"""
@ -64,16 +70,19 @@ class ChatGLMProvider(BaseModelProvider):
:return:
"""
model_max_tokens = {
'chatglm-6b': 2000,
'chatglm2-6b': 32000,
'chatglm3-6b-32k': 32000,
'chatglm3-6b': 8000,
'chatglm2-6b': 8000,
}
max_tokens_alias = 'max_length' if model_name == 'chatglm2-6b' else 'max_tokens'
return ModelKwargsRules(
temperature=KwargRule[float](min=0, max=2, default=1, precision=2),
top_p=KwargRule[float](min=0, max=1, default=0.7, precision=2),
presence_penalty=KwargRule[float](enabled=False),
frequency_penalty=KwargRule[float](enabled=False),
max_tokens=KwargRule[int](alias='max_token', min=10, max=model_max_tokens.get(model_name), default=2048, precision=0),
max_tokens=KwargRule[int](alias=max_tokens_alias, min=10, max=model_max_tokens.get(model_name), default=2048, precision=0),
)
@classmethod
@ -85,16 +94,10 @@ class ChatGLMProvider(BaseModelProvider):
raise CredentialsValidateFailedError('ChatGLM Endpoint URL must be provided.')
try:
credential_kwargs = {
'endpoint_url': credentials['api_base']
}
response = requests.get(f"{credentials['api_base']}/v1/models", timeout=5)
llm = ChatGLM(
max_token=10,
**credential_kwargs
)
llm("ping")
if response.status_code != 200:
raise Exception('ChatGLM Endpoint URL is invalid.')
except Exception as ex:
raise CredentialsValidateFailedError(str(ex))

View File

@ -0,0 +1,152 @@
import json
from json import JSONDecodeError
from typing import Type
from langchain.schema import HumanMessage
from core.helper import encrypter
from core.model_providers.models.base import BaseProviderModel
from core.model_providers.models.entity.model_params import ModelKwargsRules, KwargRule, ModelType, ModelMode
from core.model_providers.models.reranking.cohere_reranking import CohereReranking
from core.model_providers.providers.base import BaseModelProvider, CredentialsValidateFailedError
from models.provider import ProviderType
class CohereProvider(BaseModelProvider):
@property
def provider_name(self):
"""
Returns the name of a provider.
"""
return 'cohere'
def _get_text_generation_model_mode(self, model_name) -> str:
return ModelMode.CHAT.value
def _get_fixed_model_list(self, model_type: ModelType) -> list[dict]:
if model_type == ModelType.RERANKING:
return [
{
'id': 'rerank-english-v2.0',
'name': 'rerank-english-v2.0'
},
{
'id': 'rerank-multilingual-v2.0',
'name': 'rerank-multilingual-v2.0'
}
]
else:
return []
def get_model_class(self, model_type: ModelType) -> Type[BaseProviderModel]:
"""
Returns the model class.
:param model_type:
:return:
"""
if model_type == ModelType.RERANKING:
model_class = CohereReranking
else:
raise NotImplementedError
return model_class
def get_model_parameter_rules(self, model_name: str, model_type: ModelType) -> ModelKwargsRules:
"""
get model parameter rules.
:param model_name:
:param model_type:
:return:
"""
return ModelKwargsRules(
temperature=KwargRule[float](min=0, max=1, default=0.3, precision=2),
top_p=KwargRule[float](min=0, max=0.99, default=0.85, precision=2),
presence_penalty=KwargRule[float](enabled=False),
frequency_penalty=KwargRule[float](enabled=False),
max_tokens=KwargRule[int](enabled=False),
)
@classmethod
def is_provider_credentials_valid_or_raise(cls, credentials: dict):
"""
Validates the given credentials.
"""
if 'api_key' not in credentials:
raise CredentialsValidateFailedError('Cohere api_key must be provided.')
try:
credential_kwargs = {
'api_key': credentials['api_key'],
}
# todo validate
except Exception as ex:
raise CredentialsValidateFailedError(str(ex))
@classmethod
def encrypt_provider_credentials(cls, tenant_id: str, credentials: dict) -> dict:
credentials['api_key'] = encrypter.encrypt_token(tenant_id, credentials['api_key'])
return credentials
def get_provider_credentials(self, obfuscated: bool = False) -> dict:
if self.provider.provider_type == ProviderType.CUSTOM.value:
try:
credentials = json.loads(self.provider.encrypted_config)
except JSONDecodeError:
credentials = {
'api_key': None,
}
if credentials['api_key']:
credentials['api_key'] = encrypter.decrypt_token(
self.provider.tenant_id,
credentials['api_key']
)
if obfuscated:
credentials['api_key'] = encrypter.obfuscated_token(credentials['api_key'])
return credentials
else:
return {}
def should_deduct_quota(self):
return True
@classmethod
def is_model_credentials_valid_or_raise(cls, model_name: str, model_type: ModelType, credentials: dict):
"""
check model credentials valid.
:param model_name:
:param model_type:
:param credentials:
"""
return
@classmethod
def encrypt_model_credentials(cls, tenant_id: str, model_name: str, model_type: ModelType,
credentials: dict) -> dict:
"""
encrypt model credentials for save.
:param tenant_id:
:param model_name:
:param model_type:
:param credentials:
:return:
"""
return {}
def get_model_credentials(self, model_name: str, model_type: ModelType, obfuscated: bool = False) -> dict:
"""
get credentials for llm use.
:param model_name:
:param model_type:
:param obfuscated:
:return:
"""
return self.get_provider_credentials(obfuscated)

View File

@ -2,11 +2,13 @@ import json
from typing import Type
import requests
from xinference_client.client.restful.restful_client import Client
from core.helper import encrypter
from core.model_providers.models.embedding.xinference_embedding import XinferenceEmbedding
from core.model_providers.models.entity.model_params import KwargRule, ModelKwargsRules, ModelType, ModelMode
from core.model_providers.models.llm.xinference_model import XinferenceModel
from core.model_providers.models.reranking.xinference_reranking import XinferenceReranking
from core.model_providers.providers.base import BaseModelProvider, CredentialsValidateFailedError
from core.model_providers.models.base import BaseProviderModel
@ -40,6 +42,8 @@ class XinferenceProvider(BaseModelProvider):
model_class = XinferenceModel
elif model_type == ModelType.EMBEDDINGS:
model_class = XinferenceEmbedding
elif model_type == ModelType.RERANKING:
model_class = XinferenceReranking
else:
raise NotImplementedError
@ -113,6 +117,10 @@ class XinferenceProvider(BaseModelProvider):
)
embedding.embed_query("ping")
elif model_type == ModelType.RERANKING:
rerank_client = Client(credential_kwargs['server_url'])
model = rerank_client.get_model(credential_kwargs['model_uid'])
model.rerank(query="ping", documents=["ping", "pong"], top_n=2)
except Exception as ex:
raise CredentialsValidateFailedError(str(ex))

View File

@ -13,5 +13,6 @@
"huggingface_hub",
"xinference",
"openllm",
"localai"
"localai",
"cohere"
]

View File

@ -12,6 +12,9 @@
"quota_limit": 0
},
"model_flexibility": "fixed",
"supported_model_types": [
"text-generation"
],
"price_config": {
"claude-instant-1": {
"prompt": "1.63",
@ -20,8 +23,14 @@
"currency": "USD"
},
"claude-2": {
"prompt": "11.02",
"completion": "32.68",
"prompt": "8.00",
"completion": "24.00",
"unit": "0.000001",
"currency": "USD"
},
"claude-2.1": {
"prompt": "8.00",
"completion": "24.00",
"unit": "0.000001",
"currency": "USD"
}

View File

@ -4,6 +4,10 @@
],
"system_config": null,
"model_flexibility": "configurable",
"supported_model_types": [
"text-generation",
"embeddings"
],
"price_config":{
"gpt-4": {
"prompt": "0.03",

View File

@ -4,6 +4,9 @@
],
"system_config": null,
"model_flexibility": "fixed",
"supported_model_types": [
"text-generation"
],
"price_config": {
"baichuan2-53b": {
"prompt": "0.01",

View File

@ -3,5 +3,8 @@
"custom"
],
"system_config": null,
"model_flexibility": "fixed"
"model_flexibility": "fixed",
"supported_model_types": [
"text-generation"
]
}

View File

@ -0,0 +1,10 @@
{
"support_provider_types": [
"custom"
],
"system_config": null,
"model_flexibility": "fixed",
"supported_model_types": [
"reranking"
]
}

View File

@ -3,5 +3,9 @@
"custom"
],
"system_config": null,
"model_flexibility": "configurable"
"model_flexibility": "configurable",
"supported_model_types": [
"text-generation",
"embeddings"
]
}

View File

@ -3,5 +3,9 @@
"custom"
],
"system_config": null,
"model_flexibility": "configurable"
"model_flexibility": "configurable",
"supported_model_types": [
"text-generation",
"embeddings"
]
}

View File

@ -10,6 +10,10 @@
"quota_unit": "tokens"
},
"model_flexibility": "fixed",
"supported_model_types": [
"text-generation",
"embeddings"
],
"price_config": {
"abab5.5-chat": {
"prompt": "0.015",

View File

@ -11,6 +11,12 @@
"quota_limit": 200
},
"model_flexibility": "fixed",
"supported_model_types": [
"text-generation",
"embeddings",
"speech2text",
"moderation"
],
"price_config": {
"gpt-4": {
"prompt": "0.03",

View File

@ -3,5 +3,9 @@
"custom"
],
"system_config": null,
"model_flexibility": "configurable"
"model_flexibility": "configurable",
"supported_model_types": [
"text-generation",
"embeddings"
]
}

View File

@ -3,5 +3,9 @@
"custom"
],
"system_config": null,
"model_flexibility": "configurable"
"model_flexibility": "configurable",
"supported_model_types": [
"text-generation",
"embeddings"
]
}

View File

@ -10,6 +10,9 @@
"quota_unit": "tokens"
},
"model_flexibility": "fixed",
"supported_model_types": [
"text-generation"
],
"price_config": {
"spark": {
"prompt": "0.18",

View File

@ -4,6 +4,9 @@
],
"system_config": null,
"model_flexibility": "fixed",
"supported_model_types": [
"text-generation"
],
"price_config": {
"qwen-turbo": {
"prompt": "0.012",

View File

@ -4,6 +4,9 @@
],
"system_config": null,
"model_flexibility": "fixed",
"supported_model_types": [
"text-generation"
],
"price_config": {
"ernie-bot-4": {
"prompt": "0",

View File

@ -3,5 +3,10 @@
"custom"
],
"system_config": null,
"model_flexibility": "configurable"
"model_flexibility": "configurable",
"supported_model_types": [
"text-generation",
"embeddings",
"reranking"
]
}

View File

@ -10,6 +10,10 @@
"quota_unit": "tokens"
},
"model_flexibility": "fixed",
"supported_model_types": [
"text-generation",
"embeddings"
],
"price_config": {
"chatglm_turbo": {
"prompt": "0.005",

View File

@ -1,11 +1,17 @@
from typing import Optional
import json
import threading
from typing import Optional, List
from flask import Flask
from langchain import WikipediaAPIWrapper
from langchain.callbacks.manager import Callbacks
from langchain.memory.chat_memory import BaseChatMemory
from langchain.tools import BaseTool, Tool, WikipediaQueryRun
from pydantic import BaseModel, Field
from core.agent.agent.multi_dataset_router_agent import MultiDatasetRouterAgent
from core.agent.agent.output_parser.structured_chat import StructuredChatOutputParser
from core.agent.agent.structed_multi_dataset_router_agent import StructuredMultiDatasetRouterAgent
from core.agent.agent_executor import AgentExecutor, PlanningStrategy, AgentConfiguration
from core.callback_handler.agent_loop_gather_callback_handler import AgentLoopGatherCallbackHandler
from core.callback_handler.dataset_tool_callback_handler import DatasetToolCallbackHandler
@ -17,6 +23,7 @@ from core.model_providers.model_factory import ModelFactory
from core.model_providers.models.entity.model_params import ModelKwargs, ModelMode
from core.model_providers.models.llm.base import BaseLLM
from core.tool.current_datetime_tool import DatetimeTool
from core.tool.dataset_multi_retriever_tool import DatasetMultiRetrieverTool
from core.tool.dataset_retriever_tool import DatasetRetrieverTool
from core.tool.provider.serpapi_provider import SerpAPIToolProvider
from core.tool.serpapi_wrapper import OptimizedSerpAPIWrapper, OptimizedSerpAPIInput
@ -25,6 +32,16 @@ from extensions.ext_database import db
from models.dataset import Dataset, DatasetProcessRule
from models.model import AppModelConfig
default_retrieval_model = {
'search_method': 'semantic_search',
'reranking_enable': False,
'reranking_model': {
'reranking_provider_name': '',
'reranking_model_name': ''
},
'top_k': 2,
'score_threshold_enable': False
}
class OrchestratorRuleParser:
"""Parse the orchestrator rule to entities."""
@ -34,7 +51,7 @@ class OrchestratorRuleParser:
self.app_model_config = app_model_config
def to_agent_executor(self, conversation_message_task: ConversationMessageTask, memory: Optional[BaseChatMemory],
rest_tokens: int, chain_callback: MainChainGatherCallbackHandler,
rest_tokens: int, chain_callback: MainChainGatherCallbackHandler, tenant_id: str,
retriever_from: str = 'dev') -> Optional[AgentExecutor]:
if not self.app_model_config.agent_mode_dict:
return None
@ -101,7 +118,8 @@ class OrchestratorRuleParser:
rest_tokens=rest_tokens,
return_resource=return_resource,
retriever_from=retriever_from,
dataset_configs=dataset_configs
dataset_configs=dataset_configs,
tenant_id=tenant_id
)
if len(tools) == 0:
@ -123,7 +141,7 @@ class OrchestratorRuleParser:
return chain
def to_tools(self, tool_configs: list, callbacks: Callbacks = None, **kwargs) -> list[BaseTool]:
def to_tools(self, tool_configs: list, callbacks: Callbacks = None, **kwargs) -> list[BaseTool]:
"""
Convert app agent tool configs to tools
@ -132,6 +150,7 @@ class OrchestratorRuleParser:
:return:
"""
tools = []
dataset_tools = []
for tool_config in tool_configs:
tool_type = list(tool_config.keys())[0]
tool_val = list(tool_config.values())[0]
@ -140,7 +159,7 @@ class OrchestratorRuleParser:
tool = None
if tool_type == "dataset":
tool = self.to_dataset_retriever_tool(tool_config=tool_val, **kwargs)
dataset_tools.append(tool_config)
elif tool_type == "web_reader":
tool = self.to_web_reader_tool(tool_config=tool_val, **kwargs)
elif tool_type == "google_search":
@ -156,57 +175,81 @@ class OrchestratorRuleParser:
else:
tool.callbacks = callbacks
tools.append(tool)
# format dataset tool
if len(dataset_tools) > 0:
dataset_retriever_tools = self.to_dataset_retriever_tool(tool_configs=dataset_tools, **kwargs)
if dataset_retriever_tools:
tools.extend(dataset_retriever_tools)
return tools
def to_dataset_retriever_tool(self, tool_config: dict, conversation_message_task: ConversationMessageTask,
dataset_configs: dict, rest_tokens: int,
def to_dataset_retriever_tool(self, tool_configs: List, conversation_message_task: ConversationMessageTask,
return_resource: bool = False, retriever_from: str = 'dev',
**kwargs) \
-> Optional[BaseTool]:
-> Optional[List[BaseTool]]:
"""
A dataset tool is a tool that can be used to retrieve information from a dataset
:param rest_tokens:
:param tool_config:
:param dataset_configs:
:param tool_configs:
:param conversation_message_task:
:param return_resource:
:param retriever_from:
:return:
"""
# get dataset from dataset id
dataset = db.session.query(Dataset).filter(
Dataset.tenant_id == self.tenant_id,
Dataset.id == tool_config.get("id")
).first()
dataset_configs = kwargs['dataset_configs']
retrieval_model = dataset_configs.get('retrieval_model', 'single')
tools = []
dataset_ids = []
tenant_id = None
for tool_config in tool_configs:
# get dataset from dataset id
dataset = db.session.query(Dataset).filter(
Dataset.tenant_id == self.tenant_id,
Dataset.id == tool_config.get('dataset').get("id")
).first()
if not dataset:
return None
if not dataset:
continue
if dataset and dataset.available_document_count == 0 and dataset.available_document_count == 0:
return None
if dataset and dataset.available_document_count == 0 and dataset.available_document_count == 0:
continue
dataset_ids.append(dataset.id)
if retrieval_model == 'single':
retrieval_model_config = dataset.retrieval_model if dataset.retrieval_model else default_retrieval_model
top_k = retrieval_model_config['top_k']
top_k = dataset_configs.get("top_k", 2)
# dynamically adjust top_k when the remaining token number is not enough to support top_k
# top_k = self._dynamic_calc_retrieve_k(dataset=dataset, top_k=top_k, rest_tokens=rest_tokens)
# dynamically adjust top_k when the remaining token number is not enough to support top_k
top_k = self._dynamic_calc_retrieve_k(dataset=dataset, top_k=top_k, rest_tokens=rest_tokens)
score_threshold = None
score_threshold_enable = retrieval_model_config.get("score_threshold_enable")
if score_threshold_enable:
score_threshold = retrieval_model_config.get("score_threshold")
score_threshold = None
score_threshold_config = dataset_configs.get("score_threshold")
if score_threshold_config and score_threshold_config.get("enable"):
score_threshold = score_threshold_config.get("value")
tool = DatasetRetrieverTool.from_dataset(
dataset=dataset,
top_k=top_k,
score_threshold=score_threshold,
callbacks=[DatasetToolCallbackHandler(conversation_message_task)],
conversation_message_task=conversation_message_task,
return_resource=return_resource,
retriever_from=retriever_from
)
tools.append(tool)
if retrieval_model == 'multiple':
tool = DatasetMultiRetrieverTool.from_dataset(
dataset_ids=dataset_ids,
tenant_id=kwargs['tenant_id'],
top_k=dataset_configs.get('top_k', 2),
score_threshold=dataset_configs.get('score_threshold', 0.5) if dataset_configs.get('score_threshold_enable', False) else None,
callbacks=[DatasetToolCallbackHandler(conversation_message_task)],
conversation_message_task=conversation_message_task,
return_resource=return_resource,
retriever_from=retriever_from,
reranking_provider_name=dataset_configs.get('reranking_model').get('reranking_provider_name'),
reranking_model_name=dataset_configs.get('reranking_model').get('reranking_model_name')
)
tools.append(tool)
tool = DatasetRetrieverTool.from_dataset(
dataset=dataset,
top_k=top_k,
score_threshold=score_threshold,
callbacks=[DatasetToolCallbackHandler(conversation_message_task)],
conversation_message_task=conversation_message_task,
return_resource=return_resource,
retriever_from=retriever_from
)
return tool
return tools
def to_web_reader_tool(self, tool_config: dict, agent_model_instance: BaseLLM, **kwargs) -> Optional[BaseTool]:
"""

View File

@ -1,7 +1,7 @@
from typing import Dict
from httpx import Limits
from langchain.chat_models import ChatAnthropic
from langchain.schema import ChatMessage, BaseMessage, HumanMessage, AIMessage, SystemMessage
from langchain.utils import get_from_dict_or_env, check_package_version
from pydantic import root_validator
@ -29,8 +29,7 @@ class AnthropicLLM(ChatAnthropic):
base_url=values["anthropic_api_url"],
api_key=values["anthropic_api_key"],
timeout=values["default_request_timeout"],
max_retries=0,
connection_pool_limits=Limits(max_connections=200, max_keepalive_connections=100),
max_retries=0
)
values["async_client"] = anthropic.AsyncAnthropic(
base_url=values["anthropic_api_url"],
@ -46,3 +45,16 @@ class AnthropicLLM(ChatAnthropic):
"Please it install it with `pip install anthropic`."
)
return values
def _convert_one_message_to_text(self, message: BaseMessage) -> str:
if isinstance(message, ChatMessage):
message_text = f"\n\n{message.role.capitalize()}: {message.content}"
elif isinstance(message, HumanMessage):
message_text = f"{self.HUMAN_PROMPT} {message.content}"
elif isinstance(message, AIMessage):
message_text = f"{self.AI_PROMPT} {message.content}"
elif isinstance(message, SystemMessage):
message_text = f"{message.content}"
else:
raise ValueError(f"Got unknown type {message}")
return message_text

View File

@ -51,7 +51,8 @@ class OpenLLM(LLM):
) -> str:
params = {
"prompt": prompt,
"llm_config": self.llm_kwargs
"llm_config": self.llm_kwargs,
"stop": stop,
}
headers = {"Content-Type": "application/json"}
@ -65,11 +66,11 @@ class OpenLLM(LLM):
raise ValueError(f"OpenLLM HTTP {response.status_code} error: {response.text}")
json_response = response.json()
completion = json_response["responses"][0]
completion = json_response["outputs"][0]['text']
completion = completion.lstrip(prompt)
if stop is not None:
completion = enforce_stop_tokens(completion, stop)
# if stop is not None:
# completion = enforce_stop_tokens(completion, stop)
return completion

View File

@ -0,0 +1,232 @@
import json
import threading
from typing import Type, Optional, List
from flask import current_app, Flask
from langchain.tools import BaseTool
from pydantic import Field, BaseModel
from core.callback_handler.index_tool_callback_handler import DatasetIndexToolCallbackHandler
from core.conversation_message_task import ConversationMessageTask
from core.embedding.cached_embedding import CacheEmbedding
from core.index.keyword_table_index.keyword_table_index import KeywordTableIndex, KeywordTableConfig
from core.model_providers.error import LLMBadRequestError, ProviderTokenNotInitError
from core.model_providers.model_factory import ModelFactory
from extensions.ext_database import db
from models.dataset import Dataset, DocumentSegment, Document
from services.retrieval_service import RetrievalService
default_retrieval_model = {
'search_method': 'semantic_search',
'reranking_enable': False,
'reranking_model': {
'reranking_provider_name': '',
'reranking_model_name': ''
},
'top_k': 2,
'score_threshold_enable': False
}
class DatasetMultiRetrieverToolInput(BaseModel):
query: str = Field(..., description="dataset multi retriever and rerank")
class DatasetMultiRetrieverTool(BaseTool):
"""Tool for querying multi dataset."""
name: str = "dataset-"
args_schema: Type[BaseModel] = DatasetMultiRetrieverToolInput
description: str = "dataset multi retriever and rerank. "
tenant_id: str
dataset_ids: List[str]
top_k: int = 2
score_threshold: Optional[float] = None
reranking_provider_name: str
reranking_model_name: str
conversation_message_task: ConversationMessageTask
return_resource: bool
retriever_from: str
@classmethod
def from_dataset(cls, dataset_ids: List[str], tenant_id: str, **kwargs):
return cls(
name=f'dataset-{tenant_id}',
tenant_id=tenant_id,
dataset_ids=dataset_ids,
**kwargs
)
def _run(self, query: str) -> str:
threads = []
all_documents = []
for dataset_id in self.dataset_ids:
retrieval_thread = threading.Thread(target=self._retriever, kwargs={
'flask_app': current_app._get_current_object(),
'dataset_id': dataset_id,
'query': query,
'all_documents': all_documents
})
threads.append(retrieval_thread)
retrieval_thread.start()
for thread in threads:
thread.join()
# do rerank for searched documents
rerank = ModelFactory.get_reranking_model(
tenant_id=self.tenant_id,
model_provider_name=self.reranking_provider_name,
model_name=self.reranking_model_name
)
all_documents = rerank.rerank(query, all_documents, self.score_threshold, self.top_k)
hit_callback = DatasetIndexToolCallbackHandler(self.conversation_message_task)
hit_callback.on_tool_end(all_documents)
document_score_list = {}
for item in all_documents:
document_score_list[item.metadata['doc_id']] = item.metadata['score']
document_context_list = []
index_node_ids = [document.metadata['doc_id'] for document in all_documents]
segments = DocumentSegment.query.filter(
DocumentSegment.completed_at.isnot(None),
DocumentSegment.status == 'completed',
DocumentSegment.enabled == True,
DocumentSegment.index_node_id.in_(index_node_ids)
).all()
if segments:
index_node_id_to_position = {id: position for position, id in enumerate(index_node_ids)}
sorted_segments = sorted(segments,
key=lambda segment: index_node_id_to_position.get(segment.index_node_id,
float('inf')))
for segment in sorted_segments:
if segment.answer:
document_context_list.append(f'question:{segment.content} answer:{segment.answer}')
else:
document_context_list.append(segment.content)
if self.return_resource:
context_list = []
resource_number = 1
for segment in sorted_segments:
dataset = Dataset.query.filter_by(
id=segment.dataset_id
).first()
document = Document.query.filter(Document.id == segment.document_id,
Document.enabled == True,
Document.archived == False,
).first()
if dataset and document:
source = {
'position': resource_number,
'dataset_id': dataset.id,
'dataset_name': dataset.name,
'document_id': document.id,
'document_name': document.name,
'data_source_type': document.data_source_type,
'segment_id': segment.id,
'retriever_from': self.retriever_from,
'score': document_score_list.get(segment.index_node_id, None)
}
if self.retriever_from == 'dev':
source['hit_count'] = segment.hit_count
source['word_count'] = segment.word_count
source['segment_position'] = segment.position
source['index_node_hash'] = segment.index_node_hash
if segment.answer:
source['content'] = f'question:{segment.content} \nanswer:{segment.answer}'
else:
source['content'] = segment.content
context_list.append(source)
resource_number += 1
hit_callback.return_retriever_resource_info(context_list)
return str("\n".join(document_context_list))
async def _arun(self, tool_input: str) -> str:
raise NotImplementedError()
def _retriever(self, flask_app: Flask, dataset_id: str, query: str, all_documents: List):
with flask_app.app_context():
dataset = db.session.query(Dataset).filter(
Dataset.tenant_id == self.tenant_id,
Dataset.id == dataset_id
).first()
if not dataset:
return []
# get retrieval model , if the model is not setting , using default
retrieval_model = dataset.retrieval_model if dataset.retrieval_model else default_retrieval_model
if dataset.indexing_technique == "economy":
# use keyword table query
kw_table_index = KeywordTableIndex(
dataset=dataset,
config=KeywordTableConfig(
max_keywords_per_chunk=5
)
)
documents = kw_table_index.search(query, search_kwargs={'k': self.top_k})
if documents:
all_documents.extend(documents)
else:
try:
embedding_model = ModelFactory.get_embedding_model(
tenant_id=dataset.tenant_id,
model_provider_name=dataset.embedding_model_provider,
model_name=dataset.embedding_model
)
except LLMBadRequestError:
return []
except ProviderTokenNotInitError:
return []
embeddings = CacheEmbedding(embedding_model)
documents = []
threads = []
if self.top_k > 0:
# retrieval_model source with semantic
if retrieval_model['search_method'] == 'semantic_search' or retrieval_model[
'search_method'] == 'hybrid_search':
embedding_thread = threading.Thread(target=RetrievalService.embedding_search, kwargs={
'flask_app': current_app._get_current_object(),
'dataset_id': str(dataset.id),
'query': query,
'top_k': self.top_k,
'score_threshold': self.score_threshold,
'reranking_model': None,
'all_documents': documents,
'search_method': 'hybrid_search',
'embeddings': embeddings
})
threads.append(embedding_thread)
embedding_thread.start()
# retrieval_model source with full text
if retrieval_model['search_method'] == 'full_text_search' or retrieval_model[
'search_method'] == 'hybrid_search':
full_text_index_thread = threading.Thread(target=RetrievalService.full_text_index_search,
kwargs={
'flask_app': current_app._get_current_object(),
'dataset_id': str(dataset.id),
'query': query,
'search_method': 'hybrid_search',
'embeddings': embeddings,
'score_threshold': retrieval_model[
'score_threshold'] if retrieval_model[
'score_threshold_enable'] else None,
'top_k': self.top_k,
'reranking_model': retrieval_model[
'reranking_model'] if retrieval_model[
'reranking_enable'] else None,
'all_documents': documents
})
threads.append(full_text_index_thread)
full_text_index_thread.start()
for thread in threads:
thread.join()
all_documents.extend(documents)

View File

@ -1,5 +1,6 @@
import json
from typing import Type, Optional
import threading
from typing import Type, Optional, List
from flask import current_app
from langchain.tools import BaseTool
@ -14,6 +15,18 @@ from core.model_providers.error import LLMBadRequestError, ProviderTokenNotInitE
from core.model_providers.model_factory import ModelFactory
from extensions.ext_database import db
from models.dataset import Dataset, DocumentSegment, Document
from services.retrieval_service import RetrievalService
default_retrieval_model = {
'search_method': 'semantic_search',
'reranking_enable': False,
'reranking_model': {
'reranking_provider_name': '',
'reranking_model_name': ''
},
'top_k': 2,
'score_threshold_enable': False
}
class DatasetRetrieverToolInput(BaseModel):
@ -56,7 +69,9 @@ class DatasetRetrieverTool(BaseTool):
).first()
if not dataset:
return f'[{self.name} failed to find dataset with id {self.dataset_id}.]'
return ''
# get retrieval model , if the model is not setting , using default
retrieval_model = dataset.retrieval_model if dataset.retrieval_model else default_retrieval_model
if dataset.indexing_technique == "economy":
# use keyword table query
@ -83,28 +98,62 @@ class DatasetRetrieverTool(BaseTool):
return ''
embeddings = CacheEmbedding(embedding_model)
vector_index = VectorIndex(
dataset=dataset,
config=current_app.config,
embeddings=embeddings
)
documents = []
threads = []
if self.top_k > 0:
documents = vector_index.search(
query,
search_type='similarity_score_threshold',
search_kwargs={
'k': self.top_k,
'score_threshold': self.score_threshold,
'filter': {
'group_id': [dataset.id]
}
}
)
# retrieval source with semantic
if retrieval_model['search_method'] == 'semantic_search' or retrieval_model['search_method'] == 'hybrid_search':
embedding_thread = threading.Thread(target=RetrievalService.embedding_search, kwargs={
'flask_app': current_app._get_current_object(),
'dataset_id': str(dataset.id),
'query': query,
'top_k': self.top_k,
'score_threshold': retrieval_model['score_threshold'] if retrieval_model[
'score_threshold_enable'] else None,
'reranking_model': retrieval_model['reranking_model'] if retrieval_model[
'reranking_enable'] else None,
'all_documents': documents,
'search_method': retrieval_model['search_method'],
'embeddings': embeddings
})
threads.append(embedding_thread)
embedding_thread.start()
# retrieval_model source with full text
if retrieval_model['search_method'] == 'full_text_search' or retrieval_model['search_method'] == 'hybrid_search':
full_text_index_thread = threading.Thread(target=RetrievalService.full_text_index_search, kwargs={
'flask_app': current_app._get_current_object(),
'dataset_id': str(dataset.id),
'query': query,
'search_method': retrieval_model['search_method'],
'embeddings': embeddings,
'score_threshold': retrieval_model['score_threshold'] if retrieval_model[
'score_threshold_enable'] else None,
'top_k': self.top_k,
'reranking_model': retrieval_model['reranking_model'] if retrieval_model[
'reranking_enable'] else None,
'all_documents': documents
})
threads.append(full_text_index_thread)
full_text_index_thread.start()
for thread in threads:
thread.join()
# hybrid search: rerank after all documents have been searched
if retrieval_model['search_method'] == 'hybrid_search':
hybrid_rerank = ModelFactory.get_reranking_model(
tenant_id=dataset.tenant_id,
model_provider_name=retrieval_model['reranking_model']['reranking_provider_name'],
model_name=retrieval_model['reranking_model']['reranking_model_name']
)
documents = hybrid_rerank.rerank(query, documents,
retrieval_model['score_threshold'] if retrieval_model['score_threshold_enable'] else None,
self.top_k)
else:
documents = []
hit_callback = DatasetIndexToolCallbackHandler(dataset.id, self.conversation_message_task)
hit_callback = DatasetIndexToolCallbackHandler(self.conversation_message_task)
hit_callback.on_tool_end(documents)
document_score_list = {}
if dataset.indexing_technique != "economy":
@ -147,10 +196,10 @@ class DatasetRetrieverTool(BaseTool):
'document_name': document.name,
'data_source_type': document.data_source_type,
'segment_id': segment.id,
'retriever_from': self.retriever_from
'retriever_from': self.retriever_from,
'score': document_score_list.get(segment.index_node_id, None)
}
if dataset.indexing_technique != "economy":
source['score'] = document_score_list.get(segment.index_node_id)
if self.retriever_from == 'dev':
source['hit_count'] = segment.hit_count
source['word_count'] = segment.word_count

View File

@ -1,4 +1,4 @@
from core.index.vector_index.milvus import Milvus
from core.vector_store.vector.milvus import Milvus
class MilvusVectorStore(Milvus):

View File

@ -4,7 +4,7 @@ from langchain.schema import Document
from qdrant_client.http.models import Filter, PointIdsList, FilterSelector
from qdrant_client.local.qdrant_local import QdrantLocal
from core.index.vector_index.qdrant import Qdrant
from core.vector_store.vector.qdrant import Qdrant
class QdrantVectorStore(Qdrant):
@ -73,3 +73,4 @@ class QdrantVectorStore(Qdrant):
if isinstance(self.client, QdrantLocal):
self.client = cast(QdrantLocal, self.client)
self.client._load()

View File

@ -28,7 +28,7 @@ from langchain.docstore.document import Document
from langchain.embeddings.base import Embeddings
from langchain.vectorstores import VectorStore
from langchain.vectorstores.utils import maximal_marginal_relevance
from qdrant_client.http.models import PayloadSchemaType
from qdrant_client.http.models import PayloadSchemaType, FilterSelector, TextIndexParams, TokenizerType, TextIndexType
if TYPE_CHECKING:
from qdrant_client import grpc # noqa
@ -189,14 +189,25 @@ class Qdrant(VectorStore):
texts, metadatas, ids, batch_size
):
self.client.upsert(
collection_name=self.collection_name, points=points, **kwargs
collection_name=self.collection_name, points=points
)
added_ids.extend(batch_ids)
# if is new collection, create payload index on group_id
if self.is_new_collection:
# create payload index
self.client.create_payload_index(self.collection_name, self.group_payload_key,
field_schema=PayloadSchemaType.KEYWORD,
field_type=PayloadSchemaType.KEYWORD)
# creat full text index
text_index_params = TextIndexParams(
type=TextIndexType.TEXT,
tokenizer=TokenizerType.MULTILINGUAL,
min_token_len=2,
max_token_len=20,
lowercase=True
)
self.client.create_payload_index(self.collection_name, self.content_payload_key,
field_schema=text_index_params)
return added_ids
@sync_call_fallback
@ -600,7 +611,7 @@ class Qdrant(VectorStore):
limit=k,
offset=offset,
with_payload=True,
with_vectors=True, # Langchain does not expect vectors to be returned
with_vectors=True,
score_threshold=score_threshold,
consistency=consistency,
**kwargs,
@ -615,6 +626,39 @@ class Qdrant(VectorStore):
for result in results
]
def similarity_search_by_bm25(
self,
filter: Optional[MetadataFilter] = None,
k: int = 4
) -> List[Document]:
"""Return docs most similar by bm25.
Args:
embedding: Embedding vector to look up documents similar to.
k: Number of Documents to return. Defaults to 4.
filter: Filter by metadata. Defaults to None.
search_params: Additional search params
Returns:
List of documents most similar to the query text and distance for each.
"""
response = self.client.scroll(
collection_name=self.collection_name,
scroll_filter=filter,
limit=k,
with_payload=True,
with_vectors=True
)
results = response[0]
documents = []
for result in results:
if result:
documents.append(self._document_from_scored_point(
result, self.content_payload_key, self.metadata_payload_key
))
return documents
@sync_call_fallback
async def asimilarity_search_with_score_by_vector(
self,

View File

@ -0,0 +1,506 @@
"""Wrapper around weaviate vector database."""
from __future__ import annotations
import datetime
from typing import Any, Callable, Dict, Iterable, List, Optional, Tuple, Type
from uuid import uuid4
import numpy as np
from langchain.docstore.document import Document
from langchain.embeddings.base import Embeddings
from langchain.utils import get_from_dict_or_env
from langchain.vectorstores.base import VectorStore
from langchain.vectorstores.utils import maximal_marginal_relevance
def _default_schema(index_name: str) -> Dict:
return {
"class": index_name,
"properties": [
{
"name": "text",
"dataType": ["text"],
}
],
}
def _create_weaviate_client(**kwargs: Any) -> Any:
client = kwargs.get("client")
if client is not None:
return client
weaviate_url = get_from_dict_or_env(kwargs, "weaviate_url", "WEAVIATE_URL")
try:
# the weaviate api key param should not be mandatory
weaviate_api_key = get_from_dict_or_env(
kwargs, "weaviate_api_key", "WEAVIATE_API_KEY", None
)
except ValueError:
weaviate_api_key = None
try:
import weaviate
except ImportError:
raise ValueError(
"Could not import weaviate python package. "
"Please install it with `pip install weaviate-client`"
)
auth = (
weaviate.auth.AuthApiKey(api_key=weaviate_api_key)
if weaviate_api_key is not None
else None
)
client = weaviate.Client(weaviate_url, auth_client_secret=auth)
return client
def _default_score_normalizer(val: float) -> float:
return 1 - val
def _json_serializable(value: Any) -> Any:
if isinstance(value, datetime.datetime):
return value.isoformat()
return value
class Weaviate(VectorStore):
"""Wrapper around Weaviate vector database.
To use, you should have the ``weaviate-client`` python package installed.
Example:
.. code-block:: python
import weaviate
from langchain.vectorstores import Weaviate
client = weaviate.Client(url=os.environ["WEAVIATE_URL"], ...)
weaviate = Weaviate(client, index_name, text_key)
"""
def __init__(
self,
client: Any,
index_name: str,
text_key: str,
embedding: Optional[Embeddings] = None,
attributes: Optional[List[str]] = None,
relevance_score_fn: Optional[
Callable[[float], float]
] = _default_score_normalizer,
by_text: bool = True,
):
"""Initialize with Weaviate client."""
try:
import weaviate
except ImportError:
raise ValueError(
"Could not import weaviate python package. "
"Please install it with `pip install weaviate-client`."
)
if not isinstance(client, weaviate.Client):
raise ValueError(
f"client should be an instance of weaviate.Client, got {type(client)}"
)
self._client = client
self._index_name = index_name
self._embedding = embedding
self._text_key = text_key
self._query_attrs = [self._text_key]
self.relevance_score_fn = relevance_score_fn
self._by_text = by_text
if attributes is not None:
self._query_attrs.extend(attributes)
@property
def embeddings(self) -> Optional[Embeddings]:
return self._embedding
def _select_relevance_score_fn(self) -> Callable[[float], float]:
return (
self.relevance_score_fn
if self.relevance_score_fn
else _default_score_normalizer
)
def add_texts(
self,
texts: Iterable[str],
metadatas: Optional[List[dict]] = None,
**kwargs: Any,
) -> List[str]:
"""Upload texts with metadata (properties) to Weaviate."""
from weaviate.util import get_valid_uuid
ids = []
embeddings: Optional[List[List[float]]] = None
if self._embedding:
if not isinstance(texts, list):
texts = list(texts)
embeddings = self._embedding.embed_documents(texts)
with self._client.batch as batch:
for i, text in enumerate(texts):
data_properties = {self._text_key: text}
if metadatas is not None:
for key, val in metadatas[i].items():
data_properties[key] = _json_serializable(val)
# Allow for ids (consistent w/ other methods)
# # Or uuids (backwards compatble w/ existing arg)
# If the UUID of one of the objects already exists
# then the existing object will be replaced by the new object.
_id = get_valid_uuid(uuid4())
if "uuids" in kwargs:
_id = kwargs["uuids"][i]
elif "ids" in kwargs:
_id = kwargs["ids"][i]
batch.add_data_object(
data_object=data_properties,
class_name=self._index_name,
uuid=_id,
vector=embeddings[i] if embeddings else None,
)
ids.append(_id)
return ids
def similarity_search(
self, query: str, k: int = 4, **kwargs: Any
) -> List[Document]:
"""Return docs most similar to query.
Args:
query: Text to look up documents similar to.
k: Number of Documents to return. Defaults to 4.
Returns:
List of Documents most similar to the query.
"""
if self._by_text:
return self.similarity_search_by_text(query, k, **kwargs)
else:
if self._embedding is None:
raise ValueError(
"_embedding cannot be None for similarity_search when "
"_by_text=False"
)
embedding = self._embedding.embed_query(query)
return self.similarity_search_by_vector(embedding, k, **kwargs)
def similarity_search_by_text(
self, query: str, k: int = 4, **kwargs: Any
) -> List[Document]:
"""Return docs most similar to query.
Args:
query: Text to look up documents similar to.
k: Number of Documents to return. Defaults to 4.
Returns:
List of Documents most similar to the query.
"""
content: Dict[str, Any] = {"concepts": [query]}
if kwargs.get("search_distance"):
content["certainty"] = kwargs.get("search_distance")
query_obj = self._client.query.get(self._index_name, self._query_attrs)
if kwargs.get("where_filter"):
query_obj = query_obj.with_where(kwargs.get("where_filter"))
if kwargs.get("additional"):
query_obj = query_obj.with_additional(kwargs.get("additional"))
result = query_obj.with_near_text(content).with_limit(k).do()
if "errors" in result:
raise ValueError(f"Error during query: {result['errors']}")
docs = []
for res in result["data"]["Get"][self._index_name]:
text = res.pop(self._text_key)
docs.append(Document(page_content=text, metadata=res))
return docs
def similarity_search_by_bm25(
self, query: str, k: int = 4, **kwargs: Any
) -> List[Document]:
"""Return docs using BM25F.
Args:
query: Text to look up documents similar to.
k: Number of Documents to return. Defaults to 4.
Returns:
List of Documents most similar to the query.
"""
content: Dict[str, Any] = {"concepts": [query]}
if kwargs.get("search_distance"):
content["certainty"] = kwargs.get("search_distance")
query_obj = self._client.query.get(self._index_name, self._query_attrs)
if kwargs.get("where_filter"):
query_obj = query_obj.with_where(kwargs.get("where_filter"))
if kwargs.get("additional"):
query_obj = query_obj.with_additional(kwargs.get("additional"))
properties = ['text', 'dataset_id', 'doc_hash', 'doc_id', 'document_id']
result = query_obj.with_bm25(query=query, properties=properties).with_limit(k).do()
if "errors" in result:
raise ValueError(f"Error during query: {result['errors']}")
docs = []
for res in result["data"]["Get"][self._index_name]:
text = res.pop(self._text_key)
docs.append(Document(page_content=text, metadata=res))
return docs
def similarity_search_by_vector(
self, embedding: List[float], k: int = 4, **kwargs: Any
) -> List[Document]:
"""Look up similar documents by embedding vector in Weaviate."""
vector = {"vector": embedding}
query_obj = self._client.query.get(self._index_name, self._query_attrs)
if kwargs.get("where_filter"):
query_obj = query_obj.with_where(kwargs.get("where_filter"))
if kwargs.get("additional"):
query_obj = query_obj.with_additional(kwargs.get("additional"))
result = query_obj.with_near_vector(vector).with_limit(k).do()
if "errors" in result:
raise ValueError(f"Error during query: {result['errors']}")
docs = []
for res in result["data"]["Get"][self._index_name]:
text = res.pop(self._text_key)
docs.append(Document(page_content=text, metadata=res))
return docs
def max_marginal_relevance_search(
self,
query: str,
k: int = 4,
fetch_k: int = 20,
lambda_mult: float = 0.5,
**kwargs: Any,
) -> List[Document]:
"""Return docs selected using the maximal marginal relevance.
Maximal marginal relevance optimizes for similarity to query AND diversity
among selected documents.
Args:
query: Text to look up documents similar to.
k: Number of Documents to return. Defaults to 4.
fetch_k: Number of Documents to fetch to pass to MMR algorithm.
lambda_mult: Number between 0 and 1 that determines the degree
of diversity among the results with 0 corresponding
to maximum diversity and 1 to minimum diversity.
Defaults to 0.5.
Returns:
List of Documents selected by maximal marginal relevance.
"""
if self._embedding is not None:
embedding = self._embedding.embed_query(query)
else:
raise ValueError(
"max_marginal_relevance_search requires a suitable Embeddings object"
)
return self.max_marginal_relevance_search_by_vector(
embedding, k=k, fetch_k=fetch_k, lambda_mult=lambda_mult, **kwargs
)
def max_marginal_relevance_search_by_vector(
self,
embedding: List[float],
k: int = 4,
fetch_k: int = 20,
lambda_mult: float = 0.5,
**kwargs: Any,
) -> List[Document]:
"""Return docs selected using the maximal marginal relevance.
Maximal marginal relevance optimizes for similarity to query AND diversity
among selected documents.
Args:
embedding: Embedding to look up documents similar to.
k: Number of Documents to return. Defaults to 4.
fetch_k: Number of Documents to fetch to pass to MMR algorithm.
lambda_mult: Number between 0 and 1 that determines the degree
of diversity among the results with 0 corresponding
to maximum diversity and 1 to minimum diversity.
Defaults to 0.5.
Returns:
List of Documents selected by maximal marginal relevance.
"""
vector = {"vector": embedding}
query_obj = self._client.query.get(self._index_name, self._query_attrs)
if kwargs.get("where_filter"):
query_obj = query_obj.with_where(kwargs.get("where_filter"))
results = (
query_obj.with_additional("vector")
.with_near_vector(vector)
.with_limit(fetch_k)
.do()
)
payload = results["data"]["Get"][self._index_name]
embeddings = [result["_additional"]["vector"] for result in payload]
mmr_selected = maximal_marginal_relevance(
np.array(embedding), embeddings, k=k, lambda_mult=lambda_mult
)
docs = []
for idx in mmr_selected:
text = payload[idx].pop(self._text_key)
payload[idx].pop("_additional")
meta = payload[idx]
docs.append(Document(page_content=text, metadata=meta))
return docs
def similarity_search_with_score(
self, query: str, k: int = 4, **kwargs: Any
) -> List[Tuple[Document, float]]:
"""
Return list of documents most similar to the query
text and cosine distance in float for each.
Lower score represents more similarity.
"""
if self._embedding is None:
raise ValueError(
"_embedding cannot be None for similarity_search_with_score"
)
content: Dict[str, Any] = {"concepts": [query]}
if kwargs.get("search_distance"):
content["certainty"] = kwargs.get("search_distance")
query_obj = self._client.query.get(self._index_name, self._query_attrs)
embedded_query = self._embedding.embed_query(query)
if not self._by_text:
vector = {"vector": embedded_query}
result = (
query_obj.with_near_vector(vector)
.with_limit(k)
.with_additional(["vector", "distance"])
.do()
)
else:
result = (
query_obj.with_near_text(content)
.with_limit(k)
.with_additional(["vector", "distance"])
.do()
)
if "errors" in result:
raise ValueError(f"Error during query: {result['errors']}")
docs_and_scores = []
for res in result["data"]["Get"][self._index_name]:
text = res.pop(self._text_key)
score = res["_additional"]["distance"]
docs_and_scores.append((Document(page_content=text, metadata=res), score))
return docs_and_scores
@classmethod
def from_texts(
cls: Type[Weaviate],
texts: List[str],
embedding: Embeddings,
metadatas: Optional[List[dict]] = None,
**kwargs: Any,
) -> Weaviate:
"""Construct Weaviate wrapper from raw documents.
This is a user-friendly interface that:
1. Embeds documents.
2. Creates a new index for the embeddings in the Weaviate instance.
3. Adds the documents to the newly created Weaviate index.
This is intended to be a quick way to get started.
Example:
.. code-block:: python
from langchain.vectorstores.weaviate import Weaviate
from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()
weaviate = Weaviate.from_texts(
texts,
embeddings,
weaviate_url="http://localhost:8080"
)
"""
client = _create_weaviate_client(**kwargs)
from weaviate.util import get_valid_uuid
index_name = kwargs.get("index_name", f"LangChain_{uuid4().hex}")
embeddings = embedding.embed_documents(texts) if embedding else None
text_key = "text"
schema = _default_schema(index_name)
attributes = list(metadatas[0].keys()) if metadatas else None
# check whether the index already exists
if not client.schema.contains(schema):
client.schema.create_class(schema)
with client.batch as batch:
for i, text in enumerate(texts):
data_properties = {
text_key: text,
}
if metadatas is not None:
for key in metadatas[i].keys():
data_properties[key] = metadatas[i][key]
# If the UUID of one of the objects already exists
# then the existing objectwill be replaced by the new object.
if "uuids" in kwargs:
_id = kwargs["uuids"][i]
else:
_id = get_valid_uuid(uuid4())
# if an embedding strategy is not provided, we let
# weaviate create the embedding. Note that this will only
# work if weaviate has been installed with a vectorizer module
# like text2vec-contextionary for example
params = {
"uuid": _id,
"data_object": data_properties,
"class_name": index_name,
}
if embeddings is not None:
params["vector"] = embeddings[i]
batch.add_data_object(**params)
batch.flush()
relevance_score_fn = kwargs.get("relevance_score_fn")
by_text: bool = kwargs.get("by_text", False)
return cls(
client,
index_name,
text_key,
embedding=embedding,
attributes=attributes,
relevance_score_fn=relevance_score_fn,
by_text=by_text,
)
def delete(self, ids: Optional[List[str]] = None, **kwargs: Any) -> None:
"""Delete by vector IDs.
Args:
ids: List of ids to delete.
"""
if ids is None:
raise ValueError("No ids provided to delete.")
# TODO: Check if this can be done in bulk
for id in ids:
self._client.data_object.delete(uuid=id)

View File

@ -1,4 +1,4 @@
from langchain.vectorstores import Weaviate
from core.vector_store.vector.weaviate import Weaviate
class WeaviateVectorStore(Weaviate):

View File

@ -12,6 +12,21 @@ dataset_fields = {
'created_at': TimestampField,
}
reranking_model_fields = {
'reranking_provider_name': fields.String,
'reranking_model_name': fields.String
}
dataset_retrieval_model_fields = {
'search_method': fields.String,
'reranking_enable': fields.Boolean,
'reranking_model': fields.Nested(reranking_model_fields),
'top_k': fields.Integer,
'score_threshold_enable': fields.Boolean,
'score_threshold': fields.Float
}
dataset_detail_fields = {
'id': fields.String,
'name': fields.String,
@ -29,7 +44,8 @@ dataset_detail_fields = {
'updated_at': TimestampField,
'embedding_model': fields.String,
'embedding_model_provider': fields.String,
'embedding_available': fields.Boolean
'embedding_available': fields.Boolean,
'retrieval_model_dict': fields.Nested(dataset_retrieval_model_fields)
}
dataset_query_detail_fields = {
@ -41,3 +57,5 @@ dataset_query_detail_fields = {
"created_by": fields.String,
"created_at": TimestampField
}

View File

@ -0,0 +1,43 @@
"""add-dataset-retrival-model
Revision ID: fca025d3b60f
Revises: b3a09c049e8e
Create Date: 2023-11-03 13:08:23.246396
"""
from alembic import op
import sqlalchemy as sa
from sqlalchemy.dialects import postgresql
# revision identifiers, used by Alembic.
revision = 'fca025d3b60f'
down_revision = '8fe468ba0ca5'
branch_labels = None
depends_on = None
def upgrade():
# ### commands auto generated by Alembic - please adjust! ###
op.drop_table('sessions')
with op.batch_alter_table('datasets', schema=None) as batch_op:
batch_op.add_column(sa.Column('retrieval_model', postgresql.JSONB(astext_type=sa.Text()), nullable=True))
batch_op.create_index('retrieval_model_idx', ['retrieval_model'], unique=False, postgresql_using='gin')
# ### end Alembic commands ###
def downgrade():
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table('datasets', schema=None) as batch_op:
batch_op.drop_index('retrieval_model_idx', postgresql_using='gin')
batch_op.drop_column('retrieval_model')
op.create_table('sessions',
sa.Column('id', sa.INTEGER(), autoincrement=True, nullable=False),
sa.Column('session_id', sa.VARCHAR(length=255), autoincrement=False, nullable=True),
sa.Column('data', postgresql.BYTEA(), autoincrement=False, nullable=True),
sa.Column('expiry', postgresql.TIMESTAMP(), autoincrement=False, nullable=True),
sa.PrimaryKeyConstraint('id', name='sessions_pkey'),
sa.UniqueConstraint('session_id', name='sessions_session_id_key')
)
# ### end Alembic commands ###

View File

@ -3,7 +3,7 @@ import pickle
from json import JSONDecodeError
from sqlalchemy import func
from sqlalchemy.dialects.postgresql import UUID
from sqlalchemy.dialects.postgresql import UUID, JSONB
from extensions.ext_database import db
from models.account import Account
@ -15,6 +15,7 @@ class Dataset(db.Model):
__table_args__ = (
db.PrimaryKeyConstraint('id', name='dataset_pkey'),
db.Index('dataset_tenant_idx', 'tenant_id'),
db.Index('retrieval_model_idx', "retrieval_model", postgresql_using='gin')
)
INDEXING_TECHNIQUE_LIST = ['high_quality', 'economy']
@ -39,7 +40,7 @@ class Dataset(db.Model):
embedding_model = db.Column(db.String(255), nullable=True)
embedding_model_provider = db.Column(db.String(255), nullable=True)
collection_binding_id = db.Column(UUID, nullable=True)
retrieval_model = db.Column(JSONB, nullable=True)
@property
def dataset_keyword_table(self):
@ -93,6 +94,20 @@ class Dataset(db.Model):
return Document.query.with_entities(func.coalesce(func.sum(Document.word_count))) \
.filter(Document.dataset_id == self.id).scalar()
@property
def retrieval_model_dict(self):
default_retrieval_model = {
'search_method': 'semantic_search',
'reranking_enable': False,
'reranking_model': {
'reranking_provider_name': '',
'reranking_model_name': ''
},
'top_k': 2,
'score_threshold_enable': False
}
return self.retrieval_model if self.retrieval_model else default_retrieval_model
class DatasetProcessRule(db.Model):
__tablename__ = 'dataset_process_rules'
@ -120,7 +135,7 @@ class DatasetProcessRule(db.Model):
],
'segmentation': {
'delimiter': '\n',
'max_tokens': 1000
'max_tokens': 512
}
}
@ -462,4 +477,3 @@ class DatasetCollectionBinding(db.Model):
model_name = db.Column(db.String(40), nullable=False)
collection_name = db.Column(db.String(64), nullable=False)
created_at = db.Column(db.DateTime, nullable=False, server_default=db.text('CURRENT_TIMESTAMP(0)'))

View File

@ -160,7 +160,13 @@ class AppModelConfig(db.Model):
@property
def dataset_configs_dict(self) -> dict:
return json.loads(self.dataset_configs) if self.dataset_configs else {"top_k": 2, "score_threshold": {"enable": False}}
if self.dataset_configs:
dataset_configs = json.loads(self.dataset_configs)
if 'retrieval_model' not in dataset_configs:
return {'retrieval_model': 'single'}
else:
return dataset_configs
return {'retrieval_model': 'single'}
@property
def file_upload_dict(self) -> dict:

View File

@ -23,7 +23,6 @@ boto3==1.28.17
tenacity==8.2.2
cachetools~=5.3.0
weaviate-client~=3.21.0
qdrant_client~=1.1.6
mailchimp-transactional~=1.0.50
scikit-learn==1.2.2
sentry-sdk[flask]~=1.21.1
@ -36,7 +35,7 @@ docx2txt==0.8
pypdfium2==4.16.0
resend~=0.5.1
pyjwt~=2.6.0
anthropic~=0.3.4
anthropic~=0.7.2
newspaper3k==0.2.8
google-api-python-client==2.90.0
wikipedia==1.4.0
@ -49,8 +48,10 @@ huggingface_hub~=0.16.4
transformers~=4.31.0
stripe~=5.5.0
pandas==1.5.3
xinference-client~=0.5.4
xinference-client~=0.6.4
safetensors==0.3.2
zhipuai==1.0.7
werkzeug==2.3.7
pymilvus==2.3.0
pymilvus==2.3.0
qdrant-client==1.6.4
cohere~=4.32

View File

@ -489,9 +489,10 @@ class RegisterService:
'email': account.email,
'workspace_id': tenant.id,
}
expiryHours = current_app.config['INVITE_EXPIRY_HOURS']
redis_client.setex(
cls._get_invitation_token_key(token),
3600,
expiryHours * 60 * 60,
json.dumps(invitation_data)
)
return token

View File

@ -470,7 +470,16 @@ class AppModelConfigService:
# dataset_configs
if 'dataset_configs' not in config or not config["dataset_configs"]:
config["dataset_configs"] = {"top_k": 2, "score_threshold": {"enable": False}}
config["dataset_configs"] = {'retrieval_model': 'single'}
if not isinstance(config["dataset_configs"], dict):
raise ValueError("dataset_configs must be of object type")
if config["dataset_configs"]['retrieval_model'] == 'multiple':
if not config["dataset_configs"]['reranking_model']:
raise ValueError("reranking_model has not been set")
if not isinstance(config["dataset_configs"]['reranking_model'], dict):
raise ValueError("reranking_model must be of object type")
if not isinstance(config["dataset_configs"], dict):
raise ValueError("dataset_configs must be of object type")

View File

@ -232,7 +232,7 @@ class CompletionService:
logging.exception("Unknown Error in completion")
PubHandler.pub_error(user, generate_task_id, e)
finally:
db.session.commit()
db.session.remove()
@classmethod
def countdown_and_close(cls, flask_app: Flask, worker_thread, pubsub, detached_user,
@ -242,22 +242,25 @@ class CompletionService:
def close_pubsub():
with flask_app.app_context():
user = db.session.merge(detached_user)
try:
user = db.session.merge(detached_user)
sleep_iterations = 0
while sleep_iterations < timeout and worker_thread.is_alive():
if sleep_iterations > 0 and sleep_iterations % 10 == 0:
PubHandler.ping(user, generate_task_id)
sleep_iterations = 0
while sleep_iterations < timeout and worker_thread.is_alive():
if sleep_iterations > 0 and sleep_iterations % 10 == 0:
PubHandler.ping(user, generate_task_id)
time.sleep(1)
sleep_iterations += 1
time.sleep(1)
sleep_iterations += 1
if worker_thread.is_alive():
PubHandler.stop(user, generate_task_id)
try:
pubsub.close()
except Exception:
pass
if worker_thread.is_alive():
PubHandler.stop(user, generate_task_id)
try:
pubsub.close()
except Exception:
pass
finally:
db.session.remove()
countdown_thread = threading.Thread(target=close_pubsub)
countdown_thread.start()
@ -394,7 +397,7 @@ class CompletionService:
logging.exception(e)
raise
finally:
db.session.commit()
db.session.remove()
try:
pubsub.unsubscribe(generate_channel)
@ -436,7 +439,7 @@ class CompletionService:
logging.exception(e)
raise
finally:
db.session.commit()
db.session.remove()
try:
pubsub.unsubscribe(generate_channel)

View File

@ -173,6 +173,9 @@ class DatasetService:
filtered_data['updated_by'] = user.id
filtered_data['updated_at'] = datetime.datetime.now()
# update Retrieval model
filtered_data['retrieval_model'] = data['retrieval_model']
dataset.query.filter_by(id=dataset_id).update(filtered_data)
db.session.commit()
@ -473,7 +476,19 @@ class DocumentService:
embedding_model.name
)
dataset.collection_binding_id = dataset_collection_binding.id
if not dataset.retrieval_model:
default_retrieval_model = {
'search_method': 'semantic_search',
'reranking_enable': False,
'reranking_model': {
'reranking_provider_name': '',
'reranking_model_name': ''
},
'top_k': 2,
'score_threshold_enable': False
}
dataset.retrieval_model = document_data.get('retrieval_model') if document_data.get('retrieval_model') else default_retrieval_model
documents = []
batch = time.strftime('%Y%m%d%H%M%S') + str(random.randint(100000, 999999))
@ -733,6 +748,7 @@ class DocumentService:
raise ValueError(f"All your documents have overed limit {tenant_document_count}.")
embedding_model = None
dataset_collection_binding_id = None
retrieval_model = None
if document_data['indexing_technique'] == 'high_quality':
embedding_model = ModelFactory.get_embedding_model(
tenant_id=tenant_id
@ -742,6 +758,20 @@ class DocumentService:
embedding_model.name
)
dataset_collection_binding_id = dataset_collection_binding.id
if 'retrieval_model' in document_data and document_data['retrieval_model']:
retrieval_model = document_data['retrieval_model']
else:
default_retrieval_model = {
'search_method': 'semantic_search',
'reranking_enable': False,
'reranking_model': {
'reranking_provider_name': '',
'reranking_model_name': ''
},
'top_k': 2,
'score_threshold_enable': False
}
retrieval_model = default_retrieval_model
# save dataset
dataset = Dataset(
tenant_id=tenant_id,
@ -751,7 +781,8 @@ class DocumentService:
created_by=account.id,
embedding_model=embedding_model.name if embedding_model else None,
embedding_model_provider=embedding_model.model_provider.provider_name if embedding_model else None,
collection_binding_id=dataset_collection_binding_id
collection_binding_id=dataset_collection_binding_id,
retrieval_model=retrieval_model
)
db.session.add(dataset)
@ -768,7 +799,7 @@ class DocumentService:
return dataset, documents, batch
@classmethod
def document_create_args_validate(cls, args: dict):
def document_create_args_validate(cls, args: dict):
if 'original_document_id' not in args or not args['original_document_id']:
DocumentService.data_source_args_validate(args)
DocumentService.process_rule_args_validate(args)

View File

@ -1,4 +1,6 @@
import json
import logging
import threading
import time
from typing import List
@ -9,16 +11,26 @@ from langchain.schema import Document
from sklearn.manifold import TSNE
from core.embedding.cached_embedding import CacheEmbedding
from core.index.vector_index.vector_index import VectorIndex
from core.model_providers.model_factory import ModelFactory
from extensions.ext_database import db
from models.account import Account
from models.dataset import Dataset, DocumentSegment, DatasetQuery
from services.retrieval_service import RetrievalService
default_retrieval_model = {
'search_method': 'semantic_search',
'reranking_enable': False,
'reranking_model': {
'reranking_provider_name': '',
'reranking_model_name': ''
},
'top_k': 2,
'score_threshold_enable': False
}
class HitTestingService:
@classmethod
def retrieve(cls, dataset: Dataset, query: str, account: Account, limit: int = 10) -> dict:
def retrieve(cls, dataset: Dataset, query: str, account: Account, retrieval_model: dict, limit: int = 10) -> dict:
if dataset.available_document_count == 0 or dataset.available_segment_count == 0:
return {
"query": {
@ -28,31 +40,68 @@ class HitTestingService:
"records": []
}
start = time.perf_counter()
# get retrieval model , if the model is not setting , using default
if not retrieval_model:
retrieval_model = dataset.retrieval_model if dataset.retrieval_model else default_retrieval_model
# get embedding model
embedding_model = ModelFactory.get_embedding_model(
tenant_id=dataset.tenant_id,
model_provider_name=dataset.embedding_model_provider,
model_name=dataset.embedding_model
)
embeddings = CacheEmbedding(embedding_model)
vector_index = VectorIndex(
dataset=dataset,
config=current_app.config,
embeddings=embeddings
)
all_documents = []
threads = []
# retrieval_model source with semantic
if retrieval_model['search_method'] == 'semantic_search' or retrieval_model['search_method'] == 'hybrid_search':
embedding_thread = threading.Thread(target=RetrievalService.embedding_search, kwargs={
'flask_app': current_app._get_current_object(),
'dataset_id': str(dataset.id),
'query': query,
'top_k': retrieval_model['top_k'],
'score_threshold': retrieval_model['score_threshold'] if retrieval_model['score_threshold_enable'] else None,
'reranking_model': retrieval_model['reranking_model'] if retrieval_model['reranking_enable'] else None,
'all_documents': all_documents,
'search_method': retrieval_model['search_method'],
'embeddings': embeddings
})
threads.append(embedding_thread)
embedding_thread.start()
# retrieval source with full text
if retrieval_model['search_method'] == 'full_text_search' or retrieval_model['search_method'] == 'hybrid_search':
full_text_index_thread = threading.Thread(target=RetrievalService.full_text_index_search, kwargs={
'flask_app': current_app._get_current_object(),
'dataset_id': str(dataset.id),
'query': query,
'search_method': retrieval_model['search_method'],
'embeddings': embeddings,
'score_threshold': retrieval_model['score_threshold'] if retrieval_model['score_threshold_enable'] else None,
'top_k': retrieval_model['top_k'],
'reranking_model': retrieval_model['reranking_model'] if retrieval_model['reranking_enable'] else None,
'all_documents': all_documents
})
threads.append(full_text_index_thread)
full_text_index_thread.start()
for thread in threads:
thread.join()
if retrieval_model['search_method'] == 'hybrid_search':
hybrid_rerank = ModelFactory.get_reranking_model(
tenant_id=dataset.tenant_id,
model_provider_name=retrieval_model['reranking_model']['reranking_provider_name'],
model_name=retrieval_model['reranking_model']['reranking_model_name']
)
all_documents = hybrid_rerank.rerank(query, all_documents,
retrieval_model['score_threshold'] if retrieval_model['score_threshold_enable'] else None,
retrieval_model['top_k'])
start = time.perf_counter()
documents = vector_index.search(
query,
search_type='similarity_score_threshold',
search_kwargs={
'k': 10,
'filter': {
'group_id': [dataset.id]
}
}
)
end = time.perf_counter()
logging.debug(f"Hit testing retrieve in {end - start:0.4f} seconds")
@ -67,7 +116,7 @@ class HitTestingService:
db.session.add(dataset_query)
db.session.commit()
return cls.compact_retrieve_response(dataset, embeddings, query, documents)
return cls.compact_retrieve_response(dataset, embeddings, query, all_documents)
@classmethod
def compact_retrieve_response(cls, dataset: Dataset, embeddings: Embeddings, query: str, documents: List[Document]):
@ -99,7 +148,7 @@ class HitTestingService:
record = {
"segment": segment,
"score": document.metadata['score'],
"score": document.metadata.get('score', None),
"tsne_position": tsne_position_data[i]
}
@ -136,3 +185,11 @@ class HitTestingService:
tsne_position_data.append({'x': float(data_tsne[i][0]), 'y': float(data_tsne[i][1])})
return tsne_position_data
@classmethod
def hit_testing_args_check(cls, args):
query = args['query']
if not query or len(query) > 250:
raise ValueError('Query is required and cannot exceed 250 characters')

View File

@ -17,11 +17,12 @@ from models.provider import Provider, ProviderModel, TenantPreferredModelProvide
class ProviderService:
def get_provider_list(self, tenant_id: str):
def get_provider_list(self, tenant_id: str, model_type: Optional[str] = None) -> list:
"""
get provider list of tenant.
:param tenant_id:
:param tenant_id: workspace id
:param model_type: filter by model type
:return:
"""
# get rules for all providers
@ -79,6 +80,9 @@ class ProviderService:
providers_list = {}
for model_provider_name, model_provider_rule in model_provider_rules.items():
if model_type and model_type not in model_provider_rule.get('supported_model_types', []):
continue
# get preferred provider type
preferred_model_provider = provider_name_to_preferred_provider_type_dict.get(model_provider_name)
preferred_provider_type = ModelProviderFactory.get_preferred_type_by_preferred_model_provider(
@ -90,6 +94,7 @@ class ProviderService:
provider_config_dict = {
"preferred_provider_type": preferred_provider_type,
"model_flexibility": model_provider_rule['model_flexibility'],
"supported_model_types": model_provider_rule.get("supported_model_types", []),
}
provider_parameter_dict = {}

View File

@ -0,0 +1,95 @@
from typing import Optional
from flask import current_app, Flask
from langchain.embeddings.base import Embeddings
from core.index.vector_index.vector_index import VectorIndex
from core.model_providers.model_factory import ModelFactory
from extensions.ext_database import db
from models.dataset import Dataset
default_retrieval_model = {
'search_method': 'semantic_search',
'reranking_enable': False,
'reranking_model': {
'reranking_provider_name': '',
'reranking_model_name': ''
},
'top_k': 2,
'score_threshold_enable': False
}
class RetrievalService:
@classmethod
def embedding_search(cls, flask_app: Flask, dataset_id: str, query: str,
top_k: int, score_threshold: Optional[float], reranking_model: Optional[dict],
all_documents: list, search_method: str, embeddings: Embeddings):
with flask_app.app_context():
dataset = db.session.query(Dataset).filter(
Dataset.id == dataset_id
).first()
vector_index = VectorIndex(
dataset=dataset,
config=current_app.config,
embeddings=embeddings
)
documents = vector_index.search(
query,
search_type='similarity_score_threshold',
search_kwargs={
'k': top_k,
'score_threshold': score_threshold,
'filter': {
'group_id': [dataset.id]
}
}
)
if documents:
if reranking_model and search_method == 'semantic_search':
rerank = ModelFactory.get_reranking_model(
tenant_id=dataset.tenant_id,
model_provider_name=reranking_model['reranking_provider_name'],
model_name=reranking_model['reranking_model_name']
)
all_documents.extend(rerank.rerank(query, documents, score_threshold, len(documents)))
else:
all_documents.extend(documents)
@classmethod
def full_text_index_search(cls, flask_app: Flask, dataset_id: str, query: str,
top_k: int, score_threshold: Optional[float], reranking_model: Optional[dict],
all_documents: list, search_method: str, embeddings: Embeddings):
with flask_app.app_context():
dataset = db.session.query(Dataset).filter(
Dataset.id == dataset_id
).first()
vector_index = VectorIndex(
dataset=dataset,
config=current_app.config,
embeddings=embeddings
)
documents = vector_index.search_by_full_text_index(
query,
search_type='similarity_score_threshold',
top_k=top_k
)
if documents:
if reranking_model and search_method == 'full_text_search':
rerank = ModelFactory.get_reranking_model(
tenant_id=dataset.tenant_id,
model_provider_name=reranking_model['reranking_provider_name'],
model_name=reranking_model['reranking_model_name']
)
all_documents.extend(rerank.rerank(query, documents, score_threshold, len(documents)))
else:
all_documents.extend(documents)

View File

@ -50,4 +50,7 @@ XINFERENCE_MODEL_UID=
OPENLLM_SERVER_URL=
# LocalAI Credentials
LOCALAI_SERVER_URL=
LOCALAI_SERVER_URL=
# Cohere Credentials
COHERE_API_KEY=

View File

@ -0,0 +1,61 @@
import json
import os
from unittest.mock import patch
from langchain.schema import Document
from core.model_providers.models.reranking.cohere_reranking import CohereReranking
from core.model_providers.providers.cohere_provider import CohereProvider
from models.provider import Provider, ProviderType
def get_mock_provider(valid_api_key):
return Provider(
id='provider_id',
tenant_id='tenant_id',
provider_name='cohere',
provider_type=ProviderType.CUSTOM.value,
encrypted_config=json.dumps({'api_key': valid_api_key}),
is_valid=True,
)
def get_mock_model():
valid_api_key = os.environ['COHERE_API_KEY']
provider = CohereProvider(provider=get_mock_provider(valid_api_key))
return CohereReranking(
model_provider=provider,
name='rerank-english-v2.0'
)
def decrypt_side_effect(tenant_id, encrypted_api_key):
return encrypted_api_key
@patch('core.helper.encrypter.decrypt_token', side_effect=decrypt_side_effect)
def test_run(mock_decrypt):
model = get_mock_model()
docs = []
docs.append(Document(
page_content='bye',
metadata={
"doc_id": 'a',
"doc_hash": 'doc_hash',
"document_id": 'document_id',
"dataset_id": 'dataset_id',
}
))
docs.append(Document(
page_content='hello',
metadata={
"doc_id": 'b',
"doc_hash": 'doc_hash',
"document_id": 'document_id',
"dataset_id": 'dataset_id',
}
))
rst = model.rerank('hello', docs, None, 2)
assert rst[0].page_content == 'hello'

View File

@ -0,0 +1,78 @@
import json
import os
from unittest.mock import patch, MagicMock
from langchain.schema import Document
from core.model_providers.models.entity.model_params import ModelType
from core.model_providers.models.reranking.xinference_reranking import XinferenceReranking
from core.model_providers.providers.xinference_provider import XinferenceProvider
from models.provider import Provider, ProviderType, ProviderModel
def get_mock_provider(valid_server_url, valid_model_uid):
return Provider(
id='provider_id',
tenant_id='tenant_id',
provider_name='xinference',
provider_type=ProviderType.CUSTOM.value,
encrypted_config=json.dumps({'server_url': valid_server_url, 'model_uid': valid_model_uid}),
is_valid=True,
)
def get_mock_model(mocker):
valid_server_url = os.environ['XINFERENCE_SERVER_URL']
valid_model_uid = os.environ['XINFERENCE_MODEL_UID']
model_name = 'bge-reranker-base'
provider = XinferenceProvider(provider=get_mock_provider(valid_server_url, valid_model_uid))
mock_query = MagicMock()
mock_query.filter.return_value.first.return_value = ProviderModel(
provider_name='xinference',
model_name=model_name,
model_type=ModelType.RERANKING.value,
encrypted_config=json.dumps({
'server_url': valid_server_url,
'model_uid': valid_model_uid
}),
is_valid=True,
)
mocker.patch('extensions.ext_database.db.session.query', return_value=mock_query)
return XinferenceReranking(
model_provider=provider,
name=model_name
)
def decrypt_side_effect(tenant_id, encrypted_api_key):
return encrypted_api_key
@patch('core.helper.encrypter.decrypt_token', side_effect=decrypt_side_effect)
def test_run(mock_decrypt, mocker):
model = get_mock_model(mocker)
docs = []
docs.append(Document(
page_content='bye',
metadata={
"doc_id": 'a',
"doc_hash": 'doc_hash',
"document_id": 'document_id',
"dataset_id": 'dataset_id',
}
))
docs.append(Document(
page_content='hello',
metadata={
"doc_id": 'b',
"doc_hash": 'doc_hash',
"document_id": 'document_id',
"dataset_id": 'dataset_id',
}
))
rst = model.rerank('hello', docs, None, 2)
assert rst[0].page_content == 'hello'

View File

@ -31,12 +31,12 @@ def mock_chat_generate_invalid(messages: List[BaseMessage],
run_manager: Optional[CallbackManagerForLLMRun] = None,
**kwargs: Any):
raise anthropic.APIStatusError('Invalid credentials',
request=httpx._models.Request(
method='POST',
url='https://api.anthropic.com/v1/completions',
),
response=httpx._models.Response(
status_code=401,
request=httpx._models.Request(
method='POST',
url='https://api.anthropic.com/v1/completions',
)
),
body=None
)

View File

@ -2,7 +2,9 @@ import pytest
from unittest.mock import patch
import json
import requests
from langchain.schema import LLMResult, Generation, AIMessage, ChatResult, ChatGeneration
from requests import Response
from core.model_providers.providers.base import CredentialsValidateFailedError
from core.model_providers.providers.chatglm_provider import ChatGLMProvider
@ -26,8 +28,11 @@ def decrypt_side_effect(tenant_id, encrypted_key):
def test_is_provider_credentials_valid_or_raise_valid(mocker):
mocker.patch('langchain.llms.chatglm.ChatGLM._call',
return_value="abc")
mock_response = Response()
mock_response.status_code = 200
mock_response._content = json.dumps({'models': []}).encode('utf-8')
mocker.patch('requests.get',
return_value=mock_response)
MODEL_PROVIDER_CLASS.is_provider_credentials_valid_or_raise(VALIDATE_CREDENTIAL)

View File

@ -30,7 +30,7 @@ services:
# The Weaviate vector store.
weaviate:
image: semitechnologies/weaviate:1.18.4
image: semitechnologies/weaviate:1.19.0
restart: always
volumes:
# Mount the Weaviate data directory to the container.
@ -63,4 +63,4 @@ services:
# environment:
# QDRANT__API_KEY: 'difyai123456'
# ports:
# - "6333:6333"
# - "6333:6333"

View File

@ -2,7 +2,7 @@ version: '3.1'
services:
# API service
api:
image: langgenius/dify-api:0.3.30
image: langgenius/dify-api:0.3.32
restart: always
environment:
# Startup mode, 'api' starts the API server.
@ -128,7 +128,7 @@ services:
# worker service
# The Celery worker for processing the queue.
worker:
image: langgenius/dify-api:0.3.30
image: langgenius/dify-api:0.3.32
restart: always
environment:
# Startup mode, 'worker' starts the Celery worker for processing the queue.
@ -196,7 +196,7 @@ services:
# Frontend web application.
web:
image: langgenius/dify-web:0.3.30
image: langgenius/dify-web:0.3.32
restart: always
environment:
EDITION: SELF_HOSTED
@ -253,7 +253,7 @@ services:
# The Weaviate vector store.
weaviate:
image: semitechnologies/weaviate:1.18.4
image: semitechnologies/weaviate:1.19.0
restart: always
volumes:
# Mount the Weaviate data directory to the container.
@ -280,7 +280,7 @@ services:
# (if uncommented, you need to comment out the weaviate service above,
# and set VECTOR_STORE to qdrant in the api & worker service.)
# qdrant:
# image: qdrant/qdrant:latest
# image: langgenius/qdrant:latest
# restart: always
# volumes:
# - ./volumes/qdrant:/qdrant/storage
@ -302,4 +302,4 @@ services:
- api
- web
ports:
- "80:80"
- "80:80"

BIN
images/demo.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 790 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.2 MiB

Some files were not shown because too many files have changed in this diff Show More