mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-05-26 02:37:21 +08:00
### What problem does this PR solve?
This PRimplement TTS, ASR for Siliconflow and TTs for StepFun
**The following functionalities are now supported:**
**SiliConFlow:**
- [x] Text To Speech
- [x] Audio To Text
- [x] Stream Audio To Text
**StrepFun:**
- [x] Audio To Text
- [x] Stream Audio To Text
**Verified examples from the CLI:**
```plaintext
# SiliconFlow
RAGFlow(user)> tts with 'FunAudioLLM/CosyVoice2-0.5B@test@Siliconflow' text 'hello? show yourself' play format 'wav' param '{"voice": "fnlp/MOSS-TTSD-v0.5:alex"}'
SUCCESS
RAGFlow(user)> asr with 'FunAudioLLM/SenseVoiceSmall@test@siliconflow' audio './internal/test.wav' param ''
+----------------------------------------------------------------------------------------------------------------------+
| text |
+----------------------------------------------------------------------------------------------------------------------+
| The examination and testimony of the experts enabled the commission to conclude that five shots may have been fired. |
+----------------------------------------------------------------------------------------------------------------------+
RAGFlow(user)> stream asr with 'FunAudioLLM/SenseVoiceSmall@test@siliconflow' audio './internal/test.wav' param ''
+----------------------------------------------------------------------------------------------------------------------+
| text |
+----------------------------------------------------------------------------------------------------------------------+
| The examination and testimony of the experts enabled the commission to conclude that five shots may have been fired. |
+----------------------------------------------------------------------------------------------------------------------+
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
74 lines
1.3 KiB
JSON
74 lines
1.3 KiB
JSON
{
|
|
"name": "SiliconFlow",
|
|
"url": {
|
|
"default": "https://api.siliconflow.cn/v1"
|
|
},
|
|
"url_suffix": {
|
|
"chat": "chat/completions",
|
|
"models": "models",
|
|
"embedding": "embeddings",
|
|
"rerank": "rerank",
|
|
"balance": "user/info",
|
|
"tts": "audio/speech",
|
|
"asr": "audio/transcriptions"
|
|
},
|
|
"models": [
|
|
{
|
|
"name": "qwen/qwen3-8b",
|
|
"max_tokens": 32768,
|
|
"model_types": [
|
|
"chat"
|
|
]
|
|
},
|
|
{
|
|
"name": "qwen/qwen3.5-4b",
|
|
"max_tokens": 262144,
|
|
"model_types": [
|
|
"chat"
|
|
]
|
|
},
|
|
{
|
|
"name": "tencent/hunyuan-mt-7b",
|
|
"max_tokens": 32768,
|
|
"model_types": [
|
|
"chat"
|
|
]
|
|
},
|
|
{
|
|
"name": "BAAI/bge-reranker-v2-m3",
|
|
"max_tokens": 8192,
|
|
"model_types": [
|
|
"rerank"
|
|
]
|
|
},
|
|
{
|
|
"name": "Qwen/Qwen3-Embedding-0.6B",
|
|
"max_tokens": 8192,
|
|
"model_types": [
|
|
"embedding"
|
|
]
|
|
},
|
|
{
|
|
"name": "fnlp/MOSS-TTSD-v0.5",
|
|
"max_tokens": 8192,
|
|
"model_types": [
|
|
"tts"
|
|
]
|
|
},
|
|
{
|
|
"name": "FunAudioLLM/CosyVoice2-0.5B",
|
|
"max_tokens": 8192,
|
|
"model_types": [
|
|
"tts"
|
|
]
|
|
},
|
|
{
|
|
"name": "FunAudioLLM/SenseVoiceSmall",
|
|
"max_tokens": 8192,
|
|
"model_types": [
|
|
"asr"
|
|
]
|
|
}
|
|
]
|
|
}
|