mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-05-21 16:40:07 +08:00
### What problem does this PR solve?
This PR implement implement OCR for Baidu and Mistral, implement
PaddleOCR provider and implement ASR for CoHere
**Verified examples from the CLI:**
```
RAGFlow(user)> ocr with 'mistral-ocr-2512@test@mistral' file './internal/text.jpg'
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| text |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Parallel to these organizational innovations there were significant complementary technical innovations (e.g., improved methods of manufacturing cast-iron pipe and of coating interiors for pressure maintenance, and newer paving and construction material... |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
RAGFlow(user)> ocr with 'paddleocr-vl-0.9b@test@baidu' file './internal/text.jpg'
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| text |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Parallel to these organizational innovations there were significant complementary technical innovations (e.g., improved methods of manufacturing cast-iron pipe and of coating interiors for pressure maintenance, and newer paving and construction material... |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# PaddleOCR
RAGFlow(user)> ocr with 'PaddleOCR-VL-1.5@test@paddleocr' file './internal/test.pdf'
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| text |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| # Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
Bingxin Ke
Nando Metzger
Photogra
Anton Obukhov
Rodrigo Caye Daudt
netry and Remote Sensing,
Shengyu Huang
Konrad Schindler
ETH Zürich
<div style="text-align: c... |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# Cohere
RAGFlow(user)> asr with 'cohere-transcribe-03-2026@test@cohere' audio './internal/test.wav' param '{"language": "en"}'
+-----------------------------------------------------------------------------------------------------------------------+
| text |
+-----------------------------------------------------------------------------------------------------------------------+
| The examination and testimony of the experts enabled the Commission to conclude that five shots may have been fired. |
+-----------------------------------------------------------------------------------------------------------------------+
```
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
116 lines
1.9 KiB
JSON
116 lines
1.9 KiB
JSON
{
|
|
"name": "StepFun",
|
|
"url": {
|
|
"default": "https://api.stepfun.ai/v1"
|
|
},
|
|
"url_suffix": {
|
|
"chat": "chat/completions",
|
|
"models": "models",
|
|
"tts": "audio/speech"
|
|
},
|
|
"class": "step",
|
|
"models": [
|
|
{
|
|
"name": "step-3.5-flash",
|
|
"max_tokens": 32768,
|
|
"model_types": [
|
|
"chat"
|
|
]
|
|
},
|
|
{
|
|
"name": "step-3.5-flash-paid",
|
|
"max_tokens": 32768,
|
|
"model_types": [
|
|
"chat"
|
|
]
|
|
},
|
|
{
|
|
"name": "step-2-16k",
|
|
"max_tokens": 16384,
|
|
"model_types": [
|
|
"chat"
|
|
]
|
|
},
|
|
{
|
|
"name": "step-1-256k",
|
|
"max_tokens": 262144,
|
|
"model_types": [
|
|
"chat"
|
|
]
|
|
},
|
|
{
|
|
"name": "step-1-128k",
|
|
"max_tokens": 131072,
|
|
"model_types": [
|
|
"chat"
|
|
]
|
|
},
|
|
{
|
|
"name": "step-1-32k",
|
|
"max_tokens": 32768,
|
|
"model_types": [
|
|
"chat"
|
|
]
|
|
},
|
|
{
|
|
"name": "step-1-8k",
|
|
"max_tokens": 8192,
|
|
"model_types": [
|
|
"chat"
|
|
]
|
|
},
|
|
{
|
|
"name": "step-1-flash",
|
|
"max_tokens": 8192,
|
|
"model_types": [
|
|
"chat"
|
|
]
|
|
},
|
|
{
|
|
"name": "step-1v-32k",
|
|
"max_tokens": 32768,
|
|
"model_types": [
|
|
"chat",
|
|
"vision"
|
|
]
|
|
},
|
|
{
|
|
"name": "step-1v-8k",
|
|
"max_tokens": 8192,
|
|
"model_types": [
|
|
"chat",
|
|
"vision"
|
|
]
|
|
},
|
|
{
|
|
"name": "step-1o-vision-32k",
|
|
"max_tokens": 32768,
|
|
"model_types": [
|
|
"chat",
|
|
"vision"
|
|
]
|
|
},
|
|
{
|
|
"name": "step-tts-2",
|
|
"max_tokens": 8192,
|
|
"model_types": [
|
|
"tts"
|
|
]
|
|
},
|
|
{
|
|
"name": "stepaudio-2.5-tts",
|
|
"max_tokens": 8192,
|
|
"model_types": [
|
|
"tts"
|
|
]
|
|
},
|
|
{
|
|
"name": "step-tts-mini",
|
|
"max_tokens": 8192,
|
|
"model_types": [
|
|
"tts"
|
|
]
|
|
}
|
|
]
|
|
}
|