Files
ragflow/conf/models/stepfun.json
Haruko386 bf41d35729 Go: implement PaddleOCR provider and implement ASR for CoHere (#14954)
### What problem does this PR solve?

This PR implement implement OCR for Baidu and Mistral, implement
PaddleOCR provider and implement ASR for CoHere

**Verified examples from the CLI:**

```
RAGFlow(user)> ocr with 'mistral-ocr-2512@test@mistral' file './internal/text.jpg'
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| text                                                                                                                                                                                                                                                             |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Parallel to these organizational innovations there were significant complementary technical innovations (e.g., improved methods of manufacturing cast-iron pipe and of coating interiors for pressure maintenance, and newer paving and construction material... |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+


RAGFlow(user)> ocr with 'paddleocr-vl-0.9b@test@baidu' file './internal/text.jpg'
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| text                                                                                                                                                                                                                                                             |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Parallel to these organizational innovations there were significant complementary technical innovations (e.g., improved methods of manufacturing cast-iron pipe and of coating interiors for pressure maintenance, and newer paving and construction material... |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

# PaddleOCR
RAGFlow(user)> ocr with 'PaddleOCR-VL-1.5@test@paddleocr' file './internal/test.pdf'
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| text                                                                                                                                                                                                                                                             |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| # Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation

Bingxin Ke

Nando Metzger

Photogra

Anton Obukhov

Rodrigo Caye Daudt

netry and Remote Sensing,

Shengyu Huang

Konrad Schindler

ETH Zürich





<div style="text-align: c...  |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

# Cohere

RAGFlow(user)> asr with 'cohere-transcribe-03-2026@test@cohere' audio './internal/test.wav' param '{"language": "en"}'
+-----------------------------------------------------------------------------------------------------------------------+
| text                                                                                                                  |
+-----------------------------------------------------------------------------------------------------------------------+
|  The examination and testimony of the experts enabled the Commission to conclude that five shots may have been fired. |
+-----------------------------------------------------------------------------------------------------------------------+
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
2026-05-15 18:41:43 +08:00

116 lines
1.9 KiB
JSON

{
"name": "StepFun",
"url": {
"default": "https://api.stepfun.ai/v1"
},
"url_suffix": {
"chat": "chat/completions",
"models": "models",
"tts": "audio/speech"
},
"class": "step",
"models": [
{
"name": "step-3.5-flash",
"max_tokens": 32768,
"model_types": [
"chat"
]
},
{
"name": "step-3.5-flash-paid",
"max_tokens": 32768,
"model_types": [
"chat"
]
},
{
"name": "step-2-16k",
"max_tokens": 16384,
"model_types": [
"chat"
]
},
{
"name": "step-1-256k",
"max_tokens": 262144,
"model_types": [
"chat"
]
},
{
"name": "step-1-128k",
"max_tokens": 131072,
"model_types": [
"chat"
]
},
{
"name": "step-1-32k",
"max_tokens": 32768,
"model_types": [
"chat"
]
},
{
"name": "step-1-8k",
"max_tokens": 8192,
"model_types": [
"chat"
]
},
{
"name": "step-1-flash",
"max_tokens": 8192,
"model_types": [
"chat"
]
},
{
"name": "step-1v-32k",
"max_tokens": 32768,
"model_types": [
"chat",
"vision"
]
},
{
"name": "step-1v-8k",
"max_tokens": 8192,
"model_types": [
"chat",
"vision"
]
},
{
"name": "step-1o-vision-32k",
"max_tokens": 32768,
"model_types": [
"chat",
"vision"
]
},
{
"name": "step-tts-2",
"max_tokens": 8192,
"model_types": [
"tts"
]
},
{
"name": "stepaudio-2.5-tts",
"max_tokens": 8192,
"model_types": [
"tts"
]
},
{
"name": "step-tts-mini",
"max_tokens": 8192,
"model_types": [
"tts"
]
}
]
}