mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-05-21 00:36:43 +08:00
## What - Add Replicate as a chat provider backed by the documented predictions API - Register Replicate in the Go model factory and provider config - Support non-streaming chat through sync predictions, polling fallback, streaming through `urls.stream`, model listing, and connection checks ## Notes - Uses `POST /v1/predictions` with Replicate model identifiers in `version`, which supports official and community model identifiers - Maps RAGFlow messages into Replicate prompt-shaped inputs (`prompt`, optional `system_prompt`) and forwards common documented LLM inputs: `max_new_tokens`, `temperature`, `top_p` - Preserves whitespace in SSE output chunks and emits RAGFlow `[DONE]` at stream completion ## Tests - `go test -vet=off -run TestReplicate -count=1 ./internal/entity/models` - `go test -vet=off -count=1 ./internal/entity/models` Refs #14736
28 lines
471 B
JSON
28 lines
471 B
JSON
{
|
|
"name": "Replicate",
|
|
"url": {
|
|
"default": "https://api.replicate.com"
|
|
},
|
|
"url_suffix": {
|
|
"chat": "v1/predictions",
|
|
"models": "v1/models"
|
|
},
|
|
"class": "replicate",
|
|
"models": [
|
|
{
|
|
"name": "meta/meta-llama-3-70b-instruct",
|
|
"max_tokens": 8192,
|
|
"model_types": [
|
|
"chat"
|
|
]
|
|
},
|
|
{
|
|
"name": "meta/meta-llama-3-8b-instruct",
|
|
"max_tokens": 8192,
|
|
"model_types": [
|
|
"chat"
|
|
]
|
|
}
|
|
]
|
|
}
|