mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-06-08 08:07:21 +08:00
### What problem does this PR solve? The LocalAI Go driver landed in #14809 and Embed landed in #14811. `Rerank` was left as a stub that returns `"not implemented"`. This PR fills the gap. LocalAI exposes a public rerank endpoint at `<tenant-url>/v1/rerank` with a Cohere-shaped request and response (`{model, query, documents, top_n}` → `{results: [{index, relevance_score}]}`). The Python side has had `LocalAIRerank` in `rag/llm/rerank_model.py` for a long time. Until this PR, a tenant who wanted to use LocalAI for reranking in the Go layer got `"not implemented"`. ### What this PR includes - `conf/models/localai.json`: add `"rerank": "rerank"` under `url_suffix` so the driver can build the URL from config. This matches the `URLSuffix.Rerank` field already used by aliyun and siliconflow. - `internal/entity/models/localai.go`: replace the `Rerank` stub with a real implementation that POSTs to `/v1/rerank`. Adds local request/response types `localAIRerankRequest` and `localAIRerankResponse`. No factory change. No interface change. ### How the implementation works - Validate the model name and resolve the tenant-supplied base URL with the existing `resolveBaseURL` helper. - Wrap the request with `context.WithTimeout(nonStreamCallTimeout)` so the call has a clear deadline. Same pattern `ChatWithMessages`, `ListModels`, and `Embed` already use in this file. - Only set the `Authorization` header when a non-empty API key was supplied. LocalAI accepts an empty key by default, so this preserves the optional-auth contract. - Default `top_n` to `len(documents)` when `rerankConfig.TopN == 0`, matching the existing Aliyun and SiliconFlow rerank implementations. - Validate every `results[].index` against `len(documents)`. If the upstream returns an out-of-range index, fail clearly instead of silently writing past the slice. - An empty `documents` slice returns `&RerankResponse{}` with no HTTP call. - Non-200 responses propagate the upstream status line and body. ### Note on stacking This PR builds on #14809 (LocalAI driver) and #14811 (LocalAI Embed). Until both merge, this PR's diff on GitHub will include all three commits. After #14809 and #14811 land on `main`, GitHub will auto-reduce this PR to only the `Rerank` changes (one commit, ~99 line diff in `localai.go` plus 1 line in `localai.json`). ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### How was this tested? - `go build ./internal/entity/models/...` returns exit 0 on go 1.25 (the `go.mod` minimum). - The full method set on `LocalAIModel` still matches the `ModelDriver` interface. - Pattern parity with the existing Aliyun Rerank (`internal/entity/models/aliyun.go`) and SiliconFlow Rerank (`internal/entity/models/siliconflow.go`) implementations. Closes #14812 Depends on #14809, #14811 Tracking: #14736 Co-authored-by: Jin Hai <haijin.chn@gmail.com>
11 lines
177 B
JSON
11 lines
177 B
JSON
{
|
|
"name": "localai",
|
|
"url_suffix": {
|
|
"chat": "chat/completions",
|
|
"models": "models",
|
|
"embedding": "embeddings",
|
|
"rerank": "rerank"
|
|
},
|
|
"class": "local"
|
|
}
|