mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-05-26 10:47:21 +08:00
## Summary - Replaces the `"no such method"` stub on `XinferenceModel.Embed` (`internal/entity/models/xinference.go`) with a real implementation against Xinference's OpenAI-compatible `/v1/embeddings` endpoint. - Adds the `"embedding": "v1/embeddings"` URL suffix to `conf/models/xinference.json`. - Mirrors the Python `XinferenceEmbed` class in `rag/llm/embedding_model.py:407` for payload shape (OpenAI-compatible `model + input` → `data[*].index + data[*].embedding`) and tolerates the same no-auth default Xinference deployments use. Authorization is only sent when a non-empty API key is configured, via the existing `setXinferenceAuth` helper. - Reuses the existing `normalizeXinferenceBaseURL` + `baseURLForRegion` helpers so both `http://127.0.0.1:9997` and `http://127.0.0.1:9997/v1` resolve to the same `/v1/embeddings` target without doubled `/v1`. - Validates response indices — duplicate, missing, or out-of-range `data[*].index` values fail with a clear error rather than silently producing misaligned vectors. - Returns `[]EmbeddingData` in original input order (placed by `Index`) so downstream callers can index positionally without re-sorting. - Forwards `EmbeddingConfig.Dimension` as `dimensions` when `> 0`, matching the OpenAI cluster pattern. Closes #14810 Co-authored-by: Jin Hai <haijin.chn@gmail.com>
11 lines
192 B
JSON
11 lines
192 B
JSON
{
|
|
"name": "xinference",
|
|
"url_suffix": {
|
|
"chat": "v1/chat/completions",
|
|
"embedding": "v1/embeddings",
|
|
"models": "v1/models",
|
|
"rerank": "v1/rerank"
|
|
},
|
|
"class": "local"
|
|
}
|