ragflow/xinference.json at b2bf9155edf1bc5b4b42e1013401a7d828b2d6f7 - ragflow - Gitea: Git with a cup of tea

youngkingdom/ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-05-26 10:47:21 +08:00

Files

Renzo 394cd5d116 Go: implement Embed in Xinference driver (#14932 )

## Summary

- Replaces the `"no such method"` stub on `XinferenceModel.Embed`
(`internal/entity/models/xinference.go`) with a real implementation
against Xinference's OpenAI-compatible `/v1/embeddings` endpoint.
- Adds the `"embedding": "v1/embeddings"` URL suffix to
`conf/models/xinference.json`.
- Mirrors the Python `XinferenceEmbed` class in
`rag/llm/embedding_model.py:407` for payload shape (OpenAI-compatible
`model + input` → `data[*].index + data[*].embedding`) and tolerates the
same no-auth default Xinference deployments use. Authorization is only
sent when a non-empty API key is configured, via the existing
`setXinferenceAuth` helper.
- Reuses the existing `normalizeXinferenceBaseURL` + `baseURLForRegion`
helpers so both `http://127.0.0.1:9997` and `http://127.0.0.1:9997/v1`
resolve to the same `/v1/embeddings` target without doubled `/v1`.
- Validates response indices — duplicate, missing, or out-of-range
`data[*].index` values fail with a clear error rather than silently
producing misaligned vectors.
- Returns `[]EmbeddingData` in original input order (placed by `Index`)
so downstream callers can index positionally without re-sorting.
- Forwards `EmbeddingConfig.Dimension` as `dimensions` when `> 0`,
matching the OpenAI cluster pattern.

Closes #14810

Co-authored-by: Jin Hai <haijin.chn@gmail.com>

2026-05-21 11:47:30 +08:00

11 lines

192 B

JSON

Raw Blame History

 {
   "name": "xinference",
   "url_suffix": {
     "chat": "v1/chat/completions",
     "embedding": "v1/embeddings",
     "models": "v1/models",
     "rerank": "v1/rerank"
   },
   "class": "local"
 }