Files
ragflow/conf/models/xinference.json
tmimmanuel 09a06f1b00 Go: implement provider: Xinference (#14938)
### What problem does this PR solve?

Closes #14808.

Adds a Go model driver for Xinference so self-hosted Xinference chat
models can be used through the Go provider layer instead of falling
through to the dummy driver. Xinference exposes an OpenAI-compatible API
under `/v1`; the driver accepts either a root endpoint such as
`http://127.0.0.1:9997` or an OpenAI-compatible endpoint such as
`http://127.0.0.1:9997/v1` and normalizes it before calling chat or
model-listing routes.

### What is changed?

- Add `internal/entity/models/xinference.go` implementing `ModelDriver`
for Xinference chat.
- Route provider name `xinference` in
`internal/entity/models/factory.go`.
- Add `conf/models/xinference.json` as a local provider config.
- Add focused unit tests in `internal/entity/models/xinference_test.go`.

Initial method coverage:

- `ChatWithMessages`: POST `/v1/chat/completions`.
- `ChatStreamlyWithSender`: SSE streaming from `/v1/chat/completions`.
- `ListModels`: GET `/v1/models`.
- `CheckConnection`: lightweight `ListModels` probe.
- Optional auth: send `Authorization: Bearer <api_key>` only when a
non-empty key is configured, matching Xinference no-auth and
auth-enabled deployments.
- `Balance`, `Embed`, `Rerank`, ASR, TTS, and OCR return `no such
method` for this initial chat-provider PR.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Bug Fix (non-breaking change which fixes an issue)

### Tests

- `go test -vet=off -run TestXinference -count=1
./internal/entity/models/...`
- `go test -vet=off -count=1 ./internal/entity/models/...`

### References

- Xinference docs:
https://inference.readthedocs.io/zh-cn/latest/index.html
- OpenAI-compatible chat usage:
https://inference.readthedocs.io/zh-cn/latest/getting_started/using_xinference.html
- API key auth:
https://inference.readthedocs.io/zh-cn/latest/user_guide/auth_system.html

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-19 15:10:13 +08:00

9 lines
131 B
JSON

{
"name": "xinference",
"url_suffix": {
"chat": "v1/chat/completions",
"models": "v1/models"
},
"class": "local"
}