Files
ragflow/internal/entity/models/factory.go
tmimmanuel cb01529d8b Go: implement provider: Voyage AI (#14811)
### What problem does this PR solve?

Add a Go driver for Voyage AI (https://voyageai.com), one of the
unchecked providers on the umbrella tracking issue #14736. Voyage AI is
**embed + rerank only** — no chat, no streaming, no `/v1/models`
endpoint. It's the first provider in the Go layer of this shape.

Until this PR, a tenant who configured `voyage` as a model provider in
the Go layer fell through to the default branch of
`internal/entity/models/factory.go` and got the dummy driver.

### What this PR includes

- New `internal/entity/models/voyage.go` with a `VoyageModel`
implementing the `ModelDriver` interface.
- New `conf/models/voyage.json` with 6 embedding models (`voyage-3.5`,
`voyage-3.5-lite`, `voyage-3-large`, `voyage-code-3`, `voyage-law-2`,
`voyage-finance-2`) and 2 rerank models (`rerank-2`, `rerank-2-lite`).
- `factory.go`: route `"voyage"` to `NewVoyageModel`.
- `internal/entity/models/voyage_test.go`: 19 unit tests.

### How the driver works

- **Embed**: `POST /v1/embeddings`. Response is OpenAI-shaped (`{data:
[{embedding, index, object, text}], model, usage}`). Driver reorders by
`index`, rejects duplicate / out-of-range / missing slots, and
short-circuits empty input without an HTTP call.
- **Rerank**: `POST /v1/rerank`. Voyage uses **`top_k`** as the request
param name (not `top_n` like Aliyun/SiliconFlow); the driver translates
`RerankConfig.TopN` → `top_k`. Response is Cohere-shaped (`{data:
[{relevance_score, index}], model}`), so the existing
`RerankResponse{Data: []RerankResult{Index, RelevanceScore}}` shape fits
cleanly.
- **`ListModels`**: returns a hardcoded list of `voyageKnownModels`.
Voyage does **not** expose `/v1/models` (probed live, returns 404), so
the driver synthesizes the list from the same set the config ships. New
upstream models are added by extending one slice.
- **`CheckConnection`**: pings a 1-input embed call against
`voyage-3.5`. Without `/v1/models`, this is the cheapest way to verify
the API key + network path before a tenant tries a real workload.
- **`ChatWithMessages` / `ChatStreamlyWithSender` / `Balance` /
`TranscribeAudio` / `AudioSpeech` / `OCRFile`**: all return `"no such
method"`. Voyage does not host any of these surfaces.

No interface change. No new dependencies.

### How was this tested?

**19 unit tests** in `internal/entity/models/voyage_test.go` — all pass
on go 1.25:

```
$ go test -vet=off -run TestVoyage -count=1 ./internal/entity/models/...
ok      ragflow/internal/entity/models  0.036s
```

Coverage: Name; Embed (happy path, reorder, empty-input, missing
key/model, duplicate index, out-of-range index, missing slot); Rerank
(happy path with `top_k` assertion, default-to-len-documents, empty
documents, out-of-range index); ListModels (static list, missing key);
CheckConnection (happy, 401); chat methods sentinels; Balance sentinel;
audio/OCR sentinels.

`go build ./internal/entity/models/...` exits 0.

**Live integration test** against `api.voyageai.com`:

```
=== RUN   TestVoyageLiveSmoke
    [OK] Name() = "voyage"
    [OK] ListModels (static): 8 models -> [voyage-3.5 voyage-3.5-lite voyage-3-large voyage-code-3 voyage-law-2 voyage-finance-2 rerank-2 rerank-2-lite]
    [OK] CheckConnection
    [OK] Embed vectors=3 dim=1024 indices=[0 1 2]
    [OK] Embed(empty) -> 0 vectors
    [OK] Rerank results=3 scores=[0.8125 0.59765625 0.39453125]
    [OK] ChatWithMessages returns voyage, no such method
    [OK] Balance returns voyage, no such method

VOYAGE LIVE SMOKE PASSED
--- PASS: TestVoyageLiveSmoke (0.81s)
```

What the live run proves on the wire:

- Auth (`Bearer <key>`) accepted by `api.voyageai.com`.
- Embed `voyage-3.5` on 3 inputs returns 3 vectors at dim 1024 with
`index` field preserved as `[0, 1, 2]` — the reorder-by-index code is
exercised on real data.
- Empty input short-circuits without an HTTP call (mock server would
have been hit if it did).
- Rerank `rerank-2` on 3 docs returns 3 real `relevance_score` floats
`[0.8125, 0.598, 0.395]`. The `top_k` translation works on the live
wire.
- All sentinel methods return the documented `"no such method"` strings.

### Note on PR history

This branch was previously named for LocalAI Embed work which is now
consolidated into PR #14813. The branch was reset to `upstream/main` and
rebuilt for Voyage. Diff against `main` is a clean +838 lines across 4
files.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Tracking: #14736

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-14 09:46:54 +08:00

98 lines
3.1 KiB
Go

//
// Copyright 2026 The InfiniFlow Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
package models
import (
"strings"
)
// ModelFactory creates ModelDriver instances based on provider name
type ModelFactory struct {
}
// NewModelFactory creates a new ModelFactory
func NewModelFactory() *ModelFactory {
return &ModelFactory{}
}
// CreateModelDriver creates a ModelDriver for the given provider and model
func (f *ModelFactory) CreateModelDriver(providerName string, baseURL map[string]string, urlSuffix URLSuffix) (ModelDriver, error) {
providerLower := strings.ToLower(providerName)
switch providerLower {
case "zhipu-ai":
return NewZhipuAIModel(baseURL, urlSuffix), nil
case "deepseek":
return NewDeepSeekModel(baseURL, urlSuffix), nil
case "moonshot":
return NewMoonshotModel(baseURL, urlSuffix), nil
case "minimax":
return NewMinimaxModel(baseURL, urlSuffix), nil
case "gitee":
return NewGiteeModel(baseURL, urlSuffix), nil
case "siliconflow":
return NewSiliconflowModel(baseURL, urlSuffix), nil
case "google":
return NewGoogleModel(baseURL, urlSuffix), nil
case "aliyun":
return NewAliyunModel(baseURL, urlSuffix), nil
case "volcengine":
return NewVolcEngine(baseURL, urlSuffix), nil
case "vllm":
return NewVllmModel(baseURL, urlSuffix), nil
case "xai":
return NewXAIModel(baseURL, urlSuffix), nil
case "lmstudio":
return NewLmStudioModel(baseURL, urlSuffix), nil
case "ollama":
return NewOllamaModel(baseURL, urlSuffix), nil
case "openai":
return NewOpenAIModel(baseURL, urlSuffix), nil
case "nvidia":
return NewNvidiaModel(baseURL, urlSuffix), nil
case "openrouter":
return NewOpenRouterModel(baseURL, urlSuffix), nil
case "huggingface":
return NewHuggingFaceModel(baseURL, urlSuffix), nil
case "baidu":
return NewBaiduModel(baseURL, urlSuffix), nil
case "cohere":
return NewCoHereModel(baseURL, urlSuffix), nil
case "fishaudio":
return NewFishAudioModel(baseURL, urlSuffix), nil
case "mistral":
return NewMistralModel(baseURL, urlSuffix), nil
case "upstage":
return NewUpstageModel(baseURL, urlSuffix), nil
case "stepfun":
return NewStepFunModel(baseURL, urlSuffix), nil
case "baichuan":
return NewBaichuanModel(baseURL, urlSuffix), nil
case "jina":
return NewJinaModel(baseURL, urlSuffix), nil
case "localai":
return NewLocalAIModel(baseURL, urlSuffix), nil
case "longcat":
return NewLongCatModel(baseURL, urlSuffix), nil
case "novita":
return NewNovitaModel(baseURL, urlSuffix), nil
case "voyage":
return NewVoyageModel(baseURL, urlSuffix), nil
default:
return NewDummyModel(baseURL, urlSuffix), nil
}
}