Commit Graph

7 Commits

Author SHA1 Message Date
5d022d83e8 Go: implement provider: PaddleOCR_Local (#15158)
### What problem does this PR solve?

Go: implement provider: PaddleOCR_Local

**Verified from CLI**

```
RAGFlow(user)> ocr with 'PaddleOCR-VL@test@paddleocr_local' file './internal/test1.jpg'
+----------------------+
| text                 |
+----------------------+
| ## Parallel to these |
+----------------------+
```

### Type of change

- [X] Bug Fix (non-breaking change which fixes an issue)
- [X] New Feature (non-breaking change which adds functionality)
- [X] Refactoring
2026-05-25 12:12:57 +08:00
3a5df08c76 Go: add file parse command (#14892)
### What problem does this PR solve?

```
RAGFlow(user)> ocr with 'hunyuanocr@test@gitee' file './picture.png'
+----------------------------------------------------------+
| text                                                     |
+----------------------------------------------------------+
| 生活不是等待风暴过去,而是学会在雨中翩翩起舞。
——佚名                                                       |
+----------------------------------------------------------+

RAGFlow(user)> list 'test@gitee' tasks;
+---------+----------------------------------+
| status  | task_id                          |
+---------+----------------------------------+
| success | C3FX4MQNKY5MGC6ZFMIXIAMJKHCEBQB5 |
+---------+----------------------------------+
RAGFlow(user)> show 'test@gitee' task 'C3FX4MQNKY5MGC6ZFMIXIAMJKHCEBQB5';
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
| content                                                                                                                                                                                                                                                          | index |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
| # PDF 1: Purpose of RAGFlow  

RAGFlow is an open source Retrieval-Augmented Generation (RAG) engine designed to turn raw documents into reliable context for large language models.Its purpose is to make it practical to build an Al assistant that can ans... | 1     |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+

```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-05-15 12:29:52 +08:00
b18640d228 Go: fix OCR command (#14891)
### What problem does this PR solve?

RAGFlow(user)> ocr with 'hunyuanocr@test@gitee' file './picture.png'
+----------------------------------------------------------+
| text                                                     |
+----------------------------------------------------------+
| 生活不是等待风暴过去,而是学会在雨中翩翩起舞。

——佚名                                                       |
+----------------------------------------------------------+

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-05-13 17:29:53 +08:00
bfb4a0eea2 Go: implement Encode (embeddings) in Gitee AI driver (#14698)
### What problem does this PR solve?

The Gitee AI Go driver in `internal/entity/models/gitee.go` shipped with
a stub `Encode` method that returned `gitee, no such method`, even
though `conf/models/gitee.json` already wires the `embedding` URL
suffix. The conf also listed no embedding models, so the picker had
nothing to select.

This blocked any tenant who wanted to use Gitee AI for chat, rerank
(already working, see #14656), and embeddings from a single provider.

This PR fills the gap, mirroring the just-merged Aliyun `Encode`
(#14647):

- `internal/entity/models/gitee.go`: replace the `Encode` stub with a
real implementation.
Validates inputs, resolves the region with a default fallback, POSTs the
standard OpenAI-compatible `{"model", "input": [...]}` body to
`BaseURL[region] + URLSuffix.Embedding`, parses `data[*].embedding`
indexed by `data[*].index` so output order matches input order, handles
both `float64` and `float32` element types, and uses a 30s per-call
context deadline matching the merged `Rerank`.
- `conf/models/gitee.json`: add `BAAI/bge-m3` so the embedding picker
has something to select.

No factory change. No interface change. No URL suffix change.

Verified with `go build`, `go vet`, and `gofmt -l` : all clean.

Closes #14697

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-05-11 11:56:46 +08:00
d8d49df35e Go: implement Rerank in Gitee AI driver (#14656)
### What problem does this PR solve?

The Gitee AI Go driver shipped with a stub \`Rerank\` method that
returned \`Rerank not implemented\`, even though
\`conf/models/gitee.json\` already wires the rerank URL suffix at
\`\"rerank\": \"rerank\"\`. The same config did not list any
rerank model, so the picker had nothing to select.

So a Gitee tenant could not use BAAI/bge-reranker-v2-m3 as a reranker
through the Go layer today, even though the
infrastructure was one config entry and one method body away.

### What this PR includes

- \`conf/models/gitee.json\`: add \`BAAI/bge-reranker-v2-m3\` to the
\`models\` array.
- \`internal/entity/models/gitee.go\`: replace the \`Rerank\` stub with
a real implementation. Adds two small local types
that match the OpenAI-compatible \`/rerank\` shape already used by the
SiliconFlow and ZhipuAI drivers.

No factory change. No interface change.

### How the driver works

- Validate \`apiConfig\` and the API key, validate the model name,
resolve the region with a default fallback, build the
  URL from \`BaseURL[region] + URLSuffix.Rerank\`.
- Use a per-call \`context.WithTimeout(30s)\` and
\`http.NewRequestWithContext\`, matching the pattern the
  recently merged Aliyun Encode and the OpenAI driver already use.
- Send \`{model, query, documents, top_n, return_documents:false}\` in
the body.
- Parse \`results[*].relevance_score\` and copy each score into the
output slice indexed by \`results[*].index\`, so the
output order matches the input order even if the API returns items in a
different order.
- Empty input returns \`[]float64{}\` with no HTTP call.
- An out-of-range result index returns a clear error rather than
silently skipping the entry.
- Non-200 responses propagate the upstream status line and body.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

### How was this tested?

- \`go build ./internal/entity/models/...\` in a clean go 1.25 image
returns exit 0.
- The full method set on \`GiteeModel\` still matches the
\`ModelDriver\` interface.
- Pattern parity with the existing SiliconFlow Rerank and the recently
merged ZhipuAI Rerank (#14608).

Closes #14655
2026-05-08 13:08:22 +08:00
965717c4fb Go: add new provider: google (#14395)
### What problem does this PR solve?

As title.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-27 20:35:47 +08:00
1c244df90d Go: add gitee and siliconflow as model provider (#14336)
### What problem does this PR solve?

As title

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-24 20:59:30 +08:00