mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-03-18 05:10:00 +08:00
### What problem does this PR solve? Implement Search() in Infinity in GO. The function can handle the following request. "search '曹操' on datasets 'infinity'" "search '常胜将军' on datasets 'infinity'" "search '卓越儒雅' on datasets 'infinity'" "search '辅佐刘禅北伐中原' on datasets 'infinity'" The output is exactly the same as request to python Search() ### Type of change - [ ] New Feature (non-breaking change which adds functionality)
Doc Engine Implementation
RAGFlow Go document engine implementation, supporting Elasticsearch and Infinity storage engines.
Directory Structure
internal/engine/
├── engine.go # DocEngine interface definition
├── engine_factory.go # Factory function
├── global.go # Global engine instance management
├── elasticsearch/ # Elasticsearch implementation
│ ├── client.go # ES client initialization
│ ├── search.go # Search implementation
│ ├── index.go # Index operations
│ └── document.go # Document operations
└── infinity/ # Infinity implementation
├── client.go # Infinity client initialization (placeholder)
├── search.go # Search implementation (placeholder)
├── index.go # Table operations (placeholder)
└── document.go # Document operations (placeholder)
Configuration
Using Elasticsearch
Add to conf/service_conf.yaml:
doc_engine:
type: elasticsearch
es:
hosts: "http://localhost:9200"
username: "elastic"
password: "infini_rag_flow"
Using Infinity
doc_engine:
type: infinity
infinity:
uri: "localhost:23817"
postgres_port: 5432
db_name: "default_db"
Note: Infinity implementation is a placeholder waiting for the official Infinity Go SDK. Only Elasticsearch is fully functional at this time.
Usage
1. Initialize Engine
The engine is automatically initialized on service startup (see cmd/server_main.go):
// Initialize doc engine
if err := engine.Init(&cfg.DocEngine); err != nil {
log.Fatalf("Failed to initialize doc engine: %v", err)
}
defer engine.Close()
2. Use in Service
In ChunkService:
type ChunkService struct {
docEngine engine.DocEngine
engineType config.EngineType
}
func NewChunkService() *ChunkService {
cfg := config.Get()
return &ChunkService{
docEngine: engine.Get(),
engineType: cfg.DocEngine.Type,
}
}
// Search
func (s *ChunkService) RetrievalTest(req *RetrievalTestRequest) (*RetrievalTestResponse, error) {
ctx := context.Background()
switch s.engineType {
case config.EngineElasticsearch:
// Use Elasticsearch retrieval
searchReq := &elasticsearch.SearchRequest{
IndexNames: []string{"chunks"},
Query: elasticsearch.BuildMatchTextQuery([]string{"content"}, req.Question, "AUTO"),
Size: 10,
}
result, _ := s.docEngine.Search(ctx, searchReq)
esResp := result.(*elasticsearch.SearchResponse)
// Process result...
case config.EngineInfinity:
// Infinity not implemented yet
return nil, fmt.Errorf("infinity not yet implemented")
}
}
3. Direct Use of Global Engine
import "ragflow/internal/engine"
// Get engine instance
docEngine := engine.Get()
// Search
searchReq := &elasticsearch.SearchRequest{
IndexNames: []string{"my_index"},
Query: elasticsearch.BuildTermQuery("status", "active"),
}
result, err := docEngine.Search(ctx, searchReq)
// Index operations
err = docEngine.CreateIndex(ctx, "my_index", mapping)
err = docEngine.DeleteIndex(ctx, "my_index")
exists, _ := docEngine.IndexExists(ctx, "my_index")
// Document operations
err = docEngine.IndexDocument(ctx, "my_index", "doc_id", docData)
bulkResp, _ := docEngine.BulkIndex(ctx, "my_index", docs)
doc, _ := docEngine.GetDocument(ctx, "my_index", "doc_id")
err = docEngine.DeleteDocument(ctx, "my_index", "doc_id")
API Documentation
DocEngine Interface
type DocEngine interface {
// Search
Search(ctx context.Context, req interface{}) (interface{}, error)
// Index operations
CreateIndex(ctx context.Context, indexName string, mapping interface{}) error
DeleteIndex(ctx context.Context, indexName string) error
IndexExists(ctx context.Context, indexName string) (bool, error)
// Document operations
IndexDocument(ctx context.Context, indexName, docID string, doc interface{}) error
BulkIndex(ctx context.Context, indexName string, docs []interface{}) (interface{}, error)
GetDocument(ctx context.Context, indexName, docID string) (interface{}, error)
DeleteDocument(ctx context.Context, indexName, docID string) error
// Health check
Ping(ctx context.Context) error
Close() error
}
Dependencies
Elasticsearch
github.com/elastic/go-elasticsearch/v8
Infinity
- Not available yet - Waiting for official Infinity Go SDK
Notes
- Type Conversion: The
Searchmethod returnsinterface{}, requiring type assertion based on engine type - Model Definitions: Each engine has its own request/response models defined in their respective packages
- Error Handling: It's recommended to handle errors uniformly in the service layer and return user-friendly error messages
- Performance Optimization: For large volumes of documents, prefer using
BulkIndexfor batch operations - Connection Management: The engine is automatically closed when the program exits, no manual management needed
- Infinity Status: Infinity implementation is currently a placeholder. Only Elasticsearch is fully functional.
Extending with New Engines
To add a new document engine (e.g., Milvus, Qdrant):
- Create a new directory under
internal/engine/, e.g.,milvus/ - Implement four files:
client.go,search.go,index.go,document.go - Add corresponding creation logic in
engine_factory.go - Add configuration structure in
config.go - Update service layer code to support the new engine
Correspondence with Python Project
| Python Module | Go Module |
|---|---|
common/doc_store/doc_store_base.py |
internal/engine/engine.go |
rag/utils/es_conn.py |
internal/engine/elasticsearch/ |
rag/utils/infinity_conn.py |
internal/engine/infinity/ (placeholder) |
common/settings.py |
internal/config/config.go |
Current Status
- ✅ Elasticsearch: Fully implemented and functional
- ⏳ Infinity: Placeholder implementation, waiting for official Go SDK
- 📋 OceanBase: Not implemented (removed from requirements)