mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-05-20 16:26:42 +08:00
### What problem does this PR solve? Add stage for migrate tenant_llm data into table tenant_model_instance and tenant_model. ### Type of change - [x] Other (please describe): tool script <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Chores** * Added two new migration stages to move tenant model and instance records into new target tables, with dry-run, full-execute, and "create table only" modes; migration skips already-migrated rows to avoid duplicates. * **Bug Fixes** * Cleaned up migration header logging for clearer output. * **Documentation** * Added usage guide describing stages, options, modes, config format, examples, and expected logs. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
MySQL Data Migration Script
A flexible MySQL data migration tool for migrating data between tables with stage-based execution.
Overview
This script provides stage-based data migration between MySQL tables. Currently supports:
tenant_model_providertenant_model_instancetenant_model
Migration Stages
| Stage | Source Table | Target Table | Description |
|---|---|---|---|
tenant_model_provider |
tenant_llm |
tenant_model_provider |
Extracts distinct (tenant_id, llm_factory) pairs |
tenant_model_instance |
tenant_llm + tenant_model_provider |
tenant_model_instance |
Creates instances with distinct (tenant_id, llm_factory, api_key) |
tenant_model |
tenant_llm + tenant_model_provider + tenant_model_instance |
tenant_model |
Migrates model configurations (only status='0' records) |
Stage Dependencies
tenant_model_provider (no dependencies)
↓
tenant_model_instance (depends on tenant_model_provider)
↓
tenant_model (depends on tenant_model_provider and tenant_model_instance)
Field Mapping Rules
tenant_model_provider
| Target Field | Source | Rule |
|---|---|---|
id |
- | Random 32-character UUID1 |
provider_name |
tenant_llm.llm_factory |
Direct mapping |
tenant_id |
tenant_llm.tenant_id |
Direct mapping |
- Deduplication: Groups by
(tenant_id, llm_factory)and takes distinct pairs
tenant_model_instance
| Target Field | Source | Rule |
|---|---|---|
id |
- | Random 32-character UUID1 |
instance_name |
tenant_llm.llm_factory |
Direct mapping |
provider_id |
tenant_model_provider.id |
JOIN on tenant_id and provider_name=llm_factory |
api_key |
tenant_llm.api_key |
Direct mapping |
status |
tenant_llm.status |
Direct mapping |
- Deduplication: Groups by
(tenant_id, llm_factory, api_key)and takes distinct records
tenant_model
| Target Field | Source | Rule |
|---|---|---|
id |
- | Random 32-character UUID1 |
model_name |
tenant_llm.llm_name |
Direct mapping |
provider_id |
tenant_model_provider.id |
JOIN on tenant_id and provider_name=llm_factory |
instance_id |
tenant_model_instance.id |
JOIN on provider_id and api_key |
model_type |
tenant_llm.model_type |
Direct mapping |
status |
tenant_llm.status |
Direct mapping |
- Filter: Only migrates records where
tenant_llm.status='0'
Usage
Command Line Arguments
python mysql_migration.py [OPTIONS]
| Option | Short | Description |
|---|---|---|
--config |
-c |
Path to YAML config file (required) |
--stages |
-s |
Comma-separated list of stages to run |
--list-stages |
-l |
List available stages and exit |
--execute |
-e |
Execute full migration (create tables and migrate data) |
--create-table-only |
- | Only create target tables, skip data migration |
Execution Modes
The script has three mutually exclusive modes:
-
Dry-Run Mode (default): Check only, no database writes
python mysql_migration.py --stages tenant_model_provider --config config.yaml -
Create Table Only Mode: Create target tables without migrating data
python mysql_migration.py --stages tenant_model_provider --config config.yaml --create-table-only -
Execute Mode: Create tables and migrate data
python mysql_migration.py --stages tenant_model_provider --config config.yaml --execute
Configuration File
Create a YAML configuration file with MySQL connection settings:
database:
host: localhost
port: 3306
user: root
password: your_password
name: rag_flow
Alternative keys are also supported:
mysql:
host: localhost
port: 3306
user: root
password: your_password
database: rag_flow
Examples
# List all available stages
python mysql_migration.py --list-stages
# Dry run single stage
python mysql_migration.py --stages tenant_model_provider --config /path/to/config.yaml
# Create tables only for multiple stages
python mysql_migration.py --stages tenant_model_provider,tenant_model_instance --config /path/to/config.yaml --create-table-only
# Execute full migration for all stages (in dependency order)
python mysql_migration.py --stages tenant_model_provider,tenant_model_instance,tenant_model --config /path/to/config.yaml --execute
Output Interpretation
Stage Execution Log
Each stage displays a header showing progress:
============================================================
Stage [1/3]: tenant_model_provider
============================================================
The stage then performs:
- Check phase: Verifies source/target tables exist and counts records to migrate
- Execute phase: Creates tables (if needed) and migrates data in batches
Dry-Run Output
In dry-run mode, the script outputs what it would do without writing:
[DRY RUN] Would insert 150 records
instance_name=OpenAI, provider_id=abc123, api_key=***
... and 145 more records
Migration Summary
After all stages complete, a summary is printed:
============================================================
Migration Summary
============================================================
Total Duration: 2.45s
Total Rows Processed: 350
Tables Operated: tenant_model_provider, tenant_model_instance
------------------------------------------------------------
Stage Details:
[tenant_model_provider] Tables: tenant_model_provider, Rows: 50, Duration: 0.82s
[tenant_model_instance] Tables: tenant_model_instance, Rows: 300, Duration: 1.63s
============================================================
Common Messages
| Message | Meaning |
|---|---|
No new data to migrate |
All records already exist in target table |
[DRY RUN] Target table does not exist |
Target table missing, use --execute or --create-table-onlyto create |
Dependency table does not exist |
Required table from previous stage missing |
Inserted batch X: Y records |
Successfully inserted batch of records |