Files
ragflow/tools/scripts
Lynn dbfb439239 Feat: migrate script (#13976)
### What problem does this PR solve?

Add stage for migrate tenant_llm data into table tenant_model_instance
and tenant_model.

### Type of change

- [x] Other (please describe): tool script


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Chores**
* Added two new migration stages to move tenant model and instance
records into new target tables, with dry-run, full-execute, and "create
table only" modes; migration skips already-migrated rows to avoid
duplicates.
* **Bug Fixes**
  * Cleaned up migration header logging for clearer output.
* **Documentation**
* Added usage guide describing stages, options, modes, config format,
examples, and expected logs.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-09 11:03:39 +08:00
..
2026-04-09 11:03:39 +08:00

MySQL Data Migration Script

A flexible MySQL data migration tool for migrating data between tables with stage-based execution.

Overview

This script provides stage-based data migration between MySQL tables. Currently supports:

  • tenant_model_provider
  • tenant_model_instance
  • tenant_model

Migration Stages

Stage Source Table Target Table Description
tenant_model_provider tenant_llm tenant_model_provider Extracts distinct (tenant_id, llm_factory) pairs
tenant_model_instance tenant_llm + tenant_model_provider tenant_model_instance Creates instances with distinct (tenant_id, llm_factory, api_key)
tenant_model tenant_llm + tenant_model_provider + tenant_model_instance tenant_model Migrates model configurations (only status='0' records)

Stage Dependencies

tenant_model_provider (no dependencies)
        ↓
tenant_model_instance (depends on tenant_model_provider)
        ↓
tenant_model (depends on tenant_model_provider and tenant_model_instance)

Field Mapping Rules

tenant_model_provider

Target Field Source Rule
id - Random 32-character UUID1
provider_name tenant_llm.llm_factory Direct mapping
tenant_id tenant_llm.tenant_id Direct mapping
  • Deduplication: Groups by (tenant_id, llm_factory) and takes distinct pairs

tenant_model_instance

Target Field Source Rule
id - Random 32-character UUID1
instance_name tenant_llm.llm_factory Direct mapping
provider_id tenant_model_provider.id JOIN on tenant_id and provider_name=llm_factory
api_key tenant_llm.api_key Direct mapping
status tenant_llm.status Direct mapping
  • Deduplication: Groups by (tenant_id, llm_factory, api_key) and takes distinct records

tenant_model

Target Field Source Rule
id - Random 32-character UUID1
model_name tenant_llm.llm_name Direct mapping
provider_id tenant_model_provider.id JOIN on tenant_id and provider_name=llm_factory
instance_id tenant_model_instance.id JOIN on provider_id and api_key
model_type tenant_llm.model_type Direct mapping
status tenant_llm.status Direct mapping
  • Filter: Only migrates records where tenant_llm.status='0'

Usage

Command Line Arguments

python mysql_migration.py [OPTIONS]
Option Short Description
--config -c Path to YAML config file (required)
--stages -s Comma-separated list of stages to run
--list-stages -l List available stages and exit
--execute -e Execute full migration (create tables and migrate data)
--create-table-only - Only create target tables, skip data migration

Execution Modes

The script has three mutually exclusive modes:

  1. Dry-Run Mode (default): Check only, no database writes

    python mysql_migration.py --stages tenant_model_provider --config config.yaml
    
  2. Create Table Only Mode: Create target tables without migrating data

    python mysql_migration.py --stages tenant_model_provider --config config.yaml --create-table-only
    
  3. Execute Mode: Create tables and migrate data

    python mysql_migration.py --stages tenant_model_provider --config config.yaml --execute
    

Configuration File

Create a YAML configuration file with MySQL connection settings:

database:
  host: localhost
  port: 3306
  user: root
  password: your_password
  name: rag_flow

Alternative keys are also supported:

mysql:
  host: localhost
  port: 3306
  user: root
  password: your_password
  database: rag_flow

Examples

# List all available stages
python mysql_migration.py --list-stages

# Dry run single stage
python mysql_migration.py --stages tenant_model_provider --config /path/to/config.yaml

# Create tables only for multiple stages
python mysql_migration.py --stages tenant_model_provider,tenant_model_instance --config /path/to/config.yaml --create-table-only

# Execute full migration for all stages (in dependency order)
python mysql_migration.py --stages tenant_model_provider,tenant_model_instance,tenant_model --config /path/to/config.yaml --execute

Output Interpretation

Stage Execution Log

Each stage displays a header showing progress:

============================================================
Stage [1/3]: tenant_model_provider
============================================================

The stage then performs:

  1. Check phase: Verifies source/target tables exist and counts records to migrate
  2. Execute phase: Creates tables (if needed) and migrates data in batches

Dry-Run Output

In dry-run mode, the script outputs what it would do without writing:

[DRY RUN] Would insert 150 records
  instance_name=OpenAI, provider_id=abc123, api_key=***
  ... and 145 more records

Migration Summary

After all stages complete, a summary is printed:

============================================================
Migration Summary
============================================================
Total Duration: 2.45s
Total Rows Processed: 350
Tables Operated: tenant_model_provider, tenant_model_instance
------------------------------------------------------------
Stage Details:
  [tenant_model_provider] Tables: tenant_model_provider, Rows: 50, Duration: 0.82s
  [tenant_model_instance] Tables: tenant_model_instance, Rows: 300, Duration: 1.63s
============================================================

Common Messages

Message Meaning
No new data to migrate All records already exist in target table
[DRY RUN] Target table does not exist Target table missing, use --execute or --create-table-onlyto create
Dependency table does not exist Required table from previous stage missing
Inserted batch X: Y records Successfully inserted batch of records