Database Migration Development Guide
This guide is designed to help developers properly manage database structure changes in the OpenJiuWen Studio project. This project uses multiple database technology stacks, including relational databases (MySQL/SQLite, managed by Alembic) and vector databases (Chroma/Milvus, automatically managed at the application layer). Whether you need to add new tables, add fields, or modify existing database structures, please follow the operation specifications in this guide.
1. Core Concepts
This project uses two types of database systems, each adopting different migration strategies:
1.1 Relational Databases (MySQL/SQLite) - Alembic Managed
Alembic is SQLAlchemy's database migration tool. It allows us to define database changes by writing Python scripts (migration scripts), thereby implementing version control for database structures.
- Revision: Each change corresponds to a version file located in
backend/upgrade/{mysql,sqlite}/alembic_{agent,ops}/versions/. - Upgrade: Apply changes to upgrade the database to a new version.
- Downgrade: Rollback changes to restore the database to an old version.
- Stamp: Tag an existing database with a version label without executing actual database change operations.
2. Version Upgrade Notes
2.1 Version Upgrade Method Changes
This project adopted different database upgrade methods during different version stages:
0.1.1 → 0.1.2 / 0.1.2 → 0.1.3
- Upgrade Method: Auto DB Sync
- Description: These versions did not yet use Alembic for database version management. The code includes automatic database synchronization functionality that automatically synchronizes newly added table fields.
- Operation: No need to execute the Alembic upgrade commands in this document. Database changes are completed automatically when the application starts.
0.1.x → 0.1.4 (and later versions) (including 0.1.1/0.1.2/0.1.3→0.1.4 and later versions)
- Upgrade Method: Alembic Migration
- Description: Starting from version 0.1.4, the project officially adopted Alembic for database version management.
- Operation: Need to use
alembic stampcommand to tag andalembic upgradecommand to upgrade the database. - Specific Process: Please refer to the detailed instructions in sections 5.1.3 and 5.1.4 of this document.
2.2 Upgrade Constraints
⚠️ Important: This upgrade process only supports smooth upgrades of same-type database components.
Before upgrading relational databases, ensure:
- The database component types of new and old versions are completely consistent
- MySQL version upgrades: Must both be MySQL, cannot cross database types
- SQLite version upgrades: Must both be SQLite, cannot cross database types
- The database type (MySQL/SQLite) in the database connection configuration remains unchanged before and after upgrades
Vector Database Upgrade Notes:
- Type Consistency: The vector database types of new and old versions must remain consistent (Chroma → Chroma, Milvus → Milvus)
- Chroma and Milvus data formats are incompatible and cannot be directly migrated to each other
- Environment variable
INDEX_MANAGER_TYPEmust remain unchanged before and after upgrades
3. How It Works
The core of Alembic's operation is based on state comparison and version tracking:
-
Version Tracking (
alembic_versiontable):- Alembic creates a special table named
alembic_versionin your database. - This table stores only one row of data: the version number (Revision ID) of the current database state.
- Each time
upgradeordowngradeis executed, Alembic updates the version number in this table, thereby establishing the "current coordinate" of the database.
- Alembic creates a special table named
-
State Comparison (Autogenerate):
- When you run
alembic -n <db_name> revision --autogenerate, Alembic does two things:- Read Model: Load the SQLAlchemy models defined in your Python code (i.e., your expected database structure).
- Read Database: Connect to the actual database and read the current table structure.
- It compares the differences between the two (for example: the model has an
agefield, but the database does not). - Based on the differences, it automatically generates migration scripts containing
op.create_table,op.add_columnand other instructions.
- When you run
-
Chain Migration:
- Each migration script internally records
down_revision(previous version number). - This forms a linked list structure:
Base -> Rev1 -> Rev2 -> ... -> Head. - Alembic follows this chain and executes scripts in sequence, thereby safely migrating the database from any old version to the latest version.
- Each migration script internally records
4. Practical Scenario Guide
4.1 Common Commands Quick Reference
Please execute the following commands in the backend directory:
| Operation | Command | Description |
|---|---|---|
| Generate Migration Script | alembic -n <db_name> revision --autogenerate -m "description" |
Automatically detect model changes and generate scripts |
| Apply Migration (Upgrade) | alembic -n <db_name> upgrade head |
Upgrade database to the latest version |
| Rollback Migration (Downgrade) | alembic -n <db_name> downgrade -1 |
Rollback the most recent migration |
| View History | alembic -n <db_name> history |
View all migration version history |
| View Current Version | alembic -n <db_name> current |
View the current version of the database |
| Stamp Version | alembic -n <db_name> stamp <version> |
Tag an existing database with a version label (without executing changes) |
4.1.1 -n Parameter Description: Multi-Database Configuration
This project uses multiple independent Alembic configurations to manage different databases. The -n parameter is used to specify the database instance to operate on.
Database Instance List
| Database Instance Name | Database Type | Purpose | Configuration File Path |
|---|---|---|---|
| alembic_sqlite_agent | SQLite | Agent Database (agents, workflows, etc.) | backend/upgrade/sqlite/alembic_agent/ |
| alembic_sqlite_ops | SQLite | Ops Database (prompt management, etc.) | backend/upgrade/sqlite/alembic_ops/ |
| alembic_mysql_agent | MySQL | Agent Database (agents, workflows, etc.) | backend/upgrade/mysql/alembic_agent/ |
| alembic_mysql_ops | MySQL | Ops Database (prompt management, etc.) | backend/upgrade/mysql/alembic_ops/ |
Why Do We Need Multiple Database Instances?
This project separates data storage into two independent databases:
- Agent Database: Stores core business data such as agents, workflows, knowledge bases, execution records
- Ops Database: Stores operational management data such as prompt templates
While supporting both SQLite and MySQL database types, 4 independent Alembic configurations are required.
Command Examples Using -n Parameter
# ✅ Correct: Specify database instance
alembic -n alembic_sqlite_agent upgrade head
alembic -n alembic_mysql_ops upgrade head
# ❌ Incorrect: Database instance not specified
alembic upgrade head
# Error message: ERROR: ConfigurationError: Multiple Configurations found
Complete Command Examples
# SQLite Agent Database: Generate migration script
alembic -n alembic_sqlite_agent revision --autogenerate -m "feat: add user profile"
# SQLite Agent Database: Apply migration
alembic -n alembic_sqlite_agent upgrade head
# MySQL Ops Database: Generate migration script
alembic -n alembic_mysql_ops revision --autogenerate -m "feat: add prompt template"
# MySQL Ops Database: Apply migration
alembic -n alembic_mysql_ops upgrade head
# View current versions of all database instances
for db in alembic_sqlite_agent alembic_sqlite_ops alembic_mysql_agent alembic_mysql_ops; do
echo "=== $db ==="
alembic -n $db current
done
How to Choose the Correct Database Instance?
- Determine Database Type: Check environment variable
DB_TYPE(sqlite or mysql) - Determine Data Purpose:
- Agents, workflows, knowledge bases related → Agent Database
- Prompt templates related → Ops Database
- Combination Selection:
- SQLite + Agent →
alembic_sqlite_agent - MySQL + Ops →
alembic_mysql_ops
- SQLite + Agent →
Important Notes:
- Each database instance has independent
alembic_versiontable and migration history - Migration version numbers between different database instances are independent and cannot be compared
- During development, you usually only need to focus on the currently used database type (SQLite or MySQL)
4.2 Scenario 1: Add New Table or Field (Create/Add)
This is the simplest scenario. Alembic's automatic detection feature (--autogenerate) can usually handle it perfectly.
Steps:
- Modify Model Code: Add new classes or fields in SQLAlchemy model files (such as
backend/app/models/). - Generate Script:
alembic -n alembic_sqlite_agent revision --autogenerate -m "add_user_age_column" - Check Script: Open the generated
versions/xxxx_add_user_age_column.pyfile and confirm that theupgrade()function contains the correctop.create_tableorop.add_columninstructions. - Apply Migration:
alembic -n alembic_sqlite_agent upgrade head
4.3 Scenario 2: Drop Field or Table (Drop)
⚠️ Note: Alembic's automatic detection feature does not detect deletion operations by default (to prevent accidental data deletion). You must handle it manually or explicitly confirm.
Steps:
- Modify Model Code: Delete the corresponding class or field from the code.
- Generate Base Script:
At this point, the generated script may be empty or not contain drop statements.alembic -n alembic_sqlite_agent revision --autogenerate -m "drop_unused_column" - Manually Edit Script: Open the generated migration file and manually add deletion instructions in the
upgrade()function:def upgrade(): # Drop the 'age' field from the 'users' table op.drop_column('users', 'age') def downgrade(): # Restore the field when downgrading (remember to add the type) op.add_column('users', sa.Column('age', sa.Integer(), nullable=True)) - Apply Migration:
alembic -n alembic_sqlite_agent upgrade head
4.4 Scenario 3: Rename Field (Rename)
⚠️ Note: Alembic cannot automatically identify renaming. It usually recognizes it as "delete old field" + "add new field". This will cause data loss! You must manually use alter_column.
Steps:
- Modify Model Code: Change the field name from
old_nametonew_name. - Generate Base Script:
alembic -n alembic_sqlite_agent revision --autogenerate -m "rename_column" - Manually Edit Script: Open the generated migration file, delete the automatically generated
drop_columnandadd_columnstatements, and replace withalter_column:SQLite Special Note: SQLite has limited support fordef upgrade(): # Rename 'old_name' to 'new_name' in the 'users' table op.alter_column('users', 'old_name', new_column_name='new_name') def downgrade(): # Rollback operation op.alter_column('users', 'new_name', new_column_name='old_name')ALTER TABLE. If you encounter an error, you may need to use batch mode (Alembic default configuration usually has this enabled, but pay attention).
4.5 Scenario 4: Write Idempotent Migration Scripts
Since MySQL's DDL operations (such as CREATE TABLE, ADD COLUMN) do not support transaction rollback, if the migration script fails midway through execution, running upgrade again will cause errors because tables or columns already exist. Therefore, it is recommended to write idempotent migration scripts.
This project provides unified helper functions to check whether tables, columns, and indexes exist:
from openjiuwen_studio.core.database.migration_utils import table_exists, column_exists, index_exists
def upgrade() -> None:
# 1. Check before creating table
if not table_exists('my_new_table'):
op.create_table('my_new_table', ...)
# 2. Check before adding column
if not column_exists('users', 'new_column'):
op.add_column('users', sa.Column('new_column', sa.String(100)))
# 3. Check before creating index
if not index_exists('users', 'idx_user_email'):
op.create_index('idx_user_email', 'users', ['email'])
Notes:
- SQLite Support: These helper functions are also applicable to SQLite migrations.
- Batch Mode: When using
op.batch_alter_tablein SQLite, you can also use these checks within thewithblock. - Template Support: Newly generated migration scripts have imported these helper functions by default.
4.6 Scenario 5: Modify Field Type or Attributes (Alter)
For example, changing String(50) to String(100), or modifying nullable attributes.
Steps:
- Modify Model Code: Update field definitions.
- Generate Script:
alembic -n alembic_sqlite_agent revision --autogenerate -m "change_column_type" - Check Script: Alembic can usually detect type changes, but it is recommended to carefully verify whether the parameters in
op.alter_columnmeet expectations.def upgrade(): op.alter_column('users', 'username', existing_type=sa.VARCHAR(length=50), type_=sa.String(length=100), existing_nullable=False) - Apply Migration:
alembic -n alembic_sqlite_agent upgrade head
5. openJiuWen Studio Backend Database Migration Development Core Steps
Important Note: All backend developers must follow the following database migration development process when developing code.
Applicable Scenarios:
- ✅ Need to modify database structure (tables, fields, indexes, etc.)
- ✅ Need to add, modify, or delete data
- ✅ The connected database version is inconsistent with the latest database version in the alembic scripts in the code
- ❌ If the current database version is consistent with the code repository and there is no database change requirement, you can skip this step
⚠️ Special Attention: If the upgrade script involves operations related to the user_message table, please do not execute any changes. This table is managed by the memory module. Please contact the personnel responsible for the memory module to handle it.
5.1 How to Sync Database Changes in Development Branches
When doing collaborative development, keeping your local database synchronized with the code repository is very important. The following process helps you correctly update your local database after pulling code.
5.1.1 Standard Sync Process
Step 1: Pull Latest Code
git pull
Step 2: Check Database Version
Since main.py will try to automatically create tables (Base.metadata.create_all) when the application starts, if you start the application directly without synchronizing the database version, it may cause table structure conflicts or overwrites. Therefore, it is recommended to manually check the database version before starting the application.
# View the version recorded by the current database
alembic -n alembic_sqlite_agent current
# View the latest available version
alembic -n alembic_sqlite_agent heads
Step 3: Sync Database Version
Operate according to the check results:
- If the version is already the latest (
currentis consistent withheads): No operation needed, proceed to the next step. - If the version is behind (
currentis behindheads):alembic -n alembic_sqlite_agent upgrade head - If no current version is displayed but tables already exist (
currentis empty, but tables already exist in the database): This usually happens when there is an existing database but Alembic has not been initialized. Please first stamp the version (Stamp), then upgrade. Please refer to the section below "6.1.3 How to Handle Existing Databases That Have Not Used Alembic Migration" for operation
Step 4: Start Application
After confirming that the database version is synchronized, start the backend service:
cd backend
python main.py
At this point, the application startup log should show consistent versions:
✅ agent database: version is already up to date
Current version: 7883f1b07bc3
Latest version: 7883f1b07bc3
5.1.2 Pre-Commit Checklist
Before committing code, please ensure:
- Local database has been updated to the latest version
- No version mismatch warnings when starting the application
- All migration scripts have been correctly generated
- Tests passed, functionality works normally
5.1.3 How to Handle Existing Databases That Have Not Used Alembic Migration
Scenario Description: When your database already has data (for example, currently version v0.1.2), but has never used Alembic for version management, special handling is required.
Problem Analysis:
- The database already contains complete table structures and data
- But lacks the
alembic_versiontable to record version information - If you run
alembic revision --autogeneratedirectly, it will generate an empty migration script because it finds the database consistent with the model
Solution: Use the alembic stamp command to tag the database with a version label.
Operation Steps:
-
Confirm Current Database Version: Check which version your database corresponds to (such as v0.1.2)
-
Stamp Database Version:
# Tag the database with the corresponding alembic version alembic -n alembic_sqlite_agent stamp <version_number>⚠️ Do not run
upgrade! Thestampcommand only records the version number to thealembic_versiontable and does not execute any SQL operations -
Verify Stamp Result:
alembic -n alembic_sqlite_agent currentShould display the just-stamped version number
-
Subsequent Development: Afterward, you can normally use
alembic -n alembic_sqlite_agent revision --autogenerateto generate new migration scripts
Example:
# Assuming your database is version v0.1.2, the corresponding revision ID is f458c7fb17a5
alembic -n alembic_sqlite_agent stamp f458c7fb17a5
# Verify stamp success
alembic -n alembic_sqlite_agent current
# Output: f458c7fb17a5
# Now you can develop normally and generate new migration scripts
alembic -n alembic_sqlite_agent revision --autogenerate -m "feat: add new column"
5.1.4 Released Version Baseline Marking After 0.1.2
The following is a reference table of key version Revision IDs for each database type, for use when manually stamping versions (Stamp).
| Database Type | Service Component | v0.1.2 Revision ID | v0.1.3 Revision ID | v0.1.4 Revision ID |
|---|---|---|---|---|
| MySQL | Agent | 54351e123cf0 |
06a1f79bce8b |
072ac1293a02 |
| MySQL | Ops | 80f110f929fc |
13377a900fe2 |
13377a900fe2 |
| SQLite | Agent | f458c7fb17a5 |
031b34b4dd30 |
8f4846812221 |
| SQLite | Ops | b4f4c6589bc5 |
f6e49cd8c97d |
f6e49cd8c97d |
The baseline for database migration scripts in the current code repository is version v0.1.2. For users upgrading from v0.1.1 to v0.1.2, since there are no database structure changes, users need to manually stamp their database with a v0.1.2 label, otherwise running alembic -n <db_name> upgrade head directly will fail because it tries to create existing tables.
Solution: Use the alembic stamp command to stamp to a v0.1.2 baseline version number.
MySQL Users:
# Stamp agent database to v0.1.2 baseline version
alembic -n alembic_mysql_agent stamp 54351e123cf0
# Stamp ops database to v0.1.2 baseline version
alembic -n alembic_mysql_ops stamp 13377a900fe2
SQLite Users:
# Stamp agent database to v0.1.2 baseline version
alembic -n alembic_sqlite_agent stamp f458c7fb17a5
# Stamp ops database to v0.1.2 baseline version
alembic -n alembic_sqlite_ops stamp b4f4c6589bc5
5.1.5 Execute Incremental Upgrade
After stamping is complete, Alembic knows that the current database version is v0.1.2. Now running upgrade head will only execute all incremental migrations between the baseline version and the latest version.
5.2 Standard Migration Process
The standard migration process is divided into three steps:
Step 1: Modify Model Code
- Agent Library: Modify model files in the
backend/openjiuwen_studio/models/directory. - Ops Library: Modify the
backend/openjiuwen_studio/ops/modules/prompt/infra/repositories/orm_repo.pyfile.
Make the changes you need, such as adding a new field or a new table.
Step 2: Generate Migration Scripts
After saving the code, run the revision command in the backend directory. Alembic will compare your models with the current state of the database and automatically generate a migration script.
Note: If you modify the model structure, you need to generate migration scripts for both MySQL and SQLite. You can use the following two methods:
Method 1: Use Unified Generation Script (Recommended)
Use the generate_migration.py script to generate migration scripts for all databases simultaneously:
# Automatically generate migration scripts for all databases
python generate_migration.py --autogenerate -m "feat: add user profile table"
Advantages:
- One command generates migration scripts for all 4 databases
- Automatically injects correct DB_TYPE environment variables
- Avoids missing any database
For detailed usage, please refer to the GENERATE_MIGRATION_USAGE.md document.
Method 2: Generate Separately (For Specific Databases)
If you only need to generate migrations for a specific database, you can manually execute:
-
sqlite Agent Database:
alembic -n alembic_sqlite_agent revision --autogenerate -m "feat: add user profile table" -
sqlite Ops Database:
alembic -n alembic_sqlite_ops revision --autogenerate -m "feat: add ip_address to audit log" -
mysql Agent Database:
alembic -n alembic_mysql_agent revision --autogenerate -m "feat: add user profile table" -
mysql Ops Database:
alembic -n alembic_mysql_ops revision --autogenerate -m "feat: add ip_address to audit log"
Description Information Specification
Naming Convention: Description information must start with fix: or feat:
| Prefix | Meaning | Use Cases |
|---|---|---|
fix: |
Fix bugs or issues | Fix database structure defects, rollback erroneous changes, etc. |
feat: |
Add new features or characteristics | Add new tables, new fields, new indexes, etc. |
Examples:
# ✅ Recommended: Follow naming specification
python generate_migration.py --autogenerate -m "feat: add user profile table"
python generate_migration.py --autogenerate -m "fix: correct user email field type"
# ❌ Not Recommended: Missing prefix or incorrect format
python generate_migration.py --autogenerate -m "add user profile table"
python generate_migration.py --autogenerate -m "update"
Hint: The description information after the
-mparameter is crucial. It will become part of the version history and help the team understand the purpose of each change.
Step 3: Apply Migration
The generated script is just a plan. Run the upgrade head command to execute this plan and apply changes to the database.
-
sqlite Agent Database:
alembic -n alembic_sqlite_agent upgrade head -
sqlite Ops Database:
alembic -n alembic_sqlite_ops upgrade head -
mysql Agent Database:
alembic -n alembic_mysql_agent upgrade head -
mysql Ops Database:
alembic -n alembic_mysql_ops upgrade head
6. How to Collaborate as a Team
When multiple developers perform database structure changes simultaneously, conflicts may arise. Following the following process can effectively avoid and resolve these issues.
Golden Process
-
Pull Latest Code: Before starting any model modifications, first
git pullto get the latest code, including migration scripts that teammates may have already committed. -
Update Local Database:
- Start application to check version:
python main.py - View version detection log to confirm if update is needed
- update needed, run:
alembic -n <db_name> upgrade head - Ensure local database is updated to the latest version before proceeding with development
- Start application to check version:
-
Make Your Changes: Now, modify your model code on the latest database structure.
-
Generate Your Migration Scripts:
- Recommended: Use
python generate_migration.py --autogenerate -m "feat: xxx" - Or manual: Run
alembic -n <db_name> revision --autogenerate ... - Check that the generated migration scripts meet expectations
- Recommended: Use
-
Apply and Test Migration: Execute
alembic -n <db_name> upgrade headlocally to test if the migration succeeds. -
Commit Code: Commit your model code and newly generated migration scripts together.
Resolve Conflicts (Merge)
If you forget step 2, you may encounter migration branch conflicts.
Scenario: You and your teammate both created respective migration scripts B1 and B2 based on version A. After git pull, Alembic will detect two parallel "heads".
Solution:
- First, upgrade the database to the head of one branch, for example
alembic -n <db_name> upgrade B1. - Then, run
alembic -n <db_name> merge heads -m "Merge parallel migrations B1 and B2". - This will create a new merge migration file
Cthat merges the two branches together.Note: The original migration files
B1andB2will not be deleted. They still exist in theversionsdirectory. Themergeoperation only creates a new nodeCthat points to bothB1andB2in terms of dependencies, thereby reconnecting the two forked paths. - Finally, upgrade the database to this new merge head:
alembic -n <db_name> upgrade head.
Advanced: What if two branches modified the same field?
- Manual Review: Before running
merge, be sure to check the contents of B1 and B2. - Manual Fix:
- Recommended (Rebase Strategy):
If B1 and B2 conflict seriously (for example, one renamed a field, one deleted a field), the best approach is to abandon one of the migration scripts.
- Delete the migration script for the B2 branch.
- Rebase the B2 branch based on the latest main (including B1).
- Run
alembic revision --autogenerateagain to generate a new migration script based on B1.
- Alternative (Manual Edit):
If it's just simple attribute modifications, you can manually edit the
upgrade()function to ensure the logic order is correct.
- Recommended (Rebase Strategy):
If B1 and B2 conflict seriously (for example, one renamed a field, one deleted a field), the best approach is to abandon one of the migration scripts.
7. Best Practices and Notes
- Never Modify Database Directly: Strictly prohibit using tools such as Navicat, DBeaver to directly modify table structures. This will cause the database state to be inconsistent with the Alembic version history, leading to subsequent migration failures.
- Maintain Atomicity: Each migration should preferably contain only related changes. Do not mix "add new feature table" and "modify old table fields" operations in one migration script.
- Test Before Commit: Before committing code, be sure to execute
upgradeanddowngradetests locally to ensure that migration scripts can both upgrade forward and rollback backward. - Team Collaboration:
- After pulling others' code, execute
alembic -n <db_name> upgrade headimmediately. - If you encounter version conflicts (multiple heads), you need to manually merge version history (
alembic -n <db_name> merge) or regenerate migration scripts.
- After pulling others' code, execute
8. Reference Documentation
9. Frequently Asked Questions (FAQ)
9.1 Error when executing alembic command: file has no '[alembic]' section
Error Message:
ERROR: ConfigurationError: File has no '[alembic]' section which is absent from the ./backend/ini file
Problem Cause:
This error is caused by not specifying the -n parameter in the migration command. The current code repository uses multiple independent Alembic configurations to manage different databases, and you need to specify the specific database name through the -n parameter.
Solution:
Add the -n <db_name> parameter to all alembic commands to specify the specific database name.
Correct Command Examples:
# SQLite Agent Database
alembic -n alembic_sqlite_agent upgrade head
alembic -n alembic_sqlite_agent revision --autogenerate -m "feat: add new column"
alembic -n alembic_sqlite_agent current
# SQLite Ops Database
alembic -n alembic_sqlite_ops upgrade head
alembic -n alembic_sqlite_ops revision --autogenerate -m "feat: add new column"
# MySQL Agent Database
alembic -n alembic_mysql_agent upgrade head
alembic -n alembic_mysql_agent revision --autogenerate -m "feat: add new column"
# MySQL Ops Database
alembic -n alembic_mysql_ops upgrade head
alembic -n alembic_mysql_ops revision --autogenerate -m "feat: add new column"
Incorrect Command Examples:
# ❌ Error: Missing -n parameter
alembic upgrade head
alembic revision --autogenerate -m "feat: add new column"
How to Choose the Correct Database Name:
Please refer to the 4.1.1 -n Parameter Description: Multi-Database Configuration section in this document to understand how to choose the correct database instance name.
Available Database Names:
alembic_sqlite_agent- SQLite Agent Databasealembic_sqlite_ops- SQLite Ops Databasealembic_mysql_agent- MySQL Agent Databasealembic_mysql_ops- MySQL Ops Database