Handling Database Migrations with Alembic

Last updated March 28th, 2024

If you’re connecting to a database in your Python application, there’s a good chance you’re using SQLAlchemy to interact with the database since it's one of the most popular ORMs (Object Relational Mapper) in the Python ecosystem. Once you’re past the "hello world" stage of building your application you’ll probably want to implement database migrations into your codebase to:

  1. Version control your database schema
  2. Be able to upgrade and downgrade changes appropriately

Alembic is the tool for doing this with SQLAlchemy.

In this article, you’ll learn more about the high-level architecture of how Alembic works, how to add it to your project, and some common workflows you’ll encounter. Some nomenclature worth mentioning early on is Alembic calls each discrete migration file a revision, which you could think as synonymous with a Git commit, so in this article the terms "migration", "revision", and "commit" can be thought of as the same.

Contents

Architecture and Mental Model

Alembic works differently than most database migration tools that you may have used. Instead of migration revision files with incrementing numbers -- e.g., v0001_init.sql -- the files are stored in a "versions" directory and are identified by a hash string -- e.g., abcd1234_init.py -- that could resemble a Git commit hash. In fact, Alembic works closer to Git with the concept of branches and commits (revisions), so it’s entirely possible to have merge conflicts, which we’ll discuss later.

However, while Git stores the state entirely within the confines of the filesystem, inside a ".git" directory, Alembic stores the state within the filesystem in the alembic/versions/*.py files and within the database itself in the alembic_version table under the version_num column. This core difference of the migration state being decoupled between the filesystem ("versions" directory) and database (alembic_version table) is important to keep in mind to mentally grok Alembic and understand workflows that may come up like merge conflicts.

An Alembic revision file looks like this:

"""create account table

Revision ID: 1975ea83b712
Revises:
Create Date: 2011-11-08 11:40:27.089406

"""

# revision identifiers, used by Alembic.
revision = '1975ea83b712'
down_revision = None
branch_labels = None

from alembic import op
import sqlalchemy as sa

def upgrade():
    pass

def downgrade():
    pass

There’s some metadata in the comment block and variables like revision and down_revision are what Alembic uses to connect the migration revisions together into a graph data structure. The upgrade() and downgrade() functions are where you define your migration logic to apply or revert the changes.

Adding Alembic to Your Project

Assuming you’re already up and running in your Python application using SQLAlchemy, adding Alembic is as simple as:

$ pip install alembic

After installing, you’ll want to initialize Alembic which scaffolds out the required files and directories:

$ alembic init .

Common Workflows and Scenarios

After using Alembic in a production application, with multiple contributors, there’s some common workflows that are worth mentioning so you’re able to handle them when you encounter them as well:

  1. Autogenerating Migrations: Modifying a database schema model and automagically creating a migration (revision) file.
  2. Pulling/Applying Latest Migrations: After creating a new migration or doing a git pull, you’ll want to apply the latest migrations locally.
  3. Rolling a Migration Back: Rolling back an Alembic commit, just like you would with Git.
  4. Resolving Merge Conflicts: Dealing with Alembic merge conflicts, particularly when working on a feature branch.

Autogenerating Migrations

When adding features or squashing bugs, it’s common to need to touch a database model schema to perhaps add a column, change a datatype, or even update some relationship. After you’ve made these changes and saved the file, you’ll want to create a migration revision to actually apply them to the database. Thankfully, Alembic makes this easy:

$ alembic revision --autogenerate -m "thing done, in imperative tone (like a Git commit message)"

Note: Alembic can automatically detect many changes, but not all (notably changes to table or column names). For more, review What does Autogenerate Detect from the official docs.

After running this command you'll be left with a new alembic/versions/*.py file which contains the logic for your discrete migration upgrade() or downgrade(). As mentioned earlier, the "revision" is a discrete migration, and you could think of it as similar to a Git commit which are linked together in sequence by their hash strings. Once you're satisfied by the contents of your revision file, you can apply the changes locally...

Pulling/Applying Latest Migrations

After making changes locally or pulling the latest changes on a Git main branch, you’ll want to apply the changes too. Essentially, the Alembic equivalent of a git pull:

$ alembic upgrade head

Rolling a Migration Back

Just like with Git commits, you may want to roll back your most recent Alembic revision in order to throw it away or simply do it differently:

$ alembic downgrade -1

-1 is a shortcut to running downgrade() from your latest revision and updating the alembic_version table to the now-latest revision. Like Git, you can view the history of all the revisions (and their IDs) and revert to a specific revision:

$ alembic history

To get a specific revision ID, and then:

$ alembic downgrade 1975ea83b712

Note: to delete a specific migration revision, you'll need to do so manually -- e.g. delete, 1975ea83b712_create_account_table.py from the "alembic/versions" directory.

Resolving Merge Conflicts

The most common scenario you may find yourself in is when working on your own feature branch and you go to pull in the latest changes from main and you find yourself in an Alembic merge conflict. Unfortunately, there’s not a convenient one-liner, but here are the steps to take:

  1. In sequence, repeat until successful:
    1. Change to your feature branch
    2. Rollback your revisions (hopefully it’s just one)
    3. Change back to main or the problem branch
    4. Try to run an alembic upgrade head
  2. Once caught up with the latest changes, apply yours with an alembic merge heads
  3. Finally, you should now be able to successfully alembic upgrade head on your branch

The merge command makes a migration file that joins 2 (or more) "head" revisions (that can be viewed with alembic history) together, which was the side effect of multiple contributors creating a new revision that both point from the same previous revision. This basic diagram should help visualize the flow:

                            -- ae1027a6axcf -->
                           /                   \
<base> --> 1975ea83b712 -->                     --> <merge revision>
                           \                   /
                            -- 27c6a30d7c24 -->

This sequence would be nicely implemented as a shell script. Further, depending on the tooling your project uses, it would vastly improve the developer experience to add command-line shortcuts -- e.g., via make -- so you can perform the basic alembic ... commands without typing them fully out by hand. That’s an exercise left up to the reader.

Conclusion

If you're used to working with the Django ORM, then you're used to similarly generated migration files whose main difference is that they're generated in sequence -- e.g., 001_init.py -- which lets you linearly follow the migration history directly from your filesystem. Alembic models this closer to Git where migration revisions themselves are linked together to form the history, but it's not easily grokked from a mere ls command on the filesystem.

If you're coming from the Flask ecosystem, there's a Flask-Migrate package which uses Alembic under the hood to provide migrations for your Flask application exposed via the existing flask CLI pattern.

All said, Alembic is powerful tool (it has many more features than discussed here) that requires a slightly different mental model to use than many other conventional migration tools.

Featured Course

Test-Driven Development with Python, Flask, and Docker

In this course, you'll learn how to set up a development environment with Docker in order to build and deploy a microservice powered by Python and Flask. You'll also apply the practices of Test-Driven Development with pytest as you develop a RESTful API.

Featured Course

Test-Driven Development with Python, Flask, and Docker

In this course, you'll learn how to set up a development environment with Docker in order to build and deploy a microservice powered by Python and Flask. You'll also apply the practices of Test-Driven Development with pytest as you develop a RESTful API.