If you’re connecting to a database in your Python application, there’s a good chance you’re using SQLAlchemy to interact with the database since it's one of the most popular ORMs (Object Relational Mapper) in the Python ecosystem. Once you’re past the "hello world" stage of building your application you’ll probably want to implement database migrations into your codebase to:
- Version control your database schema
- Be able to upgrade and downgrade changes appropriately
Alembic is the tool for doing this with SQLAlchemy.
In this article, you’ll learn more about the high-level architecture of how Alembic works, how to add it to your project, and some common workflows you’ll encounter. Some nomenclature worth mentioning early on is Alembic calls each discrete migration file a revision, which you could think as synonymous with a Git commit, so in this article the terms "migration", "revision", and "commit" can be thought of as the same.
Contents
Architecture and Mental Model
Alembic works differently than most database migration tools that you may have used. Instead of migration revision files with incrementing numbers -- e.g., v0001_init.sql -- the files are stored in a "versions" directory and are identified by a hash string -- e.g., abcd1234_init.py -- that could resemble a Git commit hash. In fact, Alembic works closer to Git with the concept of branches and commits (revisions), so it’s entirely possible to have merge conflicts, which we’ll discuss later.
However, while Git stores the state entirely within the confines of the filesystem, inside a ".git" directory, Alembic stores the state within the filesystem in the alembic/versions/*.py files and within the database itself in the alembic_version
table under the version_num
column. This core difference of the migration state being decoupled between the filesystem ("versions" directory) and database (alembic_version
table) is important to keep in mind to mentally grok Alembic and understand workflows that may come up like merge conflicts.
An Alembic revision file looks like this:
"""create account table
Revision ID: 1975ea83b712
Revises:
Create Date: 2011-11-08 11:40:27.089406
"""
# revision identifiers, used by Alembic.
revision = '1975ea83b712'
down_revision = None
branch_labels = None
from alembic import op
import sqlalchemy as sa
def upgrade():
pass
def downgrade():
pass
There’s some metadata in the comment block and variables like revision
and down_revision
are what Alembic uses to connect the migration revisions together into a graph data structure. The upgrade()
and downgrade()
functions are where you define your migration logic to apply or revert the changes.
Adding Alembic to Your Project
Assuming you’re already up and running in your Python application using SQLAlchemy, adding Alembic is as simple as:
$ pip install alembic
After installing, you’ll want to initialize Alembic which scaffolds out the required files and directories:
$ alembic init .
Common Workflows and Scenarios
After using Alembic in a production application, with multiple contributors, there’s some common workflows that are worth mentioning so you’re able to handle them when you encounter them as well:
- Autogenerating Migrations: Modifying a database schema model and automagically creating a migration (revision) file.
- Pulling/Applying Latest Migrations: After creating a new migration or doing a
git pull
, you’ll want to apply the latest migrations locally. - Rolling a Migration Back: Rolling back an Alembic commit, just like you would with Git.
- Resolving Merge Conflicts: Dealing with Alembic merge conflicts, particularly when working on a feature branch.
Autogenerating Migrations
When adding features or squashing bugs, it’s common to need to touch a database model schema to perhaps add a column, change a datatype, or even update some relationship. After you’ve made these changes and saved the file, you’ll want to create a migration revision to actually apply them to the database. Thankfully, Alembic makes this easy:
$ alembic revision --autogenerate -m "thing done, in imperative tone (like a Git commit message)"
Note: Alembic can automatically detect many changes, but not all (notably changes to table or column names). For more, review What does Autogenerate Detect from the official docs.
After running this command you'll be left with a new alembic/versions/*.py file which contains the logic for your discrete migration upgrade()
or downgrade()
. As mentioned earlier, the "revision" is a discrete migration, and you could think of it as similar to a Git commit which are linked together in sequence by their hash strings. Once you're satisfied by the contents of your revision file, you can apply the changes locally...
Pulling/Applying Latest Migrations
After making changes locally or pulling the latest changes on a Git main
branch, you’ll want to apply the changes too. Essentially, the Alembic equivalent of a git pull
:
$ alembic upgrade head
Rolling a Migration Back
Just like with Git commits, you may want to roll back your most recent Alembic revision in order to throw it away or simply do it differently:
$ alembic downgrade -1
-1
is a shortcut to running downgrade()
from your latest revision and updating the alembic_version
table to the now-latest revision. Like Git, you can view the history of all the revisions (and their IDs) and revert to a specific revision:
$ alembic history
To get a specific revision ID, and then:
$ alembic downgrade 1975ea83b712
Note: to delete a specific migration revision, you'll need to do so manually -- e.g. delete,
1975ea83b712_create_account_table.py
from the "alembic/versions" directory.
Resolving Merge Conflicts
The most common scenario you may find yourself in is when working on your own feature branch and you go to pull in the latest changes from main
and you find yourself in an Alembic merge conflict. Unfortunately, there’s not a convenient one-liner, but here are the steps to take:
- In sequence, repeat until successful:
- Change to your feature branch
- Rollback your revisions (hopefully it’s just one)
- Change back to
main
or the problem branch - Try to run an
alembic upgrade head
- Once caught up with the latest changes, apply yours with an
alembic merge heads
- Finally, you should now be able to successfully
alembic upgrade head
on your branch
The merge
command makes a migration file that joins 2 (or more) "head" revisions (that can be viewed with alembic history
) together, which was the side effect of multiple contributors creating a new revision that both point from the same previous revision. This basic diagram should help visualize the flow:
-- ae1027a6axcf -->
/ \
<base> --> 1975ea83b712 --> --> <merge revision>
\ /
-- 27c6a30d7c24 -->
This sequence would be nicely implemented as a shell script. Further, depending on the tooling your project uses, it would vastly improve the developer experience to add command-line shortcuts -- e.g., via
make
-- so you can perform the basicalembic ...
commands without typing them fully out by hand. That’s an exercise left up to the reader.
Conclusion
If you're used to working with the Django ORM, then you're used to similarly generated migration files whose main difference is that they're generated in sequence -- e.g., 001_init.py -- which lets you linearly follow the migration history directly from your filesystem. Alembic models this closer to Git where migration revisions themselves are linked together to form the history, but it's not easily grokked from a mere ls
command on the filesystem.
If you're coming from the Flask ecosystem, there's a Flask-Migrate package which uses Alembic under the hood to provide migrations for your Flask application exposed via the existing flask
CLI pattern.
All said, Alembic is powerful tool (it has many more features than discussed here) that requires a slightly different mental model to use than many other conventional migration tools.