Non-backwards compatible SQL database migration

Sometimes you need/want to make more radical changes to your SQL database schema, such as renaming a column or moving data from one table to another.

Tools like Flyway and Liquibase has simplified making backwards compatible database migrations, such as adding columns/tables. However in order to make non-backwards compatible changes “online” (i.e. while the application is up and running) for a clustered environment (multiple application instances accessing the same database) requires a little more thought.

Basically you need to make this change in 5 steps, spread out across (at least) 4 releases (assuming a release means updating the database, either separately or during application boot using a migration tool, and updating application – in that order). I’ll use renaming a database column as an example.

  1. Database: Add the new column (i.e. do not rename or drop the previous one) to the database. You may copy existing data to new column also.
    Application: Start writing the data to both the old and the new column.
  2. Database: Copy existing data to new column also. Even if you did that in the previous step, you need to do it again, since the application may have written to the old column only between the time when the database migration was executed, and the time the application was updated. You could opt to only copy the data that was written during that time (i.e. where the old column is non-null but the new column is null.) This does not have to be a separate release, but could be the migration made as part of the next application release.
  3. Application: Start reading data from the new column instead of the old column. Note that you must not stop writing data to the old column yet, since as you update one application instance (i.e. one cluster node) at a time, there can be non-updated nodes reading the old column for data written by updated nodes.
  4. Application: Stop writing to the old column.
    Database: Note that we cannot drop the old column yet, since the non-updated nodes will still be writing to it.
  5. Database: Drop the old column.