A frequent event when working with a SQL database is adding a column. Ideally, you’d want to add this column before or after another one that makes sense rather than all the way at the end. MySQL makes this straightforward since you can use the AFTER keyword when adding a column to specify exactly where it should be added. PostgreSQL and Redshift make this difficult since all new columns are automatically added at the end.
Normally, this isn’t a problem in most cases since you just write a query to specify the desired column order but it makes doing a simple “SELECT *” more annoying and will break naive jobs that rely on a particular column order.
The accepted solution is to create a new table with the proper structure, migrate the data from the original table while using a default value for the new column, drop or rename the original table, and then rename the new table with the original name. It’s a lot easier to see this in code:
One issue with this approach is that if the table is very large, it will take a long time to migrate the data from the original to the new table. In this case a possible approach is to do it piecemeal - if you know you only need recent data you can migrate a subset of the data first to get the table ready, do the rename, and then migrate the rest. In code again:
If you know for a fact that it’s not important to have the old data immediately available you can opt to rename the original table, create the new table, and only then migrate the data. This allows queries that need to write data to this table to complete immediately. The risk is that the legacy data will only be available after the migration completes. The last code block:
The first scenario handles having columns in the right order but the other two can be useful on MySQL as well when tables are large and performance is critical. For high load databases, transactions can be used to make sure that the renames are done atomically - this will avoid an intermittent query writing to a non existent database. It’s surprising how a task as simple as adding a column can evolve into a large problem with a variety of solutions when running into a variety of constraints.