This is the first article in a series about using Microsoft’s data access technology, Entity Framework (EF) in an application that to be ‘always on’, in my case an ASP.NET MVC e-commerce web site, yet is still being developed so there is a need to update the database schema, i.e. the tables, views, constraints etc. that define how the data is held in the database.
This is a series, and in this first article I deal with the issue of updating database schema in a robust and testable way. The list of articles will be:
- Part 1 – Applying updates to a live, production database (this article).
- Part 2 – Keeping EF’s database model in line with the actual database schema.
- Part 3 – How to take a ASP.NET MVC web site offline to allow data migrations to be added.
- Part 4 – Release of EfSchemaCompare to Nuget.
- Part 5 – Documentation for EfSchemaCompare (GitHub project Readme.md file)
However before detail my update method I will start by describing the general problems around a database update.
Why production database updates are so challenging
Databases contain date, and that data is often very valuable to someone. In the e-commerce site I am working on the database holds customer orders which they have paid for and they expect them to be delivered. If we want the business to succeed that data better not be lost or corrupted.
The problem is that as the business grows we are likely to need to change the type/format of data we hold in the database, known as the database schema. Simple changes like adding a column to hold say a comment is normally easy to add, as the new table will work with the existing software. However more complex additions of say a new table which links to existing data, i.e. in SQL terms it has a foreign key, are a lot more difficult to do.
Updates also become especially challenging if you want the existing web site to stay ‘live’ while you change the database, i.e. the old software still works with the new database while you are updating its schema and data.
In my case the application is an e-commerce web site using ASP.NET MVC with EF and is running on Microsoft Azure. In the diagram below we see a typical ‘live’ deployment via Azure’s staging slot.
The point about this arrangement is that the staging slot holds the new version of the software, which is accessing the same database as the live site. When the new version checks out the DevOps person swaps the staging-live site. The users should see no interruption to their use of the site.
The challenge is if the new software needs a change to the database schema the database must be updated while the existing version of the software is still running. Also the database must support both the old and the new software until the swap is done. That is quite a challenge.
Note: You can read more about the challenges of deploying and updating databases, and the Simple-Talk web site has a whole section on Database Lifecycle Management.
My requirements for the database deployment process
After a lot of research I came up with a list of requirements a system that deployed a database. I decided the process must be:
- Testable: I can test any database change before running it on the Production database.
- Automated: I can automate the whole process so that I can’t get it wrong.
- Trackable: Each database should have a log of what has been done to its schema.
- Atomic: The update process must either be completed successful or entirely rolled-back.
- Recoverable: Each update should automatically make a backup in case the worst happens.
- EF-compatible: The database changes have to work with Entity Framework.
- Comprehensive: There must be Full SQL access; I needed to change tables, views, stored procedures, constraints etc.
Why I decided EF’s standard migration feature was not sufficient
I am very familiar with EF’ database migration feature and I have used it in the early stages of developing an application. However, it’s this knowledge that made me decide it wasn’t sufficient for this current project. The primary reason is because it is not very testable, i.e. it is hard to run the update on a test database. A secondary reason is that I had come across problems when applying migrations to a Production system – the update runs as part of the application start-up and feedback of errors is very limited. There is a way to run EF migrations from a command line, which does improve the process, but the problem of testability still remains.
All in all I just didn’t feel comfortable with using EF’s own database migration feature, so I set out to find a better way. After a lot of searching I found a package called DbUp.
Note: I talk more about how EF matches its view of the database with the actual schema of the database in the second article.
Database Migrations the DbUp way
DbUp is an open-source, .NET-based tool for running scripts on a database. DbUp is elegantly simple, and uses an “Apply These Scripts…” approach, i.e. you write a list of scripts, SQL or C# based, with a name that sets the order, e.g. Script001, Script002 etc. You can call DbUp via a Console App or directly from PowerShell. It uses a special table in the database to see which scripts have been run and runs any scripts that are new.
Here is an example of using DbUp from my application:
var upgrader = DeployChanges.To .SqlDatabase(dbConnectionString) .WithScriptsAndCodeEmbeddedInAssembly(Assembly.GetExecutingAssembly()) .WithTransaction() .LogToConsole() .Build(); var result = upgrader.PerformUpgrade(); if (result.Successful) etc.…
Looks simple, but there are lots of little things in DbUp that shows someone has really thought the process through. I have to admit I found the documentation a little obscure in places, but that is because I was most likely thinking in an EF way. Once I got me head around it I started to appreciate the quality of DbUp.
A snippet from one of my test T-SQL scripts looks like this:
create table $schema$.[DataTop]( [DataTopId] [int] IDENTITY(1,1) PRIMARY KEY, [MyString] [nvarchar](25) NULL, [DataSingletonId] [int] NULL, ) go create table $schema$.[DataChild]( [DataChildId] [int] IDENTITY(1,1)PRIMARY KEY, [MyInt] [int] NOT NULL, [MyString] [varchar](max) NULL, [DataTopId] [int] NOT NULL ) Go ALTER TABLE $schema$.[DataChild] ADD CONSTRAINT [FK_dbo.DataChild_dbo.DataTop_DataTopId] FOREIGN KEY([DataTopId]) REFERENCES $schema$.[DataTop] ([DataTopId]) ON DELETE CASCADE GO
A couple of pointers for people that are used to EF Migrations:
- DbUp only does forward changes, i.e. there is no DOWN script like in EF migrations. The philosophy of DbUp is that at each stage you are transitioning the database to the new state. So going back is simply another transition.
Considering that I have only used an EF DOWN script about once in four years then I don’t think this is a problem.
- There is no ‘Run Seed Every Time’ approach like EF. When you think about it you run one seed at the start why do you need to run it again? In DbUp if you want to change the data in the seed you just have a new script to update or add to the original seed.
Note: DbUp does have a way of running a script every time if you what to, see DbUp NullJournal.
One down side of using SQL scripts to apply database changes is that, unless the project has a DBA (Database Administrator), the developer will need to learn T-SQL. While I am not a DBA I have written quite a bit of T-SQL in my time. In fact I personally prefer T-SQL for configuring a database as I find using EF’s fluent interface can be quite difficult in some situations.
In this first article I have described the issue I have with EF’s own database migration code and shown and alternative method of applying updates to the schema of a SQL database using DbUp. However, on its own this will not work, as EF will not know you have changed things.
In the second article I will look at a method I have created to make sure that the view of the database schema that EF has does not diverge from the actual database schema.
“The primary reason is because it is not very testable”
I do not understand what you mean by this. You mention EF Migrations is not testable and when you explain about DbUp, you fail to explain how that is more testable than EF Migrations. Please could you elaborate?
What is your criteria for something to be considered “testable”?
“A secondary reason is that I had come across problems when applying migrations to a Production system – the update runs as part of the application start-up and feedback of errors is very limited.”
Microsoft has this knack for making the “out of the box” way very R.A.D.
I would recommend the “migrate.exe” approach on your migration assemblies (separate from your assemblies used to host the business application). Although I would admit that not having the “script” operation on migrate.exe is a bit disappointing but then you don’t have the Seed support, you would need to insert your seed data in the migration itself (how else would you do it?).
“DbUp is elegantly simple, and uses an “Apply These Scripts…” approach, i.e. you write a list of scripts, SQL or C# based, with a name that sets the order, e.g. Script001, Script002 etc. You can call DbUp via a Console App or directly from PowerShell. It uses a special table in the database to see which scripts have been run and runs any scripts that are new.”
I fail to see how this is different from EF Migrations except that you now have to manually craft the SQL migration scripts and that you cannot “DbDown” (so to speak).
It sounds like you find EF Migrations works well for you, which is great. I had had some problems with EF Migrations in the past so when I was building a 24/7 e-commerce site I wanted to find a more robust way to handle migrations. Let me give you my view on your points.
In the past I used EF Migrations all the time. I had a web application which was in development and when I applied a migration locally it worked, but when I applied it, via EF Migrations to a test hosted application it failed (this was before Migrate.exe was around). The failure was because the data on the hosted test site was different from my local test site, and EF spotted that the migration would lose data. The problem was it “failed silently”, i.e. you get no feedback.
Also my research shows that some database changes can be very complex, say splitting an existing table into two tables, and EF Migrations can’t handle that. But I can write scripts to do that, which is where DbUp wins.
Now, Migrate.exe is a good step forward and does help on testability, but I cannot easily include that in my Production deployment Unit Tests. I create a DbUp application I can run via Powershell, but because its code I can also run it from my Unit Tests. I find I run these tests either by hand when I know there is a migration, or as part of a production deployment. I am using this technique on a project I am contracting on right now and its working well.
DbUp does take more work to write the SQL and it isn’t for everyone, but I find it gives me much more control. I can add Computed Columns, Store Procs etc. with ease. Also migrations that need data manipulation are possible in DbUp. DbUp also allows me to wipe and recreate a database very quickly – DbUp is sub-second on applying 10 scripts creating 25+ tables and some seed data, whereas EF Migrations would take tens of seconds. This means I can wipe/recreate database really quickly, which is a boon for testing. It also means I don’t really need a “down” feature as I can simply correct the script with another script, or if in development where lots of ‘big’ changes happen, then I can wipe/recreate the database. (I never used EF Migrations Down feature).
Finally I built a package called EfSchemaCompare.EF6 which I find very useful. This checks the database against EF’s internal model of what the database is like and lets me know if they differ. This is a standard part of my Unit Tests. This means I can build my migration script, apply it to a test database, and then check that it matches what EF would have built if I have use migrations.
So, I’m happy that you find EF Migrations works for you. I decided that there were situations where EF Migrations could cause me issues and this was my solution. Yes it is more work, but I was paranoid about migrations on a 24/7 e-commerce site where I could lose a customer order.
Thanks for elaborating a bit. I do agree that EF Migrations might be lacking some essential things that can give it that nicely needed “oomf” and I have to admit that in the past, I had to resort to writing methods that would be used inside the migration classes to write out custom SQL scripts when the migration was executed.
You still have that “control” as you describe, just with some effort. I would love if they had the script output operation in Migrate.exe instead of just form Visual Studio. I had to write my own version of “Migrate.exe” that would essentially take the internal components and use it to output to SQL with all the migrations (from all the different projects that contained Migrations) merged into a wonderful chronologically ordered SQL script. Took some effort and time but it was definitely possible.
However, I think the most important thing is that one uses a Migration-strategy (like EF Migrations and DbUp) which is far more effective than just hand-coding a migration script at the end of each release.
I recall before the days of EF Migrations (well before I knew about it, LOL!) that I could spend days writing complex SQL scripts to sort out the schemas and data for the next release. When I came into contact with EF Migrations, I just loved it and never looked back.
I like the sound of your Migrate.exe – is that available as a NuGet package?
Alos you might like to look at this article by Max Vasilyev. I found his insights useful.
Currently working on EF Core, which has some impressive improvements, and some things missing or not quite working yet. EF Core is going to be really good though.