This is the first article in a series about using Microsoft’s data access technology, Entity Framework (EF) in an application that to be ‘always on’, in my case an ASP.NET MVC e-commerce web site, yet is still being developed so there is a need to update the database schema, i.e. the tables, views, constraints etc. that define how the data is held in the database.
This is a series, and in this first article I deal with the issue of updating database schema in a robust and testable way. The list of articles will be:
- Part 1 – Applying updates to a live, production database (this article).
- Part 2 – Keeping EF’s database model in line with the actual database schema.
- Part 3 – How to take a ASP.NET MVC web site offline to allow data migrations to be added.
- Part 4 – Release of EfSchemaCompare to Nuget.
- Part 5 – Documentation for EfSchemaCompare (GitHub project Readme.md file)
However before detail my update method I will start by describing the general problems around a database update.
Databases contain date, and that data is often very valuable to someone. In the e-commerce site I am working on the database holds customer orders which they have paid for and they expect them to be delivered. If we want the business to succeed that data better not be lost or corrupted.
The problem is that as the business grows we are likely to need to change the type/format of data we hold in the database, known as the database schema. Simple changes like adding a column to hold say a comment is normally easy to add, as the new table will work with the existing software. However more complex additions of say a new table which links to existing data, i.e. in SQL terms it has a foreign key, are a lot more difficult to do.
Updates also become especially challenging if you want the existing web site to stay ‘live’ while you change the database, i.e. the old software still works with the new database while you are updating its schema and data.
The point about this arrangement is that the staging slot holds the new version of the software, which is accessing the same database as the live site. When the new version checks out the DevOps person swaps the staging-live site. The users should see no interruption to their use of the site.
The challenge is if the new software needs a change to the database schema the database must be updated while the existing version of the software is still running. Also the database must support both the old and the new software until the swap is done. That is quite a challenge.
After a lot of research I came up with a list of requirements a system that deployed a database. I decided the process must be:
- Testable: I can test any database change before running it on the Production database.
- Automated: I can automate the whole process so that I can’t get it wrong.
- Trackable: Each database should have a log of what has been done to its schema.
- Atomic: The update process must either be completed successful or entirely rolled-back.
- Recoverable: Each update should automatically make a backup in case the worst happens.
- EF-compatible: The database changes have to work with Entity Framework.
- Comprehensive: There must be Full SQL access; I needed to change tables, views, stored procedures, constraints etc.
I am very familiar with EF’ database migration feature and I have used it in the early stages of developing an application. However, it’s this knowledge that made me decide it wasn’t sufficient for this current project. The primary reason is because it is not very testable, i.e. it is hard to run the update on a test database. A secondary reason is that I had come across problems when applying migrations to a Production system – the update runs as part of the application start-up and feedback of errors is very limited. There is a way to run EF migrations from a command line, which does improve the process, but the problem of testability still remains.
All in all I just didn’t feel comfortable with using EF’s own database migration feature, so I set out to find a better way. After a lot of searching I found a package called DbUp.
Note: I talk more about how EF matches its view of the database with the actual schema of the database in the second article.
DbUp is an open-source, .NET-based tool for running scripts on a database. DbUp is elegantly simple, and uses an “Apply These Scripts…” approach, i.e. you write a list of scripts, SQL or C# based, with a name that sets the order, e.g. Script001, Script002 etc. You can call DbUp via a Console App or directly from PowerShell. It uses a special table in the database to see which scripts have been run and runs any scripts that are new.
Here is an example of using DbUp from my application:
var upgrader = DeployChanges.To .SqlDatabase(dbConnectionString) .WithScriptsAndCodeEmbeddedInAssembly(Assembly.GetExecutingAssembly()) .WithTransaction() .LogToConsole() .Build(); var result = upgrader.PerformUpgrade(); if (result.Successful) etc.…
Looks simple, but there are lots of little things in DbUp that shows someone has really thought the process through. I have to admit I found the documentation a little obscure in places, but that is because I was most likely thinking in an EF way. Once I got me head around it I started to appreciate the quality of DbUp.
A snippet from one of my test T-SQL scripts looks like this:
create table $schema$.[DataTop]( [DataTopId] [int] IDENTITY(1,1) PRIMARY KEY, [MyString] [nvarchar](25) NULL, [DataSingletonId] [int] NULL, ) go create table $schema$.[DataChild]( [DataChildId] [int] IDENTITY(1,1)PRIMARY KEY, [MyInt] [int] NOT NULL, [MyString] [varchar](max) NULL, [DataTopId] [int] NOT NULL ) Go ALTER TABLE $schema$.[DataChild] ADD CONSTRAINT [FK_dbo.DataChild_dbo.DataTop_DataTopId] FOREIGN KEY([DataTopId]) REFERENCES $schema$.[DataTop] ([DataTopId]) ON DELETE CASCADE GO
A couple of pointers for people that are used to EF Migrations:
- DbUp only does forward changes, i.e. there is no DOWN script like in EF migrations. The philosophy of DbUp is that at each stage you are transitioning the database to the new state. So going back is simply another transition.
Considering that I have only used an EF DOWN script about once in four years then I don’t think this is a problem.
- There is no ‘Run Seed Every Time’ approach like EF. When you think about it you run one seed at the start why do you need to run it again? In DbUp if you want to change the data in the seed you just have a new script to update or add to the original seed.
Note: DbUp does have a way of running a script every time if you what to, see DbUp NullJournal.
One down side of using SQL scripts to apply database changes is that, unless the project has a DBA (Database Administrator), the developer will need to learn T-SQL. While I am not a DBA I have written quite a bit of T-SQL in my time. In fact I personally prefer T-SQL for configuring a database as I find using EF’s fluent interface can be quite difficult in some situations.
In this first article I have described the issue I have with EF’s own database migration code and shown and alternative method of applying updates to the schema of a SQL database using DbUp. However, on its own this will not work, as EF will not know you have changed things.
In the second article I will look at a method I have created to make sure that the view of the database schema that EF has does not diverge from the actual database schema.