Building a robust CQRS database with EF Core and Cosmos DB

Last Updated: November 4, 2019 | Created: February 23, 2019

Back in 2017 I wrote about a system I built using EF Core 2.0 which used Command and Query Responsibility Segregation (CQRS) pattern that combines a SQL Server write database a RavenDB NoSQL read database. The title of the article is “EF Core – Combining SQL and NoSQL databases for better performance” and it showed this combination gave excellent read-side performance.

Ever since then I have been eagerly waiting for Entity Framework Core’s (EF Core) support of Cosmos DB NoSQL database, which is in preview in EF Core 2.2. RavenDB worked well but having a NoSQL database that EF Core can use natively makes it a whole lot easier to implement, as you will see in this article.

I’m writing this using the early EF Core 2.2.0-preview3 release of the Cosmos database provider. This preview works but is slow, so I won’t be focusing on performance (I’ll write a new article on that when the proper Cosmos DB provider is out in EF Core 3). What I will focus on is providing a robust implementation which ensures that the two databases are kept in step.

UPDATE!

There is a new article called “An in-depth study of Cosmos DB and the EF Core 3.0 database provider” that looks at what happened when I updated the EF Core to 3.0 with the first non-preview Cosmos DB database provider.

The original CQRS design had a problem if the NoSQL RavenDB update failed: at that point the two databases were out of sync. That was always nagging me, and Roque L Lucero P called me out on this issue on the original article (see this set of comments on that topic). I decided to wait until EF Core support of Cosmos DB was out (that has taken longer than originally thought) and fix this problem when I did the Cosmos DB rewrite, which I have now done.

NOTE: This article comes with a repo containing all the code and a fully functional example – see https://github.com/JonPSmith/EfCoreSqlAndCosmos. You can run the code need SQL Server (localdb is fine) and the Cosmos DB emulator. It will auto-seed the database on startup. This repo has been updated to EF Core 3.0 and the old EF Core 2.2 is in a seperate brnach called NetCore2_2Version.

TR; DR; – summary

  • For an introduction to CQRS pattern read this excellent article on the Microsoft’s site.
  • I implement a two-database CQRS pattern, with the write database being a relational (SQL Server) database and the read database being a NoSQL database (Cosmos DB).
  • This type of CQRS scheme should improve the performance of read-heavy applications, but it is not useful to write-heavy applications (because the writes take longer).
  • My implementation uses EF Core 2.2 with the preview Cosmos database provider.
  • I implement the update of the NoSQL inside EF Core’s SaveChanges methods. This means a developer cannot forget to update the read-side database because its done for them by the code inside SaveChanges.
  • I use a SQL transaction to make sure that both the SQL and the NoSQL database updates are both done together. This means the two databases with always be in step.
  • This design of CQRS pattern is suitable for adding to a system later in its development to fix specific performance issues.
  • There is an example application on GitHub available to go with this article.
  • There is a new article called “An in-depth study of Cosmos DB and the EF Core 3.0 database provider” which covers the update to EF Core 3.0

Setting the scene – using caching to improve read performance

You can skip this section if you already understand caching and the CQRS pattern.

In many applications the database accesses can be a bottleneck on performance, i.e. the speed and scalability of the application. When using EF Core there are lots of things you can do to improve the performance of database accesses, and its also really easy to do things that give you terrible performance.

There are only two hard things in Computer Science: cache invalidation and naming things. – Phil Karlton.But what do you do when even the best SQL database queries are deemed “too slow”? One typical approach is to add caching to your application, which holds a copy of some data in a form that can be accessed quicker than the original source. This is very useful, but making sure your cached version is always up to date is very hard (the Phil Karton quote comes from an article on CQRS written by Mateusz Stasch)

Caching can occur in lots of places, but in this article I cover caching at the database level. At the database level the caching is normally done by building “ready to display” versions for data. This works well where the read requires data from multiple tables and/or time-consuming calculations. In my book “Entity Framework Core in Action” I use a book selling application (think super-simple Amazon) as an example because it contains some complex calculations (form the list of authors, calculate the average review stars etc.). You can see a live version of the book app at http://efcoreinaction.com/.

You can do this yourself by building a pre-calculated version of the book list display (I did it in section 13.4 of my book), but its hard work and requires lots of concurrency handling to ensure that the pre-calculated version is always updated property. But the CQRS pattern makes this easier to do because it splits the write and read operations. That makes it simpler to catch the writes and deliver the reads. See the figure taken from Microsoft’s CQRS article  (See this link for authors and Creative Commons Attribution Licence for this figure).

A further step I have taken is to have two databases – one for write and one for reads. In my case the read data store is a Azure Cosmos DB – according to Microsoft an “highly responsive, low latency, high availability, scalable” NoSQL database. The figure below gives you a high-level view of what I am going to describe.

The rest of the article describes how to build a two-database CQRS database pattern using EF Core with its new support for the Cosmos DB NoSQL database. The design also includes one way to handle the “cache invalidation” problem inherent in having the same data in two forms, hence the “robust” word in the title.

Describing the software structure

With any performance tuning you need to be clear what you are trying to achieve. In this case my tests show it gets slow as I add lots of books but buying a book (i.e. creating a customer order) is quick enough. I therefore decide to minimise the amount of development work and only apply the CQRS approach for the book list, leaving the book buying process as it was. This gives me a software structure where I have one Data Layer, but it has two part: SQL Server (orange) and Cosmos DB (purple).

I am really pleased to see that I can add a CQRS implementation only where I really need it. This has two big benefits:

  • I only need to add the complexity of CQRS where it’s needed.
  • I can add a CQRS system to an existing application to just performance tune specific areas.

Most applications I see have lots of database accesses and many of them are admin-type accesses which are needed, but their performance isn’t that important. This means I only really want to add CQRS where it’s needed, because adding CQRS is more work and more testing. I want to be smart as to where I spend my time writing code and many database accesses don’t need the performance improvements (and complexities!) that CQRS provides.

But the best part is by implementing my CQRS system inside EF Core’s SaveChanges methods I know that any existing database changes HAS to go though me code. That means if I’m adding my CQRS system to an existing project I know I can catch all the updates so my NoSQL (cache) values will be up to date.

Making my update robust – use a transaction

As well as using the new Cosmos DB database provider in EF Core I also want to fix the problem I had in my first version of this CQRS, two-database pattern. In the previous design the two database could get out of step if the write to the NoSQL database failed. If that happens then you have a real problem: the book information you are showing to your users is incorrect. That could cause problems with your customers, especially if the price on the book list is lower that the price at checkout time!

There are lots of ways to handle this problem, but I used a feature available to me because I am using a SQL database as my primary database – a SQL transaction (see previous diagram). This is fairly easy to do, but it does have some down (and up) sides. The main one is that the write of data is slower because SaveChanges only returns when both writes have finished. But there is an up side to this: it solves what is known as the “eventually consistent” problem where you do an update but when the app returns the data on your screen hasn’t updated yet.

Jimmy Bogard has an excellent series called “Life Beyond Distributed Transactions: An Apostate’s Implementation” in the 8th article in the series he talks about using a transaction in a SQL database to ensure the second update is done before exiting. Jimmy is very clear that too many people ignore these errors – as he says in his tweet “hope is not a strategy”!

Jimmy’s approach is easy to understand, but if I used his approach I would have to find and replace every update path with some special code. In EF Core I can fix that by moving the code inside the SaveChanges methods, which means all the checks and updates are done whenever I create, update or delete anything that would change the book list display. That way I, or any of my colleagues, can’t forget to do the NoSQL update.

Let’s get into the code!

The whole process is contained in the SaveChanges (sync and async) methods. Below is the sync SaveChanges code.

public override int SaveChanges(bool acceptAllChangesOnSuccess)
{
    if (_bookUpdater == null)
        //if no bookUpdater then run as normal
        return base.SaveChanges(acceptAllChangesOnSuccess);

    try
    {
        var thereAreChanges = _bookUpdater
            .FindBookChangesToProjectToNoSql(this);
        //This stops ChangeTracker being called twice
        ChangeTracker.AutoDetectChangesEnabled = false;
        if (!thereAreChanges)
            return base.SaveChanges(acceptAllChangesOnSuccess);
        return _bookUpdater
            .CallBaseSaveChangesAndNoSqlWriteInTransaction(
               this,
               () = base.SaveChanges(acceptAllChangesOnSuccess));
    }
    finally
    {
        ChangeTracker.AutoDetectChangesEnabled = true;
    }
}

There are lots of lines, many of them are to make the code run efficiently, but in the code there are two methods that manage the book list update.

  • FindBookChangesToProjectToNoSql (Lines 8 and 9). This uses EF Core’ ChangeTracker to find changes to entity classes that will affect the book list display.
  • CallBaseSaveChangesAndNoSqlWriteInTransaction (lines ?? to ??). This is only called if NoSQL writes are needed and it handles the secure update of both the SQL database and the Cosmos database.

Now we will look at the two parts – finding the changes and then saving the changes securely.

Finding the book changes

Finding the book changes is complex, but this article is really about making a robust CQRS system with Cosmos DB, so I’m only going to skip over this code and just give you a diagram of how the entity’s State is used to decide what changes should be applied to the NoSQL database.

The diagram starts with the book list that the NoSQL database is going to provide with the list of entity’s that effect that book list. The table at the bottom shows how I use the entity’s State to decide what changes need to be applied to the NoSQL database.

The basic idea is the Book’s State takes precedent, with changes in the associated relationships only causing an update to the book list. There are some subtle items, especially around soft delete, which you can see in the BookChangeInfo class.

NOTE: The actual code to work out the updates needed is quite complex, but you can see it in the accompanying example repo by starting in the SQL DbContext’s SaveChanges and follow the code. I cover how I decoded the entity State in section 14.2.4 of my book “Entity Framework Core in Action”.

The end of all this is there a series of book list changes that must be applied to the NoSQL database to make it match what the data in the SQL database. The trick is to make sure that anything will make the two databases out of step from each other,  which I cover next.

Updating the databases in a secure manner

To make sure my SQL and NoSQL databases always in step I apply the both database updates inside a SQL transaction. That means if either of the updates, SQL or NOSQL, fail then neither are applied. The code below shows how I do that.

private int RunSqlTransactionWithNoSqlWrite(
    DbContext sqlContext, callBaseSaveChanges)
{
    if (sqlContext.Database.CurrentTransaction != null)
        throw new InvalidOperationException(
            "You can't use the NoSqlBookUpdater if you are using transactions.");

    var applier = new ApplyChangeToNoSql(sqlContext, _noSqlContext);
    using (var transaction = sqlContext.Database.BeginTransaction())
    {
        var result = callBaseSaveChanges(); //Save the SQL changes
        applier.UpdateNoSql(_bookChanges);  //apply changes to NoSql database
        _noSqlContext.SaveChanges();        //And Save to NoSql database
        transaction.Commit();
        return result;
    }
}
  • Use this to sort out formatting

Using a SQL transaction is a nice way to implement this, but you must apply the NoSQL database update at the end of the transaction. That’s because Cosmos DB database provider does not support transactions (most NoSQL don’t support transactions) which means the NoSQL write cannot be rolled back (i.e. undone). That means you can’t use this approach inside another transaction, as you could do something after the NoSQL update that errored (hence the check on line 4).

Here is the sync UdateNoSql method (there is a similar Async version). I use AutoMapper’s  ProjectTo method to create the book list version needed for the display.

public bool UpdateNoSql(IImmutableList<BookChangeInfo> booksToUpdate)
{
    if (_noSqlContext == null || !booksToUpdate.Any()) return false;

    foreach (var bookToUpdate in booksToUpdate)
    {
        switch (bookToUpdate.State)
        {
            case EntityState.Deleted:
            {
                var noSqlBook = _noSqlContext.Find<BookListNoSql>
                    (bookToUpdate.BookId);
                _noSqlContext.Remove(noSqlBook);
                break;
            }
            case EntityState.Modified:
            {
                var noSqlBook = _noSqlContext.Find<BookListNoSql>
                     (bookToUpdate.BookId);
                var update = _sqlContext.Set<Book>().ProjectTo<BookListNoSql>(SqlToNoSqlMapper)
                    .Single(x =&amp;gt; x.BookId == bookToUpdate.BookId);
                SqlToNoSqlMapper.CreateMapper().Map(update, noSqlBook);
                break;
            }
            case EntityState.Added:
                var newBook = _sqlContext.Set<Book>()                    .ProjectTo<BookListNoSql>(SqlToNoSqlMapper)
                    .Single(x => x.BookId == bookToUpdate.BookId);
                _noSqlContext.Add(newBook);
                break;
            default:
                throw new ArgumentOutOfRangeException();
        }
    }

    return true;
}
  • Use this to sort out formatting

Alternative ways of make the update robust

I think using a transaction is a simple way to ensure both databases are in step, but as

I have been thinking about this and here are some ideas for you think about. Note: all of these approaches suffer with the the “eventually consistent” problem I mentioned before, i.e. the system will return to the user before the data they were looking at has been updated.

1. Send a message inside a transaction.

In this case you would fire-and-forget a message to another system via a reliable queue, like RabbitMQ or Azure Service Bus. Then it is the job of the system that gets that message to make sure the NoSQL update is repeated until it works. The SaveChanges should return more quickly because queuing the message will be quick.

This is what Jimmy Bogard does in his relational article in his series called Life Beyond Distributed Transactions: An Apostate’s Implementation. Do have a read.

2. Run a background task to fix any failures.

If you added a LastUpdated DateTime to all the SQL entities, and a similar LastUpdated in the NoSQL cached version, then you have a method to find mismatches. This means you could looks for changes since it last run and check the SQL and NoSQL version have matching LastUpdated values (Cosmos DB has a “_ts” Unix-styled timestamp that may be useful).

Either you run the method every x mins (simple, but not that good) or you catch a NoSQL error and run the method looking for updates equal to LastUpdated time of the SQL update.

3. (Advanced) Use the ChangeTracker.StateChanged Event.

There is a really nice, but not much talked about, feature called ChangeTracker.StateChanged event that happens after the SaveChange has completed. This gives you a nice solution that only kicks in for a specific update error.

Basically you could kick off an timer for every NoSQL write, which is cancelled by the NoSQL ChangeTracker.StateChanged event that occurs on a successful write (status changes to Unchanged if successful). If the timer timed out, then you know the NoSQL update failed and you could take remedial action to fix it.

This if advanced stuff needing a ConcurrentDictionary to track each write. I have thought about it, but not implemented it yet. However if a client wanted me to add a CQRS pattern with a quick writes to their application, then this is most likely what I would build.

Limitations in Cosmos support in 2.2 preview

This application was built with EF Core 2.2 and the Microsoft.EntityFrameworkCore.Cosmos package 2.2.0-preview3-35497 . This is a very early version of Cosmos DB support in EF Core with some limitations on this application. They are

  • This Cosmos DB preview is very slow! (like hundreds of milliseconds). Version 3 will use a new Cosmos DB SDK, which will be faster.
  • I would have liked the Cosmos id value to be the same as the BookId GUID string.

You can track what is happening to EF Core support for Cosmos here. I will most likely update the application when Version 3 is out and write an article on its performance.

UPDATE: See new article “An in-depth study of Cosmos DB and the EF Core 3.0 database provider” and the updated repo for the version using EF Core 3.0 and the first, non-preview Cosmos DB database provider.

Conclusion

I am very happy to see EF Core supporting NoSQL databases alongside relational databases. It gives me more flexibility in using the right database based on business needs. Also EF Core has the depth and flexibility for me to implement quite complex state management of my database writes, which I needed to implement my CQRS two-database design.

Personally, I like the CQRS two-database design because it too allows me flexibility – I can add it only to the queries that need performance tuning and its also fairly easy to add retrospectively to an application that uses EF Core and relational (SQL) databases. Most performance tuning is done late, and my design fits in with that.

The next stage is to see what performance gains I can get with EF Core version 3. In my original version of CQRS with RavenDB I got very good performance indeed. I’ll let you know how that goes when EF Core 3 is out!

Happy coding.