A library to run your business logic when using Entity Framework Core

Last Updated: February 9, 2018 | Created: January 19, 2018

This article describes a library to help write and run business logic in a .NET Core application where you are using Entity Framework Core (EF Core) library for database accesses. This follows on from my article “Architecture of Business Layer working with Entity Framework (Core and v6) – revisited”, where I describe my pattern for building business logic and showed something called a BizRunner. This article details a library called EfCore.GenericBizRunner that provides a much more comprehensive solution to the BizRunner described in that first article.

The EfCore.GenericBizRunner library is available as a NuGet package, and its source, with examples, is available on GitHub at https://github.com/JonPSmith/EfCore.GenericBizRunner. The project is an open-source (MIT licence).

The aims of this article

  • To tell you why the GenericBizRunner library/framework is useful.
  • To show you where you might use the GenericBizRunner library in your application.
  • To show you how to use the GenericBizRunner library via two examples in an ASP.NET Core application.
  • To provide you with links to the GitHub repo GenericBizRunner , which contains the code of the library, an example an ASP.NET Core web application you can run locally to try the business logic code out, and its README file and the documentation in the project’s Wiki.

NOTE: I refer to the GenericBizRunner package as a “library”, because you can download it and install it. However, according to Neal Ford et al, technically the GenericBizRunner package is a framework because it calls your code, while a library is called by your code. I therefore use the term “framework” in contexts where that term is helpful.

I’m going assume you are familiar with .NET, EF Core and ASP.NET Core, and this article only worth reading if you are using Entity Framework Core (EF Core) library for database accesses in your business logic.

NOTE: This is a long article covering all the library’s capabilities, so if you just want to see an example of what it can do then I suggest to go straight away to Example 1. Or you can look at the Quick Start guide in the GitHub Wiki.

Setting the scene

According to Eric Evan’s, of Domain-Drive Design fame, “The heart of software is its ability to solve domain(business)-related problems for its users”. He also goes on to say “When the domain(business problem) is complex, this is a difficult task, calling for the concentrated effort of talented and skilled people”. I totally agree and I take business logic very seriously, which is why I have built a framework to help manage and run my business logic.

Over the years I have built some quite complex business logic and I have learnt that writing business logic is hard work. During those years I have progressively improved my business logic pattern to its current form, which I describe in the article “Architecture of Business Layer working with Entity Framework (Core and v6) – revisited” and in chapter 4 of my book “Entity Framework Core in Action(use my discount code fccefcoresmith to get 37% off my book).

This article covers the next stage, where I have taken the common parts of my pattern and made them into a library. The diagram below describes a layered architecture, with the EfCore.GenericBizRunner framework acting as a  Command, Mediator and Adapter software pattern.

If you aren’t familiar with that sort of layered architecture I suggest you read the article “Architecture of Business Layer working with Entity Framework (Core and v6) – revisited” where I describe how and why I split up my code in this way.

What does the library/framework do?

As you can see by the diagram above, I try to isolate the business logic from the other layers. The EfCore.GenericBizRunner framework helps this by providing much of the code you need to isolate the business logic, yet easily use it in your presentation layer. The key elements of the framework are:

  • Better Dependency Injection: It allows the business logic to be used as a service that can be injected into a ASP.NET Core action using dependency injection (DI).
  • Templates: It provides a templates for the business logic which include error handling and the definition on the particular in/out, sync/async etc. options the business logic needs.
  • Anti-corruption layer: It provides isolation of the business logic from the presentation, via mapping DTOs (data transfer objects, also known and ViewModels in ASP.NET).
  • Controls database commits: It manages the call to EF Core’s SaveChanges method, with optional validation, so that the business logic doesn’t have to handle the database or any errors it produces.

The diagram below shows the GenericBizRunner framework being used to run the business logic for handling a customer’s e-commerce order, which I describe in Example 1. The GenericBizRunner framework acts as a ServiceLayer (see previous diagram) to isolate, adapt and control the business logic. (Click figure to see a bigger version).

I should also say that the EfCore.GenericBizRunner framework is a rewrite of a private library I built for EF6.x back in 2015. The EF6.x framework was private because it was quite complex and hard to explain, but I have used it in several projects. My experience from using the original framework has helped me to create a better GenericBizRunner library, and I have been able to take advantage of the improvements in ASP.NET Core and EF Core. The EfCore.GenericBizRunner framework is still complex inside, but is easier to use that the original EF6.x framework.

There are many options in the library so I am going to explain what it does with two example pieces of business logic that are implemented in an ASP.NET Core application in the GitHub repo.

Example 1: Basic call of business logic

In my first example I want to show you all the steps in building your business logic and then calling it. To help me with this I am going to look at the business logic for placing an order in an e-commerce site that sells books. Order processing is typically quite complex, with requests to external systems. My example business logic isn’t that complicated as I am only adding the order to the database, but shows the how the GenericBizRunner works. Here is a screenshot of the checkout page.

When the Purchase button is pressed then the ASP.NET Core action PlaceOrder is called.  This action method obtains a BizRunner service instance, linked to my PlaceOrderAction business class, via the injection of a the GenericBizRunner library’s IActionService interface. Let’s look at the ASP.NET Core Action called PlaceOrder.

public IActionResult PlaceOrder(PlaceOrderInDto dto, 
    [FromServices]IActionService<IPlaceOrderAction> service)
    if (!ModelState.IsValid)
        //model errors so return to checkout page, showing the basket
        return View("Index", FormCheckoutDtoFromCookie(HttpContext));

    //This runs my business logic using the service injected into the Action's service parameter
    var order = service.RunBizAction<Order>(dto);

    if (!service.Status.HasErrors)
        //If successful I need to clear the line items in the basket cookie
        //Then I show the order to confirm that it was placed successfully
        return RedirectToAction("ConfirmOrder", "Orders", 
            new { order.OrderId , Message = "Your order is confirmed" });

    //Otherwise errors, so I need to redisplay the basket from the cookie
    var checkoutDto = FormCheckoutDtoFromCookie(HttpContext);      
    //This copies the errors to the ModelState
    service.Status.CopyErrorsToModelState(ModelState, checkoutDto);
    return View("Index", checkoutDto);

Some of the key lines are:

  • Line 4: The [FromService] attribute tells ASP.NET Core to use dependency injection to resolve the interface I have given. This creates what I call a BizRunner, with the business class instance also injected into that BizRunner via the IPlaceOrderAction interface generic part.
  • Line 13. This is where I use the BizRunner service to execute the business logic. The order processing business logic takes in a class called PlaceOrderDto (defined in the BizLogic layer), and outputs an Order class, which is the database entity class, i.e. it holds the information written to the database.
  • Line 15: The BizRunner feeds back any errors that the business logic raises, or any validation errors found when writing to the database. You can see me checking whether I have errors or not, and taking the appropriate actions.
  • Line 27: You may notice the extension method called CopyErrorsToModelState, which I wrote and can be found here. This takes the errors returned by the business logic, which is as a collection of ValidationResults, and copies these errors into the ModelState so ASP.NET can display this back to the user, hopefully against the actual field that caused the problem. If you were using a WebAPI you would need a method to return these errors to the calling application.

What business logic templates does it have?

The GenericBizRunner library provides a template for your business logic. There are twelve combinations of business logic formats, each represented by an interface:

Interface In Out Async? Write to Db
IGenericAction<TIn, TOut> Yes Yes No No
IGenericActionAsync<TIn, TOut> Yes Yes Yes No
IGenericActionWriteDb<TIn, TOut> Yes Yes No Yes
IGenericActionWriteDbAsync<TIn, TOut> Yes Yes Yes Yes
IGenericActionInOnly<TIn> Yes No No No
IGenericActionInOnlyAsync<TIn> Yes No Yes No
IGenericActionInOnlyWriteDb<TIn> Yes No No Yes
IGenericActionInOnlyWriteDbAsync<TIn> Yes No Yes Yes
IGenericActionOutOnly<TOut> No Yes No No
IGenericActionOutOnlyAsync<TOut> No Yes Yes No
IGenericActionOutOnlyWriteDb<TOut> No Yes No Yes
IGenericActionOutOnlyWriteDbAsync<TOut> No Yes Yes Yes

As you can see, the GenericBizAction library provides interfaces that handle all the different types of business logic you could write. The library decodes what the business logic needs and checks that the data/call you use matches its interface.

The business logic is expected to implement the GenericBizAction’s IBizActionStatus interface which the business logic can use to report errors and other status items to the BizRunner. The GenericBizAction library provides an abstract class called BizActionStatus, which the business logic can inherit to provide that functionality. Here is an example of an In-and-Out, synchronous method that doesn’t write to the database, with all the key component parts pointed out (The business logic class shown is one I use in my unit testing, so the code inside the method is very simple).

The code for the order processing business logic is much more complicated and you can find it in the GutHub repro under PlaceOrderAction.cs . Also, if you clone the GitHub project and run the application locally the ASP.NET Core web app allows you to try out the order example just described and the next example below.

Example 2: Using the anti-corruption layer

In my second example I want to show you how the anti-corruption layer part of the GenericBizRunner framework works. The term “anti-corruption layer” is a Domain-Driven Design concept, and is normally about related to separating bounded contexts (see this article on the DDD use of this terms). I am not using my anti-corruption layer between bounded contexts, but to stop the Presentation Layer concepts/data from having any impact on my business logic.

To show the anti-corruption layer in action I have built a piece of business logic  that allows the delivery date of an order to be changed. The business logic implements certain business rules to apply to any change, and ultimately it would also need to access an external site, such as a courier service, to check if the delivery date is possible.

Before I describe the code, with its use of the anti-corruption layer feature, let me first show you a screenshot of the ChangeDelivery page that the user would see.

To make this work properly the Presentation layer need to show a dropdownlist, which isn’t something that the business logic should be even thinking about – its a Presentation layer problem. This is where the GenericBizRunner library provides a couple of classes that provide the anti-corruption layer feature.

These anti-corruption layer classes are abstract classes that you can inherit, which then allows you to deal with the presentation layer item (known as the presentation-focused DTO). The abstract class makes you define the business logic’s associated class, known as the business-focused DTOs. You will see this in action later in this example, but lets look at stages in the “Change Delivery” example.

This screenshot shows the date that an order is going to be delivered on being changed from 16/1/2018 (see second line) to 17/1/2018. The stages in this are:

  1. Show the ChangeDeliver input page – the ChangeDelivery HTTP GET request
  2. Get the user’s input and run the business logic – ChangeDelivery HTTP POST request
  3. Copy the presentation-side DTO to the business DTO and run business logic
  4. Show a confirmation page if successful, or errors if failed.

Let’s look at the code for each of these sections, and see where and how the anti-corruption feature works.

1. Showing the ChangeDelivery input page

To produce the page shown in the screenshot above I need to read the current order and then display a list of possible dates in a dropdown. I do this by adding some properties for dropdown list, etc. to the a DTO class called WebChangeDeliverDto. But how do these properties get filled in? This is where the abstract class in the GenericBizRunner library called GenericActionsToBizDto comes in.

The class GenericActionsToBizDto has number of features, but the one I want to start with is the SetupSecondaryData method. This is called by the BizRunner to set up the presentation properties. Here is the code from my WebChangeDeliverDto class.

public class WebChangeDeliveryDto : 
    GenericActionToBizDto<BizChangeDeliverDto, WebChangeDeliveryDto>
    public int OrderId { get; set; }

    public string UserId { get; set; }

    public DateTime NewDeliveryDate { get; set; }

    //Presentation layer items

    // … various properties removed to focus on the SelectList

    public SelectList PossibleDeliveryDates { get; private set; }

    protected override void SetupSecondaryData(DbContext db, IBizActionStatus status)
        var order = db.Set<Order>()
            .Include(x => x.LineItems).ThenInclude(x => x.ChosenBook)
            .SingleOrDefault(x => x.OrderId == OrderId);

        if (order == null)
            status.AddError("Sorry, I could not find the order you asked for.");

        // … other code removed to focus on creating SelectList

        PossibleDeliveryDates = new SelectList(
        var selected = PossibleDeliveryDates
            .FirstOrDefault(x => x.Text == NewDeliveryDate.ToString("d"));
        if (selected != null)
            selected.Selected = true;
    //... other code left out

The part I want you to see is the SetupSecondaryData method, at the bottom of the class. This is executed by two BizRunner methods:

  • GetDto<T>: This creates the DTO class, optionally sets some properties (see next code example below) and the runs the the SetupSecondaryData
  • ResetDto: If you have errors and want to re-display the page, you need to run the SetupSecondaryData method again, which the BizRunner’s ResetDto method does.

The idea is that the SetupSecondaryData method sets up any properties in the DTO which are needed to build the page shown to the user.

The code below displays the ChangeDelivery page. You can see it using the BizRunner’s GetDto<T> method to set up the presentation data.

public IActionResult ChangeDelivery(int id, 
    [FromServices]IActionService<IChangeDeliverAction> service)
    var dto = service.GetDto<WebChangeDeliveryDto>(x =>
        x.OrderId = id;
        x.UserId = GetUserId(HttpContext);
    service.Status.CopyErrorsToModelState(ModelState, dto); 
    return View(dto);

Some of the key lines are 4 to 8, where the service.GetDto<WebChangeDeliveryDto> method is called. This will:

  1. Create a new instance of the WebChangeDeliveryDto
  2. Executes the action I have provided in the parameter, which sets properties in the new WebChangeDeliveryDto
  3. Then it runs the SetupSecondaryData method in the WebChangeDeliveryDto class, which I described earlier.

NOTE: The code I wrote in the the SetupSecondaryData method reports an error if the order was missing. I did this to show you how something like that would work. I needed to call the CopyErrorsToModelState extension method to copy any errors found to ASP.NET Core’s ModelState.

2. Get the user’s input and run the business logic

Once the user has selected the new delivery date and pressed the Change button the ASP.NET Core’s POST Action method is called. This does certain checks, but the primary thing we are interested here is the running of the ChangeDeliverAction business class, which you can see in line 17 of the code below.

public IActionResult ChangeDelivery(WebChangeDeliveryDto dto,
    [FromServices]IActionService<IChangeDeliverAction> service)
    if (!ModelState.IsValid)
        //I have to reset the DTO, which will call SetupSecondaryData
        //to set up the values needed for the display
        //return to the same view to show the errors
        return View(dto);

    //This runs my business logic using the service injected in the
    //Action's service parameter

    //… rest of action left out. I deal with that in section 3.

3. Copy the presentation-side DTO to the business DTO and run business logic

One of my rules is that the Business Logic data classes should be isolated from other layers in the application. This is a separation of concerns principle, and helps me to focus just on the business issue I am trying to solve. To help me with this the GenericBizRunner library provides the abstract GenericActionToBizDto class for input and the abstract GenericActionFromBizDto class for output. These provide the anti-corruption layer between the business logic code and the presentation code.

In this “Change Delivery” example I needed a dropdownlist of potential delivery dates, which is a presentation-focused item so I produced a presentation-focused input class called WebChangeDeliveryDto (which you have seen already when I was talking about the SetupSecondaryData method). This class inherits from the abstract GenericActionToBizDto class, and right at the top you need to define which business logic DTO this class is linked to, in this case the BizChangeDeliveryDto, as you can see this in the class definition below.

public class WebChangeDeliveryDto :
     GenericActionToBizDto< BizChangeDeliverDto, WebChangeDeliveryDto>
 … other code left out

The abstract class uses this definition to set up a mapping, via the AutoMapper library, between a presentation-layer focused DTO and the business-focused DTO. So, in this “Change Delivery” example, when the BizRunner is asked to run some business logic it notices that the input isn’t the BizChangeDeliverDto class that the business logic needs, but is a class that inherits from GenericActionToBizDto class. It then uses the CopyToBizData method inside the GenericActionToBizDto class to copy only the properties that the business logic’s DTO needs from the input class.

The default mapping that AutoMapper uses is “copy similar named properties”, which in this case only copies the first three properties to the business input class called BizChangeDeliveryDto. This is shown in the figure below.

This simple copy-by-name works 80% of the time, but the abstract classes have plenty of options to allow you to tweak the copy – see the article Using AutoMapper: Creating Mappings.

4. Show a confirmation page if successful, or errors if failed

Finally, I want to cover what happens when the business logic returns. This depends on what happened with the business logic. Let me show you the second half for the ChangeDeliver POST action.

public IActionResult ChangeDelivery(WebChangeDeliveryDto dto,
    [FromServices]IActionService<IChangeDeliverAction> service)
    //… first part of the action removed for clarity    


    if (!service.Status.HasErrors)
        //We copy the message from the business logic to show 
        return RedirectToAction("ConfirmOrder", "Orders", 
            new { dto.OrderId, message = service.Status.Message });

    //Otherwise errors, so I need to redisplay the page to the user
    service.Status.CopyErrorsToModelState(ModelState, dto);
    //I reset the DTO, which will call SetupSecondaryData i set up the display props
    return View(dto); //redisplay the page, with the errors

Success path

If the business logic, and the write to the database, were successful then the service.Status.HasErrors property is false. In that case I want to show a confirmation page. In this example I include the service.Status.Message property, which contains a string defined by my business logic – in this example it says “Your new delivery date is 17/01/18”.

The Message property is one way to return confirmation messages when the business logic finishes successfully. If there were no errors it carries a success message, otherwise it has a message saying there were n errors. Note: there are several default messages for success and failure cases – see this documentation on the service.Status property.

Failure path

If there are errors, you normally what to redisplay the GET page, with a list of errors. You have seen the CopyErrorsToModelState extension method in the first example –  it copies the Errors into the ASP.NET ModelState. So that the errors will be displayed when the view is shown.

This “Change Delivery” example also needs to re-run the SetupSecondaryData method, so that the properties in the View are set up. I do that using the BizRunner’s ResetDto method (see line 19 in the code above).

Other features not covered in this article

This article is already very long, so I have left out several parts. Here is a list of features, with links to the documentation in the project’s Wiki.

Future features

At this point in time I have not added the feature to allow you to run a chain of business logic inside a transaction. I have already written the code in the old EF6.x version, but there is a planned change to transaction handling in EF Core version 2.1.0 (see this EF Core issue). I’m also not sure whether want to use transactions – please vote on this issue.

Note: You can see an example of how running multiple business logic in a transaction works in the article I wrote called “Architecture of Business Layer – Calling multiple business methods in one HTTP request”. I wrote this article after writing the transactional part of the original EF6.x GenericBizRunner.


Well done if you get to the end, as it’s a big article! The GenericBizRunner has many options to make it versatile but that does make the documentation quite long (this was one reason I didn’t make the original EF6.x GenericBizRunner library open-source – too much documentation needed!). Hopefully this article, and the runnable example application in the EfCore.GenericBizRunner GitHub repo, will make it easier for you to understand the GenericBizRunner framework.

The core concept is that writing business logic is difficult so isolating the business logic code from other layers allows the developer to concentrate on the business/domain problem they are trying to solve (see chapter 4 of my book for more on this concept). The GenericBizRunner helps me in five main ways:

  • DI-friendly. It is designed to make accessing business logic very simple, via specific interfaces and dependency injection.
  • Templates: It’s set of interfaces provides a common format for your business logic patterns, including help on error reporting and status feature.
  • Controls database writes: It provides a harness with can run any business logic that implements one of the GenericBizRunner’s interfaces, with optional writing to an EF Core database.
  • Anti-corruption layer feature: It allows the business logic to be isolated from the presentation layer through a copy system that extracts the data the business logic needs from a presentation-focused class.

This new EF Core version of the GenericBizRunner framework has several improvements from using the original EF6.x version I built some years ago. The experience of using that first framework showed me that the GenericBizRunner library was useful, so I put the time into building the EF Core version, and finally biting the bullet on writing the documentation. You can find out more about the EF Core library via the EfCore.GenericBizRunner GitHub repo and its Wiki.

Other useful articles/books

Happy coding!

Using in-memory databases for unit testing EF Core applications

Last Updated: February 9, 2018 | Created: December 16, 2017

While writing the book Entity Framework Core in Action I wrote over 600 unit tests, which taught me a lot about unit testing EF Core applications. So, when I wrote the chapter on unit testing, which was the last in the book,  I combined what I had learn into a library called EfCore.TestSupport.

Note: EfCore.TestSupport is an open-source library (MIT licence) with a NuGet package available. Documentation is available in the project’s Wiki.

In this article I want to look at using in-memory database to simulate the database when you are unit testing. I also look at how to properly test your ASP.NET Core database update code, which has to handle what is called the disconnected state (explained later), with an in-memory database.

Your options for in-memory databases

There are several ways to simulate the database when unit testing, but one of the simplest ways is to create an in-memory database. They are easy to create and are quicker to create than a real database, but they do have a few limitations, which I will describe at the end. First I want to show you how they work.

The most obvious in-memory database is EF Core’s InMemory database provider, which was written specifically for unit testing. But it turns out the better option is using the SQLite database provider, with an in-memory database. Why is that?

Its because SQLite is a true relational database while EF Core’s InMemory database provider isn’t a real relational database. This means that EF Core’s InMemory database won’t pick up on data that breaks referential integrity, such as foreign keys that don’t match the appropriate primary key. About the only good thing about EF Core’s InMemory is that its slightly faster on setup than SQLite (about 40%), but both are very fast anyway, so the extra checking makes SQLite my choice.

Note My EfCore.TestSupport library has support for both InMemory and SQLite in-memory, so you can choose. But in this article I am only going to describe the SQLite in-memory.

How to configure an in-memory SQLite database?

My EfCore.TestSupport library provides a pre-configured static method that will set up the options for creating a SQLite database that is in-memory. The code below contains a simplified version of that method

public static DbContextOptions CreateOptions()
    where T : DbContext
    //This creates the SQLite connection string to in-memory database
    var connectionStringBuilder = new SQLiteConnectionStringBuilder
            { DataSource = ":memory:" };
    var connectionString = connectionStringBuilder.ToString();

    //This creates a SQLiteConnection with that string
    var connection = new SQLiteConnection(connectionString);

    //The connection MUST be opened here

    //Now we have the EF Core commands to create SQLite options
    var builder = new DbContextOptionsBuilder();

    return builder.Options;

Note: The EfCore.TestSupport version has a few extra checks and options, but I left them out to focus on the in-memory bit.

How to use an in-memory database in a unit test

Having shown you the CreateOptions method now I want to show you how you can use that to create an in-memory version of the database that EF a simple unit test that uses the CreateOptions method to create an in-memory database.

Note: I am using the xUnit unit test package with fluent asserts.

public void TestSQLiteOk()
    //Here create the options using SQLite CreateOptions
    var options = SQLiteInMemory         
    //I create an instance of my DbContext
    using (var context = new MyContext(options))
        //You have to create the database

        context.Books.First().Title = "New Title";

        context.Books.First().Title.ShouldEqual("New Title");

Using the pattern shown in the code just shown you can test all manner of EF Core code. But let me point out some parts of the code

  1. Lines 6-7: This is using my EfCore.TestSupport methods to set up the options to use a in-memory SQLite database.
  2. Line 9: This is where you create an instance of your application’s DbContext.
  3. Line 12: The context.Database.EnsureCreated(), is very important. This creates the in-memory database using the current entity classes and EF Core configuration of your application. In this case it will return an empty database.
  4. Line 13: You often need to fill the database (often called seeding the database) with appropriate data for a test. It turns out seeding the database can be quite complex in a big application. I tend to spend time creating helper methods, like the SeedDatabaseFourBooks method in this example, to set up specific database states.
  5. Lines 16-17: This is where you put the code you want to test. I showed a very simple example – yours is likely to call methods from you own code.
  6. Line 20: Here is the assert to check the method did what you expected it to do.

Using SQLite in-memory database for testing disconnected state updates

If you are using EF Core in an application such as ASP.NET Core, then you will have to test update methods that have to work across what is called the disconnected state. The disconnected state happens when an operation, normally an update, is split into two separate parts.

The figure below shows an example of the disconnected state where a user updates the publication date of a book. The first stage on the left is where the data is presented to the user to change. The second stage happens when the user presses the UPDATE button, which submits a new HTTP POST request to the web application which uses the information sent back to do the the update.

The important point about the disconnected state is you have two, separate instances of the application’s DbContext. And if you don’t test things in the same way you might miss some errors.

Both the InMemeory and SQLite in-memory databases can handle this problem. In SQLite’s case the database is held in the connection, so, as long as you use the same options you can create new instances of the DbContext that maps to the same database.

Let me give you an example of a bug that would only be picked up by having two instances of the application’s DbContext. In this case I want to add a new review to a book. A Book entity has has zero-to-many Review‘s and a user have add a new review, with a star rating – see figure below.

So, the standard way to add a new Review in a disconnected state is to load the Book entity with all its Reviews, add the review to the collection and then call SaveChanges.

Now, if I get that wrong, and I don’t load the current Reviews I can get a number of possible errors, but only if I replicate the disconnected state properly.

Let me show you the correct way to test this disconnected state, which shows the error.

public void CorrectTestOfDisconnectedState()
    var options = SQLiteInMemory

    //In my first DbContext I make seed the database
    int bookId;
    using (var context = new MyContext(options))
        bookId = context.Books.Last().BookId;

    //In the second DbContext I test my disconnected update code
    using (var context = new MyContext(options))
        var book = context.Single(x => x.BookId == bookId);
        book.Reviews.Add( new Review{NumStars = 5});

        //… rest of unit test left out, as has errored

Note: the correct code for line 22 would be
        context.Single(x => x.BookId == bookId).Include(x => x.Reviews);

The point of all that is, if I had used one DbContext and seeded the database and ran the test of the Review update code, then it would have passed. That is because EF Core would have already been tracking the Reviews collections as from the SeedDatabaseFourBooks method, so the Reviews collection would already be present.

Note: In some of my unit tests I need to use three instances of the DbContext to correctly test something. One for seeding the database, one for running the test, and a final one to verify that the database has been written correctly.

Limitations of using SQLite to unit test EF Core

It’s likely that the actual database you use isn’t going to be SQLite. In that case you need to understand how SQLite differs from your actual database, especially if you use a feature that SQLite doesn’t support. The table below gives you the major areas that SQLite doesn’t support, or its support is so different that it will cause a problem.

These are all fairly advanced features, so if you don’t recognise them then you aren’t using them, so its fine to use SQLite for unit testing.

SQL feature SQLite Support?
Different schemas NONE
SQL Sequences NONE
SQL Computed columns Different
SQL user defined functions (UDFs) Different
SQL fragment default value Different

Note: There are other differences such as it doesn’t have the same types as other databases, for instance it stores strings in a TEXT type, while SQL Server has 8-bit varchar(nnn) and 16-bit nvarchar(nnn) types. These don’t normally cause a problem, but be aware in case some odd unit test result comes up.

So, if you use any of the features in the table then you can’t use SQLite in-memory database (nor InMemory – it has even more limitations than SQLite). If you can’t use SQLite you have to use the same database type that you are using for the application in your unit tests.

My EfCore.TestSupport library has help for other database types, with extra helpers for SQL Server databases, but that is most likely needs a new article to cover that.


This article shows you how you can simulate your database quickly and easily by using an in-memory database. I have shown you that SQLite is the best choice for an in-memory database, because it is a true relational database and will catch errors such as incorrect primary or foreign keys.

I also showed you how you can produce different instances of the application’s DbContext that all access the same in-memory database. This is useful for checking code that is used in applications such as ASP.NET Core, where you have a disconnected state.

There are some limitations to SQLite’s implementation of the SQL language, which might rule it out in your application. But for any ‘standard’ EF Core application using SQLite in-memory databases should be fine.

I hope this article, and my EfCore.TestSupport library, which is available as a NuGet package, with documentation of all its feature in the project’s Wiki, will help you write unit tests quicker.

Note: The chapter 15 of my book Entity Framework Core in Action is on unit testing EF Core application and goes into the whole topic in much more detail (its 32 pages long!). Manning Publications have offered the coupon to get 37% off the book – use fccefcoresmith (Christmas 2017 – watch out for 50% off offers too).

Happy coding.

EF Core: taking full control of the database schema

Last Updated: December 7, 2017 | Created: November 23, 2017

Entity Framework Core (EF Core) is the new(ish) version of Microsoft’s widely-used database access library. EF Core allows you to map .NET classes and properties to database tables and columns respectively, and makes writing database access code much simpler than other alternatives, like ADO.NET.

EF Core maps classes to the database by building a model of the database structure, based on the software classes and configuration you set. This means that when you change the software structure or the database structure, known as the schema, then something must happen to make them match. EF Core provides various commands to do this – you can either change the database schema based on the software structure, or change the software structure based on the database schema.

In this article, I describe an alternative way to handle a change to a database’s schema from EF Core. My approach provides a tool that will check that EF Core’s model of the database matches the actual database schema. This allows a developer to take full control of the software structure and the database structure with the confidence that the tool will catch any mistakes they make in matching the two. I will now:

  • List the two ways EF Core can update the database, and add my alternate method.
  • Describe how my alternate method works
  • Compare each approach and look at their advantages and disadvantages
  • Detail my experience of using my alternative approach in real projects

Who controls the database schema?

In EF Core terms, there are two ways in which the database scheme can be changed, or in EF Core terms ‘migrated’.

  1. Code-first: Here the classes that are mapped to the database (known as entity classes), and the EF Core’s configuration commands define what the database schema looks like. You can then run a command, such as Add-Migration, that creates code can update the database’s schema.
  2. Database-first: For the situation where you have an existing database EF Core has a command called Scaffold-DbContext that will read the schema of a given database and then produce the entity classes and the EF Core configuration to match the database.

I call my alternate approach SQL-first, for reasons that will become obvious.

  1. SQL-first. In this approach SQL change scripts controls the database structure changes. Then I use a tool I have developed to compare EF Core’s entity classes and configuration to highlight any places where they differ from the database (or the database differs from the code).

The use of SQL change scripts to change a database’s schema is well known (and liked) by many DevOps people. In fact, Microsoft recommend using SQL change scripts, created via EF Core’s migration feature, to update a production database.

The problem with writing your own SQL change scripts is making sure your changes to the database match what EF Core thinks the database looks like. This is where my new tool comes in, and in the next section I describe how I make the SQL-first approach work.

How does the SQL-first approach work?

In this part I am going to describe the steps in using the SQL-first approach. Let’s assume I want to make a change to my application to support a new feature. I work out how that affects the database and then make the following changes:

  • I change my EF Core entity classes and configuration to the form I want.
  • I create a new SQL change script that will update an existing database schema with the same change.

It doesn’t matter if I change the classes first, or the database first. What does matter is: does the database’s schema and EF Core’s internal model of the database schema match? That is where my EF Core tool, which I refer to as the EfSchemaCompare feature, comes into play.

The following figure shows the steps to check a SQL change script matches what the EF Core’s view of the database. If you follow the 1 to 5 items on the diagram it will step you through the process of ensuring the database and the software match.

As you can see, my EfSchemaCompare tool checks that my changes to the EF Core code and the update that my SQL script applied to the database match. If they don’t, then I get a list of error messages that point out what is wrong. I fix them, either in the database or my software, and run the process again.

I have not defined how you apply the SQL change scripts, but there are plenty of packages that can do that (and more). I personally have used the open-source library DbUp, which I found very useful. Whatever script application package you use needs to know what scripts has been applied, and only apply scripts that aren’t already applied to the database.

How to run the EfSchemeCompare feature

I have added the EfSchemeCompare feature to my NuGet/GitHub project EfCore.TestSupport. This library contains a whole range of methods and classes to help you unit test an application that used EF Core for database accesses. As I want to check that the EF Core’s view of the database matchs the test database in my unit tests, then putting the feature inside the EfCore.TestSupport library makes sense to me.

The code below shows an xUnit unit test that checks new database against EF Core’s view of the database. In this case there should be no errors, because the database was created using EF Core.

public void CompareViaContext()
    using (var context = new BookContext(_options))
        //These two commands ensure I have a new database that matches EF Core

        var comparer = new CompareEfSql();

        //This will compare EF Core model of the database 
        //with the database that the context's connection points to
        var hasErrors = comparer.CompareEfWithDb(context);

        //The CompareEfWithDb method returns true if there were errors.
        //The comparer.GetAllErrors property returns a string
        //where each error is on a separate line

In this case the hasErrors variable will be false, because the database was created by EF Core and therefore matches the EF Core Model. But in cases where the test fails I have arranged, via the comparer.GetAllErrors property, that the xUnit runner will list all the errors in the unit test output window. (The ShouldBeFalse extension method is part of the xUnit Fluent asserts. It takes a second parameter that is a string, which is output if the test fails).

Here is an alternative method where I supply a connection string to a database that I want to check. The connection string could point to any database you like, such as your production database. That gives you early warning that this version of your application cannot be deployed without first running a migration on the database.

public void CompareViaConnection()
    using (var context = new BookContext(_options))
        var comparer = new CompareEfSql();

        //The connection string points to a database that you want to check
        var hasErrors = comparer.CompareEfWithDb(connectionString, context);


If this process finds an error(s) you get a series of error messages. Here are some examples

  1. DIFFERENT: MyEntity->Property ‘MyString’, column type. Expected = varchar(max), found = nvarchar(max)
    This says it found a difference in the column type of the property MyString in my entity class, MyEntity. EF Core thinks the SQL type should be varchar(max), that is a 8-bit ASCII storage type, while the actual database is nvarchar(max), that is a 16-bit Unicode storage type.
  2. NOT IN DATABASE: Entity ‘LineItem’, table name. Expected = LineItems
    This says the table LineItems that entity class called LineItem maps to wasn’t found in the database.
  3. EXTRA IN DATABASE: MyEntity->PrimaryKey ‘PK_MyEntites’, column name. Found = MyEntityId
    This tells you that a column called MyEntityId, which is (part of) a primary key was found in the database, but wasn’t in EF Core’s list of primary key properties in the entity class, MyEntity.

Version 1.1.2 of the EfSchemaCompare feature uses about 20 different types of tests, of which seven are applied to each column. This isn’t the place to describe how the EfSchemaCompare feature works (see here for a detailed description) but it is quite versatile, with lots of options. For instance, you can suppress any errors that you don’t care about – say you didn’t care about any EXTRA IN DATABASE errors, then you can add that type of error to the LogsToIgnore.

The Pros and Cons of the three approaches

I like to critique any approach I create, because there is always so sort of trade-offs when there are multiple ways to do something. Here is a table that summarises the advantages and disadvantages of each approach, with a reference in brackets, e.g. (see 1), to sections with more information.

Approach Advantages / Disadvantages
1. Code-first Advantages:
·         Easy for a software developer
·         Can use all EF Core featuresDisadvantages:
·         EF Migrations can be difficult (see 1)
·         Can’t access all SQL features (see 2)
2. Database-first Advantages:
·         Easy for a software developerDisadvantages:
·         Creates “poor” entity classes (see 3)
·         Can’t use some EF Core features (see 4)
3. SQL-first Advantages:
·         Can use all SQL features
·         Can use all EF Core features
·         Easy to use in automated updateDisadvantages:
·         Developer needs to know SQL (see 5)

Note: Currently the database-first approach in EF Core 2.0.0 requires you to delete all the code that EF Core generated when you used the Scaffold-DbContext command. That makes importing updates to the database rather tedious and time consuming. But a command that will “tweak” the generated code is on the EF Core roadmap and should be out sometime in 2018.

1. EF Migrations can be difficult

EF migrations work well for a simple system, but they start to get complex when used on a live site. There are two issues:

  1. If you have changes to the database that will lose data or change data, then EF Core can’t handle that. You will need to write extra code to the migration to deal with that.
  2. The recommendation of the EF Core team is that migrations be applied via scripts, or a command line tool, in production. This means you are into handling SQL scripts anyway.

2. Can’t access all SQL features

EF Core uses the main SQL commands, like ALTER, to change a database’s schema. EF Core is richer in what SQL parts it can do than EF6.x, with commands to set column defaults or computed columns, but there are still things it won’t do, like place a CONSTRAINT on a value column.

In addition, if a database change requires data must be moved or reformatted, then EF Core can’t help – you have to write SQL code to do that.

3. Creates “poor” entity classes

EF Core’s Scaffold-DbContext command produces entity classes which has a navigational property for all the links between tables. Personally, I only define a navigational link from one entity class to another if there is a good business/development reason for it to be there. Having every navigational link possible in an entity class obscures the correct, business-based connections that I should be using.

4. Can’t use some EF Core features

EF Core’s Scaffold-DbContext command produces entity classes that are generic in nature, with every column mapped to a property which has public read/write. This means you can’t use some of EF Core’s features that allow you to ‘hide’ data and/or relationships. These can be very useful in building robust systems by limiting how certain data is accessed.

In addition, EF Core provides a number of table mapping features, table-per-hierarchy, table splitting and owned types, which the scaffolder can’t take advantage of.

5. Developer needs to know SQL

Yes, the SQL-first approach does need you as developer to understand SQL. For some that is a down side – it’s another language to learn. For those who don’t want to do that the EF Core methods are best.

But if you do want to use SQL and don’t know it that well (I’m not an expert), then you can get some help from EF Core. In EF Core, there are two commands meant for unit testing called Database.EnsureDeleted and Database.EnsureCreated. By running those two, one after the other, you can get EF Core to output SQL to create all the tables, constraints, indexes etc. that are needed to produce a database that exactly matches what EF Core need. You can capture this SQL via EF Core’s logging (my library has an extension method to help with that), which will give you a set of SQL commands to act as template for your SQL change script.

For small changes, like adding a column to an existing table, you need to snip out the relevant part from the SQL that EF Core produced and re-craft it into a SQL ALTER TABLE command. But at least you have the correct names, types, constraints that EF Core expect to see.

My experience of using the SQL-first approach

This isn’t the first time I have developed or used a EfSchemaCompare feature – I produced such a tool for EF6.x back early 2016, and I have used it in anger on two projects. It worked well, even on a multi-person project. The benefits were:

  • Complete control over the database schema
  • Complete control over the EF entity classes and configuration
  • Ability to add/alter data via SQL scripts
  • Can produce a new database much quicker than EF can (useful for testing)
  • SQL Scripts are great for production
  • I can use the EfSchemaCompare tool to check my production database matches before deploying new code
  • If done carefully then migrations can be run out-or-order – useful when different developers are working on different parts of the database.

The new EF Core version of the EfSchemaCompare feature is far superior to my EF6.x version because of EF Core. Firstly, EF6.x’s access to the database model is poor, while EF Core’s is brilliant – it allows access to every aspect of the database mapping. This mean I can handle column SQL defaults, computed columns, and the table mapping features, table-per-hierarchy, table splitting and owned types. My EF6.x couldn’t handle any of those.

Secondly, I could access EF Core’s scaffold service that is used by the Scaffold-DbContext command. This means my compare feature will work with any relational database that EF Core supports (my old EF6.x version only worked with SQL Server). That is a great feature to have.

Also, my second implementation is better because I learnt what I needed from using the EF6.x version. My EF Core version now provides more comprehensive coverage, optional suppression of specific errors, handles multiple DbContext’s mapped to the same database (known as bounded contexts). It has a small number of limitations – see here for the latest list.


In this article, I have described the alternative SQL-first approach to changing a database’s schema when working with EF Core as your database access library. The SQL-first approach has some pros and cons, but I like it because it allows me to use all the great features of EF Core, and exactly define my SQL database the way I want to.

My experience of using the SQL-first approach with EF6.x was very positive, which made it a priority to create a EF Core version, which I have now done. EF Core’s features, such as the Model property and access to EF Core internal services, made it easier to create a EfSchemaCompare tool and that tool provides a better checks than the EF6.x version did.

The EfSchemaCompare tool is available via the open-source (MIT licence) EfCore.TestSupport package on NuGet, and the code and documentation can be found on GitHub.

Happy coding.

EF Core – Combining SQL and NoSQL databases for better performance

Last Updated: December 15, 2017 | Created: October 24, 2017

This is the second article in my series on Entity Framework Core (EF Core) performance tuning. In this article I take that performance even further than part 1 did, by changing the applications to a two-database CQRS (Command Query Responsibility Segregation) architecture, where one database is SQL and the other is NoSQL (I explain what that architecture looks like later in the article).

  1. Entity Framework Core performance tuning – a worked example.
  2. EF Core – Combining SQL and NoSQL for better performance (this article).

Note: This article is from chapter 14 (which will be available from the end of November 2017) in the book “Entity Framework in Action” that I am writing for Manning publications. Chapter 14 contains more on the EF Core parts on the application mentioned in this article. The book is on early access, which means you can buy the book now, and you get the chapters as I write them – plus the finished book when the book is complete.

NOTE: Manning Publications have offered the coupon to get 37% off my book, Entity Framework in Action” – use fccefcoresmith

Executive summary for those in a hurry

I have an example ASP.NET Core application that uses EF Core to display a set of books – called my book selling site. In this article, I look at the performance book selling site which I have converted to a CQRS (explained below) architecture. In this I write data to a primary SQL Server database, but also build a read-side NoSQL database which the application queries use to show the list of books.

I run various queries to test the differences in performance, but the main one is sorting the books by their book reviews and then showing the top 100. I do this for three sizes of database

  • 100,000 books, which has ½ million book reviews.
  • 250,000 books, which has 1.4 million book reviews.
  • 500,000 books, which has 2.7 million book reviews.

The “sort by average review votes” is a particularly difficult query for a SQL-only database, as it must dynamically calculate the average votes value of all the nook reviews in the database – and as you can see my test data has a lot of reviews!

The figure below shows the difference in performance of the “best-SQL” (hand-tuned SQL run by Dapper) against what I call the SQL+NoSQL CQRS version of my application, for the three sizes of databases.

You can try the SQL+NoSQL CQRS at http://cqrsravendb.efcoreinaction.com/. It has 250,000 books in its database.

Note: The site cqrsravnedb.efcoreinaction.com isn’t running on some massively powerful cloud system. The NoSQL database is running courtesy of www.RavenHQ.com (thanks to them for doing that) and is running on the lowest level of server, so you won’t get the same performance as shown in the chart above.

Setting the scene – my example application

Note: It you read the first article then you know this – skip to the next section called “Explaining the new CQRS architecture and the SQL+NoSQL databases”.

Right from the second chapter of my book I introduce an example book selling site, and as the book progresses I add more attributes to the application. Admittedly the site only has a few features, but I purposely included some sorting and filtering features that can be a challenge when the site has a lot of books to show. The diagram below shows you the features of the site.

What makes the book list a challenge?

I tried to create an example application which was at least credible, and included some complicated queries. The diagram below shows you the listing of a book, and some information on how easy or complex it is to get that data from the database.

I ran a series of tests checking on sorting, filtering and paging, but with this design they all had the same performance (unlike the previous article). But I did run tests will different numbers of books and related data in the database.

Note: I did all my testing on my development PC using the ASP.NET Core example book selling site running in debug mode. The timings are from Chrome, and are the total time it takes for the page to return (including downloading the content). For comparison, accessing the About page, which has no database accesses in it, typically takes 11 ms to load, so the database accesses are always going to be slower than that.

Explaining the new CQRS architecture and the SQL+NoSQL databases

This article is about me upgrading my example book selling site to what is known as a CQRS (Command Query Responsibility Segregation) architecture using a polyglot database structure. Those terms might not that useful if you haven’t heard them before, so let me give you a diagram and then explain those terms in a bit more detail.

The CQRS architecture acknowledges that the read-side of an application is different from the write-side. Reads are often complicated, drawing in data from multiple places, while the write side is often much simpler. Separating the code for each part can help you focus on the specific features of each part.

A “polyglot database structure” is one which uses a combination of different storage types, for instance, relational databases, NoSQL databases, flat files, and so on. The idea is that each database type has its strengths and weaknesses, and by used two or more different database types you can obtain a better overall system.

I think a polyglot CQRS architecture, with SQL for the write-side database and NoSQL for the read-side database makes a lot of sense for business applications where there are more reads than writes, for instance, eCommerce applications, job sites, calender/todo etc. etc.

The SQL write-side database makes sense because business application often use relational. Think about a real book selling site: it would have a LOT of complex, linked data, to handle all the parts of the system, such as suppliers, inventory, pricing, orders, payment, delivery, tracking, audits and so on. I think a relational (e.g. SQL) database with its high level of data integrity for relational data, is the right choice for many business problems.

But those relationships and some aspect of database can make a SQL database slow at retrieval of data. My “sort by average votes” is particularly hard on a SQL database because of how many book reviews it needs to look at to sort by average votes. But I have seen equally complex queries in other systems I have built that don’t scale well either.

One answer is to create a read-side database that contains the exact data that the user wants to see, or sort/filter on. This could be another SQL database (I did that in Part3 of the previous article), but it makes a lot more sense for that database to be a NoSQL database. Why? Because:

  1. A NoSQL database is simpler to read, as it each entry is normally self-contained, with no links outside it. This should make it quicker at reading than a SQL database.
  2. NoSQL databases are normally easier to scale out, that is, have multiple databases containing the “same” data. This is because a write to a NoSQL database is dealing in writing one complete entry. (This is a big topic, and I’m not going to discuss this – see this Stack Overflow answer for part of the reason).
  3. I think a CQRS + NoSQL read-side database does a better job of holding “cached values” than can be done with a SQL/relational database does – I cover this later in the section about the complexity and robustness of the software.

As you will see in the rest of this article these features make my SQL+NoSQL CQRS architecture perform very well, and still have a high level of data integrity.

Why I chose RavenDB as my NoSQL database

I’m not an expert of NoSQL databases but when my first article went out I got a tweet from Oren Eini (Twitter @ayende), who is the main force behind the RavenDb NoSQL database. He quite rightly felt that his RavenDB database would be faster than the SQL equivalent at sorting/searching this data.

His tweet gave me an opportunity to get some help with my lack of NoSQL knowledge, so I sent him the tweet below, and the choice of NoSQL database was made. And it’s turned out to be a very good choice indeed, as you will see.

RavenDB has a .NET library that supports LINQ queries, which meant I could use the LINQ queries I had developed for EF Core with almost no changes. RavenDB also allowed me to create indexes on certain parameters, and would then automatically generate indexes for the various combinations. Indexes are vital to fast performance and most NoSQL systems have them.

I want to thank Oren Eini (see https://ayende.com/blog/) for his help and support, and Jonathan Matheus of RavenHQ , who has kindly provided me with free hosting of the RavenDB database that I use in my live site – http://cqrsravendb.efcoreinaction.com/.

How I added the CQRS architecture to my example book selling site

I only wanted to move the book list view part to the NoSQL read-side database, and not for the order processing part, because it’s only the book list view that has a performance issue. It turns out that while adding CQRS does require a fair amount of work, it is very simple to only apply the CQRS architecture to part of my application. The figure below shows the design of my changed example book selling site application, with the book list implemented as a two-database CQRS part.

My example book selling site is an ASP.NET Core application with EF Core used for the database access code. It is a layered application, as shown in the diagram above. The book selling site isn’t that complicated but I still needed to be careful as I modified the application over to a CQRS read-side database. I therefore wanted a design that minimized the changes, and isolated the new parts.

I came up with a design that keeps all the NoSQL/RavenDB parts separate. In my final design, the EF Core doesn’t know, or care, what database is being used for the read-side part of the CQRS system. This made the update simpler, plus offered the possibility of changing the NoSQL database I used – I really like RavenDB, with its support of LINQ, but EF Core version 2.1.0 will support Azure’s NoSQL database, Cosmos, so that might be an interesting alternative.

By keeping as much of the new database code in the NoSQLData layer, and using interfaces, kept the impact of my changes to a minimum. Figure 14.3 shows how I hid the NoSQL code behind interfaces to keep the NoSQL code isolated. I used dependency injection to provide both the DataLayer and the ServiceLayer with methods that allowed access to the database.

Here is a typical entry in the NoSQL database. As you can see I have pre-calculated the number of Reviews on this book, ReviewsCount, and the average of the six review votes, ReviewsAverageVotes. These two properties, plus the Id (not shown – RavenDB keeps that separate), the PublishedOn, and the ActualPrice all have indexes.

    "Title": "Adobe AIR in Action",
    "PublishedOn": "2008-07-01T00:00:00.0000000",
    "Price": 379,
    "ActualPrice": 189.5,
    "PromotionPromotionalText": "today only - 50% off! ",
    "AuthorsOrdered": "Joey Lott, Kathryn Rotondo, Sam Ahn, Ashley Atkins",
    "ReviewsCount": 6,
    "ReviewsAverageVotes": 3.5

Is changing my application to a SQL+NoSQL CQRS design worth the effort?

Implementing this two-database CQRS architecture wasn’t simple and took over a week to develop, which is long for me. Admittedly the main part was learning how to handle the RavenDB database approach, but there were also some complex EF Core parts to write to. So, was it worth the effort? To answer that question, I’m going to look at three distinct aspects.

  1. Was book list display, with its various sorts and filters, quicker?
  2. What was the impact writing to the database performance?
  3. What effect did it have on software, both in effort, complexity, and robustness?

1. Was book list display quicker?

Well, you have already seen the graph, but I’ll show it again. It beats the “best-SQL” version on sort by votes hands-down.

Now, this isn’t the whole story as in the first article I covered four levels of performance-tuning, and there was a faster version after the “best-SQL” version. In that version, referred to Part 3 (SQL cached values), I created “cached values” very like the values you see in the NoSQL database. In fact, it was the similarity of the Part 3 (SQL cached values) book query to getting json data that made me think of using a NoSQL database with CQRS.

If we compare the Part 3 (SQL cached-values) solution with the SQL+NoSQL CQRS version, then the difference is very much smaller, as shown by the graph below.

Note: You can access a live site with the Part 3 (SQL cached values) at http://efcoreinaction.com/. This site has 100,000 books in it.

This second graph shows that it’s the cached values, with their indexes, that makes the difference. RavenDB still has the edge of the Part 3 (SQL cached-values) version, but there is another aspect that I didn’t measure here, that is scalability.

Scalability is about how many simultaneous users an application can handle while still providing a ‘good’ performance. Because NoSQL databases such as RavenDB are dealing with a single entry containing all the information then a read or write is simpler than the SQL case. In general, this makes NoSQL databases easier to duplicate, that is, have multiple databases all containing the same data. The effect of having multiple databases on scalability can be significant. Not only can you spread database access across multiple databases, but you can locate databases geographically around the world to provide shorter access times. This final point clinches the performance issue for me – the SQL+NoSQL CQRS solution wins on performance.

SQL+NoSQL CQRS read performance summary: Excellent

2. What was the impact writing to the database performance?

I said earlier on that a SQL+NoSQL CQRS architecture is going to be slower on writes, as it must write to two databases. I have measured this in my solution and there is an effect, but it’s pretty small.

Taking a common case, which is a user adding a review to a book the results are as follows:

Solution type Total time Notes
Part 3 (SQL cached-values) 13 ms Simple addition of Review entity to the database and a recalculate of the cached values.
SQL+NoSQL CQRS 35 ms The extra time is mainly around writing to the NoSQL database. I measured the RavenDB update taking 25ms, which is quite long compared to a SQL write.

For a function that takes less than 50ms to return to the user, it’s not worth performance tuning. But in applications with more complex updates this time might get too long. But there are plenty of ways to handle this – one way is to pass the update a background task to be executed so that the application returns immediately to the user. The only down side of that approach is the user may be shown out-of-date data, which could be confusing – these are the trade-offs you must think through.

SQL+NoSQL CQRS write performance summary: Good, and could be made better if needed.

3. What effect did it have on software, both in effort, complexity and robustness?

The table below compares the development effort, complexity and robustness of the two designs: The Part 3 (SQL cached-values) solution and the SQL+NoSQL CQRS.

Solution type Effort Complexity Robustness
Part 3 (SQL cached-values) ~ 3 days Complex concurrency Good
SQL+NoSQL CQRS ~ 8 days Complex structure Very good (see below)

The effort was bigger for the SQL+NoSQL CQRS solution, but still in a very sensible time. Remember too, that is the time to performance tune an existing, SQL-only, EF Core application – 8 days to improve performance isn’t that bad in my opinion.

On the added complexity, the two solutions added to the original application was about the same –the added complexity was just in different areas in the two solutions.

The big plus for me was on the on robustness of the application. Keeping cached values correct is really difficult, and I think the SQL+NoSQL CQRS did a much better job of that than the Part 3 (SQL cached-values) did. This is because the SQL+NoSQL CQRS’s architecture “designs-out” any concurrency issues. The diagram compares Part 3 (SQL cached-values) handling of concurrency issues that could make the cached values out of date, with the same problem in the SQL+NoSQL CQRS solution.

Unlike the Part 3 (SQL cached values) solution, where I had to consider each cached value separately, and devise a different solution for each of them, the CQRS design handles all potential problems in one go – it effectively designs them out.

NOTE One other cache-invalidating error I can envisage is if the two updates to the NoSQL database arrive in the wrong order, say due to a retry. But that problem is simpler to fix, say by adding timestamps to each update and never allowing an older update to be applied.

SQL+NoSQL CQRS software summary: it took a bit more develop time, but it’s worth the effort.


I have altered my example book selling site from a SQL-only solution to one that uses a CQRS (Command Query Responsibility Segregation) architecture with a separate read-side and write-side database.  The write-side database I kept as a SQL/relational database because of its ability to track relationships with a high level of data integrity. On the read-side database I used a NoSQL database, RavenDB. I called this my SQL+NoSQL CQRS solution.

The SQL+NoSQL CQRS solution won by a large factor the on read-performance against the “best-SQL” database. Even against a SQL solution that pre-calculated cached values then SQL+NoSQL CQRS solution was only slightly faster, but has a much better chance to scale horizontally (i.e. multiple duplicate databases) than its SQL counterpart. This means the SQL+NoSQL CQRS won on all fronts for read-side queries.

The SQL+NoSQL CQRS solution did introduce a drop in performance for writes to the database, as two databases had to be updated. In my application that was small enough to ignore, but if it did become a problem there are ways around that.

The SQL+NoSQL CQRS solution took the most development time of any of the performance tuning I did – taking ~8 days to implement compared to ~3 days for the Part 3 (SQL cached values) solution took. However, I think the performance gain was impressive and the overall solution is more robust.

I hope you have found this article and the previous article interesting. Please do add comments and suggestions. I’m always learning.

Further reading

Happy coding!

Entity Framework Core performance tuning – a worked example

Last Updated: January 3, 2018 | Created: September 22, 2017

This is the first in a series of articles on performance tuning EF Core. In this article I take an example book selling site that I have built for my book and see how fast I can get it to perform. I have got some really excellent performance out of Entity Framework Core (EF Core) and I wanted to share it with you all.

  1. Entity Framework Core performance tuning – a worked example (this article).
  2. EF Core – Combining SQL and NoSQL for better performance.

Note: This article is from chapter 13 of my book, which  will be available from the end of October 2017, and has lot more detail (its 33 pages long). The book is on early access, which means you can buy the book now, and you get the chapters as I write them – plus the finished book when the book is complete.

NOTE: Manning Publications have offered the coupon to get 37% off my book, Entity Framework in Action” – use fccefcoresmith

Executive summary for those in a hurry

I build an example book selling web site using EF Core version 2.0.0 and filled it with 100,000 books and ½ million reviews. I then went through four staged of performance tuning. The figure below shows you the results through these stages for two different, but demanding areas of performance:

  • Displaying 100 books out of 100,000 possible books.
  • Displaying 100 books out of 100,000 possible books, sorted by average votes
    (timings are how long the web request took to finish, as measured by Chrome on my development PC).

The four stages on the graph are:

  1. Part 1: Working with EF Core built queries
    1. I get my LINQ code in a form that EF Core can turn it into good SQL
    2. I performance tune the existing EF Core query by adding a DbFunction.
  2. Part 2: I build a SQL query by copying EF Core’s SQL and ‘tweaked’ one bit to get some extra performance
  3. Part 3: I added cached values to the database that held hard-to-calculate values, such as the average votes for a book.

You can try the final, Part 3, “tuned up” example book selling site at  http://efcoreinaction.com/.

Note: The site www.efcoreinaction.com isn’t running on some massively powerful cloud system, but on shared hosting site that costs £3.50/month. (see Web Wiz, who are a great hosting site for .NET. They have always been early supporters of new .NET features.)

Setting the scene – my example application

Right from the second chapter of my book I introduce an example book selling site, and as the book progresses I add more feature to the application. Admittedly the site only has a few features, but I purposely included some sorting and filtering features that can be a challenge when the site has a lot of books to show. The diagram below shows you the features of the site.

What makes the book list a challenge?

I tried to create an example application which was at least credible, and included some complicated queries. The diagram below shows you the listing of a book, and some information on how easy or complex it is to get that data from the database.

On top of that the sorting, filtering and paging commands adds another level of complexity. I came up with four sort and filter tests that got progressively difficult, each done with page size of 10 and 100 to see how much showing more books effected performance. Here are the four tests.

  1. Easy – just show the page, with default everything.
  2. Moderate – sort the books by their publication date.
  3. Hard – sort by average votes, which is calculated from all the reviews linked to a book.
  4. Double hard – filter by average votes and the sort by price, both of which are calculated values.

Note: I did all my testing on my development PC using the ASP.NET Core example book selling site running in debug mode. The timings are from Chrome, and are the total time it takes for the page to return (including downloading the content). For comparison, accessing the About page, which has no database accesses in it, typically takes 11 ms to load, so the database accesses are always going to be slower than that.

My four steps in performance tuning

I went through four iterations of tuning of the application, and I will cover each one in this article. The stages are:

  1. Tuning up my EF Core query
    1. Making sure EF Core could translate all parts of my LINQ query to SQL
    2. Tweaking my query by adding a user defined function
  2. Going SQL – moving the query to Dapper
  3. Modifying the database – adding cached values (this is running on efcoreinaction.com)

Part 1a – Getting my query to translate into SQL

I started this performance tuning process by looking at what EF Core can do with well written code. Throughout my book, I have recommended the LINQ Select query, as this creates the most efficient database access. The Select query works well for the book display as it needs to get data from other tables, this the Reviews, and it doesn’t need some of the columns in the Books table, such as the book’s Description column, which could be very long. Here is my Select query object.

public static IQueryable<BookListDto>           
    MapBookToDto(this IQueryable<Book> books)   
    return books.Select(p => new BookListDto
        BookId = p.BookId,                      
        Title = p.Title,                        
        Price = p.Price,                        
        PublishedOn = p.PublishedOn,            
        ActualPrice = p.Promotion == null       
                ? p.Price                       
                : p.Promotion.NewPrice,         
        PromotionPromotionalText =              
                p.Promotion == null             
                  ? null                        
                  : p.Promotion.PromotionalText,
        AuthorNamesOrdered = p.AuthorsLink
                .OrderBy(q => q.Order)
                .Select(q => q.Author.Name),
        ReviewsCount = p.Reviews.Count,        
        ReviewsAverageVotes =                  
            p.Reviews.Select(y => (double?)y.NumStars).Average() 

Note the calculation at the end of the MapBookToDto code, which calculates the average votes from all the reviews linked to this book. I took me a bit of time to get the LINQ command in the right format – finally solved it by asking a question of the EF Core issues page. But the result is that average is done in the database using the SQL command AVG. This was critical in getting the performance I needed, not only in calculating the value to so in each book info, but also to make the “sort by average votes” as quick as it was.

Here are the results for 100,000 books (same data as on the www.efcoreinaction.com site).

Test 10 books 100 books Comment
All books 45 ms 230 ms Sorts on primary key
Sort by publication dates 45 ms 230 ms Sorts on PublishedOn, with index
Sort by votes 500 ms 650 ms Must average ½ million reviews!
Sort by price, filter by 4+ votes 450 ms 600 ms Less books, so quicker

The “sort on votes” performance, at 500ms sounds slow, but it was a lot quicker than I expected. EF Core produced the ideal SQL, using the SQL AVG command, but it had to be called 100,000 times, and had a total of over ½ million Reviews to process. For some applications, this level of performance would be acceptable, but for this example I deem that this is too slow, and I will be looking again at the “sort on votes” in Part 2 and Part 3.

The one item that did seem very slow was the display of 100 books. I analysed that and, because the Authors names came back as a collection, EF Core ran a separate query for each book – so, instead of 2 database access it had 12 database accesses for 10 books, and 102 database accesses for the 100 books.

I know that every separate database access costs, so I wanted to remove those extra database accesses. And because I knew something that EF Core didn’t know, then I could fix it, as you will see in the next Part.

Part 1b – tweak the query with the DbFunction feature

So, in my first performance tweak, I wanted to see if I keep all the good SQL that EF Core has produced, but inject a bit of my own SQL to improve the performance of the AuthorNamesOrdered part. What I knew was I only wanted the names to concatenate them into a comma-delimited string, and I could do that in SQL very easily.

EF Core 2.0.0, released in August 2017, has a great new feature contribution by Paul Middleton. This allows you to register a SQL user defined function (UDF) with EF Core and then use it in the query. This is a really useful tool, and I expect to be using DbFunctions a lot in the future.

So, with help from Stack Overflow I produced a UDF, as shown below.

CREATE FUNCTION AuthorsStringUdf (@bookId int)
-- Thanks to https://stackoverflow.com/a/194887/1434764
SELECT @Names = COALESCE(@Names + ', ', '') + a.Name
FROM Authors AS a, Books AS b, BookAuthor AS ba 
WHERE ba.BookId = @bookId
      AND ba.AuthorId = a.AuthorId 
      AND ba.BookId = b.BookId
ORDER BY ba.[Order]

I could then replace the extraction of a list of author’s names with a call to my UDF, as shown below.

public static IQueryable<BookListDto>            
    MapBookToDto(this IQueryable<Book> books)    
    return books.Select(p => new BookListDto
        //… other code removed for clarity
        AuthorsOrdered = UdfDefinitions.AuthorsStringUdf(p.BookId),
        //… etc.

The UDF returns a single string, so there is no need for extra database accessed. After this change, there were only two databases accessed, a count for paging and then the real query, no matter how many books in a page.

Note: I have not covered how to register the UDF with EF Core, nor the best way to add the UDF to the EF Core created database. There just isn’t room in this article. There is some information on this in the this GitHub documentation. I do have more information on DbFunctions in  chapter 8 of my book, and how to add UDFs and other SQL code to an EF Core database in chapter 11.

Before I show you the improved performance figures let me first show you the new SQL that EF Core produces. EF Core now produces very readable SQL, almost like how a human would lay it out (I did a couple of minor changes to the layout to make the diagram work, but is mostly as EF Core created it).

The only change in the SQL from the Part 1a version is the call to the UDF, but the effect on performance is great, especially on the 100-book display, is good. See the figures below

Here are the results for using the 100,000 book test data, with the ‘Improvement’ columns showing how much faster the Part 1b implementation is compared to Part 1a’s performance.

Test 10 books Improvement 100 books Improvement
All books 30 ms 150% 80 ms 250%
Sort by publication dates 30 ms 150% 80 ms 250%
Sort by votes 480 ms minor 530 ms minor
Sort by price, filter by 4+ votes 420 ms minor 470 ms minor

Part 2 – converting the query to hand-written SQL

When I studied the SQL from part 1b, I noticed that EF Core repeated any calculation in the SQL ORDER BY section, rather than referring to the SELECT’s alias name of the calculated value. I did some tests and they showed using the SELECT’s alias name, ReviewsAverageVote, which holds the calculated average vote, in the ORDER BY section would improve performance.

Based on this I decided to replace the EF Core query with a fully, hand-code SQL command. That meant I had to write the SQL code, but because EF Core’s SQL is so great (and readable), I just copied it (!), and replaced the one area, the ORDER BY part, with some tweaked SQL as shown below – see the ReviewsAverageVotes value used in the ORDER BY at the end of the SQL

SELECT TOP(@pageSize) [b].[BookId], [b].[Title], 
     [b].[Price], [b].[PublishedOn], 
    WHEN [p.Promotion].[PriceOfferId] IS NULL
    THEN [b].[Price] ELSE [p.Promotion].[NewPrice]
END AS [ActualPrice], 
[p.Promotion].[PromotionalText] AS [PromotionPromotionalText], 
[dbo].AuthorsStringUdf([b].[BookId]) AS [AuthorsOrdered], 
    FROM [Review] AS [r]
    WHERE [b].[BookId] = [r].[BookId]
) AS [ReviewsCount], (
    SELECT AVG(CAST([y].[NumStars] AS float))
    FROM [Review] AS [y]
    WHERE [b].[BookId] = [y].[BookId]
) AS [ReviewsAverageVotes]
FROM [Books] AS [b]
LEFT JOIN [PriceOffers] AS [p.Promotion] 
    ON [b].[BookId] = [p.Promotion].[BookId]
WHERE ([b].[SoftDeleted] = 0) 
ORDER BY [ReviewsAverageVotes] DESC

The next problem was how I would execute the SQL. I could have used ADO.NET but the well-respected mini-OR/M called Dapper. Dapper has a feature that maps the data returned from the SQL to a .NET class by matching the class property names to the column names returned. EF Core also has a feature that can provide the correct type of connection for Dapper to access the database, which makes using Dapper pretty easy.

The unit test code below shows how to use Dapper with an EF Core’s application DbContext – the Dapper part is the line “Query<Book>(“SELECT * FROM Books”)” in the code below.

public void DapperReadBooksEfCoreSqlServer()
    //… I removed the options and database setup 
    //so that it’s easier to see the Dapper part 
    using (var context = new EfCoreContext(options))
        var books = context.Database 
            .Query<Book>("SELECT * FROM Books"); 


The book list query I implemented in chapter 2 was split up into four separate query objects: MapBookToDto, OrderBooksBy, FilterBooksBy, and a generic paging method. That meant each part was easy to understand and could be separately tested. Duplicating this query in Dapper requires me to sacrifice the elegance of query objects and move to a design that combines snippets of SQL to create the query. But that’s what I need to do to squeeze out the best performance.

The results of swapping over to this improved SQL is shown in the table below, with the big change in bold (Sort on Average votes).

Test scenario Part 1b – EF Core Part 2 – SQL Comment
Show books  10 =30 ms
100 = 80 ms
10 =30 ms
100 = 85 ms
No real change – within normal variations
Sort on PublishedOn 10 = 30 ms
100 = 80 ms
10 = 30 ms
100 = 90 ms
No real change – within normal variations
Sort by average votes 10 = 500 ms
100 = 530 ms
10 = 325 ms
100 = 390 ms
The improved SQL is about 40% faster than EF Core
Filter by average votes (4 or more), sort by price 10 = 440 ms
100 = 490 ms
10 = 455 ms
100 = 520 ms
No real change – within normal variations

As you can see, the only part that improved was the “sort by average votes” situation. What I found interesting is that none of the other more mundane queries had improved. I kind of expected them to because Dapper’s current site states its ten times faster or more than EF (most likely EF6.x), which didn’t show up in these results.

I therefore tried a simple “read one book row” test, and yes, in that case Dapper is about seven times faster than EF Core. That makes sense, as EF Core has several things to do on loading, like relational fixup, while Dapper just executes the SQL. But it turns out, when it comes to large or complex queries where the database execution is the limiting factor, then the performance is down to the quality SQL produced, and not the library you use. The fact is, small accesses aren’t normally the ones that need performance tuning, as they are quick anyway, using Dapper or EF Core only has a difference if the SQL is different, which in the ”sort by average votes” case it was.

But the question is, “was all that effort to swap to SQL worth it?” Before you decide that you should read Part 3, where I use another approach that take more work, but improves performance much more than my hand-tuned SQL does.

Part 3 –  Modifying the database to increase performance

The final stage I decided to try modifying the database and adding cached values for the calculations that take a lot of time. These are:

  1. The Price, which varies depending on whether there is a PriceOffer promotion linked to the book.
  2. The average votes and number of reviews for this book
  3. The AuthorsString, containing the comma-delimited list of authors of a book.

Caching is a great tool for improving performance, but ensuring the cached value is up to date needs very careful handing. This is especially true in a database, where you may have a simultaneous update of the same row, which could make the cached value incorrect.

In chapter 13 in my book I cover how I handled the three types of caching, which need different techniques. But for this article I’m going to focus on the most difficult one of the three, that is the caching of the average votes and number of reviews for this book. This is difficult because I still need to keep the actual Review entities so that users can look at the Review’s comments on a book. This means I have two versions of the average votes value: one that is found by averaging the reviews dynamically, as I have done in Part 1a to 2, or the cached value held in the AverageVotes in my modified Book entity class. Keeping these in set is a challenge.

To keep my cached values up to date I need to make some fairly large changes to how things work, as summarised below:

  1. I still need the Reviews property, which sets up the one-to-many relationship between a Book entity and its Review But I must ensure that Review entities cannot be added or removed from the Reviews property, that is, all adds or removes must go through my AddReview and RemoveReview methods.I also I need to “lock down” the Review entity class by making its properties have private setters, and stop any methods outside of the DataLayer from being able to create a Review entity instance.
  2. Because the AverageVotes and ReviewsCount properties are cached values then there is the possibility of a concurrent adding or removing of a Review, which could invalidate the cached values.

I will deal with each of these two stages separately

Lock down access to Reviews collection property

EF Core allows navigational properties to be set up which EF Core will use them when doing certain read and writes to the database. My initial Book entity class had a property called Reviews that contained a list of Review entities for this book instance. I would then add or remove a Review instance from the list to change what reviews where linked to the Book.

For my caching of the values related to the Reviews to be kept up to date I need intercept all changes to the Book’s Reviews. To do that I use a feature in EF Core called backing fields. This allow me to ‘hide’ the Reviews list, and provide a IEnumerable<Review> Reviews property, which a developer cannot add or remove from. Here is what the updated Book entity class looks like

public class Book
    private readonly List<Review> _reviews 
        = new List<Review>(); 

    //… other properties removed for clarity

    public IEnumerable<Review> Reviews => 

Now I can add two methods, AddReview and RemoveReview, to the Book class that update its backing field _reviews and at the same time recalculate the AverageVotes and ReviewCount cache properties.

The other back door to changing the Reviews is if the develop loaded a Review entity and altered its values, say changing the reviews vote from 1 to 5. To stop the developer from do that I set all the Review class property setters to private, and I make the constructor internal, so only my DataLayer can create a new instance of the Review class.

The figure below shows the backing field in the Book class and its AddReview method in blue, with the “locked down” Review class in purple.

Handling simultaneous additions of a Reviews to the same book

The other problem is multiple users adding Reviews at the same time. There is a small window of time between the loading of the current Reviews linked to a Book and the saving of the new review collection and the associated cached values. In that time window, another Review could be added/removed from the same book by another user, which would make the cached values out of step with the database.

The solution to this problem is to use EF Core’s concurrency conflict feature. What I do is add the attribute [ConcurrencyCheck] to the AverageVotes and ReviewsCount properties. This means that EF Core will check they haven’t changed since the Book instance was loaded, which is window where other updates could make my calculations invalid. If they have been changed then I can write some code that recalculates the cached values using the current state of the database.  I’m not going to show the code for that because the article is already very long. You can find my concurrency code here.

The performance results from Part 3 – cached values

The results from all this is impressive, as you can see in the table below

Test 10 books Improvement over Part 2 (SQL) 100 books Improvement over Part 2 (SQL)
All books 30 ms no improvement 80 ms no improvement
Sort by publication dates 30 ms no improvement 80 ms no improvement
Sort by votes 30 ms 12 times better 80 ms 5 times better
Sort by price, filter by 4+ votes 30 ms 12 times better 80 ms 5 times better

Extra notes on performance:

  • I tried ½ million books: the 10-book time was 85 ms and the 100-book time was 150 ms. For this amount of books the database server is starting to be a limiting factor.
  • In Parts 1a, 1b and 2 if the sort by price, filtered by +1 votes, then the timings were all over a second in length. But this Part 3 implementation stills comes in at 30ms.
  • I took the SQL produced by EF Core in Part 3 and used Dapper to execute it, but there was no further performance improvement so it wasn’t worth doing.

Obviously, the standout improvement are the last two items, which were the ones I was targeting. You can see this code in action, with 100,000 books, at http://efcoreinaction.com/

Summary of the whole performance tuning work

Below is a repeat of the chart I had at the beginning that shows you the results for Part 1a to 3 for the 100-book display and 100-book display with sort on average votes.

However, what this graph does convey is that the Part 3 performance improvements make all the 10-book displays, or any sort or filter, come in at around 30 ms. As the 10-book display is the default then the site is now really fast. Also, the 30 ms is the time for the ASP.NET page to display, which includes other overheads outside the database access – in actual fact, the database accesses are between 10 ms to 15 ms.

The other questions I asked myself was how much effort did each stage take and where there any problem or adverse effects to the application? The table below gives my answers to these questions.

Part Effort Comments
1a. Straight EF Core Few days Finding the correct format for the LINQ average wasn’t easy. I had to ask the EF Core team via EF Core’s GitHub issue page. But once I got this right EF Core produced great SQL code.
1b. +DbFunction ½ day Very easy. I like the DbFunction a lot, and in this case, it made a significant improvement to the performance.
2. Better SQL One day Fairly easy, as I simply copied the SQL code that EF Core produces and tweaked the bit that my experiments said would help. But only useful if you can come up with some better SQL, i.e. the difference in speed of Dapper over EF Core only helps on small, very quick database accesses, which you don’t need to performance tune anyway!
3. Cached values Three days Definitely hard work, but also a fantastic result. I think the performance changes to the code don’t hinder future development of the application at all, and in some cases improves it.

Overall I am very pleased with the process. EF Core produces great SQL code from the get go, but only because I made sure my LINQ queries in a way such that EF Core can translate every part into SQL. I had to persevere with the LINQ Average method to get EF Core to translate that to SQL.

Note I think plenty of developer will fall foul of this problem and think EF Core is slow – but before you blame EF Core check your logs for QueryClientEvaluationWarning warnings, which tell you part of your LINQ query didn’t translate to SQL and is being run in software. See the section “Understanding the limitations of Client vs. Server evaluation” in my  article about EF Core’s Client vs. Server evaluation feature.

But when I had to go beyond straight EF Core I found there was plenty I could do. And I didn’t go as far as I wanted to due to time and limits on the chapter length – my next step was to change the application’s architecture (I give a few ideas at the end of chapter 13).


I took a query that was designed around a fictitious, but believable example, which wasn’t a walk over to implement. I got the best out of EF Core, and I think that performance was quite credible. I then pushed through two more stages, ending with a complex, but eminently implementable solution where the slow parts of the query were between eight and fifteen times faster from where I started. But more importantly all the queries for a 10-book display are at 30 ms.

You most likely selected EF Core for your database access library because it allowed you to develop your application rapidly (good choice!). That means you should have a working application quickly, and can then you can performance tune the parts that need it.

When you start performance tuning I would recommend persevering with conventional EF Core (see my note about QueryClientEvaluationWarning). But at some point you might decide that the standard EF Core approach won’t produce the performance you need. That’s when this article could be useful to you, so bookmark this URL for that day (or buy my book 🙂 ).

If you are interested in .NET in general I recommend the ASP.NET weekly newsletter, curated by Jerrie Pelser. This contains a list of articles on ASP.NET, Entity Framework and .NET in general (mostly .NET Core versions)  – sign up at https://www.getrevue.co/profile/aspnetweekly

Talking at Nottingham, UK .NET Meetup – July 2017

Last Updated: July 17, 2017 | Created: July 16, 2017

For anyone in the UK, I am talking at a .NET Meetup group in Nottingham UK on Monday, July 31st 2017. I was contacted by Marvin Rounce who co-ordinates the group as a result of the book in I writing for Manning Publishing,  “Entity Framework Core in Action“.

The meeting is open to anyone (but there may be a limit on numbers) and is free – you just need to sign up to the Meetup via this link.

Note: Entity Framework Core is an open-source library, built by a Microsoft and other contributors, that provides a O/RM (Object-Relational Mapper) that allows you to access a database as in an Object-Oriented way. It runs on a .NET platform, including the new .NET Core, which can run on multiple platforms such as Windows, Linux and Apple.

The topics I will be covering are:

1. Building Better Entity Framework Applications

This talk cover some of the topics that I cover in my article “Six ways to build better Entity Framework (Core and EF6) applications“, but going into more detail on the overall design, and a number of useful packages I have found for improving the speed of development.

The aim is to help people who have not used Entity Framework (EF) before, or people who what to explore different ways to build applications with EF quickly and robustly. I also will talk about how to structure your code so that it is easier to performance tune.

2. DevOps and Entity Framework – migrating your production database

This talk looks at what feature EF provides for changing the structure of a database, that is, altering what tables, columns, foreign keys, constraints etc. that make up the database schema.

EF provides a number of approaches to changing a database schema, the best known is EF’s migration feature, but that approach does have some limitations.  I will describe some other options for handling database migrations, including a method I have used successfully on some a few of my projects.

I will also look at the issues around updating a production database, and what you need to think about so that you don’t lose any of your live data!

Entity Framework Core: Client vs. Server evaluation

Last Updated: May 17, 2017 | Created: May 17, 2017

Entity Framework Core (EF Core) is a new, ground-up rewrite of the highly successful Entity Framework 6 (EF6.x) database access library for the .NET platform.  I have been commissioned by Manning Publishing to write a book on EF Core called Entity Framework Core in Action. In writing this book I have looked at any and all aspects of EF Core, and I am really impressed with the topic I am going to cover in this article, that is, Client vs. Server evaluation.

Client vs. Server evaluation is a feature that makes it easier to write queries to the database, because it allows you to include methods in your query that cannot be run on the database, that is, methods that EF Core cannot convert to SQL commands.

In my Entity Framework Core in Action book I am writing I use a database containing various technical books as an example (seemed appropriate) – you can see a live site at http://efcoreinaction.com/. The diagram below shows you just the the Book table with a many-to-many link to each Book’s Author(s).

Now I want to list a book, with its primary key, title and the Author(s) as a comma delimited string. The code below will do this, in this case I take the first book in the database:

var book = context.Books
    .Select(p => new
        AuthorsString = string.Join(", ",
            .OrderBy(q => q.Order)
            .Select(q => q.Author.Name)),

A typical output would be

BookId Title AuthorsString
1 Beginning Scala Vishal Layka, David Pollak

What happens is that EF Core converts the query into SQL to run on the database, but one method, string.Join, isn’t supported as a SQL command by EF Core’s SQL database provider, and at this point Client vs. Server evaluation comes in.

What happens is that EF Core can sort the commands into two groups:

  1. Commands that can be converted into SQL commands and run on the database server
  2. Commands that cannot be converted into SQL, so therefore must be run in software in the application, known as the client.

The figure below gives you a better view of this process. This shows the effect of running the EF Core query I just listed on a book that had two authors, Jack and Jill. As you can see Authors property will be set to “Jack, Jill”, while the BookId and Title would be set to the value of the corresponding columns in the Books Table.

Stage 1, the translation of the LINQ query, is where (with the help of the database provider) EF Core splits up the LINQ commands and other methods into either “run on database” (Server) or “run in software” (Client).

Stage 2 is where the SQL commands are run on the database server and the values are read back. The SQL that EF Core creates is given below.

SELECT TOP(1) [p].[BookId], [p].[Title]
FROM [Books] AS [p]

SELECT [q.Author].[Name]
FROM [BookAuthor] AS [q]
INNER JOIN [Authors] AS [q.Author] ON 
    [q].[AuthorId] = [q.Author].[AuthorId]
WHERE @_outer_BookId = [q].[BookId]
ORDER BY [q].[Order]

Note that EF Core always reads collections using a separate SQL command (EF6.x would have output one commands to read it all, but that can be inefficient in cases where the collection is large).

Stage 3 is run because EF Core detected commands that had to be run on the Client. It gets the data it needs from the SQL data and runs the software-only methods on the Client.

In stage 4 you see that EF Core combines the Server data and the Client data and provides the composite data back to the caller.

The Client vs. Server evaluation feature gives you as a developer the ability to create complex queries, and EF Core will optimize the query to run as much it can on the database Server. But if there is some method in your query that cannot be run on the database server then the query won’t fail, but EF Core will apply that method after SQL Server has done its part.

The example I showed is simple, but the possibilities are endless. But there are a few things to watch out for.

What about Entity Framework 6.x (EF6.x)?

The Client vs. Server evaluation feature is new in EF Core, and is not available in EF6.x. This means that I have to add extra stages in my query if I want to use methods such as string.Join. This makes it harder work in those cases. Obviously in EF Core this gets easier.

If you are still on EF6.x then I can recommend Alexander Zaytsev’s DelegateDecompiler (DD for short) which provides a similar feature (see this article, where I use the DD). While I have used DD, it does require me to add a command at the end of every query. Also DD is not (currently) available on .NET Core.

Understanding the limitations of Client vs. Server evaluation

I think the Client vs. Server evaluation feature is a very useful addition to EF. But, like all powerful features, it is best to understand what is going on so that you can use it in the right way.

Firstly, the obvious thing is the method you provide is run on every entity (row) you read from the database. If you have 10,000 rows in the database and don’t filter/limit what is loaded then, as well as a SQL command that takes a long time, your processor will spend a long time running your method 10,000 times.

The second point is subtler: The Client vs. Server evaluation feature blurs the lines between what is run in the database and what is run in the client. It would be possible to create a query that worked, but was slower than it could be because it had to use client-side evaluation. To give you some context, in EF6.x this form of mixed client/server query would have failed because it did not support that. That meant in EF6.x you had to do something about it – often by changing the query to better suit the database. Now your query may work, but could perform worse than one you write such that EF Core can convert it directly to SQL commands.

One extreme example of the problem is that Client vs. Server evaluation allows you to sort on a client-side evaluated property, which means the sorting is done in the client rather than in the database server. I tried this by replacing the .First() command with .Sort(p => p. AuthorsString) in listing 2.9 and returning a list of books. In that case EF Core produces SQL code that reads all the books, then read each row individually, twice, which is definitely not optimal.

Tips on performance tuning Client vs. Server evaluation

EF Core’s Client vs. Server evaluation feature has some very useful features you can use to make sure you aren’t producing poor-performing.

Firstly, EF Core will log a warning the first time a Client vs. Server evaluation will adversely affect performance, and this logging is intelligent. In the case of my book query that used string.Join then no warning is given because the string.Join does not affect the performance. However if I added the .Sort(p => p. AuthorsString) command then I get a warning.

Secondly, if you want to stop any poorly performing Client vs. Server evaluation queries from happening then you can configure EF Core to throw an exception if a poorly performing query is found.  See code below:

protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
    optionsBuilder .UseSqlServer(

Note: The Microsoft documentation for Client vs. Server evaluation can be found at https://docs.microsoft.com/en-us/ef/core/querying/client-eval


EF Core has a load of nice features and, at the moment, misses some EF6.x features that that I rather liked. But I think Client vs. Server evaluation is one of those great features that will help me to build queries quickly, and often at no reduction of performance. For that I am very grateful.

Happy coding

Six ways to build better Entity Framework (Core and EF6) applications

Last Updated: February 25, 2018 | Created: March 8, 2017

In this article I describe six different approaches to building applications that use Microsoft’s database access framework, Entity Framework (EF). All six approaches are based on software principles and patterns that many of you will be familiar with. This article shows how I applied these principles and patterns in real-world applications that use EF. The six principles and patterns are:

  1. Separation of concerns – building the right architecture.
  2. The Service Layer – separating data actions from presentation action.
  3. Repositories – picking the right sort of database access pattern.
  4. Dependency injection – turning your database code into services.
  5. Building business logic – using Domain-Driven Design with EF.
  6. Performance tuning EF – get it working, but be ready to make it faster if you need to.

I have an interesting perspective when it comes to software principles and patterns. I started out as a programmer, but then moved into technical management 1988. I came back to programming in 2009, a 21-year gap, and the whole software world had changed. Thankfully over those years some seriously clever people had been thinking about the science of programming, and I avidly read their articles and books.

While it was great to learn things from these software giants, I found it took some time to find the right ways to apply their ideas in the applications I developed. This article brings together the combination of ideas from some of the great software thinkers with my years of learning as I have applied and refined their approaches in building applications that used EF for database access.

All the software and figures in this article come from a book I am writing for Manning Publication called Entity Framework Core in Action that is now on early-access release i.e. if you buy now you will get the chapters as I write them, and the full book when it’s finished. The code is therefore based on the new .NET Core frameworks: EF Core and ASP.NET Core. However, these software principles and patterns apply equally well to the older Entity Framework, version 6 (EF6.x) framework and ASP.NET MVC5.

Note: I’m going to assume you know about Entity Framework already. If you don’t then I recommend you read the first chapter of my book, which is free, or look at Microsoft’s EF Core documentation which includes a number of example applications.

Before I start I need to say that the descriptions I show are my way of implementing the principles and patterns I talk about. There are plenty of other ways to implement each of six topics, mine are just one of them. Have fun yourself developing your own techniques and libraries that will improve and accelerate your development.

1. Principle: Separation of concerns – building on the right architecture

There is a software principal called Separation of Concerns (SoC), which says that you should:

  • Put code that has a similar, or strongly-related functions should be grouped together – in .NET terms put in separate projects. This called cohesion.
  • Make each group/project as self-contained as possible. Each piece of code should have a clear interface and scope of work that is unlikely to change to because of other callers changing what they do. This is called low coupling.

NOTE: Here is a more in-depth article on Separation of Concerns if you want to look into SoC some more.

For simplicity most examples of EF code on the web tend to show EF Core database commands being called directly from whatever application type they are using. This doesn’t follow SoC and nor does it really represent how real applications are written. I took the decision in my book to use a more representative software architecture for my example code, and I used a layered architecture. Using this does make it a little more difficult for the reader, but I build up the application structure over a number of chapter. Here is the final architecture of my example book selling site.

I could have used a number of different architectures, see this link for a good list, but the layered approach works well for small to medium applications. A layered architecture is also a good fit for cloud hosting, where cloud providers can spin up more instances of the web application if it is under a heavy load, that is, it will run multiple copies of a web application and place a load balancer to spread the load over all the copies. This known as scale out on Microsoft’s Azure and auto scaling on Amazon’s AWS.

The figure below shows how I apply SoC to my database access code. It shows the same software architecture, but with all my EF database access code highlighted in bubbles. The size of the bubbles relates to the amount of EF code you will find in each layer. Notice that the ASP.NET Core project and the pure business logic (BizLogic) project have no EF Core query/update code in them at all.

As I go through this article I will refer back to SoC, as it is an integral part of how I approach database applications.

2. Pattern: The Service Layer – separating data actions from presentation actions

One of the influential books I read when I came back to programming was Dino Esposito and Andrea Saltarello’s book Microsoft .NET: Architecting Applications for the Enterprise, published in 2009. Some of the technology he covered then is now superseded, but the patterns are still relevant today (Dino has written a number of newer books, see this link). This book introduced me to the use of Service Layer, which Martin Fowler previously described in his classic book in 2002.

Dino Esposito says in his book that the Service Layer “sets a boundary between two interfacing layers” and Martin Fowler’s site link says the Service Layer “Defines an application’s boundary with a layer of services that establishes a set of available operations and coordinates the application’s response in each operation”. That all sounds great, but how does that help my applications? Quite a bit actually.

I’m going to describe how the Service Layer acts as an adapter in this section. Later, in the section of business logic, I will cover the second way the Service Layer can serve me, by being in command of running my business code.

The Service Layer as an adapter

In a layered architecture there is often a data mismatch between the database/business logic and the presentation layer. The Domain-Driven Design (DDD) approach, which I describe later, says that the database and the business logic should be focused on the business rules, while the Presentation Layer is about giving the user a great user experience, or in the case of a web service, providing a standard and simple API.

For this reason, the Service Layer becomes a crucial layer, as it can be the layer that understands both sides and can transform the data between the two worlds. This keeps the business logic and the database uncluttered by the presentation needs, like drop down list and json AJAX calls. Similarly, by having the Service Layer deliver pre-formatted data in exactly the form the presentation layer needs then it makes it much simpler for the presentation layer to show that data.

When dealing with database accesses via EF the Service Layers uses an adapter pattern to transform from the data layer/business logic layers to/from the presentation layer. Databases tend to minimise duplication of data and maximises the relational links between data, while the presentation layer is about showing the data in a form that the user finds useful.

The figure below shows an example of this difference in approach. The image comes from the list of books produced by the example book selling site that I create in my book. You can see how I have to pick data from lots of different tables in the database, and do some calculations, to form a summary of a book in my book list display.

Note: You can see this book list in action on the live site that hosts the example book selling site, at http://efcoreinaction.com/

EF provides a way of building queries, called select loading, that can ‘pick out’ the relevant columns from each table and combine them into a DTO/ViewModel class that exactly fits the user view. I apply this transform in the Service Layer, alone with other sorting, filtering and paging features. The listing below is the select query using EF Core to build the book summary you just saw in the figure above.

public static IQueryable<BookListDto> 
    MapBookToDto(this IQueryable<Book> books)   
    return books.Select(p => new BookListDto
        BookId = p.BookId,                      
        Title = p.Title,                        
        Price = p.Price,                        
        PublishedOn = p.PublishedOn,            
        ActualPrice = p.Promotion == null       
                ? p.Price : p.Promotion.NewPrice,         
        PromotionPromotionalText =              
                p.Promotion == null             
                  ? null : p.Promotion.PromotionalText,
        AuthorsOrdered = string.Join(", ",      
                .OrderBy(q => q.Order)          
                .Select(q => q.Author.Name)),   
        ReviewsCount = p.Reviews.Count,         
        ReviewsAverageVotes =                   
                p.Reviews.Count == 0            
                ? null                          
                : (decimal?)p.Reviews           
                    .Select(q => q.NumStars).Average()

Note: The code above wouldn’t work with EF6.x because it includes the command string.Join that cannot be converted into SQL but EF6.x. EF Core has a called Client vs. Server Evaluation, which allows methods that cannot be translated to SQL to be included. They are run after the data has been returned from the database.

Yes, this code is complex, but to build the summary we need to pull data from lots of different places and do some calculations at the same time, so that’s what you get. I have built a library called GenericServices (currently only available for EF6.x) with automates the building of EF select loading commands like this by using a LINQ Mapper called AutoMapper. This significantly improves the speed of development of these complex queries.

3. Pattern: Picking the right sort of database access pattern

There are a number of different ways we can form your EF database access inside an application, with different levels of hiding the EF access code from the rest of the application. In the figure below I show four different data access patterns.

UPDATE: If you are interested in this topic I have written a new article called “Is the repository pattern useful with Entity Framework Core?” where cover this in more detail.

The four types of database access patterns are:

  1. Repository + Unit of Work (Repo+UOW). This hides all the EF Core behind code that provides a different interface to EF. The idea being you could replace EF with another database access framework with no change to the methods that call the Repo+UOW.
  2. EF repository. This is a repository patterns that doesn’t try and hide the EF code like the Repo+UOW pattern does. EF repositories assume that you as developer know the rules of EF, such as using tracked entities and calling SaveChanges for updates, and you will abide by them.
  3. Query Object. Query objects encapsulate the code for a database query, that is a database read. They hold the whole code for a query or for complex queries it might hold part of a query. Query objects are normally built as extension methods with IQueryable<T> inputs and outputs so that they can be chained together to build more complex queries.
  4. Direct calls to EF. This represents the case where you simply place the EF code you need in the method that needs it. For instance, all the EF code to build a list of books would be in the ASP.NET action method that shows that list.

NOTE: AS I said earlier I have created a library called GenericServices for EF6.x (and EF Core in the future). This is a form of EF repository.

I used the Repo+UOW pattern, which was the recommended approach at the time, in a big project in 2014 – and I found it was really hard work. I and many others realised Repo+UOW wasn’t the way to go – see my article ‘Is the Repository pattern useful with Entity Framework Core?’. The Repo+UOW can be a valid pattern in some cases where hiding of the certain part of the data is needed, but I think there are better ways to do this with some of the new EF Core features, such as backing fields.

At the other end of the spectrum is the direct calls to EF in the method that needs it. This fails the separation of concerns principal because the database code is mixed in with other code not directly involved in database issues.

So, having ruled out the two extremes I would recommend:

  • Query Objects for building queries, often breaking down large queries into a series of query objects. The previous listing in this article of the method called MapBookToDto, which is a query object. I cover query objects in chapter 2 of my book.
  • For Create, Update and Delete (and business logic which I cover later) I use DDD-style access methods, that is, I create a method in the entity class to update the property or relationship. This isolates the EF code and makes it easier to refactor or performance tune that code.

The listing below shows a class with two EF repository methods for changing the publication date of a book in my example book selling site. I cover this in chapter 3 of my book.

public class ChangePubDateService : IChangePubDateService
    private readonly EfCoreContext _context;

    public ChangePubDateService(EfCoreContext context)
        _context = context;

    public ChangePubDateDto GetOriginal(int id)    
        return _context.Books
            .Select(p => new ChangePubDateDto      
                BookId = p.BookId,                 
                Title = p.Title,                   
                PublishedOn = p.PublishedOn        
            .Single(k => k.BookId == id);          

    public Book UpdateBook(ChangePubDateDto dto)   
        var book = _context.Books.Find(dto.BookId);
        book.PublishedOn = dto.PublishedOn;        
        return book;                               

4. Pattern: Turning your database code into services

I have used dependency injection (DI) for years and I think it’s really useful approach. I want to show you a way you can inject your database access code into an ASP.NET Core application.

Note: If you haven’t used DI before have a look at this article for an introduction, or this longer article from another of the great thinker, Martin Fowler.

The benefits of doing this are twofold. Firstly, DI will dynamically link together your database access to into the parts of the presentation/web API code that need it. Secondly, because I am using interfaces, it is very easy to replace the calls to the database access code with mocks for unit testing.

I haven’t got the space in this article to give you all the details (I takes five pages in chapter 5 of my book to cover this), but here are the main steps, with links to online documentation if you want to follow it up. Here are the steps:

  1. You need to make each of your database access code into thin repositories. That is a class containing a method, or methods, that the front-end code needs to call. See the ChangePubDateService class listed above.
  2. You need to add an interface to each EF repository class. You can see the IChangePubDateService interface applied to the ChangePubDateService class listed above.
  3. You need to register your EF repository class against its interface in the DI provider. This will depend on your application. For ASP.NET Core see this article.
  4. Then you need to inject it into the front-end method that needs it. In ASP.NET Core you can inject into an action method using the [FromServices] Note: I use this DI parameter injection rather than the more normal constructor injection because it means I only create the class when I really need it, i.e. it is more efficient this way.

Note: I realise that is a lot to take in. If you need more information can look at the GitHub repo associated with my book. Here are some useful links:

At the end of this you have a method you can call to access the database. The listing below shows an ASP.NET Core action method that calls the UpdateBook method of the ChangePubDateService class that I listed previously. Line 4 has the [FromServices] attribute that tells the DI provider to inject the ChangePubDateService class into the parameter called service.

public IActionResult ChangePubDate(ChangePubDateDto dto,
   [FromServices]IChangePubDateService service)
      return View("BookUpdated",
         "Successfully changed publication date");

NOTE: There is a way to do parameter injecting into an ASP.NET MVC action method, but it involves you having to override the default Binder. See The section “How DI is used in SampleMvcWebApp” at the bottom of this page, and my DiModelBinder in the associated GitHub repo.

The result of all this is that database code is nicely separated into its own class/method and your front-end code just has to call the method, not knowing what it contains. And unit testing is easy, as you can check the database access code on its own, and replace the same code in your front-end call with a mocking class that implements the interface.

5. Pattern: Building business logic – using Domain-Driven Design

Real-world applications are built to supply some sort of service, ranging from holding a simple list of things on your computer through to managing a nuclear reactor. Every different real-world problem has a set of rules, often referred to as business rules, or by the more generic name, domain rules.

Another book that I read some time ago that had a big impact on me was “Domain-Driven Design” by Eric Evans. The Domain-Driven Design (DDD) approach says that the business problem you are trying to solve must drive the whole of the development. Eric then goes on to explain how the business logic should be isolated from everything else other that the database classes so that you can give all your attention to what Eric Evans calls the “difficult task” of writing business logic.

There are lots of debates about whether EF Core is suitable for a DDD approach, because the business logic code is normally separate from the EF entity classes which it maps to the database. However, Eric Evans is pragmatic on this point and says in the section entitled “Working within your (database access) Frameworks” that, and I quote:

“In general, don’t fight your framework. Seek ways to keep the fundamentals of domain-driven design and let go of the specifics when your framework is antagonistic”
Page 157, Domain-Driven Design, by Eric Evans, 2004.

Note: I had to look up the word antagonistic: it means “showing or feeling active opposition or hostility towards someone or something”.

Over the years I have developed a DDD approach that works with EF and I have dedicated the whole of chapter 4 of my book to the topic of business logic because it is so important to get right. Here is a summary of the guidelines in that chapter:

  1. The business logic has first call on how the database structure is defined

Because the problem I am trying to solve, called the “Domain Model” by Eric Evans, is the heart of the problem then it should define the way the whole application is designed. Therefore, I try to make the database structure, and the entity classes, match my business logic data needs as much as I can.

  1. The business logic should have no distractions

Writing the business logic is difficult enough in itself, so I isolate it from all the other application layers, other than the entity classes. That means when I write the business logic I only have to think about the business problem I am trying to fix. I leave the task of adapting the data for presentation to the Service Layer in my application.

  1. Business logic should think it is working on in-memory data

This is something Eric Evans taught me – write your business logic as if the data was in-memory. Of course there needs to be some a ‘load’ and ‘save’ parts, but for the core of my business logic I treat, as much as is practical, the data as if it is a normal, in-memory class or collection.

  1. Isolate the database access code into a separate project

This fairly new rule came out of writing an e-commerce application with some complex pricing and deliver rules. Before this I used EF directly in my business logic, but I found that it was hard to maintain, and difficult to performance tune. Now I have another project, which is a companion to the business logic, and holds all the database access code.

  1. The business logic should not call EF Core’s SaveChanges directly

The business logic does not call EF Core’s SaveChanges method directly. I have a class in the Service Layer whose job it is to run the business logic – this is a case of the Service Layer implementing the command pattern. and, if there are no errors, it calls SaveChanges. The main reason is to have control of whether to write the data out, but there are other benefits that I describe in the book.

The figure below shows the original software architecture, but with the focus on how the business logic is handled. The five numbers, with comments, match the numbered guidelines above.

In my book I use the processing of an order for books as an example of a piece of business logic. You can see this business logic in action by going to the companion live site, http://efcoreinaction.com/, where you can ‘buy’ a book. The site uses an HTTP cookie to hold your basket and your identity (saves you having to log in). No money needed – as the terms and conditions says, you aren’t actually going to buy a book.

The code is too long to add to this article, but I have written another article called Architecture of Business Layer working with Entity Framework (Core and v6) which covers the same area in more detail and contains plenty of code examples.

6. Principle: Get your EF code working, but be ready make it faster if you need to.

The recommended approach to developing software is to get it to work, and then worry about making it faster. Another more nuanced approach, attributed to Kent Beck, is Make it Work. Make it Right. Make it Fast. Either way, these principle says we should leave performance tuning to the end. I would add a second part: you should only performance tune if you need to.

In this article I am talking about database accesses via EF. I can develop pretty complex database accesses in EF really quickly – at least five times faster than using ADO.NET or Dapper. That covers the “get it working part”. The down side is that EF doesn’t always produce the best performing SQL commands: sometimes it’s because EF didn’t come up with a good SQL translation, and sometimes it’s because the LINQ code I wrote isn’t as efficient as I thought it was. The question is: does it matter?

For example, I developed a small e-commerce site (the code took me 10 months) which had a little over a 100 difference database accesses and about 20 tables. More than 60% of the database accessed were on the admin side, with maybe 10% of accesses that really mattered to the paying user.

To show this graphically I have picked out three features from my example book selling site and then graded them by two scales:

  • vertically, what the user expects in terms of performance.
  • Horizontally, how difficult is the database access.

This gives you the figure below, with top right highlighted as area where we really need to think about performance tuning.

My analysis says that only the book search query needs work to improve it. The user is used to fast searches thanks to Google etc. and will get frustrated if my application is too slow. Looking at the complexity of the book search, which includes filtering on things like average user votes, I can see that it produces some rather complex database access commands.

It turns out that the EF Core code for my book search performs badly, but there is plenty I can do about it. In fact, I have mapped out a whole section towards the end of my book where I show how I can improve the book search in a series stages, each getting more complex and taking more development time. They are:

  1. Can I improve the basic EF commands by rearranging or refining my EF code?
  2. Can I convert some or all of the EF code into direct SQL commands, calculated columns, store procedures etc.?
  3. Can I change the database structure, such as de-normalising the database, to improve the search performance?

Improving the book search will take quite a bit of time, but it’s worth it in this application. Quite honestly the other features aren’t worth the effort, as they are fast enough using standard EF Core commands.

Planning for possible performance tuning

While I totally agree with the idea that you shouldn’t try to performance tune too early, it is sensible to plan that you might have to performance tune. All the other approaches I have described, especially the encapsulation of the databases accesses, means that my database code is a) clearly isolated, and b) open for performance tuning.

So, my practice is to develop database accesses quickly with EF, but organise the code so that I encapsulate the database code cleanly. Then, if I have a performance problem then I can tune the database access code with minimal impact on other parts of your application.


When I came back to programming in 2009 some of my early code ended up with long methods which were intertwined with each other – they worked, but the code was hard to understand, debug and maintain. Since then I have taken the principles and practices from the software giants and worked out how to apply them real applications. With these I can write code that is robust, easy to test and easy to improve – and I write code much faster too.

However, for me, there is a gap between seeing a great idea and working out how it could help me. Some books I instantly loved, like Eric Evans “Domain-Driven Design” but it has still taken two or three projects before I had an implementation that works really well for me. I don’t think I’m unusual in this, as learning any new technique takes some time for it to become smooth and natural.

There are many great software thinkers out there and some great principles and practices. So the next time you think “there must be a better way of doing this” then go and have a look for other people’s thoughts. It’s a journey, but the reward is becoming a better developer.

Happy coding!

More information on my book, Entity Framework Core in Action

Last Updated: March 11, 2017 | Created: March 6, 2017

Quite a bit is going on around the book I am writing, called Entity Framework Core in Action, now that it is on early release. In this short article I want to point you to some resource you might like to look at, and also give you a bit more on the planned content of the book.

Resources available for the book

The following resources are available:

  • A slideshare presentation giving you an overview of what’s in the book.
    IMPORTANT: discount code in the slideshare presentation! See first slide.
  • The book page, with links to the first chapter, which is free.
  • The GitHub repo with all the code that goes with the book.
    NOTE: I use a branch-per-chapter, so you need to select a branch to see the code.
  • The live site, at http://efcoreinaction.com/ where you can see the example book selling site that I build in chapters 1 to 5, and improve in later chapters.

Note: You can buy an early-access version of my book. You will get the chapters as I write them, and the full book when the book is released. This means you have early access to the information and you get the final book with any updates I make to the chapters before it is releases.

Table of Content for the book

Here is the current table of contents for my book. It is quite possible that a couple of chapter in part 2 and 3 might get too big and be split into two.

Part 1:  Introduction to Entity Framework Core

By the end of part 1 you should be able to build a .NET application that uses a relational database, but how the database is organized is left to EF Core; for instance, EF Core’s default configuration will set the type and size of the database columns, which can be a bit wasteful on space. I also look at building and deploying a web application that uses EF Core to handle its database.

The chapters are:

  1. Introduction to Entity Framework Core
  2. Querying the database
  3. Updating the database: Create, Update & Delete
  4. Using EF Core in Business Logic
  5. Using EF Core in ASP.NET Core web applications

Part 2:  Entity Framework Core In Depth

Part 2 is more of a EF Core reference section. By the end of part 2 you will be able to create a database with the exact type and size of each column, or link to an existing database which has a specific schema. I go deeper into the commands in EF Core’s DbContext and look at different ways of migrating a database.

The chapters are:

  1. Modelling options for scalar properties
  2. Modelling options for relationships
  3. Going deeper into the DbContext
  4. Handling database migrations

Part 3: Using Entity Framework in real-world applications

Part 3 is all about improving your skills and making you a better developer, and debugger, of EF Core applications. I look at real-world applications of EF Core starting with a range of known patterns and practices that you can use with EF Core with chapters on Unit Testing EF Core applications, extending ER Core, and most importantly how to find and fix EF Core performance issues.

The chapters are:

  1. Patterns for using EF Core
  2. Unit Testing EF Core applications
  3. EF Core performance Tuning
  4. Extending EF Core


A brief introduction to LINQ


A pretty short and sweet post, but I hope it helpful. My next post will be much bigger.

Happy coding!


Updating many to many relationships in Entity Framework Core

Last Updated: November 18, 2017 | Created: March 1, 2017

I wrote an article called Updating many to many relationships in entity framework back on 2014 which is still proving to be popular in 2017. To celebrate the release of my book Entity Framework Core in Action I am producing an updated version of that article, but for Entity Framework Core (EF Core).

All the information and the code comes from Chapter 2 of my book. In the book I use a book selling site, with books linked to authors via a many-to-many relationship. You can see the live example site and if you scroll down you will see books with multiple authors.

All the unit tests and classes I use in this article can be found in the Chapter03 branch of the Git repo associated with this book.

NOTE: If you are in a hurry I have added summaries for each section – just search for the word Summary to get the answers without needing to look at the code examples.

Creating a many-to-many relationship in EF Core

The example I am going to use is one taken from the book I am writing. It has a Book entity linked its Author(s) entities via a BookAuthor table. The database diagram below shows the three tables, with the foreign key links from the BookAuthor linking table to the other two tables.

A linking table like this has foreign keys (FK in the diagram) which link to the primary key(s) (PK in the disgarm) of each end of the relationship – in this case the BookId of the Book entity and the AuthorId of the Author entity. These foreign keys then form a composite primary key for the BookAuthor table.

EF6.x created this table for you when you defined a many-to-many relationship, but EF Core, which takes a leaner approach, doesn’t – you need to do it. Its not hard, so let me give you the code to define the BookAuthor entity class.

public class BookAuthor               
    public int BookId { get; set; }   
    public int AuthorId { get; set; } 
    public byte Order { get; set; }   


    public Book Book { get; set; }    
    public Author Author { get; set; }

NOTE: I have a property called Order because the order of the authors matter. If the order of items in your many-to-many list don’t matter then you can leave that out.

EF Core will find the relationships using its by convention rules because I have used names that it understands:

  • I used the same names, BookId and AuthorId, of the primary keys in the Book class and the Author class.
  • Because I used the classes Book and Author, which EF Core knows are part of the database, then it knows its a relationship.

Now, the one thing that EF Core can’t work out is what the primary key of the BookAuthor table because they don’t follow the normal convension. There are a number of ways to define the primary key(s) of an entity, but I have used EF Core’s Fluent API that I access via the OnModelCreating method inside my DbContext, as shown below:

public class EfCoreContext : DbContext
    public DbSet<Book> Books { get; set; }            
    public DbSet<Author> Authors { get; set; } 
    public DbSet<PriceOffer> PriceOffers { get; set; }       

    public EfCoreContext(                             
        DbContextOptions<EfCoreContext> options)      
        : base(options) {}

    protected override void
        OnModelCreating(ModelBuilder modelBuilder)    
            .HasKey(x => new {x.BookId, x.AuthorId});

You can see the .HasKey method towards the bottom of the code where I define the composite key consisting of the BookId and the AuthorId.

The easiest approach to setting up relationships is by using EF Core’s conventions, which is what I have done. But if you want to explicitly define the relationships you can, using the .HasOne/.HasMany Fluent API commands, as shown below.

Note: There are loads of configuration rules, which I cover in chapter 6 and 7 of my book, or you can look at the EF Core docs.

//NOTE: I only show the OnModelCreating part of the DbContext
protected override void
    OnModelCreating(ModelBuilder modelBuilder) 
       .HasKey(x => new {x.BookId, x.AuthorId});

    //If you name your foreign keys correctly, then you don't need this.
        .HasOne(pt => pt.Book)
        .WithMany(p => p.AuthorsLink)
        .HasForeignKey(pt => pt.BookId);

        .HasOne(pt => pt.Author) 
        .WithMany(t => t.BooksLink)
        .HasForeignKey(pt => pt.AuthorId);

NOTE: You can set the keys, but not the relationships, via attributes. The [Key] and [Order] attributes allow you to define the primary key, but I find the Fluent API is quite clean and easy to understand.

Summary – how to add a many-to-many relationship

To add a many-to-many relationship in EF Core you need to:

  1. Create a linking entity class that has the foreign keys(s) of the two entities you want to form a many-to-many link between. In my example these are BookId and AuthorId.
  2. It is efficient (but not necessary) to make these two foreign keys into the composite primary key. This will ensure that there is only ever one link between the two entities you want. It is also more efficient in terms of size of the database table.
    1. If you do want the two foreign keys to be the composite primary key you need to configure this in EF Core, either via Fluent API or by [Key]/[Order] attributes, because it cannot find them using its normal “by convention” method.
    2. If you use properties for the foreign key that don’t  have the same name as the primary key it links to, then you must manually configure the relationship. There are a number of ways to do that, but EF Core’s Fluent API is quick and fairly straightforward to understand.
  3. The linking class can, but don’t have to, have relationship links to the two entities it links. In my example my AuthorLinks class has a property Book of type Book, and a property Author of type Author.
  4. You most likely want a collection relationship to the linking table in one or both of the linked entities. In my example I have property called AuthorLinks with the type ICollection<BookAuthor> in my Book class, and a property called BooksLink of type ICollection<BookAuthor> in my Author class.

Adding a Book many-to-many Author link

Creating new many-to-many link is fairly easy in EF Core (but not as easy as EF6.x). The code below creates a new book with a new author.

Note: the variable context used at the bottom of the code below is an instance of the application’s DbContext, called EfCoreContext in this case (see its definition above).

var book = new Book
  Title = "Quantum Networking",
  Description = "faster-than-light data communications",
  PublishedOn = new DateTime(2057, 1, 1),
  Price = 220
var author = new Author { Name = "Future Person" };
book.AuthorsLink = new List<BookAuthor>
  new BookAuthor { 
    Author = author, 
    Book = book,
    Order = 0

//Now add this book, with all its relationships, to the database
context.Books.Add( book);

Updating a many-to-many relationship

It turns out there are two scenarios under which you want to update a relationship. They are:

  1. Connected state: This is where the load of the data and the update is done in one go, that is within one instance of the application’s DbContext. You find this sort of update that happens in a console application, or inside your business logic (see this link on business logic and EF Core)
  2. Disconnected state: This is where the update is split into two halves: a) select the entities you want to change, b) and make the change. Each stage has a different instance of the application’s DbContext. This happens on web sites, where in the first stage the user picks what they want done and posts it back. The second stage then has to re-read the data and update it.

I am going to describe these two approaches separately, so you can go to the one that fits the application you are building.

1. Connected state update

In the connected state we read in the Book entity and update it immediately, i.e. within the same instance of the DbContext. For my Book/Author example I am going to read in the Book, with its linking entities, BookAuthor, as tracked entities. – that means that EF Core will take a copy of the entities we read in so that it can spot any changes when you call SaveChanges.

The listing below is one of my unit tests which adds the existing author “Martin Fowler” to the book called “Quantum Networking”, which currently has one author called “Future Person”. After the test has finished the book “Quantum Networking” has two authors, “Future Person” and “Martin Fowler”

NOTE: I am using a Sqlite, in-memory database, which I seed with four books with known titles and authors.

public void TestAddExtraAuthorOk()
    var inMemDb = new SqliteInMemory();

    using (var context = inMemDb.GetContextWithSetup())

        var book = context.Books                          
            .Include(p => p.AuthorsLink)                  
            .Single(p => p.Title == "Quantum Networking");

        var newAuthor = context.Authors                   
            .Single(p => p.Name == "Martin Fowler");

        book.AuthorsLink.Add(new BookAuthor
            Book = book,
            Author = newAuthor,
            Order = (byte) book.AuthorsLink.Count

        var bookAgain = context.Books
            .Include(p => p.AuthorsLink)
            .Single(p => p.BookId == book.BookId);
        var authorsInOrder = bookAgain.AuthorsLink.OrderBy(p => p.Order).ToList();
        authorsInOrder.First().Author.Name.ShouldEqual("Future Person");
        authorsInOrder.Last().Author.Name.ShouldEqual("Martin Fowler");

The most important parts of the code are lines 11 to 13. Here I load the Book entity using an Include method to load the AuthorLinks at the same time, as tracked entities. The AuthorLinks property holds a a collection of BookAuthor entities linking to the Author entities.

Note: The default way of loading data in EF Core is as tracked entities, that is, EF Core takes a copy of the loaded data so that it can detect if any changes have happened since they were loaded. You can turn off tracking by including the method .AsNoTracking to your query (useful in read-only queries, as it makes them a little faster).

In lines 15 to 16 I also load, again as a tracked entity, the Author entity who’s name is “Martin Fowler” as that the author I want to add to this book.

Then on lines 18 to 23 I create a new BookAuthor linking entry linking the Book entity I loaded to the Author entity of “Martin Fowler”.

When SaveChanges is called on line 24 it finds that the AuthorLinks property  of the book instance it loaded has changed. Because this is a relationship it looks at the entities and finds that there is a new BookAuthor entity, which it adds to the database, and a author which is already in the database so it doesn’t have to add it.

Summary – Connected state many-to-many update

To update a many-to-many relationship in the connected state:

  1. Load an entity at one end of the relationship (doesn’t matter which end) with the collection of links in the many-to-many linking table. In my example I loaded the Book entity I wanted to change along with its AuthorLinks property, which holds the collection of the linking entities in the BookAuthor table.
  2. Alter the linking table collection, either by adding or removing entries from the collection. In my example I added a new BookAuthor class to the AuthorLinks property collection.
    1. Note: if you want to replace the whole collection then you can simply assign a new list to the linking collection (there is an example of this in my unit tests – see test named TestChangeAuthorsNewListOk).

2. Disconnected state

The disconnected state happens when the initial read of the data and update of the data are done separately, i.e. they use different instances of the applications DbContext. This happens in a web application which has two stages:

  1. The first stage is where the user is presented with the book, its current author(s) and a list of possible authors to choose from. Once they have chosen the author to add then press a button which takes their choice back to the web application
  2. In the second stage the web application needs to re-read the data and execute the database update.

The unit test below does the same as the previous example, that is, it adds the existing author “Martin Fowler” to the book called “Quantum Networking”, which starts out with one author called “Future Person”. After the test has finished the book “Quantum Networking” has two authors, “Future Person” and “Martin Fowler”.

I simulate the disconnected state by having two instances of the application’s DbContext and passing the primary keys of the Book and the new Author to add via a small class called ChangeAuthorDto.

Note: This unit test uses two, separate instances of the DbContext to simulate the disconnected state. With the help of Daria (see comment below) I have found a way to use Sqlite in-memory with multiple instances of the DbContext. Daria pointed out that the in-memory part is held in the Sqlite connection, so I create one options (which uses the Sqlite connection) and used that to create two separate DbContext instances  – thanks Daria.

public void TestAddAuthorDisconnectedOk()
    var options = SqliteInMemory.CreateOptions<EfCoreContext>();
    ChangeAuthorDto dto;
    using (var context = new EfCoreContext(options))
        var book = context.Books
            .Include(p => p.AuthorsLink)
            .Single(p => p.Title == "Quantum Networking");

        var newAuthor = context.Authors
            .Single(p => p.Name == "Martin Fowler");
        dto = new ChangeAuthorDto
            BookId = book.BookId,
            NewAuthorId = newAuthor.AuthorId

    using (var context = new EfCoreContext(options))
        var book = context.Books
            .Include(p => p.AuthorsLink)
            .Single(p => p.BookId == dto.BookId);
        var newAuthor = context.Authors.Find(dto.NewAuthorId);

        book.AuthorsLink.Add(new BookAuthor
            Book = book,
            Author = newAuthor,
            Order = (byte)book.AuthorsLink.Count

        var bookAgain = context.Books
            .Include(p => p.AuthorsLink).ThenInclude(p => p.Author)
            .Single(p => p.BookId == dto.BookId);
        var authorsInOrder = bookAgain.AuthorsLink.OrderBy(p => p.Order).ToList();
        authorsInOrder.First().Author.Name.ShouldEqual("Future Person");
        authorsInOrder.Last().Author.Name.ShouldEqual("Martin Fowler");

The important lines are 27 to 30, where I load in the Book, with its BookAuthor linking collection accessed via the Book’s AuthorLinks property, and I read in the Author that the user wants to add.

Having loaded that book I can use the same code I had in the connected state, that is, I  create a new BookAuthor linking entry linking the Book entity I loaded to the Author entity of “Martin Fowler” and add it to the existing AuthorLinks collection – see lines 32 to 37.

Summary – Disconnected state many-to-many update

To update a many-to-many relationship in the disconnected state:

  1. In the first stage you should return the primary key(s) of the two ends of the many-to-many relationships that you want to link/unlink. In my example I provide the BookId and the AuthorId I want to add to the book.
  2. In stage two you load one end of the relationship (doesn’t matter which end) with the collection of links the the linking table. In my example I loaded the Book entity using the BookId key along with its AuthorLinks property, which held the collection of the linking entities in the BookAuthor table.
  3. Now load the other end of the relationship. In my example the Author entity, using the AuthorId returned from the first stage.
  4. Alter the linking table collection, either by adding or removing entries from the collection. In my example I added a new BookAuthor class to the AuthorLinks property collection.
    1. Note: if you want to replace the whole collection then you can simply assign a new list to the linking collection with the new BookAuthor class.

2a. Another, quicker way to update a many-to-many relationships in the disconnected state

In chapter 3 of my book, Entity Framework Core in Action , I point out that you can often change relationships using the foreign keys, and it can me more efficent. This is especially true of many-to-many relationships because you can directly add/delete entries from the linking table, in this case BookAuthor.

By directly adding/deleting the linking table entries then it saves you having to load the Book entity with its AuthorLinks property and the Author you want to add. Coupled with the fact that the first stage is going to supply you with the primary keys anyway then it saves quite a few database queries.

The unit test below shows this in action. The first stage of the unit test hands back primary keys of the Book and the new Author to add via the ChangeAuthorDto class. The second stage, which uses a new instance of the DbContext, uses those primary keys to build a new BookAuthor linking entity to add to the database.

NOTE: The one downside of this approach for my Book/Author example is the setting the Order property, as it should be set to get the correct order of the authors. For this example know there is one author with an Order set to 0 so I simply set it to 1. In real application the first stage would have to define what order the authors should be shown in.

public void TestAddAuthorViaForeignKeyOk()
    var options =

    ChangeAuthorDto dto;
    using (var context = new EfCoreContext(options))
        var book = context.Books
            .Include(p => p.AuthorsLink)
            .Single(p => p.Title == "Quantum Networking");

        var newAuthor = context.Authors
            .Single(p => p.Name == "Martin Fowler");
        dto = new ChangeAuthorDto
            BookId = book.BookId,
            NewAuthorId = newAuthor.AuthorId

    using (var context = new EfCoreContext(options))
        context.Set<BookAuthor>().Add(new BookAuthor    
            BookId = dto.BookId,                        
            AuthorId = dto.NewAuthorId,                 
            Order = 1                                   

        var bookAgain = context.Books
            .Include(p => p.AuthorsLink).ThenInclude(p => p.Author)
            .Single(p => p.BookId == dto.BookId);
        var authorsInOrder = bookAgain.AuthorsLink.OrderBy(p => p.Order).ToList();
        authorsInOrder.First().Author.Name.ShouldEqual("Future Person");
        authorsInOrder.Last().Author.Name.ShouldEqual("Martin Fowler");

The important lines are 27 to 32, where I create a new BookAuthor class using the BookId and the AuthorId of the new author I want to add to the book. If you compare this to the last disconnected example you will see this avoids three database loads:

  1. The Book we wanted to update
  2. The Books AuthorLinks collection of BookAuthor entities
  3. And the Author that I wanted to add.

In this case I added an extra author. I could have removed an author by deleting the appropriate BookAuthor entry.

I do recommend this approach for many-to-many relationship updates because it is more efficient and still quite clear in its intent, i.e. its not complex so its obvious what is happening.

Summary – quicker way to update a many-to-many relationships in the disconnected state

For updates to many-to-many relationships in the disconnected state you can:

  1. In the first stage you should return the primary key(s) of the two ends of the many-to-many relationships that you want to link/unlink. In my example I provide the BookId and the NewAuthorId I wanted to add to the book.
  2. In the second stage you should add/delete entries from the linking table directly. For instance to add a new many-to-many relationship you add a new entry in the linking table. In my example I added a new BookAuthor link  with its BookId property set from the BookId returned by the first stage, and its AuthorId property set to the NewAuthorId value from the first stage.


Setting up and changing many-to-many links in EF Core is harder than in EF6.x. EF 6.x used the same approach as EF Core, but “hid” the linking table and the manipulation of that table from you. That certainly made it easier, but as the continued access to my original article on updating many-to-many in EF6.x shows, it can cause confusion sometimes.

At the moment EF Core makes you configure the linking table and update that linking table yourself. The EF Core roadmap says it will look at automating this in the future, but not for now. But I think there are some benefits in knowing what is going on in the database, and, as my last disconnected example shows, you can sometimes be more efficient that EF6.x is at updating many-to-many relationships.

I cover this an much more in my book Entity Framework Core in Action and you can access the unit test code you see here via the Chapter03 branch of the Git repo that goes with the book – see the test class Ch03_ManyToManyUpdate.

Happy coding!