EF Core In depth – what happens when EF Core reads from the database?

Last Updated: July 31, 2020 | Created: May 31, 2020

This article gives an “under the hood” view of what happens when EF Core reads in data from a database. I look at two types of database read: a normal query and a query that contains the AsNoTracking method in it. I also show how a bit of experimenting on my part solved a performance problem that one of my client’s had.

I do assume you know EF Core, but I start with a look at using EF Core to make sure we have the basics covered before I dive into the depths of EF Core. But this is a “deep dive” so be ready for lots of technical detail, hopefully described in a way you can understand.

This article is part of a “EF Core In depth” series. Here is the current list of articles in this series:

Other older articles in this series are

This “EF Core In depth” series is inspired by what I found while updating my book “Entity Framework Core in Action” to cover EF Core 5. I am also added a LOT of new content from my experiences of working with EF Core on client applications over the last 2½ years.

NOTE: There is a companion GitHub repo at https://github.com/JonPSmith/EfCoreinAction-SecondEdition. This has a simple e-commerce site called Book App that you can run. Also, there are unit tests that go with the content in this article – look for unit test class whose name start with “Ch01_”, “Ch02_” etc.

TL;DR – summary

  • EF Core has two ways to read data from the database (known as a query): a normal LINQ query and a LINQ query that contains the method AsNoTracking.
  • Both types of query return classes (referred to as entity classes) with links to any other entity classes (known as navigational properties) loaded at the same time. But how and what they are linked to is different between the two types of queries.
  • The normal query also takes a copy of any data it reads in inside the application’s DbContext – the entity classes are said to be tracked. This allows the loaded entity classes to take part in commands to update the database.
  • This normal query also has some sophisticated code called relational fixup which fills in any links between the entity classes read in, and any other tracked entities.
  • The AsNoTracked query doesn’t take a copy so it isn’t tracked – this means its faster than a normal query. This also means it won’t be considered for database writes.
  • Finally, I show a little-known feature of EF Core’s normal query as an example of how clever it is in linking up relationship via its navigational properties.

Setting the scene – the basics of EF Core writing to the database

TIP: If you already know EF Core then you can skip this section – it’s just an example of how you read a database.

In my book I have created a small web application that sells books – think super-simple Amazon. In this introduction I am going to describe the database structure and then give you a simple example of writing to that database.

a. The classes/tables I’m going to work with

My Book App as I call it starts out in chapter 2 with the following five tables shown in the figure below. I chose this because a) the data/concepts are easy to understand because of sites like Amazon etc. and b) it has one of each of the basic relationships that can exist between tables.

Theses tables are mapped to classes with similar names, e.g. Book, BookAuthor, Author, with properties with the same name as the columns shown in the tables. I’m not going to show the classes because of space, but you can see these classes here in my GitHub repo.

b. A look at what you need to read this database via EF Core

For EF Core to write to the database I have shown you need 5 parts

  1. A database server, such as SQL Server, Sqlite, PostgreSQL…
  2. An existing database with data in it.
  3. A class, or classes, to map to your database – I refer to these as entity classes.
  4. A class which inherits EF Core’s DbContext class, which contains the setup/configuration of EF Core
  5. Finally, the commands to read from the database.

The unit test code below comes from the EfCoreinAction-SecondEdition GitHub repo and shows a simple example of reading in a set of four Books, with their BookAuthor and Authors entity classes from an existing database.

The example database contains four books, where the first two books have the same author, Martin Folwer.

[Fact]
public void TestBookCountAuthorsOk()
{
    //SETUP
    var options = SqliteInMemory.CreateOptions<EfCoreContext>();
    //code to set up the database with four books, two with the same Author
    using (var context = new EfCoreContext(options))
    {
        //ATTEMPT
        var books = context.Books
            .Include(r => r.AuthorsLink)
                .ThenInclude(r => r.Author)
            .ToList();

        //VERIFY
        books.Count.ShouldEqual(4);
        books.SelectMany(x => x.AuthorsLink.Select(y => y.Author))
            .Distinct().Count().ShouldEqual(3);
    }
}

Now, if we link unit test code to the list of 5 parts, it goes like this

  1. A database server – Line 5: I have chosen a Sqlite database server, and in this case the SqliteInMemory.CreateOptions method, which comes from my EfCore.TestSupport NuGet package, sets up a new, in-memory database (in-memory database are great for unit testing as you can set up a new, empty database just for this test – see chapter 17 of my book for more).
  2. An existing database with data – Line 6: I deal with writing to the database in the next article, for now just assume there is a database with four books, two of which have the same author.
  3. A class, or classes – not shown but classes found here, but there is a Book entity class, with relationships to an Author entity class, via a many-to-many linking entity class called BookAuhor.
  4. A class inherits DbContext – Line 7:  the EfCoreContext class inherits the DbContext class and configures the links from the classes to the database (you can see this class here in my GitHub repo).
  5. Commands to read from the database – Lines 10 to 13 – this is a query:
    1. Line 10 – the EfCoreContext instance called context gives you access to the database, and adding Books says you want to access the Books table.
    1. Line 11  – The Include is known as eager loading and tells EF Core that when it loads a Book is should also load all the BookAuthor entity classes that are linked to that book.
    1. Line 12  – The ThenInclude is part of the eager loading and tells EF Core that when it loads a BookAuthor it should also load the Author entity classes that are linked to that BookAuthor.

The result of all of this is a set of books, with normal properties, like the Title of the Book, filled in and the navigational properties that link the entity classes, like the AuthorsLink property in the Book, filled in with a reference to the correct instance of the entity class it links to.  And the last few lines after the //VERIFY comment are some simple checks that there are four books have, between them, three distinct authors.

This example is known as a query, and one of the four types of database accesses, which are known as CRUD (Create, Read, Update, and Delete). I cover the Create and Update in the next article.

How EF Core represents data when reading from the database

When you query a database EF Core goes thought various steps to convert from data returned by the database into entity classes with navigational links filled in. In this section we will look at those steps for two types of queries – a normal query (i.e. without AsNoTracking, also known as a read-write query) and a query with the AsNoTracking method added (known as a read-only query).

But first we look at the initial part which takes your LINQ command, converts it to the relevant commands for the database type you are using, and gets the data back. This is common to the two types of query we are going to look at. See the following figure for this first part.

There is some very complex code that converts your LINQ into database commands, but there really isn’t a lot to say other than if your LINQ can’t be translated you will get an exception from EF Core with a message that contains ‘could not be translated’. Also, when the data is coming back features like Value Converters may adapt the data.

NOTE: In chapter 6 of my book I cover some of the more complex LINQ commands and what you should do to help EF Core to translate to database commands.

This section has shown the first part of the query, where your LINQ is turned into database commands and returns all the correct values. Now we look at the second part of the query where EF Core takes the returned values and turns them into instances of the entity classes and filling in any navigational properties. There are two types of queries which we will look at.

  1. A normal query (read-write query)
  2. An AsNoTracking query, which has the AsNoTracking method added (read-only query).

1. Normal query – a read-write query

A normal query reads in the data in such a way that the data can be edited, which is why I refer to it as a read-write query. It doesn’t automatically update data (See next article for how to write to the database), but unless your query read-write then you won’t be about to update the data you have just read in.

The example I gave you in the introduction does a normal query that reads in the four example books with the links to their authors. Here is the query code part of that example

var books = context.Books
    .Include(r => r.AuthorsLink)
        .ThenInclude(r => r.Author)
    .ToList();

Then EF Core goes through three steps to convert those values back into entity classes with navigational properties filled in. The figure below shows the three steps and the resulting entity classes with their navigational links.

Let’s look at three steps:

  1. Create classes and fill in data. This takes the values that came back for the database and fills in the non-navigational (known as scalar) properties, fields, etc. In the Book entity class this would be properties like BookId (Book’s primary key), Title, etc. – see bottom left, light blue rectangles.
          There can be a lot of other issues around here, such as how EF Core used constructors, backing fields, shadow properties, adapting data, client-side calculations to name but a few. Chapters 2, 6 and 7 cover these issues.
  2. Relational fixup. The first step will have filled in the primary keys and foreign keys, which define how the data is connected to each other. EF Core then uses these keys to set up the navigational properties between the entity classes (shown as thick blue lines in the figure).
            This Relational fixup’s linking feature goes beyond the entity classes just read in by the query, it looks at every tracked entity in the DbContext and fills in any navigational properties. This is a powerful feature, but if you have lots of tracked entities then it can take some time – that’s why the AsNoTracking query exists, to be quicker.
  3. Creating a tracking snapshot. The tracking snapshot is a copy of the entity classes that are passed back to the user, plus other things like a link to each entity class that it shadows – an entity is said to be tracked, which means it can be used in database writes.

2. AsNoTracking query – read-only query

An AsNoTracking query is a read-only query. That means anything you read in won’t be looked at when the SaveChanges method is called. The AsNoTracking option is there because it makes the query perform better. I cover this and other differences from a normal query in the next section.

Following the example in the introduction I alter the query code to add the AsNoTracking method below (see line 2)

var books = context.Books
    .AsNoTracking()
    .Include(r => r.AuthorsLink)
        .ThenInclude(r => r.Author)
    .ToList();

The LINQ query goes through the two of the three steps shown in the normal query figure above. The step that is left out is the 3. Tracking snapshot, and the relational fixup step is slightly different. The following figures shows the steps for an AsNoTracking query.

Let’s look at three steps:

  1. Create classes and fill in data. (same as normal query) This takes the values that came back for the database and fills in the non-navigational (known as scalar) properties, fields, etc. (known as scalar properties). In the Book entity class this would be properties like BookId (Book’s primary key), Title, etc. – see bottom left, light blue rectangles.
  2. Relational fixup. (different from normal query) The first step will have filled in the primary keys and foreign keys, which define how the data is connected to each other. EF Core can use that to fill in the navigational properties (shown as thick blue lines) between the entity classes, but NOT looking outside the query to tracked entities.
  3. Creating a tracking snapshot. (NOT RUN)

c. Differences between normal and AsNoTracking queries

Now let’s compare the two query types and highlight the differences.

  1. The AsNoTracking query performs better. The main reason for having the AsNoTracking feature is about performance. The AsNoTracking query is
    1. Slightly faster and uses slightly less memory because it doesn’t have to create the tracking snapshot.
    1. Not having the tracking snapshot of the queried data improved the performance of SaveChanges because it doesn’t have to inspect the tracking snapshot for changes.
    1. Slightly faster because the relational fixup doesn’t what is called identity resolution. This is why you get two author instances with the same data in them.
  2. The AsNoTracking query relational fixup only links entities in the query. In the normal query I already said that the relational fixup linked both to entities in the query AND entities that are currently tracked. But the AsNoTracking query only filled in the navigational properties between entities in the query.
  3. The AsNoTracking query doesn’t always represent the database relationships. Another difference in relational fixup between the two types of queries is that the AsNoTracking query uses a quicker fixup without identity resolution. This can produce multiple instances for the same row in the database – see the blue Author entities and comment in the bottom right of the previous figure. That difference doesn’t matter if you are just showing the data to a user, but if you have business logic then the multiple instances doesn’t correctly reflect the structure of the data and could cause problems.

Useful relational fixup feature with hierarchical data

The relational fixup step is quite clever, especially in a normal query. This allows all sorts of clever things in a normal query and I wanted to show you how I used relational fixup to solve a performance problem I had in a client’s project.

I worked for a company where a lot of their dat was hierarchical, that is data that has a series of linked entity classes with an indeterminate depth. The problem was I had to parse the whole hierarchy before I could display it. I initially did this by eager loading the first two levels and then used explicit loading for deeper level. It worked but the performance was very slow, and the database was overloaded with lots of single database accesses.

This got me thinking, if the normal query relational fixup is so clever could it help me improve the performance of the query? – and it could! Let me give you an example using an example of employees in a company. The figure below shows you a possible hierarchical structure of a company we want to load.

NOTE: You can see the Employee class here, but the basic idea is it has Manger navigation Manager navigational property (single), which linked to their boss (or null if top person) and a WorksForMe navigational property (collection), which has all the employees that work for this employee (can be none). It also has employee info like their Name and what department(s) they work for.

You could use .Include(x => x.WorksForMe).ThenInclude(x => x.WorksForMe)… and so on, but it turns out that a single .Include(x => x.WorksForMe) is enough, as the relational fixup can work out the rest! That is surprizing, but very useful.

For instance, if I wanted to select all the people that work in Development (each Employee has a property with the name WhatTheyDo with a type Role which has the department(s) they work in) I could write this code.

var devDept = context.Employees                         
    .Include(x => x.WorksFromMe)                        
    .Where(x => x.WhatTheyDo.HasFlag(Roles.Development))
    .ToList();

This creates one query that loads all the employees who work in Development, and the relational fixup filled in the WorksFoMe navigational property (collection) and the Manager navigational property (single) on all the employees in the returned employees. This improves both the time the query takes and reduced the load on the database server by only sending one query (comareed with my original query that used explicit loading).

NOTE: You do need to work out which relationship to Include. In this case I have a Manager navigational property (single) and a WorksForMe navigational property (collection). It turns out that including the WorksForMe property fills in both the WorksForMe collection and the Manager property. But including the Manager navigational property means that the WorksForMe collection is only created if there are entities to link to, otherwise it is null. I don’t know why – that’s why I test everything to test what works.

Conclusion

You have seen two types of queries, which I called a) a normal, read-write query, and b) an AsNoTracking, read-only query. For each query type I showed you what EF Core does “under the hood” and the structure of the data read in. And the differences in how they work shows their strengths and weaknesses.

The AsNoTracking query is the obvious solution for read-only queries, as its faster than the normal, read-write query. But you should keep in mind its limitations of the relational fixup, which can create multiple instances of classes where the database only has one relationship.

The normal, read-write query is the solution for loading tracked entities, which means you can use them in Create, Update and Delete database changes. The normal, read-write query does take up more resources of time and memory, but is has some useful features such as linking automatically to other tracked instances of entity classes.

I hope you have found this article useful. You can find a must longer and detailed version of this in chapters 1 to 6 of my book Entity Framework in Action, section edition.

Happy coding.

0 0 vote
Article Rating
Subscribe
Notify of
guest
2 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments
Bruno Renan Lima
Bruno Renan Lima
1 month ago

Great post!

Much of the problems using Entity Framework occurs because developers don’t known how to it works “under the hood”.With that, they use it wrong. This post helps the process of understanding EF Core a lot.

Congrats!

Jon P Smith
1 month ago

Thanks Bruno. Glad you liked it.