EF Core In depth – what happens when EF Core writes to the database?

Last Updated: June 13, 2020 | Created: June 13, 2020

This article is the second “under the hood” view of what happens when you use EF Core. The first was about reading from the database, and this article is about writing to the database. This is CUD part of the four CRUD (Create, Read, Update, and Delete) accesses.

I do assume you know EF Core, but I start with a look at using EF Core to make sure we have the basics covered before I dive into the depths of EF Core. But this is a “deep dive” so be ready for lots of technical detail, hopefully described in a way you can understand.

This article is part of a “EF Core In depth” series. Here is the current list of articles in this series:

This “EF Core In depth” series is inspired by what I found while updating my book “Entity Framework Core in Action” to cover EF Core 5. I am also added a LOT of new content from my experiences of working with EF Core on client applications over the last 2½ years.

NOTE: There is a companion GitHub repo at https://github.com/JonPSmith/EfCoreinAction-SecondEdition. There are unit tests that go with the content in this article – look for unit test class whose name start with “Ch06_”.

TL;DR – summary

  • EF Core can create a new entry in the database with new or existing relationships. To do this it has to set the right order to write out the classes so that it can get set the any linked class. This makes it easy for the developer to write out classes with complex links between them.
  • When you call the EF Core command Add to create a new entry many things happen
    • EF Core finds all the links from the added class to other classes. For each linked class it works out if it needs to create a new row in the database, or jus link to an existing row in the database.
    • It also fills in any foreign key, either with the actual key of an existing row, or a pseudo-key for links to new classes.
  • EF Core can detect when you change a property in a class you read in from the database. It does this by holding a hidden copy of the class(es) read in. When you call SaveChanges it compares each clear read in with its original value and only creates commands to change the specific class/property that was changed.
  • EF Core’s Remove method will delete the row in the database pointed to by the primary key value in the class you provide as a parameter. If the deleted class has relationships then the database can sort out what to do, but you can change the delete rules.

Setting the scene – the basics of EF Core writing to the database

TIP: If you already know EF Core then you can skip this section – it’s just an example of how you write to a database.

For the examples in my book I have created a small web application that sells books – think super-simple Amazon. In this introduction I am going to describe the database structure and then give you a simple example of writing to that database.

a. The classes/tables I’m going to work with

My Book App as I call it starts out in chapter 2 with the following five tables shown in the figure below. I chose this because a) its easy to understand because of Amazon etc. and b) it had each of the basic relationships that can exist between tables.

Theses tables are mapped to classes with similar names, e.g. Book, BookAuthor, Author, with properties with the same name as the columns shown in the tables. I’m not going to show the classes because of space, but you can see these classes here in my GitHub repo.

b. A look at what you need to access this database via EF Core

For EF Core to write to the database I have shown you need 5 parts

  1. A database server, such as SQL Server, Sqlite, PostgreSQL…
  2. A class, or classes, to map to your database – I refer to these as entity classes.
  3. A class which inherits EF Core’s DbContext class, which contains the setup/configuration of EF Core
  4. A way to create a database
  5. Finally, the commands to write to the database.

The unit test code below comes from the EfCoreinAction-SecondEdition GitHub repo and shows a simple example of writing out a set of books, with their authors, reviews etc. to a database.

[Fact]
public void TestWriteTestDataSqliteInMemoryOk()
{
    //SETUP
    var options = SqliteInMemory.CreateOptions<EfCoreContext>();
    using (var context = new EfCoreContext(options))
    {
        context.Database.EnsureCreated();

        //ATTEMPT
        var book = new Book
        {   
            Title = "Test",  
            Reviews = new List<Review>() 
        };       
        book.Reviews.Add(new Review { NumStars = 5 }); 
        context.Add(book);   
        context.SaveChanges();  

        //VERIFY
        var bookWithReview = context.Books
            .Include(x => x.Reviews).Single()
        bookWithReview.Reviews.Count.ShouldEqual(1);
    }
}

Now, if we link unit test code to the list of 5 parts, it goes like this

  1. A database server – Line 5: I have chosen a Sqlite database server, and in this case the SqliteInMemory.CreateOptions method, which comes from my EfCore.TestSupport NuGet package, sets up a new, in-memory database (in-memory database are great for unit testing as you can set up a new, empty database just for this test – see chapter 17 of my book for more).
  2. A class, or classes – not shown, but there is a Book entity class, with relationships to an a Review entity class.
  3. A class inherits DbContext – Line 6:  the EfCoreContext class inherits the DbContext class and configures the links from the classes to the database (you can see this class here in my GitHub repo).
  4. A way to create a database – Line 8: because it’s a new database I use this command to create the correct SQL tables, keys, indexes etc. The EnsureCreated method is used for unit tests, but for real applications you most likely will use EF Core migrations.
  5. Commands to write to the database – Lines 17 and 18
    1. Line 17: the Add method tells EF Core that a new book with its relationships (in this case, just a Review), needs to be written to the database.
    1. Line 18: In this case the SaveChange method creates new rows in Books and Review tables in the database.

The last few lines after the //VERIFY comment are some simple checks that the books have been written to the database.

In this example you added new entries (SQL command INSERT INTO) to the database, but EF Core will also handle updates and deletes to the database. The next section covers this create example and then moves onto other examples of Create, Update and Delete.

What happens when EF Core writes in the SQL database?

I’m going to start with creating a new Book entity class, with one, new Review entity class. I chose this as the simplest write which has a relationship.  You have just as I did in the unit test above, but if you skipped this here it the important part again:

var book = new Book              
{                                
    Title = "Test",              
    Reviews = new List<Review>() 
};                               
book.Reviews.Add(new Review { NumStars = 1 });
context.Add(book);               
context.SaveChanges();           

To add these two linked entities to the database EF Core has to

  1. Work out what order it should create these new rows – in this case it has to create a row in the Books table so that it has the primary key of the Book.
  2. Copy any primary keys into the foreign key of any relationships – this this case it copies the Books row’s primary key, BookId, into the foreign key in the new Review row.
  3. Copy back any new data creating in the database so that the entity classes properly represent the database – in this case it must copy back the BookId and update the BookId property in both the Book and Review entity classes and the ReviewId for the Review entity class.

So, let’s see the SQL from this create, as shown in the following listing

-- first database access
SET NOCOUNT ON; 
-- This inserts a new row into the Books table. 
-- The database generates the Book’s primary key                            
INSERT INTO [Books] ([Description], [Title], ...)
VALUES (@p0, @p1, @p2, @p3, @p4, @p5, @p6);     

-- This returns the primary key, with checks to ensure the new row was added
SELECT [BookId] FROM [Books]                        
WHERE @@ROWCOUNT = 1 AND [BookId] = scope_identity();

-- second database access
SET NOCOUNT ON;
-- This inserts a new row into the Review table. 
-- The database generates the Review’s primary key
INSERT INTO [Review] ([BookId], [Comment], ...)
VALUES (@p7, @p8, @p9, @p10);

-- This returns the primary key, with checks to ensure the new row was added
SELECT [ReviewId] FROM [Review]
WHERE @@ROWCOUNT = 1 AND [ReviewId] = scope_identity();

The important point is that EF Core handles the problem of writing out the entity classes in the correct order so that it can fill in any foreign keys. This is simple example but for a client I had to build a very complex piece of data consisting of about 15 different entity classes, some new entity classes were added, some were updated and some removed, but one call to SaveChanges and EF Core will work out what to do, in the right order, to update the database. So, EF Core makes writing complex data to a database easy for the developer.

I mention this because I have seen EF Core code where the developer used multiple calls of the SaveChanges method to obtain the primary key from the first create to set the foreign key for the related entity. For instance.

var book = new Book              
{                                
    Title = "Test"
}; 
context.Add(book);               
context.SaveChanges();           
var review = new Review { BookId = book.BookId, NumStars = 1 }
context.Add(review);               
context.SaveChanges();           

That would have the same effect as the previous code, but it has a weakness – if the second SaveChanges fails, then you have partial updated database, which might not be a problem in this case, but in my client’s case (which was a security system) that could be very bad indeed!

So, the take-away from this is – you don’t need to copy primary keys into foreign keys, because you can use set up the navigational properties and EF Core will sort out the foreign keys for you. So, if you think you need to call SaveChanges twice, then it normally means you haven’t set up the right navigational properties to handle that case.

What happens in the DbContext when EF Core writes to the database?

In the last section you saw what EF Core does at the database end, but now you are going to look at what happens inside EF Core. Most of the time you don’t need to know this, but there are times that knowing this is very important – for instance if you are catching changes during a call to SaveChanges, then you only get its State before SaveChanges is called, but you only have the primary key of a newly created entity after the call to SaveChanges.

The example is a little more complex from the last one. In this example I want to show you the different way EF Core handles new instances of an entity class over an instance of an entity that has been read from the database. The code in listing below creates a new Book, but with an Author that is already in the database. The code has comments saying Stage 1, Stage 2 and Stage 3 and I then describe what happens after each stage using diagrams.

//STAGE1                                             
var author = context.Authors.First();                
var bookAuthor = new BookAuthor { Author = author }; 
var book = new Book                                  
{                                                    
    Title = "Test Book",                             
    AuthorsLink = new List<BookAuthor> { bookAuthor }
};                                                   

//STAGE2
context.Add(book);                                   

//STAGE3
context.SaveChanges();                               

The next three figures show you what is happening inside the entity classes and their tracked data at each stage. Each figure shows the following data at the end of its stage.

  • The State of each entity instance at each stage of the process (shown above each entity class).
  • The Book and BookAuthor classes are brown to show that these are new instances of the classes and need to be added to the database, while the Author entity class is blue to represent that instance was read from the database.
  • The primary and foreign keys with the current value in brackets. If a key is (0), then it hasn’t been set yet.
  • The navigational links are shown as connections from the navigational property to the appropriate entity class that it is linked to.
  • Changes between each stage are shown by bold text or thicker lines for the navigational links.

The following figure shows the situation after Stage 1 has finished. This is your initial code that sets up a new Book entity class (left) with a new BookAuthor entity class (middle) which links the Book to an existing Author entity class (right).

The figure above show the condition of the three entity classes after Stage 1 has finished, that is your code has set up the entity classes to represent the new book you want add to the database. This is the starting point before you call any EF Core methods.

The next figure shows the situation after the line context.Add(book) is executed. The changes are shown in bold, and with thick lines for the added navigational links.

You may be surprised how much happened when the Add method was executed (I was!). What it seems to be doing is getting the entities as close to the position that they will be after the SaveChanges is called. Here are the things that happen when the Add method is called.

It sets the State of the entity provided as a parameter to Added – in this example that is the Book entity. Then it looks all entities linked to the entity provided as a parameter, either by navigational properties or by foreign key values. For each linked entity it does the following (NOTE: I don’t know the exact order they are done in).

  • If the entity is not tracked, that is its current State of Detached, it sets its State to Added. In this example, that is the BookAuthor entity – the Author’s State isn’t updated because that entity it is tracked. 
  • It fills in any foreign keys for the correct primary keys. If the linked primary key isn’t yet available, it puts a unique negative number in the CurrentValue properties the tracking data for the primary key and the foreign key. You can see that in figure above.
  • It fills in any navigational properties that aren’t currently set up – shown as thick blue lines in the figure above.

NOTE The call of the Add method can take some time! In this example, the only entities to link to are set by your code, but Add’s relational fixup stage can link to any tracked entity. But if have lots of tracked entity classes in the current DbContext it can take a long time. There are ways you can control this which I cover in chapter 14, EF Core performance tuning, in my book “Entity Framework Core in Action” second edition.

The final stage, stage 3, is what happens when the SaveChanges method is called, as shown in the next figure.

You saw in the “what happens in the SQL database” section that any columns that were set/changed by the database are copied back into the entity class so that the entity matches the database. In this example the Book’s BookId and the BookAuthor’s BookId where updated to have the key value created in the database. Also, now that all the entities involved in this database write now match the database their States are set to Unchanged.

Now that may have seemed a long explanation to something that “just works”, and many times you don’t need to know that. But when something doesn’t work correctly, or you want to do something complex like logging entity class changes, then knowing this is very useful.

What happens when EF Core updates the database?

The last example was about adding new entities to the database, but there was no update going on. So, in this section I am going to show what happens when you update something that is already in the database. The update relies on the normal query that I covered in the first article “what happens when EF Core reads from the database?”

The update is a simple one, just three lines, but it shows the three stages in the code: read, update, save.

var books = context.Books.ToList();
books.First().PublishedOn = new DateTime(2020, 1, 1);
context.SaveChanges();  

The following figure shows the three stages.

As you can see, the type of query you use matters – the normal query loads the data and keeps a “tracking snapshot” of the data returned to the calling code. The returned entity classes are said to be “tracked”. If an entity class isn’t tracked, then you can’t update it.

NOTE: The Author entity class in the last section was also “tracked”. In that example the tracked state of the Author told EF Core that the Author was already in the database so that it wasn’t created again.

So, if you change any properties in the loaded, tracked entity classes, then when you call SaveChanges it compares ALL the tracked entity classes against their tracking snapshot. For each class it goes through all the properties that are mapped to the database, and any properties that link to other entity classes (known as navigational properties). This process, known as change tracking, will detect up every change in the tracked entities, both for non-relational properties like Title, PubishedOn, and navigational links, which will be converted to changes to foreign keys that link tables together.

In this simple example there are only four books, but in real applications you might have loaded lots entity classes all linked to each other. At that point the comparison stage can take a while. Therefore, you should try to only load the entity classes that you need to change.

NOTE: There is an EF Core command called Update, but that is used in specific cases where you want every property/column updated. EF Core’s change tracking is the default approach as it only updates the property/column that has changed.

Each update will create a SQL UPDATE command, and all these UPDATEs will be applied within a SQL transaction. Using a SQL transaction means that all the updates (and any other changes EF Coure found) are applied as a group, and if any one part fails then any database changes in the transaction are not applied. That isn’t important in our simple example, but once you start changing relationships between tables then its important that they all work, or all fail.

What happens when EF Core deleting data in the database?

The last part of the CRUD is delete, which in some ways is simple, you just call context.Remove(myClass), and in other ways its complex, e.g. what happens when you delete an entity class that another entity class relies on? I’m going to give you a quick answer to the first part, and a much longer to the second part.

The way to delete an entity class mapped to the database is to use the Remove method. Here an example where I load a specific book and then delete it.

var book = context.Books
    .Single(p => p.Title == "Quantum Networking");
context.Remove(book); 
context.SaveChanges();

The stages are:

  1. load the book entity class that you want to delete. This gets all its data, but for a delete you only really need the entity class’s primary key.
  2. The call to the Remove method set’s the State of the book to Deleted. This information is sorted in the tracking snapshot for this book.
  3. Finally, the SaveChanges creates a SQL Delete command that is sent to the database, with any other database changes, and in a SQL transaction (see update description for more on a transaction).

That looks straightforward, but there is something going on here that is important, but not obvious from that code. It turns out that the book with the title of “Quantum Networking” has some other entity classes (database: rows) linked to it – in this specific test case the book with the title of “Quantum Networking” has links to the following entity classes:

  • Two Reviews
  • One PriceOffer
  • One BookAuthor linking to its Author.

Now, the Reviews, PriceOffer and BookAuthor (but not the Author) entity classes are only relevant to this book – the terms we use is they are dependent on the Book entity class. So, if the Book is deleted, then these Reviews, PriceOffer, and any BookAuthor linking rows should be deleted too. If you don’t delete these, then the database’s links are incorrect, and a SQL database would throw an exception. So, why did this delete work?

What happens here is the database relationships between the Books table and the three dependent tables have been set up with a Delete rule called cascade delete. Here is an example of the SQL commands EF Core would produce to create the Review table.

CREATE TABLE [Review] (
    [ReviewId] int NOT NULL IDENTITY,
    [VoterName] nvarchar(max) NULL,
    [NumStars] int NOT NULL,
    [Comment] nvarchar(max) NULL,
    [BookId] int NOT NULL,
    CONSTRAINT [PK_Review] PRIMARY KEY ([ReviewId]),
    CONSTRAINT [FK_Review_Books_BookId] FOREIGN KEY ([BookId]) 
         REFERENCES [Books] ([BookId]) ON DELETE CASCADE
);

The highlighted part is the constraint (think it as a rule) that says the review is linked to a row in the Books table via its BookId column. On the end of that constraint you will see the works ON DELETE CASCADE. That tells the database that if the book it is linked to is deleted, then this review should be deleted too. That means that the delete of the Book is allowed as all the dependent rows are deleted too.

That is very helpful, but maybe to want to change the delete rules for some reason? In chapter four of my book  I add some business logic to allow a user to buy a book. I decided I didn’t want to allow a book to be deleted if it existed in a customer’s order. To do this I added some EF Core configuration inside the DbContext to change the delete behaviour to Restrict – see below

public class EfCoreContext : DbContext
{
    private readonly Guid _userId;                                   

    public EfCoreContext(DbContextOptions<EfCoreContext> options)                         
        : base(options)

    public DbSet<Book> Books { get; set; }
    //… other DbSet<T>s removed

    protected override void OnModelCreating(ModelBuilder modelBuilder
    {
        //… other configurations removed 

        modelBuilder.Entity<LineItem>()
            .HasOne(p => p.ChosenBook) 
            .WithMany()
            .OnDelete(DeleteBehavior.Restrict);
    } 
}

Once this change to the configuration is migrated to the database the SQL DELETE CASCADE would be removed. That means the SQL constraint changes to a setting of NO ACTION. This means if you try to delete a book that is in a customer’s Order (with uses a LineItem table to keep each item in an order), then the database would return an error, which EF Core would turn into an exception.

This gives you a good idea of what is going on, but there is quite a bit more that I haven’t covered (but I do cover in my book). Here are some things about Delete that I haven’t covered.

  • You can have required relationships (dependent) and optional relationships and EF Core uses different rules for each type.
  • EF Core contains DeleteBehaviors that brings some of the work that the database would do into EF Core. This is useful to avoid problems when your entity classes’ relationships are circular – some databases throw an error if they find a circular delete loop.
  • You can delete an entity class by providing the Remove method with a new, empty class with just the primary key set. That can be useful when working with a UI/WebAPI that only returns the primary key.

Conclusion

So, I have covered the Create, Update and Delete part of the CRUD, with the previous article handling the Read part.

As you have seen the creation of new data in the database using EF Core is easy to use, but complex inside. You don’t (usually) need to know about what happens inside EF Core or the database, but having some idea is going help you take full advantage of EF Core’s cleverness.

Updates are also easy – just change a property or properties in the entity class you read in and when you call SaveChanges EF Core will find the changed data and build database commands to update the database to the same settings. This works for non-relational properties, like the book Title, and for navigational properties, where you could change a relationship.

Finally, we looked at a delete. Again, easy to use but a lot can be happening underneath. Also, look out for the next article where I talk about what is called “Soft Delete”. This is were you set a flag and EF Core won’t see that entity class anymore – it’s still in the database, but its hidden. Many of my clients have used this as it helps if their users inadvertently delete something because they can un-(soft) delete it.

I hope you found this useful and look out for more articles in this series.

Happy coding.