An in-depth study of Cosmos DB and the EF Core 3.0 database provider

Last Updated: November 8, 2019 | Created: November 4, 2019

This article looks at the capabilities of Cosmos DB database when used via the new EF Core 3.0 database provider. It looks at various differences between Cosmos DB, which is a NoSQL database, and a SQL database. It also looks at the limitations when using the EF Core 3.0 Cosmos DB provider.

I have been waiting for this release for a while because I think both SQL and NoSQL database have a role in building high performance applications. I have a CQRS approach which uses both types of databases and I wanted to update it (see section “Building something more complex with Cosmos DB” for more info). But when I started using Cosmos DB and the new EF Core database provider, I found things were more different than I expected!

The articles forms part of the articles I am writing to update my book. “Entity Framework Core in Action”, with the changes in EF Core 3. The list of articles in this series is:

NOTE: If you not sure of the differences between SQL (relational) and NoSQL databases I recommend googling “sql vs nosql difference” and read some of the articles. I found this article gave a good, high-level list of differences at the start, but the rest of it was a bit out of date.

TL;DR; – summary

A quick introduction to using EF Core’s Cosmos DB provider

I thought I would start with a simple example that demonstrates how to use the Cosmos DB database provider in EF Core 3. In this example I have a Book (my favourite example) with some reviews. There are the two classes and the DbContext.

public class CosmosBook
{
    public int CosmosBookId { get; set; }
    public string Title { get; set; }
    public double Price { get; set; }
    public DateTime PublishedDate { get; set; }

    //----------------------------------
    //relationships 

    public ICollection<CosmosReview> Reviews { get; set; }
}
[Owned]
public class CosmosReview
{
    public string VoterName { get; set; }
    public int NumStars { get; set; }
    public string Comment { get; set; }
}
public class CosmosDbContext : DbContext
{
    public DbSet<CosmosBook> Books { get; set; }

    public CosmosDbContext(DbContextOptions<CosmosDbContext> options)
        : base(options) { }
}

If you have used SQL databases with EF Core, then the code above should be very familiar, as its pretty much the same as a SQL database. The unit test below is also to a SQL unit tes, but I point out a few things at the end that are different for Cosmos DB.

[Fact]
public async Task TestAddCosmosBookWithReviewsOk()
{
    //SETUP
    var options = this.GetCosmosDbToEmulatorOptions<CosmosDbContext>();
    using var context = new CosmosDbContext(options);
    await context.Database.EnsureDeletedAsync();
    await context.Database.EnsureCreatedAsync();

    //ATTEMPT
    var cBook = new CosmosBook
    {
        CosmosBookId = 1,      //NOTE: You have to provide a key value!
        Title = "Book Title",
        PublishedDate = new DateTime(2019, 1,1),
        Reviews = new List<CosmosReview>
        {
            new CosmosReview{Comment = "Great!", NumStars = 5, VoterName = "Reviewer1"},
            new CosmosReview{Comment = "Hated it", NumStars = 1, VoterName = "Reviewer2"}
        }
    };
    context.Add(cBook);
    await context.SaveChangesAsync();

    //VERIFY
    (await context.Books.FindAsync(1)).Reviews.Count.ShouldEqual(2);
}

Here are some comments on certain lines that don’t follow what I would do with a SQL database.

  • Line 5: The GetCosmosDbToEmulatorOptions method comes from my EfCore.TestSupport library and sets up the Cosmos DB connection to the Azure Cosmos DB Emulator – see the “Unit Testing Cosmos DB” section at the end of this article.
  • Lines 6 and 7: This creates an empty database so that my new entry doesn’t clash with any exsiting data.
  • Line 13: This is the first big thing with Cosmos DB – it won’t create unique primary keys. You have to provide a unique key, so most people use GUIDs.
  • Lines 23 and 26: Notice I use async versions of the EF Core commands. There is a warning to only use async methods in the EF Core Cosmos DB documentation.

And finally, here is what is placed in the Cosmos DB database:

{
    "CosmosBookId": 1,
    "Discriminator": "CosmosBook",
    "Price": 0,
    "PublishedDate": "2019-01-01T00:00:00",
    "Title": "Book Title",
    "id": "CosmosBook|1",
    "Reviews": [
        {
            "Comment": "Great!",
            "Discriminator": "CosmosReview",
            "NumStars": 5,
            "VoterName": "Reviewer1"
        },
        {
            "Comment": "Hated it",
            "Discriminator": "CosmosReview",
            "NumStars": 1,
            "VoterName": "Reviewer2"
        }
    ],
    "_rid": "+idXAO3Zmd0BAAAAAAAAAA==",
    "_self": "dbs/+idXAA==/colls/+idXAO3Zmd0=/docs/+idXAO3Zmd0BAAAAAAAAAA==/",
    "_etag": "\"00000000-0000-0000-8fc9-8720a1f101d5\"",
    "_attachments": "attachments/",
    "_ts": 1572512379
}

Notice that there are quite a few extra lines of data that EF Core/Cosmos DB adds to make this all work. Have a look at this link for information on what these extra properties do.

Building something more complex with Cosmos DB

There are plenty of simple examples of using the EF Core Cosmos DB database provider, but in my experience its not until you try and build a real application that you hit the problems. In my book, “Entity Framework Core in Action” I built a CQRS architecture database in chapter 14 using both a SQL and NoSQL database and as Cosmos DB wasn’t available I used another NoSQL database, RavenDB. I did this to get better read performance on my example book sales web site.

I wanted to redo this two-database CQRS architecture using the EF Core’s Cosmos DB provider. I had a go with a pre-release Cosmos DB provider in EF Core 2.2, but the EF Core 2.2. Cosmos DB provider had (big) limitations. Once EF Core 3 came out I started rebuilding the two-database CQRS architecture with its stable release of the Cosmos DB provider.

NOTE: The new version is in the master branch of the EfCoreSqlAndCosmos repo, with the older version in the branch called NetCore2_2Version.

This example application, called EfCoreSqlAndCosmos, is a site selling books, with various sorting, filtering, and paging features. I have designed the application so that I can compare the performance of a SQL Server database against a Cosmos DB database, both being queries via EF Core 3. Here is a picture of the application to give you an idea of what it looks like – you swap between the SQL Books page (shown) and the NoSQL Books page accessed by the “Books (NoSQL)” menu item.

NOTE: This code is open source and is available on GitHub. You can clone the repo and run it locally. It uses localdb for the SQL Server database and Azure Cosmos DB Emulator for the local Cosmos DB database.

What I found when I rebuilt my EfCoreSqlAndCosmos application

It turns out both the Cosmos DB and EF Core Cosmos DB provider have some changes/limitations over what I am used to with a SQL (relational) database. Some of the changes are because Cosmos DB database has a different focus – its great at scalability and performance but poor when it comes to complex queries (that’s true in most NoSQL databases too). But also, the EF Core 3.0 Cosmos DB provider has quite a few limitations that impacted what I could do.

This article has therefore morphed into a look at what you can, and can’t do with the EF Core 3 Cosmos DB provider, using my book app as an example. What I’m going to do is walk you thought the obstacles I encounter on trying to build my book app and explain what was the problem and how I got around it.

Issue #1: No unique key generation or constraints in Cosmos DB

In the initial example I showed you that Cosmos DB doesn’t create primary key in the way the SQL databases can do. That means you are likely to use something like a GUID, or GUID string, if you want to add to a Cosmos DB.

You also don’t get the sort of checking of primary/foreign keys that you have in SQL databases. For instance, if I add a book with the same primary key, I don’t get a constraint error from Cosmos DB, it just doesn’t save your new book.

My TestAddCosmosBookAddSameKey unit test shows that if you Add (i.e. try to create) a new entry with the same key as an existing entry in the database then the Add is ignored. You don’t get an exception and the only indications of that are:

  1. The number of changes that SaveChangesAsync returns doesn’t include that failed Add.
  2. The EF Core State of the added class turns to “Unchanged” after the SaveChangesAsync.

I quite like SQL constraints because it ensures that my database keeps is referential integrity, which means when a change is applied to a SQL database it makes sure all the primary key and foreign keys are correct against the database constraints. But Cosmos, and most NoSQL databases has no constraints like SQL databases. This is another feature that NoSQL database drops to make the database simpler and therefore faster.

In the end the way that Cosmos DB works is fine, as you can use the Guid.NewGuid() method to get a unique primary key value. But be aware of this “silent write fail” issue if you think all you are missing some data. 

NOTE: There was a bug about silently failing to write to an unknown database, but that is fixed in EF Core 3.1.

Issue #2: Counting number of books in Cosmos DB is SLOW!

In my original application I had a pretty standard paging system, with selection of page number and page size. This relies on being able to count how many books there are in the filtered list. But with the changeover to Cosmos DB this didn’t work very well at all.

First the EF Core 3.0 Cosmos DB provider does not support aggregate operators, e.g. Average, Sum, Min, Max, Any, All, and importantly for me, Count. There is a way to implement Count, but its really slow. I thought this was a real limitation, until I realised something about Cosmos DB – you really don’t want to count entries if you can help it – let me explain.

On SQL database counting things is very fast (100,000 books takes about 20 ms. on my PC). But using a Cosmos DB counting means accessing the data, and you are a) charged for all the data you access and b) you have to pay for the level of throughput you want provision for. Both of these is measured in RU/second, with the starting point being 400 RU/second for a Cosmos DB database on Azure.

The Azure Cosmos DB Emulator tells me that to count 100,000 of my book class using the best Count approach (i.e. Cosmos DB SQL: SELECT VALUE Count(c) FROM c) takes 200 ms. (not rate limited) and uses 1,608 RU/second – that is a costly query to provision for!

NOTE: The current EF Core solution, noSqlDbContext.Books.Select(_ => 1).AsEnumerable().Count() is worse. It takes 1 second to read 100,000 books.

So, the learning here is some things in Cosmos DB are faster than SQL and some things are slower, and more costly, than SQL. And you have to handle that.

This made me change the NoSQL side of the EfCoreSqlAndCosmos application to not use normal paging but use a Prev/Next page approach (see picture).

You might notice that Amazon uses a limited next/previous and says things like “1-16 of over 50,000 results for…” rather than count the exact number of entries.

Issue #3: Complex queries can be a problem in Cosmos DB

In my “filter by year” option I need to find all the years where books were published, so that the user can pick the year they are looking for from a dropdown. When I ran my code taken from my SQL example I got an exception with a link to this EF Core Issue which says that Cosmos DB has some limitations on queries.

See the two code snippets below from my filter dropdown code to see the difference from the Cosmos DB (left) and SQL Server (right) code. Notice the Cosmos DB query needed the second part of the query to be done in the application.

The learning from this example is that Cosmos DB doesn’t support complex queries. In fact the general rule for NoSQL databases is that they don’t have the range of query capabilities that SQL database has.

Issue #4: You can’t sort on a mixture of nullable and non-nullable entries

My SQL queries to show books contains a LINQ command to work out the average review stars for a book. SQL databases returns null if there aren’t any reviews for a book (see note at the end of this second for the technicalities of why that is).

So, when I built my Cosmos DB version, I naturally added a nullable double (double?) to match. But when I tried to order by the reviews it threw an exception because I had a mix of nulls and numbers. Now, I don’t know if this is a bug in EF Core (I have added an issue to the EF Core code) or an overall limitation.

UPDATE: It turns out its a bug in Cosmos DB which hasn’t been fixed yet. I suspect that fix won’t get into EF Core 3.1 but hopefully fixed in EF Core 5.

The solution for my application was easy – I just set a null value to 0 as my ReviewsCount property told me if there were any reviews. But be aware of this limitation if your using nulls, especially strings. PS. Saving/returning nulls work find, and Where clauses work too – it’s just OrderBy that has a problem.

NOTE: In my SQL version I use the LINQ command … p.Reviews.Select(y => (double?)y.NumStars).Average(), which converts into the SQL command AVG(CAST([r1].[NumStars] AS float)). The (double?) cast is necessary because the SQL AVE command returns null if there are no rows to average over.

Issue #5. Missing database functions

In my SQL filter by year I have the following piece of code:

var filterYear = int.Parse(filterValue);      
return books.Where(x => x.PublishedOn.Year == filterYear 
     && x.PublishedOn <= DateTime.UtcNow);

That returns all the books in the give year, but excludes books not yet published. I ran into two issues with the Cosmos DB version.

Firstly, the PublishedOn.Year part of the query gets converted into (DATEPART(year, [b].[PublishedOn]) in SQL database, but Cosmos DB doesn’t have that capability. So, to make the filter by year feature to work I had to add an extra property called “YearPublished” to hold the year.

The second part was the DateTime.UtcNow gets converted to GETUTCDATE() in a SQL database. Now Cosmos DB does have a GetCurrentDateTime () method, but at the moment EF Core 3.0 does not support that, and many other Cosmos database functions too (subscribe to this EF Core issue to track progress on this).

All is not lost. By me adding a new int property called “YearPublished” to my CosmosBook and getting the UtcNow from the client I got the query to work – see below:

var now = DateTime.UtcNow;
var filterYear = int.Parse(filterValue);      
return books.Where(x => x.YearPublished == filterYear 
     && x.PublishedOn <= now);

This is another example of different query features between SQL databases (which are very similar because of the definition of the SQL language) and a range of different NoSQL databases, don’t have any standards to follow. On the plus side Cosmos DB automatically indexes every property by default, which helps my application.

Issue #6. No relationships (yet)

We now have a stable, usable Cosmos DB database provider in EF Core 3.0, but its certainly not finished. From the EF Core Azure Cosmos DB Provider Limitations page in the EF Core docs it said (lasted updated 09/12/2019)

Temporary limitations

  • Even if there is only one entity type without inheritance mapped to a container it still has a discriminator property.
  • Entity types with partition keys don’t work correctly in some scenarios
  • Include calls are not supported
  • Join calls are not supported

Looking at the last two items that means, for now, you will have to put all the data into the one class using [Owned] types (see the example I did right at the beginning). That’s OK for my example application, but does cut out a number of options for now. I will be interesting to see what changes are made to the Cosmos DB provider in EF Core 5 (due out in November 2020).

Issue #7. Permanent limitation due to Cosmos DB design

The Cosmos DB database has certain limitations over what you are used to with SQL database. The two big ones are:

1. No migrations – can cause problems!

Cosmos DB, like many NOSQL databases, saves a json string, which Cosmos DB calls a “Document” – I showed you that in the first example. Cosmos DB doesn’t control the content of that Document, it just reads or writes to it, so there’s no ‘migration’ feature that you are used change all the Documents in a database. And this can cause problems you need to be aware of!

For instance, say I updated my CosmosBook class to add a new property of type double called “Discount”. If you now tried to read the ‘old’, i.e. data that was written to the database before you added the “Discount” property, then you would get an exception. That’s because it expects a property called “Discount” to fill in the non-nullable property in your class.

As you can see, this means you need to think about migrations yourself: either run special code to add a default value for the new, non-nullable property or make your new properties nullable

NOTE: Taking a property out of a class is fine – the data is in the document, but it just isn’t read. When you update document, it replaced all old data with the new data.

2. No client-initiated transactions

Cosmos only provide Document-level transactions. That means it updates the whole document, or doesn’t change anything in the document. That way you can be sure the document is correct. But doing any clever transaction stuff that SQL database provide are not in Cosmos DB (and most, but not all, NoSQL databases).

This makes some things that you can do in SQL quite hard, but if you really need transactions then you should either need a NoSQL that has that feature (a few do) or go back to a SQL database.

However there are ways to implement a substitute for a transaction, but it’s quite complex. See the series called Life Beyond Distributed Transactions: An Apostate’s Implementation, by Jimmy Bogard for a way to work around this.

UPDATE: Cosmos DB is continually changing and in the the November 2019 “what’s new” article they have added some form of transactional feature to Cosmos DB (and support for GroupBy). So my comments about no transactions might change in the future.

Unit testing Cosmos DB code

I know this article is long but I just wanted to let you know about the very useful Azure Cosmos DB Emulator for testing and developing your Cosmos DB applications. The Emulator runs locally and store your results using localdb. It also come with a nice interface to see/edit the data.

NOTE: You could use a real Cosmos DB on Azure, but they aren’t cheap and you will need lots of databases if you want to run your unit tests in parallel.

I have added a couple of extensions to my EfCore.TestSupport library to help you set up the options for Cosmos DB accessing the Azure Cosmos DB Emulator. Here is a very simple example of a unit test using my GetCosmosDbToEmulatorOptions<T> method, with comments at the end

[Fact]
public async Task ExampleUsingCosmosDbEmulator()
{
    //SETUP
    var options = this.GetCosmosDbToEmulatorOptions<CosmosDbContext>();
    using var context = new CosmosDbContext(options);
    await context.Database.EnsureDeletedAsync();
    await context.Database.EnsureCreatedAsync();

    //ATTEMPT
    var cBook = new CosmosBook {CosmosBookId = 1, Title = "Book Title")
    context.Add(cBook);
    await context.SaveChangesAsync();

    //VERIFY
    (await context.Books.FindAsync(1)).ShoudlNotBeNull();
}

Comments on certain lines.

  • Line 5: This method in by library links to your local Azure Cosmos DB Emulator creates a database using the name of the test class. Having a name based on the test class means only the tests in this class uses that database. That’s important because xUnit can run test classes in parallel and you don’t want tests in other classes writing all over the database you are using for your test.
  • Line 7 & 8: I found deleting and the recreating a Cosmos DB is a quick (< 2 secs on my PC) way to get an empty database. That’s much better than SQL Server database that can take around 8 to 10 seconds on my PC.

NOTE: The GetCosmosDbToEmulatorOptions have a few other options/calls. You can find the full documentation about this here. Also, the EfCoreSqlAndCosmos repo has lots of unit tests you can look at to get the idea – see the Test project.

Conclusion

Wow, this article is very long and took me way longer than I expected. I really hope you find this useful. I certainly learnt a lot about what Cosmos DB can do and the state of the EF Core 3 Cosmos DB database provider.

This is the first non-preview version of the EF Core Cosmos DB database provider and many features aren’t available, but it had enough for me to use it successfully. Its going to be very interesting to see where both the EF Core Cosmos provider and the Cosmos DB will be when NET Core 5 comes out in November 2020.

As I said at the beginning of the article, I think both SQL and NosSQL databases have a role in building modern applications. Each approach has different strengths and weaknesses, and this article is about seeing what they are. But its not until I write the next article comparing the performance of various approaches will some of the strengths of the Cosmos DB will come through.

In the meantime, I hope this article and its companion EfCoreSqlAndCosmos repo helps you in understanding and using Cosmos DB.

Happy coding!

If you have a ASP.NET Core or Entity Framework Core problem that you want help on then I am available as a freelance contractor. Please send me a contact request via my Contact page and we can talk some more on Skype.

NET Core 3 update to “Entity Framework Core in Action” book

Last Updated: November 3, 2019 | Created: October 19, 2019

With the release of EF Core 3 I wanted to provide updates to my book, “Entity Framework Core in Action”. This article covers the whole of the book and provides the updated information. But I also plan to write two more articles that go with the big, under the hood, changes in EF Core 3, they are: a) query overhaul and b) the addition of the NoSQL database, Cosmos DB, as a database provider. The list of articles in this series is:

  • Part 1: NET Core 3 update to “Entity Framework Core in Action” book (this article)
  • Part 2: Entity Framework Core performance tuning – reworked example with NET Core 3 (not written yet)
  • Part 3: Building a robust CQRS database with EF Core and Cosmos DB – NET Core 3 (not written yet)

NOTE: This article is written to supplement/update the book “Entity Framework Core in Action”. It is not a general introduction to EF Core 3 – please look at the Microsoft documentation for the changes in EF Core 3.

TL;DR; – summary

  • The release of EF Core 3 made small changes to the code you write, but there are big changes inside, i.e. query overhaul and support for NoSQL databases (Cosmos DB).
  • For the most part the “Entity Framework Core in Action” book (which covered up to EF Core 2.1) doesn’t change a lot.
  • There are 19 small changes in EF Core 3 that effect the book. In my opinion only two of them are really important:
  • The big changes in EF Core 3 which warrant their own articles are:
    • Query overhaul, where the way that LINQ is converted to database commands has been revamped. I mentions this in this article but going to more detail another article.
    • Support for NoSQL databases (Cosmos DB) which I will cover in separate articles.

List of changes by chapter

I’m now going to go through each chapter in my book and tell you what has changed. But first here is a summary:

  • Chapter 1 – 3 changes (medium, setup)
  • Chapter 2 – 1 change (important)
  • Chapter 3 – no changes
  • Chapter 4 – no changes
  • Chapter 5 – 2 changes (small, code)
  • Chapter 6 – 1 change (small, config)
  • Chapter 7 – 3 changes (small, config)
  • Chapter 8 – 1 change (optional, config)
  • Chapter 9 – 2 changes (small, rename)
  • Chapter 10 – no changes
  • Chapter 11 – 1 change (optional)
  • Chapter 12 – 2 changes (medium)
  • Chapter 13 – see a new article looking at the performance changes.
  • Chapter 14 – 1 change (medium)
  • Chapter 15 – 2 changes (small)

NOTE: There are other changes in EF Core 3.0 that I don’t list because the topic was not covered in my book. Typically these other changes are about features that are too deep to cover in a 500-page book. For a full list of all the breaking changes in EF Core 3 see this link to the Microsoft documentation.

Chapter 1 – Introduction to Entity Framework Core

This chapter introduced EF Core, described what it does and gives a simple application that accesses a database to start.  Much of this stays the same, but here are the changes:

Pages 9 to 11, Installing, creating application, and downloading example

The description uses VS2017 but if you want to use EF Core 3 you should use VS2019 instead (note: installing VS2019 should also install NET Core 3 SDK). The steps are the same but VS2019 is a bit easier.

NOTE: If you want to download an EF Core 3 version for the companion repo EfCoreInAction then I have created new branch called Chapter01-NetCore3 that you can download.

Pages 17 to 19, Reading data from the database

In listing 1.2 I introduce the EF Core method called AsNoTracking. This tells EF Core that you don’t want to update this data, which reduces the memory it take and makes it slightly faster.

EF Core 3 adds another another performance improvement by returning individual instances of each entity class, rather that returning a single instance for duplicate entities, i.e. classes of the same type and primary key (if that doesn’t make sense then read the EF Core docs on this).

Now that sounds OK, but it can produce some very subtle changes to your application, mainly in your business logic. Taking the Book/Author example in the book, if I read in two books with the same author using AsNoTracking and I would get different results in EF Core 2 from EF Core 3. Here is the code and what is returned:

var entities = context.Books
    .Where(x => x.AuthorsLink.Author.Name == "Author of two books")
    .Include(x => x.AuthorsLink)
         .ThenInclude(x => x.Author)
    .ToList();
  • EF Core 2: I get two instances of the Book classes and ONE instance of the Author class (because both books have the same author)
  • EF Core 3: I get two instances of the Book classes and TWO instance of the Author class, one for each book.

This shouldn’t be a problem if you are reading data to show to the user, but it may be a problem in your business logic if you were relying on the instances being unique. I found it in my Seed from Production tests when I was saving production data to json.

Pages 22 to 26 – Should you use EF Core in your next project?

Much of what I said there still applied but EF Core is much more mature and used by millions in real products. EF Core’s stability and coverage are much higher now than when I started to write this book.

Chapter 2 – Querying the database

There is a very significant under-the-hood change to the client vs. server evaluation feature in EF Core 3. I described the client vs. server evaluation (pages 43 to 45) and at the end I warned you that client vs. server evaluation could produce really inefficient database accessed, and by default EF wouldn’t alert you to this (you could turn on an event that would throw an exception, but by default this was off).

It turns out it is pretty easy to write a LINQ query that can’t be properly translated to a database access and were run in software, which can be very slow. The EF Core 3 the team decided to fix this problem as part of the query overhaul by restricting the client vs. server evaluation to remove these bad performing queries. In EF Core 3 you now get an InvalidOperationException thrown if the query couldn’t be translated into a single, valid database access command.

See the Conclusion section at the end of this article for some more information on how to handle such an exception.

NOTE: client vs. server evaluation is still works at the top level (and is very useful) – see the EF Core restricted client evaluation docs to show you what works.

Chapter 3 – Changed the database content

No changes.

Chapter 4 – Using EF Core in business logic

No changes.

Chapter-5 – Using EF Core in ASP.NET Core

Obviously, the big change is using ASP.NET Core 3 which changes the code you write in ASP.NET Core’s Program and Startup classes. But all of the concepts, and nearly all of the code, in this chapter stays the same. The only code changes are:

NOTE: In the companion repo EfCoreInAction I have created a branch called Chapter05-NetCore3. In this I have updated all the projects to NetStandard2.1/netcoreapp3.0. I also swapped out AutoFac for the NetCore.AutoRegisterDi library – see this article about why I did that.

Chapter 6 – Configuring nonrelational properties

There is a small change to backing fields on how data is loaded (pages 172 and 173). In EF core 2 it defaulted to loading data via the property, if present. In EF Core 3 it defaults to loading data via the private field, which is a better default – see this link for more on this.

Chapter 7 – Configuring relationships

Pages 195 to 199 – Owned types

Owned types in EF Core 3 have changes how you map an Owned type to another table (page page 198). Now you have to provide the table name as a second parameter in the Fluent APIs’ OwnOne method. See this link for more information.

NOTE: In appendix B, page 470 that EF Core 2.1 provided the [Owned] attribute which can be used instead of Fluent API configuration. I find this a simpler way to configure a class as an Owned type than using the Fluent API, but it can’t change the table mapping.

Pages 199 to 203 – Table per hierarchy section

In the table per hierarchy section I had to configure the discriminator’s AfterSaveBehavior behaviour to Save (see bold content in listing 7.22). In NET Core 3 you cannot do that anymore because the MetaData property is read-only, but it seems that the issue which needed me to set the AfterSaveBehavior property has been fixed, so I don’t need that feature.

Pages 203 to 205 – Table splitting

By request the way that tables splitting works has changed. Now the dependant part (BookDetail class in my example) is now optional. This is slightly complicated change so I refer you to the EF Core 3 breaking change section that covers this.

Chapter 8 – Configuring advanced features and handling concurrency conflicts

There is one small improvement around user defined functions (pages 209 to 213). In listing 8.4 I said you always needed to define the Schema name when configuring the user defined functions. In EF Core 3 you don’t need to set the Schema name if you are using the default schema name.

Chapter 9 – Going deeper into the DbContext

Pages 256 to 259 – Using raw SQL commands in EF Core

The EF Core SQL commands have been renamed to make the commands safer from SQL injection attacks. There are two versions of each method: one using C#’s string interpolation and one that uses {0} and parameters, e.g.

context.Books
    .FromSqlRaw("SELECT * FROM Books WHERE Price < {0}”, priceLimit)
    .ToList();
context.Books
     .FromSqlInterpolated($"SELECT * FROM Books WHERE Price < {priceLimit}”)
     .ToList();

My unit tests also show that you can’t add (and don’t need) the IgnoreQueryFilters() method before a EF Core SQL command (you did need to in EF Core 2). It seems that the methods like FromSqlRaw don’t have the query filter added to the SQL. That’s a nice tidy up as it often made the SQL command invalid, but please remember to add the query filter to your SQL if you need it.

Page 261 – Accessing EF Core’s view of the database

In EF Core 3 access to the database schema information has changed. Instead of using the method Relational you can now directly access database information with methods such as GetTableName and GetColumnName. For an example of this have a look at the DatabaseMetadata extensions in the Chapter13-Part2-NetCore3 branch.

Chapter 10 – Useful software patterns for EF Core applications

No changes.

Chapter 11 – Handling database migrations

Pages 303 to 306 – Stage 1: creating a migration

In figure 11.1 I showed you how to create an IDesignTimeDbContextFactory if you have a multi-project application. But if your EF Core 3 code is in an ASP.NET Core 3 application you don’t need to do this because EF Core 3 will call CreateHostBuilder in ASP.NET Core’s Program class to get an instance of your application DbContext. That’s a small, but nice improvement.

Chapter 12 – EF Core performance tuning

All the suggestions for improving performance are just as useful in EF Core 3, but here are a few adjustments for this chapter.

Pages 338 to 339 – Accessing the logging information produced by EF Core

The code that showed to setup logging on EF Core is superseded. The correct way to get logging information from EF Core is to use the UseLoggerFactory when defining the options. Here is an example of how this could work (see the full code in my EFCore.TestSupport library).

var logs = new List<LogOutput>();
var options = new DbContextOptionsBuilder<BookContext>()
    .UseLoggerFactory(new LoggerFactory(new[]
    {
        new MyLoggerProviderActionOut(log => logs.Add(log))
    }))
    .UseSqlite(connection)
    .Options;
using (var context = new BookContext(options))
//… your code to test

In this section I also talked about the QueryClientEvaluationWarning, which is replaced in EF Core with a hard exception if the query can’t be translated to single, valid database access command.

NOTE: My EfCore.TestSupport library has methods to setup Sqlite in-memory and SQL Server database with logging – see this documentation page for examples of how to do this.

Pages 347 to 328 Allowing too much of a data query to be moved into the software side

With EF Core’s much stringent approach to query translation any use of client vs server evaluation will cause an exception. That means that “Allowing too much of a data query to be moved into the software side” can’t happen, but I do suggest using unit tests to check your queries so that you don’t get exceptions in your live application.

Chapter 13 – A worked example of performance tuning

I plan to write a separate article to look at EF Core performance, but in summary EF Core 3 is better at handling relationships that are collections, e.g. one-to-many and many-to-many, because it loads the collections with the primary data. But there are still areas where you can improve EF Core’s performance.

NOTE: I wrote an article “Entity Framework Core performance tuning – a worked example” soon after I finished the book which summarises chapter 13. I plan to write a new article to look at the performance tuning that EF Core can offer.

Chapter 14 – Different database types of EF Core services

Pages 403 to 405 – Finding book view changes

In table 14.1 I show the BookId and ReviewId having a temporary key values after a context.Add method has been executed. In EF Core 2 these temporary key values were put in the actual key properties, but in EF Core 2 the temporary values are stored in the entity’s tracking information, and leaves the key properties unchanged. To find out why see this explanation.

Pages 413 to 416 – Replacing an EF Core service with your own modified service

This section is out of date because the internals of EF Core 3 has changed a lot. The concept is still valid but the actual examples don’t work any more as the names or signature of the methods have changed.

NOTE: I have updated my EfSchemaCompare feature in my EfCore.TestSupport library. This calls part of the scaffolder feature in EF Core to read the model of the database we are checking. You can find the EF Core 2 and EF Core 3 code that creates the DatabaseModel service that reads the database’s Schema.

Chapter 15 – Unit testing EF Core applications

Simulating a database—using an in-memory database (pages 430 to 431)

In listing 15.7 I talk about setting the about the QueryClientEvaluationWarning to catch poor performing queries. This isn’t needed any more as the new query transaction always throws an exception if the query can’t be translated to single, valid database access command.

Capturing EF Core logging information in unit testing (pages 443 to 446)

Just as I pointed out in Chapter 12, the way to capturing EF Core logging information has changed (see the Chapter 12 sub-section for the corrected code)

NOTE: I recommend using the EfCore.TestSupport NuGet package which was built as part of Chapter 15. I have updated this library to handle both EF Core >=2.1 and  EF Core >=3.0. This library has various …CreateOptionsWithLogging methods that set up options with logging included – see the docs for these methods for an example of how this works.

Extra information

One of the big (biggest?) changes in EF Core 3 is the support of the NoSQL database, Cosmos DB. This is a big development and means EF Core can now support both SQL and NoSQL databases. I expect to see more NoSQL database providers in the future.

In Chapter 14 I built a CQRS architecture system using two databases: SQL Server for all writes and some reads and a NoSQL database (RavenDb) for the high-performance read side. I plan to write a new article doing the same thing, but using EF Core 3 and Cosmos DB. 

NOTE: I already wrote an article “Building a robust CQRS database with EF Core and Cosmos DB” when the first version of the Cosmos DB database provider was out in EF Core 2.2. It worked, but was very limited and slow. I plan to write a new version with EF Core and look at the performance.

List of EF Core 3 updates to the EfCoreInAction repo

I have updated three branches of the companion EfCoreInAction repo. They are:

  • Chapter01-NetCore3: This is useful for people who want to start with a simple application. It now runs a netcoreapp3.0 console application with EF Core 3.0
  • Chapter05- NetCore3: This provides you with a ASP.NET Core 3.0/EF Core 3.0 application and the related unit tests.
  • Chapter13-Part3-NetCore3: I mainly did this to get all the unit tests from chapter 2 to 13 updated to EF Core 3. That way I could ensure I hadn’t missed any changes.

NOTE: If you have Visual Studio Code (VSCode) then there is a great extension called GitLens which allows you to compare GIT branches. You might like to use that to compare the Chapter13-Part3-NetCore21 and Chapter13-Part3-NetCore3 to see for yourself what has changed.

Conclusion

As you have seen the EF Core 3 update hardly changes the way to write your code, but there are very big changes inside. Looking at the breaking changes I can see the reasons for all the changes and quite a few of the changes were things I had bumped up against. Overall, I think updating an application from EF Core 2 to EF Core 3 shouldn’t be that difficult from the coding side.

The part that could catch you out is the query overhaul. This could cause some of your queries to fail with an exception. If you already checked for QueryClientEvaluationWarning in EF Core 2 you should be OK but if you didn’t then EF Core 3 will throw an exception on any queries that don’t translate properly to database commands.

You can get around the exception by breaking the query into two parts and using AsEnumerable, see this example, and the exception message provides information on what part of the query is causing a problem. The up side of this change is you now know none of your queries are running parts of the query in your application instead of in the database.

Opening up EF Core to NoSQL is potentially an even bigger change, as NoSQL databases are a very important part of the database scene. In Chapter 14 of my book I used a CQRS database architecture with NoSQL database (RavenDb) for the read-side database, which significantly improved the performance of my application (see this article). Please do look out for the article “Building a robust CQRS database with EF Core and Cosmos DB – NET Core 3” when it comes out, which repeat this approach but using Cosmos DB via EF Core 3.

Happy coding!

If you have a ASP.NET Core or Entity Framework Core problem that you want help on then I am available as a freelance contractor. Please send me a contact request via my Contact page and we can talk some more on Skype.

Part 7 – Adding the “better ASP.NET Core authorization” code into your app

Last Updated: September 29, 2019 | Created: September 28, 2019

I have written a series about a better way to handle authorization in ASP.NET Core which add extra ASP.NET Core authorization features. Things like the ability to change role rules without having to edit and redeploy your web app, or adding the ability to impersonate a user without needing their password. This series has quickly become the top articles on my web site, with lots of comments and questions.

A lot of people like what I describe in these articles, but they have problems extracting the bits of code to implement the features this article describes. This article, with its improved PermissionAccessControl2 repo, is here to make it much easier to pick out the code you need and put it into your ASP.NET Core application. I then give you a step by step example of selecting a “better authorization” feature and putting it into a ASP.NET Core 2.0 app.

I hope this will help people who like the features I describe but found the code really hard to understand, and here is a list of all the articles in the series so that you can find information on each feature.

TL;DR; – summary

  • The “better authorisation” series provides a number of extra features to ASP.NET Core’s default Role-based authorisation system.
  • I have done a major refactor to the companion PermissionAccessControl2 repo on GitHub to make it easier for someone to copy the code they need for the features they want to put in their code.
  • I have also built separate authorization methods for the different combinations of my authorization features. This means it easier for you to pick the feature that works for you.
  • I have also improved/simplified some of the code to make it easier to understand.
  • I then use a step by step description of what you need to do to add a “better authorization” feature/code into your ASP.NET Core application.
  • I have made a copy of the code I produced in these steps available via the PermissionsOnlyApp repo in GitHub.

Setting the scene – the new structure of the code

Over the six articles I have added new features on top of the features that I already implemented. This meant the final version in the PermissionAccessControl2 repo was really complex, which made  it hard work for people to pick out the simple version that they needed. In this big refactor I have split the code into separate projects so that its easy to see what each part of the application does. The diagram below shows the updated application projects and how they link to each other.

This project may look complex, but many of these projects contain less than ten classes. These projects allow you to copy classes etc. from the projects that cover the features you want, and ignore projects that cover features you don’t want.

The other problem area was the database. Again, I wanted the classes relating to the added Authorization code kept separate from the classes I used to provide a simple, multi-tenant example. This let me to have a non-standard database that was hard to create/migrate. The good news is you only need to copy over parts of the ExtraAuthorizeDbContext DbContext into your own DbContext, as you will see in the next sections.

The Seven versions of the features

Before I get into the step by step guide, I wanted to list the seven different classes that provide different mixes of features that you can now easily pick and choose.

In fact in the PermissionAccessControl2 application allows you to configure different parts of the application so that you can try different features without editing any code. The feature selection is done by the “DemoSetup” section in the appsettings.json file. The AddClaimsToCookies class reads the “AuthVersion” value on startup to select what code to use to set up the authorization. Here are the seven classes that can be used, with their “AuthVersion” value.

  1. Applied when you log in (via IUserClaimsPrincipalFactory>
    1. LoginPermissions: AddPermissionsToUserClaims class.
    1. LoginPermissionsDataKey: AddPermissionsDataKeyToUserClaims class.
  2. Applied via Cookie event
    1. PermissionsOnly: AuthCookieValidatePermissionsOnly class
    1. PermissionsDataKey: AuthCookieValidatePermissionsDataKey class
    1. RefreshClaims : AuthCookieValidateRefreshClaims class
    1. Impersonation : AuthCookieValidateImpersonation class
    1. Everything : AuthCookieValidateEverything class

Having all these version makes it easier for you to select the feature set that you need for your specific application.

Step-by-Step addition to your ASP.NET Core app

I am now going to take you through the steps of copying over the most useful asked for in my “better authorization” series – that is the Roles-to-Permissions system which I describe here in the first article. This gives you two features that aren’t in ASP.NET Core’s Roles-based authorization, that is:

  • You can change the authorization rules in your application without needing to edit code and re-deploy your application.
  • You have simpler authorization code in your application (see section “Role authorization and its limitations” for more on this).

I am also going to use the more efficient UserClaimsPrincipalFactory method (see this section of article 3 for more on this) of adding the Permissions to the user’s claims. This add the correct Permissions to the user’s claims when thye log in.

And because .NET Core 3 is now out I’m going to show you how to do this for a ASP.NET Core 3.0 MVC application.

NOTE: I have great respect for Microsoft’s documentation, which has become outstanding with the advent of NET Core. The amount of information and updates on NET Core was especially good – fantastic job everyone.  I also have come to love Microsoft’s step-by-step format which is why I have tried to do the same in this article. I don’t think I made it as good as Microsoft, but I hope it helps someone.

Step 1: Is your application ready to take the code?

Firstly, you must have some form of authorisation system, i.e. something that checks the user is allowed to login. It can be any for of authorization system (my original client used Auth0 with the OAuth2 setup in ASP.NET Core). The code in my example PermissionAccessControl2 repo has some approaches that work with most authorisation, but some of the more complex ones only work with system that store claims in a cookie (although the approach could be altered to tokens etc.).

In my example I’m going to go with a ASP.NET Core 3.0 MVC app with Authentication set to “Individual User Accounts->Store user accounts in-app”. This means there is database added to your application and be default it uses cookies to hold the logged-in user’s Claims. Your system may be different, but with this refactor its easier for you to work out what code you need to copy over.

Step 2:  What code will we need for PermissionAccessControl2 repo?

You need to start out by deciding what parts of the PermissionAccessControl2 projects you need. This will depend on what features you need. Because the projects having names that fit the features this makes it easier to find things.

In my example I’m only going to use the core Roles-to-Permissions feature so I only need the code in the following projects:

  • AuthorizeSetup: just the AddPermissionsToUserClaims class
  • FeatureAuthorize: All the code.
  • DataLayer: A cut-down the ExtraAuthorizeDbContext DbContext (no DataKey, no IAuthChange) and some, but not all, of the classes in the ExtraAuthClasses folder.
  • PermissionParts: All the code.

That means I only need to look at about 50% of the projects in the PermissionAccessControl2 repo.

Step 3: Where should I put the PermissionAccessControl2 code in my app?

This is an architectural/style decision and there aren’t any firm rules here. I lean towards separating my apps into layers (see the diagram in this section of an article on business logic). There are lots of ways I could do it, but I went for a simple design as shown below.

Step 4: Moving the PermissionsParts into DataLayer

That was straightforward – no edits needed and no NuGet packages needed to be installed.

Step 5: Moving the ExtraAuthClasses into DataLayer

The ExtraAuthClasses contain code for all features, which makes this part the most complex step as you need to remove parts that aren’t used by features you want to use. Here is a list of what I needed to do.

5.a Remove unwanted ExtraAuthClasses.

I start by removing and ExtraAuthClasses that I don’t need because I’m not using those features. In my example this means I delete the following classes

  • TimeStore – only needed for “refresh claims” feature.
  • UserDataAccess, UserDataAccessBase and UserDataHierarchical – these are about the Data Authorize feature, which we don’t want.

5.b Fix errors

Because I didn’t copy over projects/code for features I don’t want then some things showed up as compile errors, and some just need to be changed/deleted.

  • Make sure the Microsoft.EntityFrameworkCore NuGet package has been added to the DataLayer.
  • You will also need the Microsoft.EntityFrameworkCore.Relational NuGet package in DataLayer for some of the configuration.
  • Remove the IChangeEffectsUser and IAddRemoveEffectsUser interfaces from classes – they are for the “refresh claims” feature.
  • Remove any using statements that don’t link to left out/moved projects.

NOTE: I used some classes/interfaces from my EfCore.GenericServices library to handle error handling in some of the methods in my ExtraAuthClasses. But I am building this app three days after the release of NET Core 3.0 and I haven’t (yet) updated GenericServices to NET Core 3.0 so I added a project called GenericServicesStandIn to host the Status classes.

5.b Adding ExtraAuthClasses to your DbContext

I assume you will have a DbContext that you will use to access the database. Your application DbContext might be quite complex, but in my example PermissionOnlyApp I started with a basic application DbContext as shown below.

public class MyDbContext : DbContext
{
    //your DbSet<T> properties go here

    public MyDbContext(DbContextOptions<MyDbContext> options)
        : base(options)
    { }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        //Your configuration code will go here
    }
}

Then I needed to add the DbSet<T> for the ExtraAuthClasses I need, plus some extra configuration too.

public class MyDbContext : DbContext
{
    //your DbSet<T> properties go here

    //Add the ExtraAuthClasses needed for the feature you have selected
    public DbSet<UserToRole> UserToRoles { get; set; }
    public DbSet<RoleToPermissions> RolesToPermissions { get; set; }
    public DbSet<ModulesForUser> ModulesForUsers { get; set; }

    public MyDbContext(DbContextOptions<MyDbContext> options)
        : base(options)
    { }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        //Your configuration code will go here

        //ExtraAuthClasses extra config
        modelBuilder.Entity<UserToRole>().HasKey(x => new { x.UserId, x.RoleName });
        modelBuilder.Entity<RoleToPermissions>()
            .Property("_permissionsInRole")
            .HasColumnName("PermissionsInRole");
    }
}

Now you can replace some of the references to ExtraAuthorizeDbContext references in some of the ExtraAuthClasses.

Step 6: Move the code into the RolesToPermissions project

Now we need to move the last of the Roles-to-Permissions code into the RolesToPermissions project. Here are the steps I took.

6a. Move code into this project from the PermissionAccessControl2 version.

What you move in depends on what features you want. If you have some of the complex feature like refresh user’s claims on auth changes, or user impersonation you might want to put each part in a separate folder. But in my example, I only need code from one project and pick one class from another. Here is what I did.

  • I moved all the code in the FeatureAuthorize project (including the code in the PolicyCode folder).
  • I then copied over the specific setup code I needed for the feature I wanted I this case this was the AddPermissionsToUserClaims class.

At this point there will be lots of errors, but before we sort these out you have to select the specific setup code you need.

6b. Add project and NuGet packages needed by the code

This code accessing a number of support classes for it to work. The error messages on “usings” on in the code will show you what is missing. The most obvious is this project needs a reference to the DataLayer. Then there are NuGet packages – I only needed three, but you might need more.

6c. Fix references and “usings”

At this point I still have compile errors, either because the applciation’s DbContext is a different name, and because it has some old (bad) “usings”. The steps are:

  • The code refers to ExtraAuthorizeDbContext, but now it needs to reference your application DbContext. In my example that is called MyDbContext. You will also need to add a “using DataLayer” to reference that.
  • You will have a number incorrect “using” at the top of various files – just delete them.

Step 7 – setting up the ASP.NET Core app

We have added all the Roles-to-Permission code we need, but the ASP.NET Core code doesn’t use it yet. In this section I will add code to the ASP.NET Core Startup class to link the   AddPermissionsToUserClaims into the UserClaimsPrincipalFactory. Also I assume you have a database your already use to which I have add the ExtraAuthClasses. Here is the code that does all this

public void ConfigureServices(IServiceCollection services)
{
    // … other standard service registration removed for clarity

    //This registers your database, which now includes the ExtraAuthClasses
    services.AddDbContext<MyDbContext>(options =>
        options.UseSqlServer(
              Configuration.GetConnectionString("DemoDatabaseConnection")));

    //This registers the code to add the Permissions on login
    services.AddScoped<
         IUserClaimsPrincipalFactory<IdentityUser>, 
         AddPermissionsToUserClaims>();

    //Register the Permission policy handlers
    services.AddSingleton<IAuthorizationPolicyProvider,
         AuthorizationPolicyProvider>();
    services.AddSingleton<IAuthorizationHandler, PermissionHandler>();
}

NOTE: In my example app I also set the options.SignIn.RequireConfirmedAccount to false in the AddDefaultIdentity method that registers ASP.NET Core Indentity. This allow me to log in via a user I add at startup (see next section). This demo user won’t have its email verified so I need to turn off that constraint. In a real system you might not need that.

At this point all the code is linked up, but we need an EF Core migration to add the ExtraAuthClasses to your application DbContext. Here is the command I run in Visual Studio’s Package Manager Console – note that it has to have extra parameters because there are two DbContexts (the other one is the ASP.NET Idenity ApplicationDbContext).

Add-Migration AddExtraAuthClasses -Context MyDbContext -Project
DataLayer

Step 8 – Add Permissions to your Controller actions/Razor Pages

All the code is in place so now you can use Permissions to protect your ASP.NET Core Controller actions or Razor Pages. I explain Roles-to-Permissions concept in detail in the first article, so I’m not going to cover that again. I’m just going to show you how a) to add a Permission to the Permissions Enum and then b) protect an ASP.NET Core Controller action with that Permission.

8a. Edit the Permissions.cs file

In the code you copied into the PermissionsParts folder in the DataLayer you will find a file called Permissions.cs file, which defines the enum called “Permissions”. It has some example Permissions which you can remove, apart for the one named “AccessAll” (that is used by the SuperAdmin user). Then you can add the Permission names you want to use in your application. I typically I give each Permission a number, but you don’t have to – the compiler will do that for you. Here is my updated Permissions Enum.

public enum Permissions : short //I set this to short because the PermissionsPacker stores them as Unicode chars 
{
    NotSet = 0, //error condition

    [Display(GroupName = "Demo", Name = "Demo", Description = "Demo of using a Permission")]
    DemoPermission = 10,

    //This is a special Permission used by the SuperAdmin user. 
    //A user has this permission can access any other permission.
    [Display(GroupName = "SuperAdmin", Name = "AccessAll", Description = "This allows the user to access every feature")]
    AccessAll = Int16.MaxValue, 
}

8b. Add HasPermission Attribute to a Controller action

To protect an ASP.NET Core Controller action or Razor Page you use the “HasPermission” attribute to them. Here is an example from the PermissionOnlyApp.

public class DemoController : Controller
{
    [HasPermission(Permissions.DemoPermission)]
    public IActionResult Index()
    {
        return View();
    }
}

For this action to be executed the caller must a) by logged in and b) either have the Permission “DemoPermission” in their set of Permissions, or be the SuperAdmin user who has the special Permission called “AccessAll”.

At this point you are good to go apart from creating the databases.

Step 9 – extra steps to make it easier for people to try the application

I could stop there, but I like to make sure someone can easily run the application to try it out. I therefore am going to add:

  • Some code in Program to automatically migrate both databases on startup (NOTE: This is NOT the best way to do migrations as it fails in certain circumstances – see my articles on EF Core database migrations).
  • I’m going to add a SuperAdmin user to the system on startup if they aren’t there. That way you always have a admin user available to log in on startup – see this section from the part 3 article about the SuperAdmin and what that user can do.

I’m not going to detail this because you will have your own way of setting up your application, but it does mean you can just run this application from VS2019 and it should work OK (it does use the SQL localdb).

NOTE: The first time you start the application it will take a VERY long time to start up (> 25 seconds on my system). This is because it is applying migrations to two databases. The second time will be much faster.

Conclusion

Quite a few people have contacted me with questions about how they can add the features I described in these series into their code. I’m sorry it was so complex and I hope this new article. I also took the time to improve/simplify some of the code. Hopefully this will make it easier to understand and transfer the ideas and code that goes with this these articles.

All the best with your ASP.NET Core applications!

Adding user impersonation to an ASP.NET Core web application

Last Updated: September 28, 2019 | Created: August 21, 2019

If you supply a service via a web application (known as Software as a Service, SaaS for short) then you need to support users that have problems with your site. One useful tool for your support team is to be able to access your service as if you were the customer who is having the problem. Clearly you don’t want to use their password (which you shouldn’t know anyway) so you need another way to do this. In this article I describe what is known as user impersonation and describe one way to implement this feature in an ASP.NET Core web app.

NOTE: User impersonation does have privacy issues around user data, because the support person will be able to access their data. Therefore, if you do use user impersonation you need to either get specific permission from a user before using this feature, or add it to your T&Cs.

I have spent a lot of work on ASP.NET Core authorization features for both my clients and readers, and this article adds another useful feature in to a long series looking at feature and data authorization. The whole series consists of:

There is an example web application in a GitHub repo called PermissionAccessControl2 which goes with articles 3 to 6. This is an open-source (MIT licence) application that you can clone and run locally. Be default it uses in-memory databases which are seeded with demo data on start-up. That means its easy to run and see how the “user impersonation” feature works in practice.

NOTE: To try out the “user impersonation” you should run the PermissionAccessControl2 ASP.NET Core project. Then you need to log in the SuperAdmin user (email Super@g1.com with password Super@g1.com) and then go to the Impersonation->Pick user. You can see the changes by using the User’s menu, or using one of the Shop features.

TL;DR; – summary

  • The user impersonation feature allows a current user, normally a support person, to change their feature and data authorization settings to match another user. This means that the support user will experience the system as if they are the impersonated user.
  • User impersonation is useful in SaaS systems for investigating/fixing problems that customers encounter. This is especially true in systems where each organization/user has their own data, as it allows the support person to access the user’s data.
  • I add this feature to my “better authorization” system as described in this series, but the described approach can also be applied to ASP.NET Core Identity systems using Roles etc.
  • All the code, plus a working ASP.NET Core example is available via the GitHub repo called PermissionAccessControl2. It is open-source.

Setting the scene – what should a “user impersonation” feature do?

User impersonation gives you access to the services and data that a user has. Typical things you might user impersonation for are:

  1. If a customer is having problems, then user impersonation allows you to access a customer’s data as if you were them.
  2. If a customer reports that there is a bug in your system you can check it out using their own settings and data.
  3. Some customers might pay for enhanced support where your staff can to enter data or fix problems that they are struggling with.

NOTE: In the conclusion I list some other benefits that I have found in development and testing.

So, to do any of these things the support person must have:

  • The same authentication setting, e.g. ASP.NET Core Roles, that the user has (especially for items 1 and 2) – known as feature authorization, which in my system are called Permission.
  • Access to the same data that the user has – known as data authorization, which in my system is called DataKey.

NOTE: Many SaaS systems have separate data per organization and/or per user. This is known as a multi-tenant system and I cover this in Part 2 and Part 4 articles in the “better authorization” series.

But I don’t want to change current user’s “UserId”, which holds a unique value for each user (e.g. a string containing a GUID). By keeping the support user’s UserId I can use it in any logging or tracking of changes to the database. That way if there are any problems you can clearly see who did what.

There is another part of the user’s identity called “Name” which typically holds the user’s email address or name. Because of the data privacy laws such as GDPR I don’t use this in logger or tracking.

So, taking all these parts of the user’s identity here is a list below of what I want to impersonate and what I leave as the original (support) user’s setting:

  • Feature authorization, e.g. Roles, Permissions – Use impersonated user’s settings (see note below)
  • Data authorization, e.g. data key – Use impersonated user’s settings
  • UserId, i.e. unique user value – Keep support user’s UserId
  • Name, e.g. their email address – Depends: in my system I keep support user’s UserName

NOTE: In most impersonation cases the feature authorizations should change to settings of user you are impersonating – that way the support user will be see what the user sees. But sometimes it’s useful to keep the feature authorization of the support person who is doing the impersonating – that might unlock more powerful features that will help in quickly fixing some more complex issues. I provide both options in my implementation.

Now I will cover how to add a “user impersonation” feature to an ASP.NET Core web application.

The architecture of my “user impersonation” feature

To implement my “user impersonation” feature I have tapped into ASP.NET Core’s application Cookie events called OnValidatePrincipal (I use this a lot in “better authorization” series). This event happens every HTTP request and provides me with the opportunity to change the user’s Claims (all the user’s settings are held in a Claim class, which are stored as key/value strings in the authentication cookie or token).

My user impersonating code is controlled by the existence of a cookie defined in the class ImpersonationCookie. As shown in the diagram below the code linked to the OnValidatePrincipal event looks for this cookie and goes into impersonation mode while that cookie is present and reverts back to normal if the impersonation cookie is deleted.

I will describe this process in the following stage:

  1. A look at the impersonation cookie
  2. A look at the ImpersonationHandler
  3. The adaptions to the code called by the OnValidatePrincipal event.
  4. The impersonation services.
  5. Making the sure that the impersonation feature is robust.

1. A look at the impersonation cookie

I use an ImpersonationCookie to control impersonation: it holds data needed to setup the impersonation and its existence keeps the impersonation going. The code handling the OnValidatePrincipal event can read that cookie via the event’s CookieContext, which includes the HttpContext. The cookie payload (a string) comes from a class called ImpersonationData that holds this data and can convert to/from a string. The three values are:

  1. The UserId of the user to be impersonated
  2. The Name of the user to be impersonated (only used to display the name of the user being impersonated)
  3. A Boolean called KeepOwnPermissions (see next paragraph).

While impersonating a user the support person usually takes on the Permissions of the impersonated user so that the application will react as if the impersonated user is logged in. However, I have seen situations where its useful to have the support’s more comprehensive Permissions, for instance to access extra commands that the impersonated user wouldn’t normally have access to. That is why I have the KeepOwnPermissions value, which if true will keep the support’s Permissions.

All these three values have some (admittedly low) security issues so I use ASP.NET Core’s Data Protection feature to encrypt/decrypt the string holding the ImpersonationData.

2. A look at the ImpersonationHandler

I have put as much of the code that handles the impersonation into one class called ImpersonationHandler. On being created it a) checks if the impersonation cookie exists and b) checks if an impersonation claim is found in the current user’s claims. From these two tests it can work out what state the current user in in. The states are listed below:

  1. Normal: Not impersonating
  2. Starting: Starting impersonation.
  3. Impersonating: Is impersonating
  4. Stopping: Stopping impersonation

The only time the Permissions and DataKey need to be recalculated is when the state is Starting or Stopping, and the ImpersonationHandler has a property called ImpersonationChange which is true in that case. This minimises calls to recalculation to provide good performance (Note: A recalculate will also happen if you are using the “refreshing user’s claims” feature described in the Part 5 article).

The recalculation of the Permissions and DataKey needs a UserId, and there are there are two public methods, GetUserIdForWorkingOutPermissions and GetUserIdForWorkingDataKey, which provide the correct UserId based on the impersonation state and the “KeepOwnPermissions” (see step 1) setting. (I used two separate methods for the Permissions and DataKey because the “KeepOwnPermissions” will affect the Permissions’ UserId returned but doesn’t affect the DataKey’s UserId).

The other public method needed to set the user claims is called AddOrRemoveImpersonationClaim.  Its job is to add or remove the “Impersonalising” claim. This claim is used to a) tell the ImpersonationHandler whether it is already impersonating and b) contains the Name of the user being impersonated, which gives a visual display of what user you are impersonating.

3. The adaptions to the code called by the OnValidatePrincipal event.

Anyone who has been following this series will know that I tap into the authorization cookie OnValidatePrincipal event. This event happens on every HTTP request and allows the claims and the authorization cookie to be changed. For performance reasons you do need to minimise what you do in this event as it is runs so often.

I have already described in detail the code called by the OnValidatePrincipal event here in Part 1. Below is the updated PermissionAccessControl2 code, now with the impersonation added. There are some notes at the end that only describe the parts added to provide the impersonation feature.

public async Task ValidateAsync(CookieValidatePrincipalContext context)
{
    var extraContext = new ExtraAuthorizeDbContext(
        _extraAuthContextOptions, _authChanges);
    var rtoPLazy = new Lazy<CalcAllowedPermissions>(() => 
        new CalcAllowedPermissions(extraContext));
    var dataKeyLazy = new Lazy<CalcDataKey>(() => new CalcDataKey(extraContext));

    var originalClaims = context.Principal.Claims.ToList();
    var impHandler = new ImpersonationHandler(context.HttpContext, 
        _protectionProvider, originalClaims);
    
    var newClaims = new List<Claim>();
    if (originalClaims.All(x => 
            x.Type != PermissionConstants.PackedPermissionClaimType) ||
        impHandler.ImpersonationChange ||
        _authChanges.IsOutOfDateOrMissing(AuthChangesConsts.FeatureCacheKey, 
            originalClaims.SingleOrDefault(x => 
                 x.Type == PermissionConstants.LastPermissionsUpdatedClaimType)?.Value,
            extraContext))
    {
        var userId = impHandler.GetUserIdForWorkingOutPermissions();
        newClaims.AddRange(await BuildFeatureClaimsAsync(userId, rtoPLazy.Value));
    }

    if (originalClaims.All(x => x.Type != DataAuthConstants.HierarchicalKeyClaimName) 
        || impHandler.ImpersonationChange)
    {
        var userId = impHandler.GetUserIdForWorkingDataKey();
        newClaims.AddRange(BuildDataClaims(userId, dataKeyLazy.Value));
    }

    if (newClaims.Any())
    {
        newClaims.AddRange(RemoveUpdatedClaimsFromOriginalClaims(
             originalClaims, newClaims)); 
        impHandler.AddOrRemoveImpersonationClaim(newClaims);
        var identity = new ClaimsIdentity(newClaims, "Cookie");
        var newPrincipal = new ClaimsPrincipal(identity);
        context.ReplacePrincipal(newPrincipal);
        context.ShouldRenew = true;             
    }
    extraContext.Dispose();
}

The changes to add impersonalisation to the ValidateAsync code are:

  • Lines 10 to 11: I create a new ImpersonationHandler to use throughout the method
  • Line 16 and 27: the “impHandler.ImpersonationChange” property will be true if impersonation is starting or stopping, which are the times where the user’s claims need to be recalculated.
  • Lines 22 and 29: the UserId to calculate the Permissions and DataKey can alter if in impersonation mode. These impHandler methods controls what UserId value (support user’s UserId or the impersonated user’s UserId).
  • Line 37: working out the impersonation state relies on having an “Impersonalising” claim while impersonation is active. This method makes sure the “Impersonalising” claim is added on starting impersonation, or removes the impersonation claim when stopping impersonation.

The other things to note is the code above contains the “refresh claims” feature described in the Part 5 article. This means that while impersonating a user the claims will be recalculated. The ImpersonationHandler is designed to handle this, and it will continue to impersonate a user during a “refresh claims” event.

NOTE: Even if you don’t need the “refresh claims” feature you might like to take advantage of the “How to tell your front-end that the Permissions have changed” I describe in Part 5. This allows a front-end framework, like React.js, Angular.js, Vue.js etc. to detect that the Permissions have changed so that it can show the appropriate displays/links.

4. The Impersonation services

The ImpersonationService class has two methods: StartImpersonation and StopImpersonation. They have some error checks, but the actual code is really simple because all they do is create or delete the Impersonation Cookie respectively. The code for both methods are shown below

public string StartImpersonation(string userId, string userName, 
    bool keepOwnPermissions)
{
    if (_cookie == null)
        return "Impersonation is turned off in this application.";
    if (!_httpContext.User.Identity.IsAuthenticated)
        return "You must be logged in to impersonate a user.";
    if (_httpContext.User.Claims.GetUserIdFromClaims() == userId)
        return "You cannot impersonate yourself.";
    if (_httpContext.User.InImpersonationMode())
        return "You are already in impersonation mode.";
    if (userId == null)
        return "You must provide a userId string";
    if (userName == null)
        return "You must provide a username string";

    _cookie.AddUpdateCookie(new ImpersonationData(
           userId, userName, keepOwnPermissions)
               .GetPackImpersonationData());
    return null;
}

public string StopImpersonation()
{
    if (!_httpContext.User.InImpersonationMode())
        return "You aren't in impersonation mode.";

    _cookie.Delete();
    return null;
}

The only thing of note in this code is the keepOwnPermissions property in the StartImpersonation method. This controls whether the impersonated user’s Permissions or current support user’s Permissions are used.

I have also added two Permissions: Impersonate and ImpersonateKeepOwnPermissions which are used in the impersonationController to control who can access the impersonation feature.

5. Making the sure that the impersonation feature is robust

There are a few things that could cause problems or security risks, such as someone logging out while in impersonation mode and the next user logging in and inheriting the impersonation mode. For these reasons I set up a few things to fail safe.

Firstly, I set the CookieOptions as shown below, which makes the cookie secure and has a lifetime of the client browser.

_options = new CookieOptions
{
    Secure = false,  //ONLY FOR DEMO!!
    HttpOnly = true,
    IsEssential = true,
    Expires = null,
    MaxAge = null
};

Here is notes on these options

  • Line 3: In real life you would want this to be true because you would be using HTTPS, but for this demo I allow HTTP
  • Line 4: This says JavaScript can’t read it
  • Line 5: This is an essential cookie, and setting this to true which means it is allowed without user clearance.
  • Lines 6 and 7: These two settings make the cookie a session cookie, which means it is deleted when the client (e.g. browser) is shut down.

The other thing I do is delete the impersonation when the user logs out. I do this by capturing another event called OnSigningOut to delete the impersonation cookie.

services.ConfigureApplicationCookie(options =>
{
    options.Events.OnValidatePrincipal = authCookieValidate.ValidateAsync;
    //This ensures the impersonation cookie is deleted when a user signs out
    options.Events.OnSigningOut = authCookieSigningOut.SigningOutAsync;
});

Finally I update the _LoginPartial.cshtml to make it clear you are in impersonation mode and who you are impersonating. I replace the “Hello @User.Identity.Name” with “@User.GetCurrentUserNameAsHtml()”, which shows “Impersonating joe@gmail.com” with the bootstrap class, text-danger. Here is the code that does that.

public static HtmlString GetCurrentUserNameAsHtml(this ClaimsPrincipal claimsPrincipal)
{
    var impersonalisedName = claimsPrincipal.GetImpersonatedUserNameMode();
    var nameToShow = impersonalisedName ??
                     claimsPrincipal.Claims.SingleOrDefault(x => 
                          x.Type == ClaimTypes.Name)?.Value ?? "not logged in";

    return new HtmlString(
        "<span" + (impersonalisedName != null ? 
                  " class=\"text-danger\">Impersonating " : ">Hello ")
             + $"{nameToShow}</span>");
}

The downsides of the User Impersonation feature

There aren’t any big extra performance issues with the impersonation feature, but of course the use of the authorization cookie OnValidatePrincipal event does add a (very) small extra cost on every HTTP request.

One downside is that it’s quite complex with various classes, services and startup code. If you want a simpler impersonation approach I would recommend Max Vasilyev’s “User impersonation in Asp.Net Core” article that directly uses ASP.NET Core’s Identity features.

Another downside is the security issue of a support user being able to see and change a user’s data. I strongly recommend adding logging around the impersonation feature and also marking all data with the person who created/edited it using the UserId, which I make sure is the real (support) user’s UserId. That way if there is a problem you can show who did what.

Conclusion

Over the years I have found user impersonation really useful (I wrote about this on ASP.NET MVC back in 2015). This article now provides user impersonation for ASP.NET Core and also fits in with my “better authorization” approach. All the code, with a working example web application, is available in the GitHub repo called PermissionAccessControl2.

As well as allowing you to provide good help to your users I have found the impersonation feature great for helping in early stages of development. For instance, say you haven’t got the user support safe enough for your SaaS user to use, then you can have your support people manage that until you have implemented a version suitable for SaaS users.

User impersonation is also really useful for live testing, because you can quickly swap between different types of users to check that the system is working as you designed (You might also like to look at my “Getting better data for unit testing your EF Core applications” article on how to extract and anonymize personal data from a live system too).

You may not want to use my “better authorization” approach, but use ASP.NET Core’s Roles. In that case the approach I use, with the authorization cookie OnValidatePrincipal event, is still valid. Its just that you need to alter the Roles Claim in the user’s Claims. The ImpersonationHandler class will still be useful for you, as it simply detects the impersonalisation and provides the UserId of the impersonalised user – from there you can look up the impersonalised user’s Roles and replace them in the user’s claims.

Happy coding!

If you have a ASP.NET Core or Entity Framework Core problem that you want help on then please contact me via my Contact page. I’m freelance contractor specializing ASP.Net Core, EF Core and Azure.

Part 5: A better way to handle authorization – refreshing user’s claims

Last Updated: August 22, 2019 | Created: July 29, 2019

This article focuses on the ability to update a logged in user’s authorization as soon as any of the authorization classes in the database are changed – I refer to this as “refresh claims” (see “Setting the Scene” !!! for a longer explanation). This article was inspired by a number of people asking for this feature in my alternative feature/data authorization approach originally described in the first article in this series.

The original article is very popular, with lots of questions and comments. I therefore came back about six months after the first article to answer some of the more complex question by creating a new PermissionAccessControl2 example web application and the following three extra articles:

UPDATE: See my NDC Oslo 2019 talk which covers these three articles.

You can find the original articles at:

NOTE: You can see the “refresh claims” feature in action by cloning the PermissionAccessControl2 example web application and then running the PermissionAccessControl2 project. By default, it is set up to use in-memory databases seeded with demo data and the “refresh claims” feature. See the “refresh Claims” menu item.

TL;DR; – summary

  • Typically, when you log in to an ASP.NET Core web app the things you can do, known as authorization, is “frozen”, i.e. it is fixed for however long you stay logged in.
  • An alternative is to update a logged-in user’s authorization whenever the internal, database versions of the authorization is updated. I call this “refreshing claims” because authorization data is stored in a user’s Claims.
  • This article describes how I have added this “refresh claims” feature to my alternative feature/data authorization approach described in this series.
  • While the “refresh claims” code I show applies to my alternative feature/data authorization code the same approach can be applied to any standard ASP.NET Core Role or Claim authorization system.
  • There are some (small?) downside to adding this feature around complexity and performance, which I cover near the end of this article.
  • The code in this article can be found in the open-source PermissionAccessControl2 GitHub repo, which also includes a runnable example web application.

Setting the scene – why refresh the user’s claims?

If you are using the built-in ASP.NET Core’s Identity system, then when you log in you get a set of Role and Claims which defined what you can do – known as authorisation in ASP.NET. By default, once you are logged in your authorisation is fixed for however long you stay logged in. This means that any changed the internal authorisation setting, then you need to log out and log back in again before you inherit these new settings. Most systems work this way because it’s simple and covers most of the authentication/authorization requirements of standard web apps.

But there are situations where you need any change to authorization to be immediately applied to the user – what I call “refreshing claims”. For instance, in a high-security system like a bank you might want to be able revoke certain authentication features immediately from a logged-in user/users. Another use case would be where users can trial a paid-for feature, but once the trial period you want the feature to turn off immediately.

So, if you need refresh feature then how can you implement it? One approach would be to recalculate the user’s authorisation settings every time they access the system – that would work but would add a performance hit due to all the extra database accesses and recalculations required on every HTTP request. Another approach would be to revoke/time-out the authentication token or cookie and have the system recalculate the authentication token or cookie again.

In the next section I describe how I added the “refresh claims” to my feature authentication approach.

The architecture of my “refresh claims” feature

In the earlier articles I described a replacement authorization system which had the advantage over the ASP.NET Core’s Roles-based authorisation in that the Admin user can change all aspects of the user’s authorisation (with ASP.NET Core’s Roles-based authorisation you need to edit/redeploy the code to alter what controller methods a Role can access).

The figure below shows an abbreviated version of how the feature part of authorisation process which is run on login (see the Part3 article for a more in-depth explanation).

But to implement the “refresh claims” feature I need a way to alter the permissions while the user is logged in. My solution is to use an authorization cookie event that happens every HTTP request. This allows me to change the user’s authorisations at any time.

To make this work I set a “LastUpdated” time when any of the database classes that manages authorization are changed. This is then compared with the “LastUpdated” claim in the user’s Claims – see the diagram below which shows this process. Parts in blue bold text show what changes over time.

I’m going to describe the stages involved in this in the following order

  1. How to detect that the Roles/Permissions have changed.
  2. How to store the last time the Roles/Permissions changed.
  3. Linking the authorization code to the database cache.
  4. How to detect/update the user’s permissions Claim.
  5. How to tell your front-end that the Permissions have changed.

1. How to detect that the Roles/Permissions have changed.

A user’s Permissions could be out of date whenever the User Roles, the Role’s Permissions, or the User Modules are changed. To do this I add some detection code to the SaveChanges/ SaveChangesAsync methods in DbContext that manages those database classes, called ExtraAuthorizeDbContext.

NOTE: Putting the detection code inside the SaveChanges and SaveChangesAsync methods provides a robust solution because it doesn’t rely on the developer to adding code to all the services that changes the authorization database classes.

Here is the code in the SaveChanges method

// I only have to override these two versions of SaveChanges,
// as the other two SaveChanges versions call these 
public override int SaveChanges(bool acceptAllChangesOnSuccess)
{ 

    if (ChangeTracker.Entries().Any(x => 
        (x.Entity is IChangeEffectsUser && x.State == EntityState.Modified) || 
              (x.Entity is IAddRemoveEffectsUser && 
                    (x.State == EntityState.Added || 
                     x.State == EntityState.Deleted)))
    {
        _authChange.AddOrUpdate(this);
    }
    return base.SaveChanges(acceptAllChangesOnSuccess);
}

The things to note are:

  • Lines 6 to 9 This is looking for changes that could affect an existing user.
  • Line 11: If a change is found it calls the AddOrUpdate method in the IAuthChange instance that is injected into the ExtraAuthorizeDbContext. I describe the AuthChange class in section 3.

2. How to store the last time the Roles/Permissions changed.

Once the SaveChanges have detected a change we need to store the time that change happens. This is done via a class called TimeStore which is shown below.

public class TimeStore
{
    [Key]
    [Required]
    [MaxLength(AuthChangesConsts.CacheKeyMaxSize)]
    public string Key { get; set; }

    public long LastUpdatedTicks { get; set; }
}

This is a Key/Value cache, where the Value is a long (Int64) containing the time as ticks when the item was changes. I did this way because I would use this same store to contain changes in any my hierarchical DataKeys (see Part4 article), which I don’t cover in this article.

NOTE: In the Part4 article I describe a multi-tenant system which is hierarchical. In that case if I move a SubGroup (e.g. West Coast division) to a different parent, then the DataKey would change, along with all its “child” data. In this case you MUST refresh any logged-in user’s DataKey otherwise a logged-in user would have access to the wrong data. That is why I used a generalized TimeStore so that I could add a per-company “LastUpdated” value.

I also add a the ITimeStore interface ExtraAuthorizationDbContext which the AuthChanges class (see next section) can use. The ITimeStore defines two methods:

  1. GetValueFromStore, which reads a value from the TimeStore
  2. AddUpdateValue, which adds or update the TimeStore

You will see these being used in the next section.

3. Linking the authorization code to the database cache.

I created a small project called CommonCache which lives at the bottom of the solution structure, i.e. it doesn’t reference to any other project. This contains AuthChange class, which links between the database and the code handling the authorization.

This AuthChange class provides a method that the authorization code can call to check if the user’s authorization Claims are out of date. And at the database end it creates the correct cache key/value when the database detects a change in the authorization database classes.

Here is the AuthChange class code.

public class AuthChanges : IAuthChanges
{
    public bool IsOutOfDateOrMissing(string cacheKey, 
        string ticksToCompareString, ITimeStore timeStore)
    {
        if (ticksToCompareString == null)
            //if there is no time claim then you do need to reset the claims
            return true;

        var ticksToCompare = long.Parse(ticksToCompareString);
        return IsOutOfDate(cacheKey, ticksToCompare, timeStore);
    }

    private bool IsOutOfDate(string cacheKey, 
        long ticksToCompare, ITimeStore timeStore)
    {
        var cachedTicks = timeStore.GetValueFromStore(cacheKey);
        if (cachedTicks == null)
            throw new ApplicationException(
                $"You must seed the database with a cache value for the key {cacheKey}.");

        return ticksToCompare < cachedTicks;
    }

    public void AddOrUpdate(ITimeStore timeStore)
    {
        timeStore.AddUpdateValue(AuthChangesConsts.FeatureCacheKey, 
             DateTime.UtcNow.Ticks);
    }
}

The things to note are:

  • Lines 3 to 12: The IsOutOfDateOrMissing method is called by the ValidateAsync method (described in the next section) uses to find out if the User’s claims need recalculating, i.e. it returns true if the User’s claims “LastUpdated” is missing, or it is earlier then the database “LastUpdated” time. You can see the cache read in line 17 inside the private method that does the time compare.
  • Lines 25 to 29: The AddOrUpdate method makes sure the ITimeStore has an entry under the key defined by FeatureCacheKey which has the current time in ticks. This is referred to as the database “LastUpdated” value.

3. How to detect/update the user’s permissions Claim

In the Part1 article I showed how you can add claims to the user at login time via the Authentication Cookie’s OnValidatePrincipal event, but these claims are “frozen”. However, this event is perfect for our “refresh claims” feature because the event happens on every HTTP request. So, in the new version 2 PermissionAccessControl2 code I have altered the code to add the “refresh claims” feature. Below is the new version of the ValidateAsync method, with comments on the key parts of the code at the bottom.

public async Task ValidateAsync(CookieValidatePrincipalContext context)
{
    var extraContext = new ExtraAuthorizeDbContext(
        _extraAuthContextOptions, _authChanges);
    //now we set up the lazy values - I used Lazy for performance reasons
    var rtoPLazy = new Lazy<CalcAllowedPermissions>(() => 
        new CalcAllowedPermissions(extraContext));
    var dataKeyLazy = new Lazy<CalcDataKey>(() => 
        new CalcDataKey(extraContext));

    var newClaims = new List<Claim>();
    var originalClaims = context.Principal.Claims.ToList();
    if (originalClaims.All(x => 
        x.Type != PermissionConstants.PackedPermissionClaimType) ||
        _authChanges.IsOutOfDateOrMissing(AuthChangesConsts.FeatureCacheKey, 
            originalClaims.SingleOrDefault(x => 
                 x.Type == PermissionConstants.LastPermissionsUpdatedClaimType)?.Value,
            extraContext))
    {
        var userId = originalClaims.SingleOrDefault(x => 
             x.Type == ClaimTypes.NameIdentifier)?.Value;
        newClaims.AddRange(await BuildFeatureClaimsAsync(userId, rtoPLazy.Value));
    }

    //… removed DataKey code as not relevant to this article

    if (newClaims.Any())
    {
        newClaims.AddRange(RemoveUpdatedClaimsFromOriginalClaims(
              originalClaims, newClaims));
        var identity = new ClaimsIdentity(newClaims, "Cookie");
        var newPrincipal = new ClaimsPrincipal(identity);
        context.ReplacePrincipal(newPrincipal);
        context.ShouldRenew = true;             
    }
    extraContext.Dispose(); //be tidy and dispose the context.
}

private IEnumerable<Claim> RemoveUpdatedClaimsFromOriginalClaims(
    List<Claim> originalClaims, List<Claim> newClaims)
{
    var newClaimTypes = newClaims.Select(x => x.Type);
    return originalClaims.Where(x => !newClaimTypes.Contains(x.Type));
}

private async Task<List<Claim>> BuildFeatureClaimsAsync(
    string userId, CalcAllowedPermissions rtoP)
{
    var claims = new List<Claim>
    {
        new Claim(PermissionConstants.PackedPermissionClaimType, 
             await rtoP.CalcPermissionsForUserAsync(userId)),
        new Claim(PermissionConstants.LastPermissionsUpdatedClaimType,
             DateTime.UtcNow.Ticks.ToString())
    };
    return claims;
}

The things to note are:

  • Lines 13 to 18: This checks if the PackedPermissionClaimType Claim is missing, or that the LastPermissionsUpdatedClaimType Claim’s value is either out of date or missing. If either of these are true then it has to recalculate the user’s Permissions, which you can see in lines 19 to 23.
  • Lines 46 to 57: This adds the two claims needed: the PackedPermissionClaimType Claim with the user’s recalculated Permissions, and the LastPermissionsUpdatedClaimType Claim which is given the current time.

5. How to tell your front-end that the Permissions have changed

If you are using some form of front-end framework, like React.js, Angular.js, Vue.js etc. then you will use the Permissions in the front-end to select what buttons, links etc to show. In the Part3 article I showed a very simple API to get the Permissions names, but now we need to know when to update the local Permissions in your front end.

My solution is to add a header to every HTTP return that gives you the “LastUpdated” time when the current user’s Permissions where updated. By saving this value in the JavaScript SessionStorage you can compare the time provided in the header with the last value you had – if they are different then you need to re-read the permissions for the current user.

Its pretty easy to add a header, and here is the code inside the Configure method inside the Startup class in your ASP.NET Core project. Here is the code (with thanks to SO answer https://stackoverflow.com/a/48610119/1434764).

//This should come AFTER the app.UseAuthentication() call
if (Configuration["DemoSetup:UpdateCookieOnChange"] == "True")
{
    app.Use((context, next) =>
    {
        var lastTimeUserPermissionsSet = context.User.Claims
            .SingleOrDefault(x => 
                 x.Type == PermissionConstants.LastPermissionsUpdatedClaimType)
            ?.Value;
        if (lastTimeUserPermissionsSet != null)
            context.Response.Headers["Last-Time-Users-Permissions-Updated"] 
                 = lastTimeUserPermissionsSet;
        return next.Invoke();
    });
}

The downsides of adding “refresh claims” feature

While the “refresh claims” feature is useful it does have some downsides. Firstly it is a lot more complex than using the UserClaimsPrincipalFactory approach explained in the Part3 article. Complexity makes the application harder to understand and can be harder to refactor.

Also, I only got the “refresh claims” feature to work for Cookie authentication, while the “frozen” implementation I showed in the Part3 article works with both Cookie or Token authentication. If you need a token solution then a good starting point is the https://www.blinkingcaret.com/ blog (you might find this article useful “Refresh Tokens in ASP.NET Core Web Api”).

The other issue is performance. For every HTTP request a read of the TimeStore is required. Now that request is very small and only take about 750ns on my I7, 4GHz Windows development PC, but with lots of simultaneous users you would be loading up the database. But the good news is that using a database means it automatically works with multiple instances of a web application (known as scale-out).

NOTE: I did try adding an ASP.NET Core Distributed Memory Cache to improve local performance, but because the OnValidatePrincipal event lives outside the dependency injection you end up with difference instances of the memory cache (took me a while to work that out!). You could add a cache like Redis because it relies on configuration rather than the same instance, but it does add another level of complexity.

The other performance issue is it has to refresh EVERY logged in user, as it doesn’t have enough information to target the specific users that need an update. If you have thousands of concurrent users that will bring a higher-than-normal load on the application and the database. Overall recalculating the Permissions aren’t that onerous, but it may be worth changing any roles and permissions outside the site’s peak usage times.

Overall, I would suggest you think hard as to whether you need the “refresh claims” feature. Most authentication systems don’t have “refresh claims” feature as standard, so remember the Yagni (“You Aren’t Gonna Need It”) rule.

Conclusion

This article has focused on one specific feature that readers of my first article felt was needed. I believe my solution to the “refresh claims” feature is robust, but there are some (small?) downsides which I have listed. You can find all the code in this article, and a runnable example application, in the GitHub repo PermissionAccessControl2.

When I first developed the whole feature/data authorization approach for one of my clients we discussed whether we needed the “refresh claims” feature. They decided it wasn’t worth the effort and I think that was right decision for their application.

But if your application/users need the refresh claims feature then you now have a fully worked out approach which will still work even on web apps that scale out, i.e. run multiple instances of the web app to give better scalability.

Happy coding!

PS. Have a look at Andrew Lock’s excellent series “Adding feature flags to an ASP.NET Core app” for another useful feature to add to your web app.

If you have a ASP.NET Core or Entity Framework Core problem that you want help on then I am available as a freelance contractor. Please send me a contact request via my Contact page and we can talk some more on Skype.

Part 4: Building a robust and secure data authorization with EF Core

Last Updated: September 28, 2019 | Created: July 9, 2019

This article covers how to implement data authorization using Entity Framework Core (EF Core), that is only returning data from a database that the current user is allowed to access. The article focuses on how to implement data authorization in such a way that the filtering is very secure, i.e. the filter always works, and the architecture is robust, i.e. the design of the filtering doesn’t allow developer to accidently bypasses that filtering.

This article is part of a series and follows a more general article on data authorization I wrote six months ago. That first article introduced the idea of data authorization, but this article goes deeper and looks at one way to design a data authorization system that is secure and robust. It uses a new, improved example application, called PermissionAccessControl2 (referred to as “version 2”) and a series of new  articles which cover other areas of improvement/change.

UPDATE: See my NDC Oslo 2019 talk which covers these three articles.

Original articles:

TL;DR; – summary

  • This article provides a very detailed description of how I build a hierarchical, multi-tenant application where the data a user could access depends on which company and what role they have in that company.
  • The example application is built using ASP.NET Core and EF Core. You can look at the code and run the application, which has demo data/users, by cloning this GitHub repo.
  • This article and its new (version 2) example application is a lot more complicated than the data authorization described in the original (Part 2) data authorisation article. If you want to start with something, then read the original (Part 2) first.
  • The key feature that makes it work are EF Core’s Query Filters, which provides a way to filter data in ALL EF Core database queries.
  • I break the article into two parts:
    • Making it Secure, which covers how I implemented a hierarchical, multi-tenant application that filters data based on the user’s given data access rules.
    • Making it robust, which is about designing an architecture that guides developers so that they can’t accidently bypass the security code that has been written.

Setting the scene – examples of data authorization

Pretty much every web application with a database will filter data – Amazon doesn’t show you every product it has, but tried to show you things you might be interested in. But this type of filtering for the convenience of the user and is normally part of the application code.

Another type of database filtering is driven from security concerns – I refer to this this as data authorization. This isn’t about filtering data for user convenience, but about applying strict business rules that that dictate what data a user can see. Typical scenarios where data authorization is needed are:

  • Personal data, where only the user can read/alter their personal data.
  • Multi-tenant systems, where one database is used to support multiple, separate user’s data.
  • Hierarchical systems, for instance a company with divisions where the CEO can see all the sales data, but each division can only see their own sales data.
  • Collaboration systems like GitHub/BitBucket where you can invite people to work with you on a project etc.

NOTE: If you are new to this area then please see this section of the original article where I have a longer introduction to data protection, both what’s it about and what the different part are.

My example is a both a multi-tenant and a hierarchical system, which is what one of my client’s needed. The diagram below shows two companies 4U Inc. and Pets2 Ltd. with Joe, our user in charge of the LA division of 4U’s outlets.

The rest of this article will deal with how to build an application which gives Joe access to the sales and stock situation in both LA outlets, but no access to any of the other divisions’ data or other tenants like Pets2 Ltd.

In the next sections I look at the two aspects: a) making my data authorization design secure, i.e. the filter always works, and, b) its architecture is robust, i.e. the design of the filtering doesn’t allow developer to accidently bypasses that filtering.

A. Building a secure data authorization system

I start with the most important part of the data authorization – making sure my approach is secure, that is, it will correctly filter out the data the user isn’t allowed to see. The cornerstone of this design is EF Core’s Query Filters, but to use them we need to set up a number of other things too. Here is a list of the key parts, which I then describe in detail:

  1. Adding a DataKey to each filtered class: Every class/table that needs data authorization must have a property that holds a security key, which I call the DataKey.
  2. Setting the DataKey property/column: The DataKey needs to be set to the right value whenever a new filtered class is added to the database.
  3. Add the user’s DataKey to their claims: The user claims need to contain a security key that is matched in some way to the DataKey in the database.
  4. Filter via the DataKey in EF Core: The EF Core DbContext has the user’s DataKey injected into it and it uses that key in EF Core Query Filters to only return the data that the user is allowed to see.

A1. Adding a DataKey to each filtered class

Every class/table that needs data authorization must have a property that holds a security key, which I call the DataKey.

In my example application there are multiple different classes that need to be filtered. I have a base IDataKey interface which defines the DataKey string property. All the classes to be filter inherit this interface. The DataKey definition looks like this

As you will see the DataKey is used a lot in this application.

A2. Setting the DataKey property/column

The DataKey needs to be set to the right value whenever a new filtered class is added to the database.

NOTE: This example is complex because of the hierarchical data design. If you want a simple example of setting a DataKey see the “personal data filtering” example in the original, Part 2 article.

Because of the hierarchical nature of my example the “right value” is bit complex. I have chosen to create a DataKey than is a combination of the primary keys of all the layers in the hierarchy. As you will see in the next subsection this allows the query filter to target different levels in the multi-tenant hierarchy.

The diagram below shows you the DataKeys (bold, containing numbers and |) for all the levels in the 4U Company hierarchy.

The hierarchy consists of three classes, Company, SubGroup and RetailOutlet, which all inherit from an abstract class called TenantBase. This allows me to create relationships between any of the class types that inherit from TenantBase (EF Core to treat these classes as a Table-Per-Hierarchy, TPH, and stores all the different types in one table).

But setting the DataKey on creation is difficult because the DataKey needs the primary key, which isn’t set until its created in the database. My way around this is to use a transaction. Here is the method in the TennantBase class that is called by the Company, SubGroup or RetailOutlet static creation factory.

protected static void AddTenantToDatabaseWithSaveChanges
    (TenantBase newTenant, CompanyDbContext context)
{
    // … Various business rule checks let out

    using (var transaction = context.Database.BeginTransaction())
    {
        //set up the backward link (if Parent isn't null)
        newTenant.Parent?._children.Add(newTenant);
        context.Add(newTenant);  //also need to add it in case its the company
        // Add this to get primary key set
        context.SaveChanges();

        //Now we can set the DataKey
        newTenant.SetDataKeyFromHierarchy();
        context.SaveChanges();

        transaction.Commit();
    }
}

The Stock and Sales classes are easier to handle, as they use the user’s DataKey. I override EF Core’s SaveChanges/SaveChangesAsync to do this, using the code shown below.

public override int SaveChanges(bool acceptAllChangesOnSuccess)
{
    foreach (var entityEntry in ChangeTracker.Entries()
        .Where(e => e.State == EntityState.Added))
    {
        if (entityEntry.Entity is IShopLevelDataKey hasDataKey)
            hasDataKey.SetShopLevelDataKey(accessKey);
    }
    return base.SaveChanges(acceptAllChangesOnSuccess);
}

A3. Add the user’s DataKey to their claims:

The user claims need to contain a security key that is matched in some way to the DataKey in the database.

My general approach as detailed in the original data authorization article is to have claim in the user’s identity that is used to filter data on. For personal data this can be the user’s Id (typically a string containing a GUID), or for a straight-forward multi-tenant it would by some form of tenant key stored in the user’s information. For this you might have the following class in your extra authorization data:

public class UserDataAccessKey
{
    public UserDataAccessKey(string userId, string accessKey)
    {
        UserId = userId ?? throw new ArgumentNullException(nameof(userId));
        AccessKey = accessKey;
    }

    [Key]
    [Required(AllowEmptyStrings = false)]
    [MaxLength(ExtraAuthConstants.UserIdSize)]
    public string UserId { get; private set; }

    [MaxLength(DataAuthConstants.AccessKeySize)]
    public string AccessKey { get; private set; }
}

In this hierarchical and multi-tenant example it gets a bit more complex, mainly because the hierarchy could change, e.g. a company might start with simple hierarchy of company to shops, but as it grows it might move a shop to sub-divisions like the west-coast. This means the DataKey can change dynamically.

For that reason, I link to the actual Tenant class that holds the DataKey that the user should use. This means that we can look up the current DataKey of the Tenant when the user logs in.

NOTE: In Part 5 I talk about how to dynamically update the user’s DataKey claim of any logged in user if the hierarchy they are in changes.

public class UserDataHierarchical
{
    public UserDataHierarchical(string userId, TenantBase linkedTenant)
    {
        if (linkedTenant.TenantItemId == 0)
            throw new ApplicationException(
                "The linkedTenant must be already in the database.");

        UserId = userId ?? throw new ArgumentNullException(nameof(userId));
        LinkedTenant = linkedTenant;
    }

    public int LinkedTenantId { get; private set; }

    [ForeignKey(nameof(LinkedTenantId))]
    public TenantBase LinkedTenant { get; private set; }

    [Key]
    [Required(AllowEmptyStrings = false)]
    [MaxLength(ExtraAuthConstants.UserIdSize)]
    public string UserId { get; private set; }

    [MaxLength(DataAuthConstants.AccessKeySize)]
    public string AccessKey { get; private set; }
}

This dynamic DataKey example might be an extreme case but seeing how I handled this might help you when you come across something that is more complex than a simple value.

At login you can add the feature and data claims to the user’s claims using the code I showed in Part 3, where I add to the user’s claims via a UserClaimsPrincipalFactory, as described in this section of the Part 3 article.

The diagram below shows how my factory method would lookup the UserDataHierarchical entry using the User’s Id and then adds the current DataKey of the linked Tenant.

A4. Filter via the DataKey in EF Core

The EF Core DbContext has the user’s DataKey injected into it and it uses that key in EF Core Query Filters to only return the data that the user is allowed to see.

EF Core’s Query Filters are a new feature added in EF Core 2.0 and they are fantastic for this job. You define a query filter in the OnModelCreating configuration method inside your DbContext and it will filter ALL queries, that comprises of LINQ queries, using Find method, included navigation properties and it even adds extra filter SQL to EF Core’s FromSql method (FromSqlRaw or FromSqlInterpolated in EF Core 3+). This makes Query Filters a very secure way to filter data.

For the version 2 example here is a look the CompanyDbContext class with the query filters set up by the OnModelCreating method towards the end of the code.

public class CompanyDbContext : DbContext
{
    internal readonly string DataKey;

    public DbSet<TenantBase> Tenants { get; set; }
    public DbSet<ShopStock> ShopStocks { get; set; }
    public DbSet<ShopSale> ShopSales { get; set; }

    public CompanyDbContext(DbContextOptions<CompanyDbContext> options,
        IGetClaimsProvider claimsProvider)
        : base(options)
    {
        DataKey = claimsProvider.DataKey;
    }

    //… override of SaveChanges/SaveChangesAsync left out

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        //… other configurations left out

        AddHierarchicalQueryFilter(modelBuilder.Entity<TenantBase>());
        AddHierarchicalQueryFilter(modelBuilder.Entity<ShopStock>());
        AddHierarchicalQueryFilter(modelBuilder.Entity<ShopSale>());
    }

    private static void AddHierarchicalQueryFilter<T>(EntityTypeBuilder<T> builder) 
         where T : class, IDataKey
    {
        builder.HasQueryFilter(x => x.DataKey.StartsWith(Datakey));
        builder.HasIndex(x => x.DataKey);
    }
}

As you can see the Query Filter uses the StartsWith method to compare the user’s DataKey and the DataLeys of the tenants and their Sales/Stock data. This means if Joe has the DataKey of 1|2|5| then he can see the Stock/Sales data for the shops “LA Dress4U” and “LA Shirt4U” – see diagram below.

NOTE: You can try this by cloning the PermissionAccessControl2 repo and running it locally (by default it uses an in-memory to make it easy to run). Pick different users to see the different data/features you can access.

B. Building a robust data authorization architecture

We could stop here because we have covered all the code needed to secure the data. But I consider data authorization as a high-risk part of my system, so I want to make it “secure by design”, i.e.  it shouldn’t be possible for a developer to accidently write something that bypasses the filtering. Here are the things I have done to make my code robust and guide another developer on how to do things.

  1. Use Domain-Driven design database classes. Some of the code, especially creating the Company, SubGroup and RetailOutlet DataKeys, are complicated. I use DDD-styled classes which provides only one way to create or update the various DataKeys and relationships.
  2. filter-Only DbContext: I build a specific DbContext to contain all the classes/tables that need to be filtered.
  3. Unit test to check you haven’t missed anything: I build unit tests that ensure the classes in the filter-Only DbContext have a query filter.
  4. checks: With an evolving application some features are left for later. I often add small fail-safe checks in the original design to make sure any new features follow the original design approach.

B1. Use Domain-Driven design (DDD) database classes

DDD teaches us that each entity (a class in .NET world) should contain the code to create and update itself and its aggregates. Furthermore, it should stop any other external code from being able to bypass theses methods so that you are forced to use the methods inside the class. I really like this because it means there is one, and only one, method you can call to get certain jobs done.

The effect is I can “lock down” how something is done and make sure everyone uses the correct methods. Below the TenantBase abstract class which all the tenant classes inherit from showing the  MoveToNewParent method that moves a tenant to another parent, for instance moving a RetailOutlet to a different SubGroup.

public abstract class TenantBase : IDataKey
{
    private HashSet<TenantBase> _children;

    [Key]
    public int TenantItemId { get; private set; }
    [MaxLength(DataAuthConstants.HierarchicalKeySize)]
    public string DataKey { get; private set; }
    public string Name { get; private set; }

    // -----------------------
    // Relationships 

    public int? ParentItemId { get; private set; }
    [ForeignKey(nameof(ParentItemId))]
    public TenantBase Parent { get; private set; }
    public IEnumerable<TenantBase> Children => _children?.ToList();

    //----------------------------------------------------
    // public methods

    public void MoveToNewParent(TenantBase newParent, DbContext context)
    {
        void SetKeyExistingHierarchy(TenantBase existingTenant)
        {
            existingTenant.SetDataKeyFromHierarchy();
            if (existingTenant.Children == null)
                context.Entry(existingTenant).Collection(x => x.Children).Load();

            if (!existingTenant._children.Any())
                return;
            foreach (var tenant in existingTenant._children)
            {                   
                SetKeyExistingHierarchy(tenant);
            }
        }

        //… various validation checks removed

        Parent._children?.Remove(this);
        Parent = newParent;
        //Now change the data key for all the hierarchy from this entry down
        SetKeyExistingHierarchy(this);
    }
    //… other methods/constructors left out
}

The things to note are:

  • Line 3: The Children relationship is held a private field which cannot be altered by outside code. It can be read via the IEnumerable<TenantBase> Children property (line 17) but you can’t add or remove children that way.
  • Lines 6 to 16: Similarly, all the properties have private setters, so it can only be changed by methods inside the TenanBase class.
  • Lines 22 to the end: This has the code to change the relationship and then immediately runs a recursive method to change all the DataKeys of the other tenants underneath this one.

B2. Build a filter-Only DbContext

One possible problem could occur if a non-filtered relationship was present, say a link back to some authorization code (there is such a relationship linking a tenant to a UserDataHierarchical class). If that happened a developer could write code that could expose data that the user shouldn’t see – for instance accessing the UserDataHierarchical class could expose the user’s Id which I wish to keep secret.

My solution was to create a separate DbContext for the multi-tenant classes, with a different (but overlapping) DbContext for the extra authorization classes (see the diagram below as to what this looks like). The effect is to make a multi-tenant DbContext which any contains the filtered multi-tenant data. For a developer makes it clear what classes you can access when using multi-tenant DbContext.

NOTE: having multiple DbContexts with a shared table can make database migrations a bit more complicated. Have a look at my article “Handling Entity Framework Core database migrations in production” for different ways to handle migrations.

B3. Using unit tests to check you haven’t missed anything

With feature and data authorization I add unit tests that check I haven’t left a “hole” in my security. Because I have a DbContext specifically for the multi-tenant data I can write a test to check that every class mapped to the database has a Query Filter applied to it. Here is the code I use for that.

[Fact]
public void CheckQueryFiltersAreAppliedToEntityClassesOk()
{
    //SETUP
    var options = SqliteInMemory.CreateOptions<CompanyDbContext>();
    using (var context = new CompanyDbContext(options, 
        new FakeGetClaimsProvider("accessKey")))
    {
        var entities = context.Model.GetEntityTypes().ToList();

        //ATTEMPT
        var queryFilterErrs = entities.CheckEntitiesHasAQueryFilter().ToList();

        //VERIFY
        queryFilterErrs.Any().ShouldBeFalse(string.Join('\n', queryFilterErrs));
    }
}
public static IEnumerable<string> CheckEntitiesHasAQueryFilter(
    this IEnumerable<IEntityType> entityTypes)
{
    foreach (var entityType in entityTypes)
    {
        if (entityType.QueryFilter == null &&
             entityType.BaseType == null  && //not a TPH subclass
             entityType.ClrType
                 .GetCustomAttribute<OwnedAttribute>() == null && //not an owned type
             entityType.ClrType
                 .GetCustomAttribute<NoQueryFilterNeeded>() == null) //Not marked as global
            yield return 
                 $"The entity class {entityType.Name} does not have a query filter";
    }
}

B4. Adding fail-safe checks

You need to be careful of breaking the Yagni (You Aren’t Gonna Need It) rule, but a few fail-safe checks on security stuff makes me sleep better at night. Here are the two small things I did in this example which will cause an exception if the DataKey isn’t set properly.

Firstly, I added a [Required] attribute to the DataKey property (see below) which tells the database that the DataKey cannot be null. This means if my code fails to set a DataKey then the database will return a constraint error.

[Required] //This means SQL will throw an error if we don't fill it in
[MaxLength(DataAuthConstants.HierarchicalKeySize)]
public string DataKey { get; private set; }

My second fail-safe is also to do with the DataKey, but in this case I’m anticipating a future change to the business rules that could cause problems. The current business rules say that only the users that are directly linked to a RetailOutlet can create new Stock or Sales entries, but what happens if (when!) that business rule changes and divisional managers can create items in a RetailOutlet. The divisional managers don’t have the correct DataKey, but a new developer might miss that and you could “lose” data.

My answer is to add a safely-check to the retail outlet’s DataKey. A retail outlet has a slightly different DataKey format – it ends with a * instead of a |. That means I can check a retail outlet format DataKey is used in the SetShopLevelDataKey and throw an exception if it’s not in the right format. Here is my code that catches this possible problem.

public void SetShopLevelDataKey(string key)
{
    if (key != null && !key.EndsWith("*"))
        //The shop key must end in "*" (or be null and set elsewhere)
        throw new ApplicationException(
             "You tried to set a shop-level DataKey but your key didn't end with *");

    DataKey = key;
}

This is a very small thing, but because I know that change is likely to come and I might not be around it could save someone a lot of head scratching working out why data doesn’t end up in the right place.

Conclusion

Well done for getting to the end of this long article. I could have made the article much shorter if I only dealt with the parts on how to implement data authorization, but I wanted to talk about how handling security issues should affect the way you build your application (what I refer to as have a “robust architecture”).

I have to say that the star feature in my data authorization approach is EF Core’s Query Filters. These Query Filters cover ALL possible EF Core based queries with no exceptions. The Query Filters are the cornerstone of data authorization approach which I then add a few more features to manage user’s DataKeys and do clever things to handle the hierarchical features my client needed.

While you most likely don’t need all the features I included in this example it does give you a look at how far you can push EF Core to get what you want. If you need a nice, simple data authorization example, then please look at the Part 2 article “Handling data authorization in ASP.NET Core and Entity Framework Core” which has a personal data example which uses the User’s Id as the DataKey.

Part 3: A better way to handle ASP.NET Core authorization – six months on

Last Updated: September 28, 2019 | Created: July 2, 2019

About six months ago I wrote the article “A better way to handle authorization in ASP.NET Core” which quickly became the top article on my web site, with lots of comments and questions. I also gave a talk at the NDC Oslo 2019 conference on the same topic which provided even more questions. In response to all the questions I have developed a new, improved example application, called PermissionAccessControl2 (referred to as “version 2”) with series of new articles that cover the changes.

The version 2 articles which cover improvements/changes based on questions and feedback.

UPDATE: See my NDC Oslo 2019 talk which covers these three articles.

Original articles:

NOTE: Some people had problems using the code in the original web application (referred to a version 1) in their applications, mainly because of the use of in-memory databases. The version 2 example code is more robust and supports real databases (e.g. SQL Server).

TL;DR; – summary (and links to sections)

  • This article answers comments/questions raised by the first version and covers the following changes from the original feature authorization article. If you are new to this topic the original article is a better place to start as it explains the whole system.
  • The changed parts covered in this article are:
    • : Since the first article I have found a much simpler way to setup the users claims on login if you don’t need a refresh of the users claims.
    • Roles: In the original article I used ASP.NET Core’s identity Roles feature, but that adds some limitations. The version 2 example app has its own UserToRole class.
    • Using Permissions in the front-end: I cover how to use Permissions in Razor pages and how send the permissions to a JavaScript type front-end such as AngularJS, ReactJS etc.
    • Provides a SuperUser: Typically, a web app needs a user with super-high privileges to allow them set up a new system. I have added that to the new example app.
  • The version 2 example app is a super-simple SaaS (Software as a Service) application which provides stock and sales management to companies with multiple retail outlets.

Setting the scene – my feature/data authorization approach

As I explained in the first article in this series, I was asked to build a web application that provided a service to various companies that worked in the retail sector. The rules about what the users could access were quite complex and required me to go beyond the out-of-box features in ASP.NET Core. Also, the data was multi-tenant and hierarchical, i.e. each company’s data must be secured such that a user the data they are allowed to access.

Many (but not all) of the issues I solved for my client are generally useful so, with the permission of my client, I worked on an open-source example application to capture the parts of the authentication and authorization features that I feel are generally useful to other developers.

If you not familiar with ASP.NET’s authorization and authentication features, then I suggest you read the ”Setting the Scene” section in the first article.

Below is a figure showing the various parts/stages in my approach to feature/data authorization. I use the ASP.NET Core authentication part, but I replace the authorization stage with my own code.

My replacement authorization stage provides a few extra features over ASP.NET Core Role-based authorization.

  • I use Roles to represent the type of user that is accessing the system. Typical Roles in a retail system might be SalesAssistant, Manager, Director, with other specific-job Roles like FirstAider, KeyHolder.
  • In the ASP.NET Core code I use something I call Permissions, which represent a specific “use case”, such as CanProcessSale, CanAuthorizeRefund. I place a single Permission on each ASP.NET Core pages/web APIs I want to protect. For instance, I would add CanProcessSale Permission to all the ASP.NET Core pages/web APIs than are used in the process sale use case.
  • The DataKey is more complex because of the hierarchical data and I cover all the bases on how to be sure that this multi-tenant system is secure – see see part 4 article.

There are other parts that I am not going to cover in this article as they are covered in other articles in this series. They are

  • Why using Enums to implement Permissions was a good idea (see this section in first article).
  • How I managed paid-for features (see this section in the first article).
  • How I setup and use a DataKey for segregate the data – see the part 4 article.

What the rest of this article does is deal with improvements I have made in the version 2 example application.

A simpler way to add to the User’s Claims

My approach relies on me adding some claims to the User, which is of type ClaimsPrincipal. In the original article I did this by using an Event in ApplicationCookie, mainly because that is what I had to use from my client. While that works I have found a much simpler way that adds the claims on login. This approach is much easier to write, works for Cookies and Tokens, and is more efficient. Thanks to https://korzh.com/blogs/net-tricks/aspnet-identity-store-user-data-in-claims for writing about this feature.

We do this by implementing a UserClaimsPrincipalFactory and registering it as a service. Here is my implementation of the UserClaimsPrincipalFactory.

public class AddPermissionsToUserClaims : 
UserClaimsPrincipalFactory<IdentityUser>
{
    private readonly ExtraAuthorizeDbContext _extraAuthDbContext;

    public AddPermissionsToUserClaims(UserManager<IdentityUser> userManager, 
        IOptions<IdentityOptions> optionsAccessor,
        ExtraAuthorizeDbContext extraAuthDbContext)
        : base(userManager, optionsAccessor)
    {
        _extraAuthDbContext = extraAuthDbContext;
    }

    protected override async Task<ClaimsIdentity> 
        GenerateClaimsAsync(IdentityUser user)
    {
        var identity = await base.GenerateClaimsAsync(user);
        var userId = identity.Claims
           .SingleOrDefault(x => x.Type == ClaimTypes.NameIdentifier).Value;
        var rtoPCalcer = new CalcAllowedPermissions(_extraAuthDbContext);
        identity.AddClaim(new Claim(
            PermissionConstants.PackedPermissionClaimType,
            await rtoPCalcer.CalcPermissionsForUser(userId)));
        var dataKeyCalcer = new CalcDataKey(_extraAuthDbContext);
        identity.AddClaim(new Claim(
            DataAuthConstants.HierarchicalKeyClaimName, 
            dataKeyCalcer.CalcDataKeyForUser(userId)));
        return identity;
    }
}

You can see on line 17 I get the original claims by calling the base GenerateClaimsAsync. I then use the UserId to calculate the Permissions and the DataKey, which I add to the original claims. After this method has finished the rest of the login code will build the Cookie or Token for the user.

To make this work you need to register in the Configure method inside the Startup code using the following code:

services.AddScoped<IUserClaimsPrincipalFactory<IdentityUser>,
    AddPermissionsToUserClaims>();

NOTE: The original code and this approach means that the claims are fixed until the user logs out and logs back in again. In the Part 5 article I show ways to refresh the user’s claims when the roles/permissions have been changed by an admin person.

Adding our own UserToRole class

In the version 1 example app I used ASP.NET Core’s Role/RoleManger for a quick setup. But in a real application I wouldn’t do that as, I only want ASP.NET Core’s Identity system to deal with the authentication part.

The main reason for providing my own user to role class is because you can then do away with ASP.NET Core’s built-in identity database if you are using something like the OAuth 2 API (which my client system used). Also, if you are splitting the authorization part from the authentication part, then it makes sense to have the User-to-Roles links with all the other classes in the authentication part.

Note: Thanks to Olivier Oswald for his comments where he asked why I used the ASP.NET Core’s Role system in the first version. I was just being a bit lazy, so in version 2 I have done it properly.

In the version 2 example app I have my own UserToRole class, as shown below.

public class UserToRole
{
     private UserToRole() { } //needed by EF Core

     public UserToRole(string userId, RoleToPermissions role) 
     {
         UserId = userId;
         Role = role;
     }
 
     public string UserId { get; private set; }
     public string RoleName { get; private set; }

     [ForeignKey(nameof(RoleName))]
     public RoleToPermissions Role { get; private set; }
    
     //… other code left out
}

The UserToRole class has a primary key which is formed from both the UserId and the RoleName (known as a “composite key”). I do this to stop duplicate entries linking a User to a Role, which would make managing the User’s Role more difficult.

I also have a method inside the UserToRole class called AddRoleToUser. This adds a Role to a User with check as adding a duplicate Role to a User will cause a database exception, so I catch that early and sent a user-friendly error message to the user.

Using Permissions in the front-end

The authorization side of ASP.NET returns HTTP 403 (forbidden) if a user isn’t allowed access to that method. But to make a better experience for a user we typically want to remove any links, buttons etc. that the user isn’t allowed to access. So how do I do that with my Permissions approach?

Here are the two ways you might be implementing your front-end, and how to handle each.

1. When using Razor syntax

If you are using ASP.NET Core in MVC mode or Razor Page mode, then you can use my extension method called  UserHasThisPermission. The code below comes from the version 2 _layout.cshtml file and controls whether the Shop menu appears, and what sub-menu items appear.

@if (User.UserHasThisPermission(Permissions.SalesRead))
{
    <li class="nav-item">
        <div class="dropdown">
            <a class="nav-link text-dark dropdown-toggle" role="button"
               id="dropdown1MenuButton" 
               data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">
                Shop
            </a>
            <div class="dropdown-menu" aria-labelledby="dropdown1MenuButton">
                @if (User.UserHasThisPermission(Permissions.SalesSell))
                {
                    <a class="nav-link text-dark" asp-area="" 
                         asp-controller="Shop" asp-action="Till">Till</a>
                }
                <a class="nav-link text-dark" asp-area="" 
                       asp-controller="Shop" asp-action="Stock">Stock</a>
                @if (User.UserHasThisPermission(Permissions.SalesRead))
                {
                    <a class="nav-link text-dark" asp-area="" 
                         asp-controller="Shop" asp-action="Sales">Sales</a>
                }
            </div>
        </div>
    </li>
}

You can see me using the UserHasThisPermission method on lines 1, 12 and 17.

2. Working with a JavaScript front-end framework

In some web  applications you might use a JavaScript front-end framework such as AngularJS, ReactJS etc. to manage the front-end. In that case you need to pass the Permissions to your front-end system so that you can add code to control what link/buttons are shown to the user.

It is very simple to get access to the current user’s Permissions via the HttpContext.User variable which is available in any controller. I do this via a Web API and here is the code from my FrontEndController in my version 2 application.

[HttpGet]
public IEnumerable<string> Get()
{
    var packedPermissions = HttpContext.User?.Claims.SingleOrDefault(
        x => x.Type == PermissionConstants.PackedPermissionClaimType);
    return packedPermissions?.Value
        .UnpackPermissionsFromString()
        .Select(x => x.ToString());
}

This action returns an array of Permission names. Typically you would call this after a login and store the data in a SessionStorage for use while the user is logged in. You can try this in the version 2 application – run the application locally and go to http://localhost:65480/swagger/index.html. Swagger will then display the FrontEnd API which has one command to get the user’s permissions.

NOTE: In part 5, where the permissions can change dynamically I show a way that the front-end can detect that the permissions have changed so they can update their local version of the Permissions.

Enabling a SuperAdmin User

In most web application I have built you need one user that has access to every part of the system – I call this user SuperAdmin. And typically, I have some code that will make sure there is a SuperAdmin user in any system that the application runs. That way you can run the app with a new database and then use the SuperAdmin user to set up all the other users you need.

A logged in SuperAdmin User needs all the Permissions so that they can do anything, but that would be hard to keep updated as new permissions are added. Therefore, I added a special Permission called AccessAll and altered the UserHasThisPermission extension method to return true if the current user has the AccessAll Permission.

public static bool UserHasThisPermission(
    this Permissions[] usersPermissions, 
    Permissions permissionToCheck)
{
    return usersPermissions.Contains(permissionToCheck) 
        || usersPermissions.Contains(Permissions.AccessAll);
}

Now we have the concept of a SuperAdmin we need a way to create the SuperAdmin user. I do this via setup code in the Program class. This method makes sure that the SuperAdmin user is in the current user database, i.e. it adds a SuperUser if . Note: I am using the C# 7.1’s Main Async feature to run my startup code.

public static async Task Main(string[] args)
{
    (await BuildWebHostAsync(args)).Run();
}

public static async Task<IWebHost> BuildWebHostAsync(string[] args)
{
    var webHost = WebHost.CreateDefaultBuilder(args)
        .UseStartup<Startup>()
        .Build();

    //Because I might be using in-memory databases I need to make 
    //sure they are created before my startup code tries to use them
    SetupDatabases(webHost);

    await webHost.Services.CheckAddSuperAdminAsync();
    await webHost.Services.CheckSeedDataAndUserAsync();
    return webHost;
}

The CheckAddSuperAdminAsync obtains the SuperAdmin Email and password from the appsetting.json file (see Readme file for more information).

NOTE: The SuperAdmin user is very powerful and needs to be protected. Use a long, complicated password and make sure you provide the SuperAdmin email and password by overriding the appsettings.json values on deployment. My startup code also doesn’t allow a new SuperAdmin user to be added if there is a user already in the database that has the SuperAdmin role. That stops someone adding a new SuperAdmin with their own email address once a system is live.

Other things new in version 2 of the example application

The PermissionAccessControl2 application is designed to work like a real application, using both feature and data authorization. The application pretends to be a super-simple retail sales application service for multiple companies, i.e. a multi-tenant application. I cover the multi-tenant data authorization in part 4.

The differences in version 2 of version 1 that I have not already mentioned are:

  • The code will work with either in-memory databases or SQL Server databases, i.e. it will check if users and data is already present and is so won’t try to add the user/data again
  • You can choose various application setups, such as database type and simple/complex claims setup. This is controlled by data in the appsettings.json file – see Readme file for more information on this.
  • The multi-tenant data access is hierarchical, with much more complex and robust than in version 1 – see Part 4: Building a robust and secure data authorization with EF Core for more on this.
  • There are some unit tests. Not a lot, but enough to give you an idea of what is happening.

Conclusion

The first article on my approach authorization in ASP.NET Core has been very popular, and I had great questions via my blog and at my NDC Oslo talk. This caused me to build a new version of the example app, available via a GitHub repo, with many improvements and some new articles that explains the changes/improvements.

In this article (known as Part 3) I focused on the ASP.NET Core authorization part and provided a few improvements over the version 1 application and added explanations of how to use these features.  In article Part 4 I cover data authorization with EF Core, and the Part 5 article covers the complex area of updating a user’s permissions/data key dynamically.

Hopefully the whole series, with the two example applications, will help you design and build your own authorization systems to suit your needs.

Getting better data for unit testing your EF Core applications

Last Updated: June 5, 2019 | Created: June 5, 2019

I was building an application using Entity Framework Core (EF Core) for a client and I needed to set up test data to mimic their production system. Their production system was quite complex with some data that was hard to generate, so writing code to set up the database was out of the question. I tried using JSON files to save data and restore data to a database and after a few tries I saw a pattern and decided to build a library to handle this problem.

My solution was to serialize the specific data from an existing database and store it as a JSON file. Then in each unit test that needed that data I deserialized the JSON file into classes and use EF Core to save it to the unit test’s database. And because I was copying from a production database which could have personal data in it I added a feature that can anonymize personal data so that no privacy laws, such as GDPR, were breached.

Rather than writing something specific for my client I decided to generalise this feature and add it to my open-source library EfCore.TestSupport. That way I, and you, can use this library to help you with create better test data for unit/performance tests, and my client gets a more comprehensive library at no extra cost.

TL; DR; – summary

  • The feature “Seed from Production” allows you to capture a “snapshot” of an existing (production) database into a JSON file which you can use to recreate the same data in a new database for testing your application.
  • This feature relies on an existing database containing the data you want to use. This means it only works for updates or rewrites of an existing application.
  • The “Seed from Production” is useful in the following cases.
    • For tests that need a lot of complex data that is hard to hand-code.
    • It provides much more representative data for system tests, performance tests and client demos.
  • The “Seed from Production” feature also includes an anonymization stage so that no personal data is stored/used in your application/test code.
  • The “Seed from Production” feature relies on EF Core’s excellent handing of saving classes with relationships to the database.
  • The “Seed from Production” feature is part of my open-source library TestSupport (Version 2.0.0 and above).

Setting the scene – why do I need better unit test data?

I spend a lot of time working on database access code (I wrote the book “Entity Framework Core in Action”) and I often have to work with large sets of data (one client’s example database was 1Tbyte in size!). If you use a database in a test, then the database must be in a known state before you start you test. In my experience what data the unit test’s needs break down into three cases:

  1. A small number of tests you can start with an empty database.
  2. Most tests only need few tables/rows of data in the database.
  3. A small set of tests need more complex set of data in the database.

Item 1, empty database, is easy to arrange: either delete/create a new database (slow) or build a wipe method to “clean” an existing database (faster). For item 2, few tables/rows, I usually write code to set up the database with the data I need – these are often extension methods with names like SeedDatabaseFourBooks or CreateDummyOrder. But when it comes to item 3, which needs complex data, it can be hard work to write and a pain to keep up to date when the database schema or data changes.

I have tried several approaches in the past (real database and run test in a transaction, database snapshots, seed from Excel file, or just write tests to handle unknown data setup). But EF Core’s excellent approach to saving classes with relationships to a database allows me to produce a better system.

Personally, I have found having data that looks real makes a difference when I am testing at every level. Yes, I can test with the book entitled “Book1” with author “Author1”, but having a book named “Domain-Driven Design” with author “Eric Evans” is easier to spot errors in. Therefore, I work to produce “real looking” data when I am testing or demonstrating an application to a client.

One obvious limitation of the “Seed from Production” approach is you do need an existing database that contains data you can copy from! Therefore, this works well when you are updating or extending an existing application. However, I have also found this useful when building a new application as the development will (should!) soon produce pre-production data that you can use.

NOTE: Some people would say you shouldn’t be accessing a database as part of a unit test, as that is an integration test. I understand their view and in some business logic I do replace the database access layer with an interface (see this section in my article about business logic). However, I am focused on building things quickly and I find using a real database makes it easier/quicker to write tests (especially if you can use an in-memory database) which means my unit test also checks that my database relationships/constraints work too.

How my “seed from production” system works

When I generalised the “seed from production” system I listed what I needed.

  • A way to read data from another database and store in a file. That way the “snapshot” data becomes part of your unit test code and is cover by source control.
  • The data may come from a production database that contains personal data. I need a way to anonymise that data before its saved to a file.
  • A way to take the stored “snapshot” and write it back out to a new database for use in tests.
  • I also needed the option to alter the “snapshot” data before it was written to the database for cases where a particular unit test needed a property(s) set to a specific value.
  • Finally, I need a system that made updating the “snapshot” data easy, as the database schema/data is likely to change often.

My “Seed from Production” feature handles all these requirements by splitting the process into two parts: an extract part, which is done once, and the seed part, which runs before each test to setup the database. The steps are:

  1. Extract part – only run if database changes.
    1. You write the code to read the data you need from the production database data.
    2. The DataResetter then:
      1. Resets the primary keys and foreign keys to default values so that EF Core will create new versions in the test database.
      2. Optionally you can anonymise specific properties that need privacy, e.g. you may want to anonymise all the names, emails, addresses etc.
    3. Converts the classes to a JSON string.
    4. Saves this JSON string to a file, typically in you unit test project.
  2. Seed database part – run at start of every unit test
    1. You need to provide an empty database for the unit test.
    2. The DataResetter reads the JSON file back into classes mapped to the database.
    3. Optionally you can “tweak” any specific data in the classes that your unit test need
    4. Then you add the data to the database and call SaveChanges.

That might sound quite complicated but most of that is done for you by the library methods. The diagram below shows the parts of the two stages to make it clearer – the parts shown as orange are the parts you need to write while the parts in blue are provided by the library.

Show me the code!

This will all make more sense when you see the code, so in the next subsections I show you the various usage of the “Seed from Production” library. They are:

  1. Extract without no anonymisation.
  2. Extract showing an example of anonymisation.
  3. Seed a unit test database, with optional update of the data.
  4. Seed database when using DDD-styled entity classes.

In all the examples I use my book app database which I used the book I wrote for Manning, “Entity Framework Core in Action”. The book app “sells” books and therefore the database contains books, with authors, reviews and possible price promotions – you can see this in the DataLayer/BookApp folder of my EfCore.TestSupport GitHub project.

NOTE: You can see a live version of the book app at http://efcoreinaction.com/

1. Extract without no anonymisation

I start with extracting data from a database stage, which only needs to be run when the database schema or data has changed. To make it simple to run I make it a unit test, but I define that test in such a way that it only runs in debug mode (that stops it being run when you run all the tests).

[RunnableInDebugOnly(DisplayName = "Needs database XYZ")]
public void ExampleExtract()
{
    var sqlSetup = new SqlServerProductionSetup<BookContext>
        ("ExampleDatabaseConnectionName");
    using (var context = new BookContext(sqlSetup.Options))
    {
        //1a. Read in the data to want to seed the database with
        var entities = context.Books
            .Include(x => x.Reviews)
            .Include(x => x.Promotion)
            .Include(x => x.AuthorsLink)
                .ThenInclude(x => x.Author)
            .ToList();

        //1b. Reset primary and foreign keys
        var resetter = new DataResetter(context);
        resetter.ResetKeysEntityAndRelationships(entities);

        //1c. Convert to JSON string
        var jsonString = entities.DefaultSerializeToJson();
        //1d. Save to JSON local file in TestData directory
        sqlSetup.DatabaseName.WriteJsonToJsonFile(jsonString);
    }
}

The things to note are:

  • Line 1: I use the RunnableInDebugOnly attribute (available in my EfCore.TestSupport library) to stop the unit test being run in a normal run of the unit tests. This method only needs to be run if the database scheme or data changes.
  • Line 4: the SqlServerProductionSetup class takes the name of a connection in the appsetting.json file and sets up the options for the given DbContext so that you can open it.
  • Line 9 to 14: Here I read in all the books with all their relationships that I want to save.
  • Lines 17 and 18: In this case the Resetter resets the primary keys and foreign keys to their default value. You need to do this to ensure EF Core works out the relationships via the navigational properties and creates new rows for all the data.
  • Line 21: This uses a default setting for Newtonsoft.Json’s SerializeObject method. This works in most cases, but you can write your own if you need different settings.
  • Line 23: Finally, it writes the data in to a file in the TestData folder of your unit tests. You can supply any unique string which is used as part of the file name – typically I use the name of the database it came from, which the SqlServerProductionSetup class provides.

2. Extract showing an example of using anonymisation

As I said before you might need to anonymise names, emails, addresses etc. that were in your production database. The DataResetter has a simple, but powerful system that allows you to define a series of properties/classes that need anonymising.

You define a class and a property in that class to anonymise and the DataResetter will traverse the whole sequence of relationships and will reset every instance of the class+property. As you will see you can define lots of classes/properties to be anonymised.

The default anonymisation method uses GUIDs as strings, so the name “Eric Evans” would be replaced with something like “2c7211292f2341068305309ff6783764”. That’s fine but it’s not that friendly if you want to do a demo or testing in general. That is why I provide a way to replace the default anonymisation method, which I show in the example (but you don’t have to if GUID strings are OK for you).

The code below is an example of what you can do by using an external library to provide random names and places. In this implementation I use the DotNetRandomNameGenerator NuGet package and create a few different formats you can call for, such as FirstName, LastName, FullName etc.

public class MyAnonymiser
{
    readonly PersonNameGenerator _pGenerator;

    public MyAnonymiser(int? seed = null)
    {
        var random = seed == null ? new Random() : new Random((int)seed);
        _pGenerator = new PersonNameGenerator(random);
    }

    public string AnonymiseThis(AnonymiserData data, object objectInstance)
    {
        switch (data.ReplacementType)
        {
            case "FullName": return _pGenerator.GenerateRandomFirstAndLastName();
            case "FirstName": return _pGenerator.GenerateRandomFirstName();
            case "LastName": return _pGenerator.GenerateRandomLastName();
            case "Email": //… etc. Add more versions as needed
            default: return _pGenerator.GenerateRandomFirstAndLastName();
        }
    }
}

The things to note are:

  • Lines 5 to 9: I designed my anonymiser to take an optional number to control the output. If no number is given, then the sequence of names has a random start (i.e. it produces different names each time it is run). If a number is given, then you get the same random sequence of names every time. Useful if you want to check properties in your unit test.

NOTE: See the Seed from Production Anonymization documentation link on the AnonymiserFunc and its features. There are several pieces of information I have not described here.

The code below shows an extract method, which is very similar to the first version, but with some extra code to a) link in the MyAnonymiser, and b) defines the class+property that needs to be anonymised.

[RunnableInDebugOnly(DisplayName = "Needs database XYZ")]
public void ExampleExtractWithAnonymiseLinkToLibrary()
{
    var sqlSetup = new SqlServerProductionSetup<BookContext>
        ("ExampleDatabaseConnectionName");
    using (var context = new BookContext(sqlSetup.Options))
    {
        //1a. Read in the data to want to seed the database with
        var entities = context.Books
            .Include(x => x.Reviews)
            .Include(x => x.Promotion)
            .Include(x => x.AuthorsLink)
                .ThenInclude(x => x.Author)
            .ToList();

        //1b-ii. Set up resetter config to use own method
        var myAnonymiser = new MyAnonymiser(42);
        var config = new DataResetterConfig
        {
            AnonymiserFunc = myAnonymiser.AnonymiseThis
        };
        //1b-ii. Add all class/properties that you want to anonymise
        config.AddToAnonymiseList<Author>(x => x.Name, "FullName");
        config.AddToAnonymiseList<Review>(x => x.VoterName, "FirstName");
        //1b. Reset primary and foreign keys and anonymise given class+property
        var resetter = new DataResetter(context, config);
        resetter.ResetKeysEntityAndRelationships(entities);

        //1c. Convert to JSON string
        var jsonString = entities.DefaultSerializeToJson();
        //1d. Save to JSON local file in TestData directory
        "ExampleDatabaseAnonymised".WriteJsonToJsonFile(jsonString);
    }
} 

The things to note are:

  • Line 17: I create MyAnonymiser, and in this case I provide a see number. This means the same sequence of random names will be created whenever the extract is run. This can be useful if you access the anonymised properties in your unit test.
  • Lines 18 to 21: I override the default AnonymiserFunc by creating a DataResetterConfig class and setting the AnonymiserFunc properly to my replacement AnonymiserFunc from my MyAnonymiser class.
  • Lines 23 and 24: I add two class+property items that should be anonymised via the AddToAnonymiseList<T> method in the DataResetterConfig instance. As you can see you can provide a string that defines what type of replacement you want. In this case the Author’s Name needs a full name, e.g. “Jon Smith” and the Review’s VoterName just needs a first name, e.g. Jane.
  • Line 26: The creation of the DataResetter now has a second parameter with the DataResetterConfig instance with the new AnonymiserFunc.

All the rest of the method is the same.

NOTE: The ResetKeysEntityAndRelationships method follows all the navigational links so every instance of the given class+property that is linked to the root class will be reset. It also uses Reflection, so it can anonymise properties which have private setters.

3. Seed a unit test database from the JSON file

Now I show you a typical unit test where I seed the database from the data stored in the JSON file. In this case using a Sqlite in-memory, which is very fast to setup and run (see my article “Using in-memory databases for unit testing EF Core applications” for when and how you can use this type of database for unit testing).

[Fact]
public void ExampleSeedDatabase()
{
    //SETUP
    var options = SqliteInMemory.CreateOptions<BookContext>();
    using (var context = new BookContext(options))
    {
        //2a. make sure you have an empty database
        context.Database.EnsureCreated();
        //2b. read the entities back from the JSON file
        var entities = "ExampleDatabase".ReadSeedDataFromJsonFile<List<Book>>();
        //2c. Optionally “tweak” any specific data in the classes that your unit test needs
        entities.First().Title = "new title";
        //2d. Add the data to the database and save
        context.AddRange(entities);
        context.SaveChanges();

        //ATTEMPT
        //... run your tests here

        //VERIFY 
        //... verify your tests worked here
    }
}

The things to note are:

  • Line 9. In this case I am using an in-memory database, so it will be empty. If you are using a real database, then normally clear the database before you start so that the unit tests as a “known” starting point. (Note that you don’t have to clear the database for the seed stage to work – it will just keep adding a new copy of the snapshot every time, but your unit test database will grow over time.)
  • Line 11: my ReadSeedDataFromJsonFile extension method reads the json file with the reference “ExampleDatabase” (which was the name of the database that was imported – see extract code) and uses Newtonsoft.Json’s DeserializeObject method to turn the JSON back into entity classes with relationships.
  • Line 13: Optionally you might need to tweak the data specifically for the test you are going to run. That’s easy as you have access to the classes at this stage.
  • Line 15. You use Add, or AddRange if it’s a collection, to Add the new classes to the database.
  • Line 16. The last step is to call SaveChanges to get all the entities and relationships created in the database. EF Core will follow all the navigational links, like Reviews, to work out what is linked to what and set up the primary keys/foreign keys as required.
  • Lines 19 onward. This is where your tests and asserts go in your unit test.

4. Seed database when using DDD-styled entity classes

If you have read any of my other articles you will know I am a great fan of DDD-styled entity classes (see my article “Creating Domain-Driven Design entity classes with Entity Framework Core” for more about this). So, of course, I wanted the Seed from Production feature to work with DDD-styled classes, which it does now, but you do need to be careful so here are some notes about things.

Problem occur if Newtonsoft.Json can’t find a way to set a property at serialization time. This fooled me for quite a while (see the issue I raised on Newtonsoft.Json GitHub). The solution I came up with was adding a setting to the serialization (and deserialization) that tells Newtonsoft.Json that it can set via private setters. This works for me (including private fields mapped to IEnumerable), but in case you have a more complex state there are other ways to set this up. The most useful is to create a private constructor with parameters that match the properties by type and name, and then place a [JsonConstructor] attribute on that constructor (there are other ways too – look at Newtonsoft.Json docs).

NOTE: The symptoms of Newtonsoft.Json failing to serialize because it can’t access a property aren’t that obvious. In one case Newtonsoft.Json threw an unexplained “Self referencing loop detected” exception. And when I changed the JsonSerializerSettings to ignore self-referencing loops it incorrectly serialized the data by adding a duplicate (!). You can see the gory details in the issue I raised on Newtonsoft.Json.

Aside – how does this seeding work?

NOTE: This is an optional section – I thought you might like to learn something about EF Core and how it handles classes with relationships (known as “navigational links” in EF Core).

You may be wondering how this seeding feature works – basically it relies on some magic that EF Core performs.

  1. Firstly, EF Core must work out what State a class is in, e.g. is it an Added (new) entry or an existing (tracked) entry, which tells it whether it needs to create a new entry or just refer to an existing entry in the database. This means EF Core sets the state of the class instance coming from the json as Added and will write them out tot the database.
  2. The second part is how EF Core works out the relationships between each class instance. Because the ResetKeysEntityAndRelationships method reset the foreign keys (and the primary keys) EF Core relies on the navigational properties to work out the relationships between each class instance.

These two things mean that the database is updated with the correct data and foreign keys, even if the relationships are complex. This is feature makes EF Core so nice to work with not just in this feature but for any adding or updating of linked classes.

Here is a simple example taking from one of my talks with the code and two diagrams showing you the before and after. In this case I create a new book, with a new many-to-many BookAuthor to an existing Author.

var existingAuthor = context.Authors.First();
var newBook = new Book { Title = “New Book”};
newBook.AuthorLinks = new List<BookAuthor>
{ 
     new BookAuthor
     { 
          Book = newBook, 
          Author = existingAuthor, 
          Order = 1
     };
}

After that the classes look like this: note red classed are new, while blue have been read from the database (i.e. are tracked by EF Core).

Then we save this to the database with the following code.

context.Add(newBook);
context.SaveChanges();

After that the classes would look like this, with all the primary and foreign keys filled in and all navigational links set up.

EF Core has done the following to make this work:

  1. During the Add stage is sets up all the navigation links and copies the primary keys of existing instances into the correct foreign keys (in this case the Author’s existing primary key into the AuthorId foreign key in the BookAuthor class).
  2. In the SaveChanges part it does the following within a transaction
    1. It inserts a new row for this Book, which sets its primary key
    2. It then copies the Book’s primary key into the BookId foreign key in the BookAuthor class.
    3. It then inserts a new row for the BookAuthor class.

This makes handling linked classes so simple in EF Core.

Conclusion

As I said earlier I had tried over the years different ways to set up complex databases, both with ADO.NET and EF6.x. EF Core has a number of features (good access to the database schema and better handling of adding linked classes to a database) which now make this much easier to implement a generic solution to this problem.

For my client the “Seed from Production” feature works really well. Their database contains data that is hard to create manually, and a pain to keep up to date as the application grows. By copying a database set up by the existing application we captured the data to use in our unit tests and some performance tests too. Also, the test data becomes part of the test code and therefore covered by the standard source control system. Another bonus is it makes it simple to run any tests in a DevOps pipeline as the test databases as it can be created and seeded automatically, which saves use from having to have specific database available in the DevOps path.

You won’t need this feature much, as most unit tests should use very basic data, but for those complex systems where setting up the database is complicated then this library can save you (and me) lots of time. And with anonymisation stage happening before the json file is created you don’t have to worry about having personal data in your unit tests.

Happy coding!

 

GenericServices Design Philosophy + tips and techniques

Last Updated: May 22, 2019 | Created: April 3, 2019

I read Jimmy Bogard’s article called “AutoMapper’s Design Philosophy”, which he wrote to help people understand what job AutoMapper was designed to do. This sort of article helps people who get frustrated with AutoMapper because they are trying to use it to do something it wasn’t really designing to do. I use AutoMapper in my libraries and I was glad to see my usage is right in line with what AutoMapper is designed to do.

I thought I would write a similar article for my GenericServices library (see summary for what GenericServices does) to help anyone who uses, or wants to use my GenericServices library (and associated libraries). While the use of my GenericServices library is tiny compared to AutoMapper (about 0.05% of AutoMapper’s downloads!) I too have issues or requests for features that don’t fit into what GenericServices’s is designed to do. Hopefully this article will help people understand my GenericServices library better, and I also add a few tips and techniques that I have found useful too.

Other related articles in this series:

TL; DR; – Summary

  • The GenericServices library is designed to speed up the development of building front-end Entity Framework 6 and Entity Framework Core (EF Core) databases accesses.
  • GenericServices does this by automating the mapping of database classes to DTOs (Data Transfer Object, also known as a ViewModel in ASP.NET) in a way that builds efficient database accesses.
  • My personal experience I would say that my GenericServices library saved me 2 months of development time over a 12-month period.
  • GenericServices also has a feature where it can work with Domain-Driven Design (DDD) styled EF Core database classes. It can find and call methods or constructors inside a DDD-styled EF Core database class. That gives very good control over creates and updates.
  • This article tells you what GenericServices can, and cannot, do.
  • I then list five GenericServices tips and techniques that I use when using this library.

Setting the scene – what does GenericServices do?

TIP: This is a shortened version of a section from the introduction article on EFcore.GenericServices. The original has more code in it.

GenericServices is designed to make writing front-end CRUD (Create, Read, Update and Delete) EF Core database accesses much easier. It handles both the database access code and the “adaption” of the database data to what the front-end of your application needs. It does this by providing a library with methods for Create, Read, Update and Delete that uses either a EF Core database class or a DTO (Data Transfer Object, also known as a ViewModel in ASP.NET) to define what EF Core class is involved and what data needs to be read or written.

I’m going to take page used to update a database class to describe the typical issues that come up. My example application is an e-commerce site selling technical books and I implement a feature where an authorised user can add a sales promotion to a book, by reducing its price. The ASP.NET Core web page is shown below with the user’s input in red and comments on the left about the Book properties and how they are involved in the update.

In a web/mobile application a feature like this consists of two stages:

1. Read data to display

The display to the user needs five properties taken from the Book and I use a DTO (Data Transfer Object, also known as a ViewModel) than contains the five properties I want out of the Book entity class. GenericServices uses AutoMapper to build a query using LINQ which EF Core turns into an efficient SQL SELECT command that just reads the five columns. Below is the DTO, with the empty interface ILinkToEntity<TEntity> (see line 1) that GenericServices uses to find and map the DTO to the EF Core classes.

public class AddPromotionDto : ILinkToEntity<Book>
{
    [HiddenInput]
    public int BookId { get; set; }

    [ReadOnly(true)] // Tells GenericServices not copy this back to the database
    public decimal OrgPrice { get; set; }

    [ReadOnly(true)] //Tells GenericServices not copy this back to the database
    public string Title { get; set; }

    public decimal ActualPrice { get; set; }

    [Required(AllowEmptyStrings = false)]
    public string PromotionalText { get; set; }
}

Below is the GenericService code that reads in data into the DTO, with the id holding the Book’s primary key (see this link for full list of all the code)

 
var dto = _service.ReadSingle<AddPromotionDto>(id);

NOTE: AutoMapper is great for “Flattening” relationships, that is it can pick properties/columns in related classes – see the article “Building high performance database queries using Entity Framework Core and AutoMapper” for more on this.

2. Update the data

The second part is the update of the Book class with the new ActualPrice and the PromotionalText. This requires a) the Book entity to be read in, b) the Book entity to be updated with the two new values, and c) the updated Book entity to be written back to the database. Below is the GenericService code that does this (see this link for full list of all the code)

_service.UpdateAndSave(dto);

Overall the two GenericService calls replaces about 15 lines of hand-written code that does the same thing.

The problem that GenericServices is aimed at solving

I built GenericServices to make me faster at building .NET applications and to remove some of the tedious coding (e.g. LINQ Selects with lots of properties) around building front-end CRUD EF Core database accesses. Because I care about performance I designed the library to build efficient database accesses using the LINQ Select command, i.e. only loading the properties/columns that are needed.

With the release of EF Core, I rewrote the library (GenericServices (EF6) -> EfCore.GenericServices) and added new features to work with a Domain-Driven Design (DDD) styled database classes. DDD-styled database classes give much better control over how creates and updates are done.

GenericServices is meant to make the simple-to-moderate complexity database reads easy to build. It can also handle all deletes and some single-class creates and updates with normal database classes, but because EfCore.GenericServices supports DDD-styled database classes it can call constructors/methods which can handle every type of create or update.

Overall, I find GenericServices will handle more than 60% of all front-end CRUD accesses, but with DDD-styled database classes this goes up to nearly 80%. It’s only the really complex reads/writes that can be easier to write by hand, and some of those write should really be classed as business logic anyway. The trick is to know when to use GenericServices, and when to hand-code the database access – I cover that next.

What EfCore.GenericServices can/cannot handle

OK, let’s get down to the details of what GenericServices can and cannot do, with a list for good/bad usages.

GenericServices is GOOD at:

  • All reads that use flattening (see Note1)
  • All deletes
  • Create/Update
    • Normal (i.e. non-DDD) database classes: of a single class (see Note2)
    • DDD-styled database classes: any create/update.

GenericServices is BAD at:

  • Any read that needs extra EF Core commands like .Include(), .Load(), etc. (see Note1)
  • Create/Update
    • Normal (i.e. non-DDD) database classes: with relationships (see Note2)

Note1: Read – flatten, not Include.

The GenericServices reads are designed for sending to a display page or a Web API, and I can normally implement any such read by using AutoMapper’s “Flattening” feature. However, sometimes the effort to set up special AutoMapper’s configurations (see docs) can take more effort than just hand-coding the read. Don’t be afraid to build your own read queries if this simpler for you.

You cannot use GenericServices for reads that needs .Include(), .Load(), etc. Typically that sort of read is used in business logic and I have a separate library called EfCore.GenericBizRunner for handling that (see articles “A library to run your business logic when using Entity Framework Core” and “Architecture of Business Layer working with Entity Framework (Core and v6)” for more about handling business logic).

NOTE: Using .Include(), Load() or Lazy Loading is inefficient for a simple read as it means you are either loading data you don’t need, and/or making multiple trips to the database, which is slow.

Note2: Create/Update – single or relationships

When using normal (non-DDD) database classes GenericServices will only create/update a single class mapped to the database via EF Core. However, you can get around this because GenericServices is designed to work with a certain style of DDD entity classes, i.e. GenericServices can find and call constructors or methods inside your EF Core class to do a create or update, which allows your code to handle any level of complexity of a create or update.

GenericServices also gives you the option to validate the data that is written to the database (off by default – turn on via the GenericServiceConfig class). This, coupled with DDD constructor/methods, allows you to write complex validation and checks. However, if I think the code is getting too much like business logic then I use EfCore.GenericBizRunner.

Tips and Techniques

Clearly really know my library very well, and I can do things other’s might not think of. This is a list of things I have do that you might find useful. Here is a list to save you scrolling down to see what’s there.

  1. Try using DDD-styled entity classes with EfCore.GenericServices
  2. Don’t try to use GenericServices for business logic database accesses
  3. How to filter, order, page a GenericService read query
  4. Helper library for using GenericServices with ASP.NET Core Web API
  5. How to unit test your GenericServices code

a. Try using DDD-styled entity classes with EfCore.GenericServices

Personally, I have moved over to using DDD-styled database classes with EF Core, so let me explain the differences/advantages of DDD.

Non-DDD classes have properties with public setters, i.e. anyone can alter a property, while DDD-styled classes have private setters which means you must use a constructor or a method to create/update a property/ies. So DDD-styled classes “locks down” any changes so that no one can bypass the create/update code in that class (see my article “Creating Domain-Driven Design entity classes with Entity Framework Core” for more on this).

Yes, DDD-styled database classes do take some getting used to, but it gives you an unparallel level of control over create/update, including altering not only properties but relationships as well. EfCore.GenericServices works with DDD-styled EF Core classes and finds constructors/methods by matching the parameter name/types (see GenericServices DDD docs here).

b. Don’t try to use GenericServices for business logic database accesses

When I think about database accesses in an application I separate them into two types:

  • CRUD database accesses done by the front-end, e.g. read this, update that, delete the other.
  • Business logic database accesses, e.g. create an order, calculate the price, update the stock status.

The two types of accesses are often different – CRUD front-end is about simple and efficient database accesses, while business logic database accesses are about rules and processes. GenericServices is designed for CRUD database for the front-end and won’t do a good job for business logic database accesses – I use my GenericBizRunner library for that.

Sure, it can get hazy as to whether a database access is a simple CRUD access or business logic – for instance is changing the price of an item a simple CRUD update or a piece of business logic? However, there are some actions, like update the stock status which can trigger a restocking order, that are clearly business logic and should be handled separately (see my article “Architecture of Business Layer working with Entity Framework (Core and v6)” on how I handle business logic).

There are two things that GenericService + DDD-styled database classes can’t do:

  1. GenericServices doesn’t support async calls to methods in a DDD-styled database class. I could support it but I held off for now. If I feel I need Async I use my GenericBizRunner, which has very good async handling throughout.
  2. The constructors/methods in a DDD-styled database class can’t easily have dependency injection added (you could, but you would be pushing the whole DDD pattern a bit to far). You might like to read my article “Three approaches to Domain-Driven Design with Entity Framework Core” and make your own mind up as to whether you want to do that.

c. How to filter, order, page a GenericService read query

The EfCore.GenericServices’s ReadManyNoTracked<T>() method returns an IQueryable<T> result, which allows you to filter, order, page the data after it has been projected into the DTO. By adding LINQ commands after the ReadManyNoTracked method EF Core will turn them into efficient SQL commands. You then end the query with something like .ToList() or .ToListAsync() to trigger the database access.

Filtering etc. after the mapping to the DTO normally covers 90% of your query manipulation but what happens if you need to filter or change a read prior to the projection to the DTO? Then you need ProjectFromEntityToDto<TEntity,TDto>(preDtoLinqQuery).

The ProjectFromEntityToDto method is useful if you:

  • Want to filter on properties that isn’t in the DTO version.
  • Want to apply the method .IgnoreQueryFilters() to the entity to turn off any Query Filter on the entity, say if you were using a Query Filters for soft delete.

NOTE: If you are using Query Filters then all the EfCore.GenericServices’s methods obey the query filter, apart from the method DeleteWithActionAndSave. This turns OFF any query filters so that you can delete anything – you should provide an action that checks the user is allowed to delete the specific entry.

d. Helper library for using GenericServices with ASP.NET Core Web Core

I use ASP.NET at lot over the years and I have generated several patterns for handling GenericServices (and GenericBizRunner), especially around Web APIs. I have now packaged these patterns into a companion library called EfCore.GenericServices.AspNetCore.

For ASP.NET MVC and Razor Pages the EfCore.GenericServices.AspNetCore has a CopyErrorsToModelState extension method that copies GenericServices’s status into the ASP.NET Core Model so they become validation errors.

The features for Web API are quite comprehensive.

  • GenericServices supports JSON Patch for updates – see my article “Pragmatic Domain-Driven Design: supporting JSON Patch in Entity Framework Core” for full details of this feature.
  • For Web API it can turn GenericServices’s status into the correct response type, with HTTP code, success/errors parts and any result to send. This makes for very short Web API method with a clearly defined output type for Swagger – see example below
[HttpGet("{id}")]
public async Task<ActionResult<WebApiMessageAndResult<TodoItem>>> 
    GetAsync(int id, [FromServices]ICrudServicesAsync service)
{
    return service.Response(
        await service.ReadSingleAsync<TodoItem>(id));
}

For more on EfCore.GenericServices and ASP.NET Core Web APIs have a look at my article “How to write good, testable ASP.NET Core Web API code quickly

e. How to unit test your GenericServices code

I’m a big fan of unit testing, but I also what to write my tests quickly. I therefore have built-in methods to help to unit test code that uses EfCore.GenericServices. I also have a whole library called EfCore.TestSupport to help with unit testing any code that uses EF Core.

EfCore.GenericServices has a number of methods that will set up the data that GenericServices would normally get via dependency injection (DI). See line 11 for one such method in the code below. The other methods, like SqliteInMemory.CreateOptions on line 5, come from my EfCore.TestSupport library.

[Fact]
public void TestProjectBookTitleSingleOk()
{
    //SETUP
    var options = SqliteInMemory.CreateOptions<EfCoreContext>();
    using (var context = new EfCoreContext(options))
    {
        context.Database.EnsureCreated();
        context.SeedDatabaseFourBooks();

        var utData = context.SetupSingleDtoAndEntities<BookTitle>();
        var service = new CrudServices(context, utData.Wrapped);

        //ATTEMPT
        var dto = service.ReadSingle<BookTitle>(1);

        //VERIFY
        service.IsValid.ShouldBeTrue(service.GetAllErrors());
        dto.BookId.ShouldEqual(1);
        dto.Title.ShouldEqual("Refactoring");
    }
}

I also added a ResponseDecoders class containing a number of extension method to my EfCore.GenericServices.AspNetCore that will turn a Web API response created by that library back into its component parts. This makes testing Web API methods simpler. This link to a set of unit tests gives you an idea of how you could use the extension methods in integration testing.

Also see the unit testing section of my  article “How to write good, testable ASP.NET Core Web API code quickly”.

Conclusion

I hope this article helps people to get the best out of my EfCore.GenericServices library and associated libraries like EfCore.GenericServices.AspNetCore and EfCore.GenericBizRunner. All these libraries were built to make me faster at developing applications, and also to remove some of the tedious coding so I can get on with coding the parts that need real thought.

The important section is “What EfCore.GenericServices can/cannot handle”  which tells you what the library can and cannot do. Also note my comments on the difference between front-end CRUD (GenericServices) and business logic (GenericBizRunner). If you stay in the “sweet spot” of each of these libraries, then they will work well for you. But don’t be afraid to abandon either library and write your own code if it’s easier or clearer – pick the approach that is clear, but fast to develop.

I also hope the tips and techniques will alert you to extra parts of the EfCore.GenericServices library that you might not know about. I used my libraries on many projects and learnt a lot. The list are some things I learnt to look out for and links to other libraries/techniques that help me be a fast developer.

Happy coding.

Decoding Entity Framework Core logs into runnable SQL

Created: March 20, 2019

This isn’t one of my long articles, but just a bit of fun I had trying to convert Entity Framework Core’s (EF Core) CommandExecuted logs back into real SQL. EF Core’s logging is much improved over EF6.x and it returns very readable SQL (see this example below)

var id = 1;
var book = context.Books.Single(x => x.BookId == id);

Produces this log output

Executed DbCommand (1ms) [Parameters=[@__id_0='1'],
    CommandType='Text', CommandTimeout='30']
SELECT TOP(2) [p].[BookId], [p].[Description], [p].[ImageUrl]
    , [p].[Price], [p].[PublishedOn], [p].[Publisher]
    , [p].[SoftDeleted], [p].[Title]
FROM [Books] AS [p]
WHERE ([p].[SoftDeleted] = 0) AND ([p].[BookId] = @__id_0)

Now I spend quite a bit of time understanding and performance tuning EF Core code, so it’s very useful if I can copy & paste the SQL into something like Microsoft’s SQL Server Management Studio (SSMS) to see how they perform. The problem is I have to hand-edit the SQL to add the correct values to replace any parameters (see @__id_0 in last code).

So, in my spare time (??), I decided to try to create some code that would automatically replace the property reference with the actual value. It turns out it’s quite difficult and you can’t quite get everything right, but its good enough to help in lots of places. Here is the story of how I added this feature to my EfCore.TestSupport library.

The steps to building a my DecodeMessage method

The steps I needed to do were:

  1. Capture EF Core’s logging output
  2. Turn on EnableSensitiveDataLogging
  3. Catch any EF Core CommandExecuted logs
  4. Decode the Parameters
  5. Replace any property references in the SQL with the ‘correct’ parameter value

Now I did say I was going to keep this article short so I’m going to give you some code that handles the first three parts. You can see the EnableSensitiveDataLogging method near the end of building the options.

var logs = new List<LogOutput>();
var options = new DbContextOptionsBuilder<BookContext>()
    .UseLoggerFactory(new LoggerFactory(new[] 
          { new MyLoggerProviderActionOut(l => logs.Add(l))}))
    .UseSqlite(connection)
    .EnableSensitiveDataLogging()
    .Options;
using (var context = new BookContext(options)) 
{
    //… now start using context

NOTE: Sensitive data logging is fine in your unit tests, but you should NOT have sensitive data logging turned on in production. Logging the actual data used is a security risk and could break some user privacy rules like GPRS.

In fact I have methods in my EFCore.TestSupport library that handle building the options and turning on sensitive logging, plus a load of other things. Here is an example of one helper that creates an in-memory database options, with logging.

var logs = new List<LogOutput>();
var options = SqliteInMemory.CreateOptionsWithLogging
     <BookContext>(log => logs.Add(log));
using (var context = new BookContext(options)) 
{
    //… now start using context

The EfCore.TestSupport library has another version of this that works for SQL Server. It creates a unique database name per class, or per method, because xUnit (the favourite unit test framework for NET Core) runs each test class in parallel.

NOTE: The EfCore.TestSupport uses a logging provider that calls an action method for every log. This makes it easy to write logs to the console, or capture them into list.

Decoding the Parameters

Having captured the EF Core’s logs now I need to decode the first line that has the parameters. There are a few permutations, but it’s clear that Regex is the way to go. This problem is I’m not an expert on Regex, but LinqPad came to my rescue!

LinqPad 5.36 has a very nice Regex tool – the best I have found so far.  Here is a screenshot of its regex feature, which is called up via Ctrl+Shift+F1.

WARNING: It’s a great tool, but I thought if I saved the code it would keep the pattern I had created, but it doesn’t. I spent hours getting the regex right and then lost it when I entered something else. Now I know its all OK, but be warned.

All my trials came up with the following Regex code

new Regex(@"(@p\d+|@__\w*?_\d+)='(.*?)'(\s\(\w*?\s=\s\w*\))*(?:,\s|\]).*?");

If you don’t know regex then it won’t mean anything to you, but it does the job of finding the a) param name, b) param value, and c) extra information on the parameter (like its size). You can see the whole decode code here.

Limitations of the decoding

It turns out that EF Core’s logged data doesn’t quite give you all you need to perfectly decode the log back to correct SQL. Here are the limitations I found:

  1. You can’t distinguish the different between an empty string and a null string, both are represented by ”. I decided to make ” return NULL.
  2. You can’t work out if it’s a byte[] or not, so byte[] is treated as a SQL string. This will FAIL in SQL Server.
  3. You can’t tell if something is a Guid, DateTime etc., which in SQL Server need ” around them. In the end I wrapped most things in ”, including numbers. SQL Server accepts numbers as strings (but other databases won’t).

Example of a different decoded SQL

If we go back to the book lookup at the start of this article then the decoded result is shown below

SELECT TOP(2) [p].[BookId], [p].[Description]
   , [p].[ImageUrl], [p].[Price], [p].[PublishedOn]
   , [p].[Publisher], [p].[SoftDeleted], [p].[Title]
FROM [Books] AS [p]
WHERE ([p].[SoftDeleted] = 0) AND ([p].[BookId] = '1') 

As you can see on the last line the integer is represented as a string. This isn’t the normal way to do this but works in SQL Server. I took the decision to wrap things that I didn’t think were strings because this what is needed to make other types, such as GUIDs, Datetime etc. to work.

My really complex test contained lots of different NET Types, and here is the output.

SET NOCOUNT ON;
INSERT INTO [AllTypesEntities] ([MyAnsiNonNullString]
    , [MyBool], [MyBoolNullable], [MyByteArray], [MyDateTime]
    , [MyDateTimeNullable], [MyDateTimeOffset], [MyDecimal]
    , [MyDouble], [MyGuid], [MyGuidNullable], [MyInt]
    , [MyIntNullable], [MyString], [MyStringEmptyString]
    , [MyStringNull], [MyTimeSpan])
VALUES ('ascii only', 1, NULL, '0x010203', '2000-01-02T00:00:00',
NULL, '2004-05-06T00:00:00.0000000+01:00', '3456.789', '5678.9012', 
'ba65d636-65d4-4c07-8ddc-50c615cef539', NULL, '1234', NULL, 
'string with '' in it', NULL, NULL, '04:05:06');
SELECT [Id]
FROM [AllTypesEntities]
WHERE @@ROWCOUNT = 1 AND [Id] = scope_identity(); 

In this complex version the parts that fail are:

  1. The MyByteArray has ” around it and FAILS – taking off the string delimiters fixes that.
  2. The MyStringEmptyString is set to NULL instead of an empty string.

Not perfect, but quite usable.

How can you access this code?

If you just want to use this feature its build into the latest EfCore.TestSupport NuGet package (1.7.0 to be precise). Its build into the LogOutput class which is used by the loggers in this library. There are methods that create options for SQLite (in-memory) and SQL Server database that allow logging. There are plenty of examples of these in the library – have a look at the unit tests for this in the TestEfLoggingDecodeBookContext class.

If you want to play with the code yourself then take a copy of the EfCoreLogDecoder class which contains the decode parts.

Conclusion

Well it was a bit of fun, maybe not something I would do on a job but still a useful tool. I was a bit disappointed I couldn’t decode the log completely but what it does is still useful to me. Maybe you will find it useful to you too.

Now I need to get back to my real work for my clients. See you on the other side!

Happy coding.