How to safely apply an EF Core migrate on ASP.NET Core startup

Last Updated: December 1, 2021 | Created: December 1, 2021

There are many ways to migrate a database using EF Core, and one of the most automatic approaches is to call EF Core’s MigrateAsync method on startup of your ASP.NET Core application – that way you don’t forget it.  But the “migrate on startup” approach has a big problem if your ASP.NET Core is running multiple instances of your app (known as  Scale Out in Azure and Horizontally Scaling in Amazon). That’s because trying to apply multiple migrations at the same time doesn’t work – either it will fail or worse, it might cause data corruption to your database.

This article describes a library designed to updates global resources, e.g., a database, on startup of your application that handles applications that have for multiple instances running. Migration is a one thing it can do, but as you will see there are other examples, such as adding an admin user when you first deploy your application.

This open-source library is available as the NuGet package called Net.RunMethodsSequentially and the code is available on https://github.com/JonPSmith/RunStartupMethodsSequentially. I use the shorter name of RunSequentially to refer to this library.

TL;DR; – Summary of this article

  • The RunSequentially library manages the execution of the code you run on the startup of your ASP.NET Core application. You only need this library when you have multiple instances running on your web host because it will make sure your startup code, which in every instance, aren’t run at the same time.
  • The library uses a lock on a global resource (i.e. an resource all the app instances can access) which your startup code are executed serially, and never in parallel.  This means you can migrate and/or seed your database without nasty things happening, e.g. running multiple EF Core Migrations at a same time could (will!) causes problems.
  • But be aware, each instance of your applications will run your startup code, the library just guarantees they won’t run at the same to. That means if you are seeding a database you need to check it hasn’t already been added.
  • To use the RunSequentially library you have to do three things:
    • You write the code you want to run within a global lock – these are referred to as startup services.
    • Select a global resource you can lock on – a database is a good fit, but its not yet created you can fall back to locking on a FileSystem Directory, like ASP.NET Core’s wwwRoot directory.
    • You add code to the ASP.NET Core  configuration code in the Program class (net6) to register and configure the RunSequentially setup.
  • Because the RunSequentially library uses ASP.NET Core’s HostedService its run before the main host starts, which is perfect. But the HostedServices doesn’t give good feedback if you have an exception in the RunSequentially code. To help with this I have included a tester class in the library to which allows you to check your code / configuration works before you publish your application to production.

Setting the scene – why did I create this library?

Back in December 2018 I wrote an article called “A better way to handle authorization in ASP.NET Core” that described several additions I added to ASP.NET Core to provide management of users in a large SaaS (Software as a Service) I created for a client. This article is still the top article on my blog three years on, and many people have used this approach.

Many developers asked me to create a library containing the “better ways” features, but there were a few parts that I didn’t know how do in a generic library that anyone could use. One of parts was the adding setup data to the database when there were multiple instances of ASP.NET Core were running. But in March 2021 GitHub @zejji provided a solution I hadn’t known about using the DistributedLock library.

The DistributedLock library made it possible to create a “better ways” library which is called AuthPermissions.AspNetCore (shortened to as AuthP). The AuthP library is very complex, so I released version 1 in August 2021 which only works with single instances of ASP.NET Core, knowing I had a plan to handle multiple instances. I am currently working version 2 which is using the RunSequentially library.

Rather than put the RunSequentially code directly into the AuthP library I created its own library, which means other developers can use RunSequentially library to use the “migrate on startup” approach on applications that has have multiple instances.

How the RunSequentially library works, and what to watch out for

The library allows you create services which I refer to as startup services which are run sequentially on startup of your application. These startup services are run within a DistributedLock global lock which means your startup services can’t run all at the same time but are run sequentially. This stops the problem of multiple instances of the application trying to update one common resource at the same time.

But be aware every startup service will be run on every application’s instance, for example if your application is running three instances then your startup service will be run three times. This means your startup services should check if the database has already been updated, e.g. if your service adds an admin user to the the authentication database it should first check that that admin user isn’t already been added (NOTE: EF Core’s Migrate method checks if the database needs to be updated, which stops your database being migrated multiple times).

Another warning is that you must NOT apply a database migration that changes the database such that the old version of the application’s software won’t work (these types of migrations are known as a breaking change) cannot be applied to an application which is running multiple instances. That’s because Azure etc. replace each of the running instance one by one, which means a migration applied by the first instance that has breaking changes will “break” the other running instances that hasn’t yet have its software updated. 

This breaking change issue isn’t specific to this library but because this library is all about running applications that have multiple instances, then you MUST ensure you not applying a migration that has a break change  (see the five-stage app update in this article for how you should handle a breaking changes to your database).

Using the RunSequentially library in your ASP.NET Core application

This breaks down into three stages:

  1. Adding the RunSequentially library to your application
  2. Create your startup services, e.g. one migrate your database and say another to add an admin user.
  3. Register the RegisterRunMethodsSequentially extension method to your dependency injection provider, and select:
    1. What global resource(s) you want to lock on.
    1. Register your startup service(s) you want run on startup.

Let’s look at these in turn.

1. Adding the RunSequentially library to your application

This is simple, you add the NuGet package called Net.RunMethodsSequentially into your application. This library uses the Net6 framework.

2. Create your startup services

You need to create what the RunSequentially library calls startup services, which contain the code you want to apply to a global resource. Typically, it’s a database but it could be a common file store, azure blob etc.

To create a startup service that will be run while in a lock you need to create a class that inherits the IStartupServiceToRunSequentially interface. This has a method defined has a ValueTask ApplyYourChangeAsync(IServiceProvider scopedServices), which is the method where you put your code to update a shared resource. The example code below is a RunSequentially startup service, i.e. it implements the IStartupServiceToRunSequentially interface, and shows how you would migrate a database using EF Core’s MigrateAsync method.

public class MigrateDbContextService : 
    IStartupServiceToRunSequentially
{
    public int OrderNum { get; }

    public async ValueTask ApplyYourChangeAsync(
        IServiceProvider scopedServices)
    {
        var context = scopedServices
            .GetRequiredService<TestDbContext>();

        //NOTE: The Migrate method will only update 
        //the database if there any new migrations to add
        await context.Database.MigrateAsync();
    }
}

The ApplyYourChangeAsync method has a parameter holding a scoped service provider, which is a copy of the normal services used in the main application. From this you can get the services you need to apply your changes to the shared resource, in this case a database.

NOTE: You can use the normal constructor DI injection approach, but as a scoped service provider is already set up to run the RunSequentially code you might like to use that service provider instead.

The OrderNum is a way to define the order you want your startup services are run. If startup services have the same OrderNum value (default == zero), the startup services will be run in the order they were registered.

This means most cases you just register them in the order you want them to run, but in complex situations the OrderNum value can be useful to define the exact order your startup services are run in via a OrderBy(service => service.OrderBy)inside the library. For example, my AuthP library some startup services are optionally registered by the library and other startup services are provided by the developer: in this case the OrderNum value is used to make sure the startup services are run in the correct order.

3. Register the RegisterRunMethodsSequentially and your startup services

You need to register the RunSequentially code and your startup services with your dependency Injection (DI) provider. For ASP.NET Core this the configuration of the applciation happens in the Program class (net6) (previously in the Startup class before net6). This has three parts:

  1. Selecting/registering the global locks.
  2. Registering the RunSequentially code with your DI provider
  3. Register the startup services

Before I describe each of these parts the annotated code below shows process.

NOTE: The above was taken from the Program class in a test ASP.NET Core in the RunSequentially’s repo.

And now the detail of the three parts.

3.1 Selecting/registering the global locks

The RunSequentially library relies on creating a lock on a resource that all the web application’s instances can access, and it uses the DistributedLock library to manage these global locks. Typical global resources are databases (DistributedLock supports 6 database types), but the DistributedLock library also supports a FileSystem Directory and Windows WaitHandler.

However, the RunSequentially library only natively supports the three main global lock types, known as TryLockVersions, and a no locking version, which is useful in testing:

  • SQL Server database, AddSqlServerLockAndRunMethods(string connectionString)
  • PostgreSQL database, AddPostgreSqlLockAndRunMethods(string connectionString)
  • FileSystem Directory, AddFileSystemLockAndRunMethods(string directoryFilePath)
  • No Locking (useful for testing): AddRunMethodsWithoutLock()

NOTE: If you want to use another lock type in the DistributedLock library you can easily create your own RunSequentially TryLockVersion version. They aren’t that hard to write, and you have three versions in the RunSequentially library to copy from.

GitHub @zejji pointed out there is a problem if the database isn’t created yet, so I created a two-step approach to gaining a lock:

  1. Check if the resource exists. If doesn’t exist, then try the next TryLockVersion
  2. If the resource exists, then lock that resource and run the startup services

This explains why I use two TryLockVersion in my setup: the first tries to access a database (which is normally already created), but if the database doesn’t exist, then it uses the FileSystem Directory lock (NOTE: the FileSystem Directory lock can be slower than a database lock – see this comment in the FileSystem docs, so I use the database lock first).

So, putting this all together the code in your Program class would look something like this.

var builder = WebApplication.CreateBuilder(args);

// … other parts of the configuration left out.

var connectionString = builder.Configuration
     .GetConnectionString("DefaultConnection");
var lockFolder = builder.Environment.WebRootPath;

builder.Services.RegisterRunMethodsSequentially(options =>
    {
        options.AddSqlServerLockAndRunMethods(connectionString);
        options.AddFileSystemLockAndRunMethods(lockFolder);
    })…
// … other parts of the configuration left out.

The code in question are:

  • Lines 5 to 7: This gets the connection string for the database, and the FilePath for the application’s wwwroot directory.
  • Lines 11 to 12: These register two TryLockVersion versions. It first will try to create a global the database, but if the database hasn’t been created yet it will try to create a global lock on the wwwroot directory.

3.2 Registering the RunSequentially code with your DI provider

The RunSequentially library needs to register the GetLockAndThenRunServices as an ASP.NET Core IHostedService, which is run after the configuration, but before the main web host is run (see Andrew Lock’s useful article that explains all about IHostedService). This is done via the RegisterRunMethodsSequentially extension method.

NOTE: For non-ASP.NET Core applications, and for unit testing, you can set the RegisterAsHostedService properly in the options to false. This will register the GetLockAndThenRunServices class as a normal (transient) service.

See the code shown in the last section to see that the  RegisterRunMethodsSequentially extension method takes in the builder.Services so that it can register itself and the other parts of the RunSequentially parts.

3.3 Register the startup services

The final step is to register your startup services, which you do using the RegisterServiceToRunInJob<YourStartupServiceClass> method. There are a couple of rules designed to catch error situations. The library will throw an RunSequentiallyException if:

  • You didn’t register any startup services
  • If you registered the same class multiple times

As explained in the 2. Create your startup services section by default your startup service are run in the order they are registered. For instance, you should register your startup service that migrates your database first, followed by registering a startup service that accesses that database.

Using the “run startup services in the order they were registered” approach works in most cases, but my AuthP library I needed a way to define the run order, which is why in version 1.2.0 I added the OrderNum property as described in 2. Create your startup services section.

Checking that your use of RunSequentially will work

I have automated tests for the RunSequentially library, but before I revealed this library to the world, I wanted to be absolutely sure that an ASP.NET Core application using the RunSequentially library worked on Azure with was scaled out to multiple instances. I therefore created a test ASP.NET Core where I could try this out on Azure.

NOTE: You can try this out yourself by cloning the RunStartupMethodsSequentially repo and running the WebSiteRunSequentially ASP.NET Core project, either on Azure with scale out, or on your own development PC using the approaches shown in this stack overflow answer.

The WebSiteRunSequentially web app didn’t work at first because I hadn’t set the Azure Database Firewall properly, which causes an exception. The library relies on ASP.NET Core’s IHostedService to run things before the main host starts, but if you have an exception you don’t get any useful feedback!. So, I decided to add a tester that allows you to copy the RunSequentially code in your Program class and run it in an automated test.

The code below is from one of my xUnit Test projects, where I copied the setup code from the WebSiteRunSequentially Program class and put in a test. I did have to make a few changes to get a connection string and the tester class provides a local directly instead of the ASP.NET Core wwwRoot directory. Other than those two changes its identical to the Program class code.

[Fact]
public async Task ExampleTester()
{
    //SETUP
    var dbOptions = this.CreateUniqueClassOptions<WebSiteDbContext>();
    using var context = new WebSiteDbContext(dbOptions);
    context.Database.EnsureClean();

    var builder = new RegisterRunMethodsSequentiallyTester();

    //ATTEMPT
    //Copy your setup in your Program here 
    //---------------------------------------------------------------
    var connectionString = context.Database.GetConnectionString();  //CHANGED
    var lockFolder = builder.LockFolderPath;                        //CHANGED

    builder.Services.AddDbContext<WebSiteDbContext>(options =>
        options.UseSqlServer(connectionString));

    builder.Services.RegisterRunMethodsSequentially(options =>
    {
        options.AddSqlServerLockAndRunMethods(connectionString);
        options.AddFileSystemLockAndRunMethods(lockFolder);
    })
        .RegisterServiceToRunInJob<StartupServiceEnsureCreated>()
        .RegisterServiceToRunInJob<StartupServiceSeedDatabase>();
    //----------------------------------------------------------------

    //VERIFY
    await builder.RunHostStartupCodeAsync();
    context.CommonNameDateTimes.Single().DateTimeUtc
        .ShouldBeInRange(DateTime.UtcNow.AddSeconds(-1), DateTime.UtcNow);
}

The RegisterRunMethodsSequentiallyTester class is in the RunSequentially library so it’s easy to access. But be aware, while the code is the same, but you might not use the same database as you production system so this type of test might still miss a problem.

If you want a much more complex test, then look at the last test in the AuthP’s TestExamplesStartup test class. This shows you might have a bit of work to register the other parts of the system to make it work.

NOTE: You might find my EfCore.TestSupport library helpful as it has methods to set up test databases etc. That’s what I used at the start of the test shown above.

Having fixed the Firewall, I published the WebSiteRunSequentially web app to an Azure App Service plan with the scale out was manually set to three instances of the application. Its job is to update a single common entity and also added logs of what in done in another entity. The screenshot shows the results.

I restarted the Azure App Service, as that seems to reset all the running instances, and the logs tells you what happens:

  1. The first instance of application runs the “update startup service” and it found that common entity has been updated a while ago (I set a time of 5 minutes), so assumes it’s the first to service to be run in the set of three so it sets the Common entity’s Stage to 1.
  2. When the second instance of application is allowed to start up it runs the same “update startup service”, but now it finds that the common entity was updated four seconds ago, so it assumes that another instance had just updated it and updated the increments the Stage by 1, which in this case makes Stage equal 2.
  3. When the third (and final) instance of application start up the “update startup service” it too finds that the common entity was updated about 2 minutes ago, so it assumes that another instance had just updated so it assumes that another instance had just updated it and updated the increments the Stage by 1, which in this case makes Stage equal 3.

That process and results shows me that the RunSequentially library works as expected, and at the same time I learnt a few things to watch out for. For instance, I hadn’t set the Azure Database Firewall properly, which causes an exception before the ASP.NET Core host was up. This meant I only got a generic “its broken” with no information, which is why I added the tester class into the RunSequentially library.

Conclusion

The RunSequentially library is designed to run code at startup that won’t interfere with the same code run in other instances of your ASP.NET Core application. The RunSequentially library is small, but it still provides an excellent solution to handle updating global resource, with the “migrate on startup” being one of the main uses. Its also quite quick, the SQL Server check-and-lock code only took 1 ms on a local SQL Server database.

I’m also using the RunSequentially library in my AuthPermissions.AspNetCore library in a quite complex setup. This can have something like six different startup services available for migrating / seeding various parts of the database: some registered by the AuthP code and some added by the developer. Making that work property and efficiently requires a couple iterations of the RunSequentially library (currently at version 1.3.0) and only now I am recommending RunSequentially for real use, although there have been nearly 500 downloads of the NuGet already.

I’m interested in what else other developers might use this library for. I have an idea of managing what ASP.NET Core’s BackgroundService across applications with multiple instances. At the moment the same BackgroundService will run in each instance, which is either wasteful or actually causes problems. I could envision a system using a unique GUID in each instance of the application and a database entity like the one in my test ASP.NET Core app to make sure certain BackgroundServices would only run in one of the application’s instances. This would mean I don’t have to fiddle with WebJobs anymore, which would be nice 😊.

If you come up with a use for this library that outside the migrate / seed situations, then please leave a comment on this article. I, and other readers might find that useful.

Updating your ASP.NET Core / EF Core application to NET 6

Last Updated: November 17, 2021 | Created: November 17, 2021

With the release of NET 6 I wanted to update the code in the repo called EfCoreinAction-SecondEdition which contains the code that goes with my book “Entity Framework Core in Action 2nd edition”. This code is based on ASP.NET Core 5 and EF Core 5 and this article describes what I had to do to update the NET 5 version of the code to NET 6, plus a look at any performance improvements.

TL;DR; – Summary of this article

  • I updated a non-trivial ASP.NET Core / EF Core application (22 projects) from EF Core 5 to EF Core 6. Overall, it was pretty easy and only took a few days.
  • You have to use Visual Studio 2022 (or VSCode) – Visual Studio 2019 doesn’t work
  • Any project using ASP.NET Core / EF Core NuGet packages has to have a target framework of net6.0.
  • My ASP.NET Core MVC updated with no changes to the code.
  • The EF Core parts, which as complex, had two compile errors on unusual parts of EF Core and a few runtime errors, mainly around the Cosmos DB changes.
  • I didn’t find any performance improvements, but that’s because my queries were SQL database bound, taking between 10 ms. to 800 ms. to get a result.

Setting the scene – what type of app did I update?

In part 3 of my book, I build a complex ASP.NET Core / EF Core application aimed at testing the performance of EF Core. This code is in the Part3 branch and contains 22 projects that use quite of the more intricate parts of EF Core features such as Global Query Filters, user-defined functions, table splitting, transactions and so on. This means any EF Core breaking changes are likely to been seen on this application. My aim was to:

  1. Convert my book setting web site to NET 6 and get it compile.
  2. Make the 200 ish xUnit tests to run successfully
  3. To make the application run
  4. Test the performance of the updated application

NOTE: I did NOT plan to change the code to use new NET 6 features, such as ASP.NET Core simplified  Program class or EF Core features such as the new SQL Server Temporal Tables support or improved Cosmos DB support. That’s for another day.

The resulting code can be found in the Part3-Net6 branch, and you can look at all the changes by comparing Part3 branch with Part3-Net6 branch.

The list of things I had to do in order

Overall, the update from EF Core 5 / ASP.NET Core was fairly easy. I didn’t have to change any of the ASP.NET Core project, other than changing its target framework to net6.0 and updating all the NuGet packages. I would say getting the right order for the first four steps weren’t obvious (old articles said I could use Visual Studio 2019, but it can’t).

Here is a step-by-step list to each step, with links to each part

  1. Download Net6 SDK and Visual Studio 2022
  2. Update projects that have EF Core / ASP.NET Core packages in them
  3. Update your NuGet packages to version 6
  4. Fix any compile errors caused by the version 6 update
  5. Run your unit tests and fix any changes
  6. Run the application and check it works
  7. Did performance improve?

1. Download Net6 SDK and Visual Studio 2022

To start, you need to download and install the correct NET 6 SDK for your development machine.

You also need to download Visual Studio 2022, as Visual Studio 2019 doesn’t support Net 6. Then run Visual Studio 2022 and open your application. Visual Studio 2022 does support Visual Studio 2019 solutions so you can run any tests or check your application before you start the upgrade.

TIP: Make a new branch for the changes, as its useful to check back to the older version if something doesn’t work.

2. Update projects that have EF Core / ASP.NET Core packages in them

To update EF Core to version 6 requires the projects that use EF Core / ASP.NET Core NuGet packages to have a target framework of net6.0. ASP.NET Core tends to be in one single project, but EF Core is likely to be in multiple projects.

Before EF Core 6 came out you could use netstandard2.1 for EF Core 5, or netstandard2.0 for EF Core 3.1 or below. But with EF Core 6 you must update the project’s target framework to net6.0 before you try to update to EF Core 6, otherwise the EF Core 6 NuGet updates will fail.

To update a project in Visual you click on the project which will open the project’s .csproj file. You can then edit the TargetFramework part to Net6.0.

Updating projects to net6.0 has ramifications if you are using any kind of layered architecture, because a project using the netstandard2.1 framework can’t have a project reference to a net6.0 project. In my BookApp I was using a clean architecture approach, so the inner Domain projects (called entities in clean architecture)  didn’t contain any EF Core and could use the netstandard2.1 framework, but other than I had to change all the projects to net6.0.

NOTE: This is part of the change over to .NET and going forward you will be using named .NET versions (e.g. net6.0, net7.0) much more in the lower layers in your applications.

3. Update your NuGet packages to version 6

Once your packages are changed net6.0, then you can use the “Manage NuGet Packages for Solutions”, by right-clicking on the “Solution” at the top of the Solution Explorer. When you are in the NuGet Package Manager select the “Updates” tab which should provide a list of NuGet packages already installed in your application, which suggestions for newer NuGet versions.

As well as EF Core upgrades there may be other NuGet packages you use, such as open-source libraries that you find useful. Be a little careful of libraries that use EF Core inside, as some of them might work. For instance, I have about ten libraries that work with .NET and four of them needed to be updated, but the rest were OK. I didn’t find that until I tried to use them, so check any code that uses an external library that haven’t said it works with net 6 – a lot will work, but some might not.

4. Fix any compile errors caused by the update

You shouldn’t find many breaking changes that causes a compile error, but I did have two, but they are unusual parts of EF Core.

  • One was a change in the IModelCacheKeyFactory signature – I had to add a new parameter
  • Another was around the use of EF Core’s FromSqlRaw method. Because my app works with both SQL Server and Cosmos DB database types I got a compile time error CS0121, “The call is ambiguous between the following methods or properties” (see EF Core issue #26503), which I fixed.

5. Run your unit tests and fix any changes

When I ran my xUnit tests and out of 200 tests, of which the majority are integration tests using EF Core, and I had only a few errors. The biggest was around the changes in EF Core 6’s improvements of handling Cosmos DB.

  • There was a change to the configuration of collections of owned types, which I needed for my Cosmos DB code. This has changed (for the better) in EF Core 6.
  • The whole handing of Cosmos DB create / update / delete exceptions had been improved, so I had to change that code.
  • I had some problems when using FromSqlRaw method with Cosmos DB.  That’s because there are big improvements to Cosmos DB support in EF Core 6. I fixed the Cosmos DB code, but I didn’t upgrade my code to use the new features because it’s a lot of work. I hope to look at this later.
  • The last test that failed was checking for a bug in EF Core 5 (see EF Core issue #22701) which I was checking for so I would know when it was fixed. I’m glad to see that bug was fixed, and I removed the unit test that was there to alert a change.

NOTE: If you want to test code that uses EF Core, I suggest you look at my library EfCore.TestSupport. This has a lot of useful features that speeds up the writing of tests that need access to your application’s DbContext. Also, my book “Entity Framework Core in Action” has a whole chapter about testing code that uses EF Core.

6. Run the application and check it works

xUnit tests are great, but there are lots of things that are hard to test, especially some of ASP.NET Core features. The only problem I found was my analysis code relied on logging data, and ASP.NET Core had changed its name / event code. I didn’t test every minor point but I did some detailed performances tests and I didn’t find any other problems.

Did performance improve?

EF Core 6 has a lot of work done to reduce the overheads of converting LINQ to SQL and they know it has improved the performance and I was interested to see if it made any effect to the a web site for selling book that I created in chapters 15 and 16 (see this EF Core Community Standup video where I cover performance tuning).

I retested my performance tests with EF Core 6, and I didn’t find any improvements. Thinking about it I realised my performance tests took between 10 ms. to 800 ms. of database access time, which means any improvements to the code overhead wouldn’t make any effect. I did try small queries taking 1 to 3 ms. and there was possibly an improvement, but because the logging only works in 1 ms. steps I wasn’t able to be sure.

So, my takeaway is that reducing the overheads EF Core code is a good thing, but don’t expect EF Core 6 to suddenly improve your really slow database accesses.

Conclusion

Overall, updating my book selling web app to net6.0 took a few days, which I think is quite good. I did waste time trying to use Visual Studio 2019 to work with net6.0 because of old articles said it did support net6.0, which isn’t true now – need Visual Studio 2022. I also found (the hard way) I had to update the projects to net6.0 before I could load the new net6 NuGet packages.

My integration tests were wonderful as they pointed out the parts that didn’t work, especially with the large changes/improvements to Cosmos DB. Finding those issues would have very difficult using manual testing because they are only triggered in errors states, which is difficult to do manually.

I’m also really pleased with the improvements such as SQL Server temporal tables, migration bundles, global model config and of course the better support of Cosmos DB (see “what new in EF Core 6” for the full list). I look forward to using EF Core 6 in my future applications.

Happy coding.

Using PostgreSQL in dev: Part 2 – Testing against a PostgreSQL database

Last Updated: November 17, 2021 | Created: November 10, 2021

This article describes how to test an application that use Entity Framework Core (shortened to EF Core) that accesses a PostgreSQL database. I describe various methods that help you to write xUnit tests that have to access a PostgreSQL database. These new PostgreSQL test methods can be found in the new version of my EfCore.TestSupport library (5.1.0) that has PostgreSQL helpers and also works with both EF Core 5 and EF Core 6.

 The full list of the “Using PostgreSQL in dev” series are:

TL;DR; – Summary of this article

  • I needed to test some code that uses PostgreSQL databases so added PostgreSQL methods to the new 5.1.0 version of my EfCore.TestSupport library, which also supports EF Core 6.
  • I show what a PostgreSQL connection string looks like and how you can access a PostgreSQL database using EF Core.
  • xUnit runs each test class in parallel, so each test class needs a unique PostgreSQL database. The new PostgreSQL methods handle this for you by adding the test class name on the end of the base database name.
  • It’s easier to write a test if the database is empty, so I show two ways to do make the empty at the start of test:
    • The standard approach is slow (e.g., took 13 seconds per test).
    • With help from the EF Core team, I added a quicker approach (e.g., took 4.2 seconds per test).

NOTE: As well as the adding PostgreSQL test helper I also reinstated the Seed from Production feature. I removed it from version 5 because I didn’t think anyone was using it, but some people  were using this and they requested I added back.

Setting the scene – my approach to testing applications using EF Core

I write a lot of EF Core code and to make sure that code is correct I use lots of automatedtesting. Over the years I have tried different ways to write and test code (mock repository, EF6’s Effort, EF Core In-Memory), but since EF Core came out I have found that testing against a real database has a lot of pros – they are:

  1. You can’t Mock EF Core’s DbContext, so you have to use an EF Core supported database in your tests.
  2. If you have a good set of database test helpers, you can write tests much quicker.
  3. Using an actual database also checks your EF Core code is correct too.

The definition of a unit test say that it doesn’t include a database, and the correct name is  integration testing. Use whatever term you like, but the main point is that the tests are run automatically and return pass/fail results. That means you can quickly check that a) the changes you have make to your code work correctly and b) the changes hasn’t broken any other parts of you’re your application.

As point 2 says, having a good set of database test helpers can significantly reduce the amount of test code you need to write. That’s why I created a library called EfCore.TestSupport, and in the recent update to support EF Core 6 I also added PostgreSQL helpers.

NOTE: If you have used EfCore.TestSupport before, then the PostgreSQL methods work the same as the SQL Server methods, but the method name contains “PostgreSql”.

How to access a PostgreSQL database

To access a PostgreSQL database, you need a connection string. This string containing the PostgreSQL information for the host, database name, and the rights to access that database. The string below shows these four parts that work with a PostgreSQL being run in Windows Linux subsystem.

“host=127.0.0.1;Database=YourDatabaseName;Username=postgres;Password=XXXXX”

IMPORTANT UPDATE

I had really poor performance when accessing the main postgres database and I have been trying to improve this. With the help of Shay Rojansky on the EF Core team I found the problem. The host should be 127.0.0.1 and not the localhost that the previous articles suggested. Changing this made a MASSIVE improvement, e.g. clearing the database used to take 13 seconds, now it takes 350 ms.! This article has been updated to show this.

NOTE: If you are running your PostgreSQL server in Windows Linux subsystem, then the host value in the connection string should be 127.0.0.1, the username should be postgres, and the password should be the same as the password you provided when running the Linux command: sudo passwd postgres.

You need to create a class that inherits EF Core’s DbContext class (see a super-simple example below). This contains information of the classes that you want to map to a database.

public class TestDbContext : DbContext
{
    public TestDbContext(DbContextOptions<TestDbContext> options)
        : base(options) { }

    public DbSet<MyEntityClass> MyEntityClasses { get; set; }
}

NOTE: You need a constructor because you are going to provide a different connection string to your tests than you provide to your main application (see this section of the EfCore.TestSupport docs on this)

To create the DbContextOptions<TestDbContext> for your xUnit test you need install the NuGet package called Npgsql.EntityFrameworkCore.PostgreSQL and build the options to pass to your DbContext. The code below shows an example of how you would manually create the options, but the EfCore.TestSupport provides methods which will do this for you (see next section).

var connectionString =  
       “host=127.0.0.1;Database=YourDatabaseName;Username=postgres;Password=XXXXX”;
var optionsBuilder = new DbContextOptionsBuilder<TContext>();
optionsBuilder.UseNpgsql(connectionString);
using var context = new TestDbContext(optionsBuilder.Options);
//… your test goes here

But there are issues when using xUnit for your testing, which makes creating the options a bit more complex, which I cover in the next section.

Using xUnit with databases

The xUnit library is the main test library used by EF Core and ASP.NET Core and recommended for .NET testing. I really like xUnit’s feature that runs each test class in parallel because running all the tests finishes much faster. The downside of xUnit running in parallel is that each test class needs its own database. That’s because having multiple tests accessing the same database in parallel is very likely to break your tests.

For this reason, EfCore.TestSupport database methods return a unique database name for each xUnit test class. It does this taking the base PostgreSqlConnection from the appsettings.json file (see code below) and adds the name of the test class to the end of the base database name. This gives you a database name unique to each test class.

{
  "ConnectionStrings": {
    "PostgreSqlConnection": "host=127.0.0.1;Database=TestSupport-Test;Username=postgres;Password=your-password",
   // The SQL Server connection string has been removed
  }
}

NOTE: EfCore-TestSupport makes sure the database in the PostgreSqlConnection connection string in the appsettings.json file ends with the word “Test”. This is useful to differentiate PostgreSQL databases that are used for testing, and therefore can be deleted without losing any application data.

The test shown below comes from the TestPostgreSqlHelpers test class tests in the EfCore.TestSupport repo. The CreatePostgreSqlUniqueDatabaseOptions extension method creates the PostgreSQL options using the this input to get the test class name to add to the database name. The options are then used to create a new DbContext.

[Fact]
public void TestPostgreSqlUniqueClassOk()
{
    //SETUP
    var options = this
          .CreatePostgreSqlUniqueClassOptions<BookContext>();
    using var context = new BookContext(options);

    // rest of the test left out
}

NOTE: The code about uses the CreatePostgreSqlUniqueDatabaseOptions extension method, but there are other versions to provide logging and method-level database uniqueness.  See the EfCore.TestSupport PostgreSQL documentation for more these methods.

Making sure a test database is empty (quickly!)

It’s possible to write tests that run correctly even if there is data already in the database, but that makes it harder to write a valid test. The best approach is to make sure the database is empty before each runs, and the simplest way is to delete the database and create a new database, as shown below.

[Fact]
public void TestEnsureDeletedEnsureCreatedOk()
{
    //SETUP
    var options = this.CreatePostgreSqlUniqueDatabaseOptions<BookContext>();
    using var context = new BookContext(options);

    context.Database.EnsureCreated();
    context.Database.EnsureCreated();

    //ATTEMPT
    context.SeedDatabaseFourBooks();

    //VERIFY
    context.Books.Count().ShouldEqual(4);
}

This is easy do and works all situations but before I found a fix it took about 13 seconds, but now its a very respectable 350 ms. Because of the initial slow 13 seconds I wasted a lot of time trying to find ways to the way improve this, but that’s how it goes in development 🙁

But it did create a version that might be of use to you called EnsureClean. Shay Rojansky, who is the PostgreSQL expert and evangelist, came back with some ideas and a PostgreSQL version of the SQL Server EnsureClean that I added in version 5.1.0 of the library.

The EnsureClean deletes all the tables, types, extensions etc. and then adds tables etc. using a version of the EnsuredCreated method. You can see this

[Fact]
public void TestEnsureCleanOk()
{
    //SETUP
    var options = this.CreatePostgreSqlUniqueDatabaseOptions<BookContext>();
    using var context = new BookContext(options);

    context.Database.EnsureClean(); 

    //ATTEMPT
    context.SeedDatabaseFourBooks();

    //VERIFY
    context.Books.Count().ShouldEqual(4);
}

EnsureClean takes about 80 ms. which is 4 times quicker than the EnsureDeleted plus EnsureCreated approach, but whether its worth using is up to you.

Conclusion

I have found automated tests is one of the best ways for delivering a good, correct code. But at the same time, I don’t want to be spend a lot of time writing tests. So, in 2017 I built a library called EfCore.TestSupport that provide methods that handles the setting up and managing test database. And as of November 2021, there has been nearly 400,000 downloads of the EfCore.TestSupport library.

Up until now the EfCore.TestSupport library supported SQL Server and Cosmos DD databases (and SQLite in-memory databases). But now that I use PostgreSQL databases, I added support for PostgreSQL databases too. My use of these methods showed me that it was worth adding some extra methods to speed up the emptying of a PostgreSQL database at the start of a test.

I was working on updating the EfCore.TestSupport library to support EF Core 6, so it was a good time to add these new PostgreSQL features too (see the EfCore.TestSupport ReleaseNotes file for the full details on the changes). I hope these new features will help you in your testing EF Core applications that use PostgreSQL databases.

Happy coding.

Using PostgreSQL in dev: Part 1 – Installing PostgreSQL on Windows

Last Updated: November 10, 2021 | Created: October 26, 2021

I started work on a library that uses a PostgreSQL, which means I need a PostgreSQL database to test it against. The best solution is to run a PostgreSQL server on my Windows development PC as that gives me a fast, free access to PostgreSQL database.

There are three good ways to run a PostgreSQL server on Windows: running a Windows version of PostgreSQL, using docker or installing PostgreSQL in Windows Subsystem for Linux, known as WSL. I used the WSL approach because it’s supposed to the quicker than the Windows version and this article described what I has to do to make PostgreSQL run properly.

I didn’t think installing PostgreSQL would be simple, and it wasn’t, and I had to do a lot of Googling to find the information I needed. The information was scattered, and some were out of date, which is why decided to write an article listing the steps to do get PostgreSQL working on Windows. The second article will then cover how to access a PostgreSQL with Entity Framework Core and unit test against a PostgreSQL database.

TL;DR; – The steps to installing PostgreSQL on Windows 10

  1. Installing WSL2 on Windows, which provides a Linux subsystem running on Windows. This was easy and painless.
  2. Installing PostgreSQL in the Linux subsystem. This was easy too.
  3. Starting / stopping the PostgreSQL software on Linux. You just have to remember about five Linux commands.
  4. Getting the Windows pgAdmin app up and running. This is where I had a problem and it took quite a while to work out what was wrong and how to fix it.
  5. Creating a PostgreSQL server via pgAdmin. Had to be careful to get the correct values set up, but once I had done that once it remembers it for next time.
  6. See the second article on how to access a PostgreSQL database from a .NET application running on Windows, and some suggestion on how to unit test against a real PostgreSQL database.

Setting the scene – why am I installing PostgreSQL

I’m a software developer, mainly working on ASP.NET Core and EF Core, and I use Windows as my development system. Up until now I have only used PostgreSQL via Azure, but I wanted a local PostgreSQL server to unit test for my library AuthPermissions.AspNetCore which will support SQL Server or PostgreSQL.

I haven’t used Linux before and I don’t really want to learn Linux unless I need to, so I wanted to find ways to use Windows commands as much as possible. Using WSL does need me to learn a few Linux commands, such as how to start / stop the PostgreSQL code and there is a PostgreSQL Linux app called psql which uses terminal commands to manage the PostgreSQL server etc. But I found a application that is like Microsoft’s SSMS (SQL Server Management Studio) application with a nice GUI front-end called pgAdmin and it even has a Windows version. That means I only need to learn a few Linux commands and manage PostgreSQL from Windows, which is great.

Here are the steps I took to install PostgreSQL using Windows Subsystem for Linux.

1. Installing WSL2 Linux (Ubuntu) on Windows 10/11

Installing WSL was easy and painless for me, and the Microsoft WSL documentation was excellent. You must be running Windows 10 version 2004 and higher (Build 19041 and higher) or Windows 11, and the command wsl –install via PowerShell or Windows Command Prompt run as administrator.

You then need to restart your PC, and after a long time the Linux subsystem will ask for a username and password. You will need the password every time you start the Linux system, so make sure you remeber it.

Couple of extra information:

2. Installing PostgreSQL on Linux

Again, I found a great Microsoft document on how to install PostgreSQL (and other databases) in your Linux Subsystem. The list below shows the things I did to install PostgreSQL.

  1. Update your Ubuntu packages: sudo apt update
  2. Install PostgreSQL (and the -contrib package which has some helpful utilities) with: sudo apt install postgresql postgresql-contrib
  3. Set the PostgreSQL password: sudo passwd postgres.
    The nice thing about that is you can change the password any time if you forget it. This password is used whenever you access PostgreSQL’s servers and databases.

TIP: Linux doesn’t use ctrl-c / ctrl-v for cut/paste. I found that I could use the normal ctril-c to cut on Windows and then if I right-clicked in Linux it would paste – that allowed me to copy commands from the docs into Linux.

3. Starting / stopping PostgreSQL on Linux

The previous Microsoft document on how to install PostgreSQL also provided the 3 commands you need once PostgreSQL is installed:

  • Checking the status of your database: sudo service postgresql status
    It says online if running, or down if stopped.
  • Start running your database: sudo service postgresql start
  • Stop running your database: sudo service postgresql stop

TIP: You don’t need to have the Linux subsystem window open to access PostgreSQL once you have started – the Linux subsystem will stay running in the background.

4. Getting the Windows pgAdmin app up and running

I found the Windows version of pgAdmin here and installed it. The pgAdmin app started OK, but I immediately had a problem of connecting to the PostgreSQL on the Linux subsystem. When I tried to create a PostgreSQL server, I got the error “password authentication failed for user “postgres” – see below

Copyright: Stack Overflow – see  this stack overflow question.

This was frustrating and took quite a while to work out what the problem is. I tried a number of things but in the end it turns out you that the authentication mode of PostgreSQL doesn’t work with the connection to Windows (see this stack overflow answer) and you need to edit PostgreSQL’s pg_hba.conf file to change what is known as the auth-method (see PostgreSQL docs on the pg_hba.conf file) from md5  to trust for the IP local connections. See code below which shows the new version – NOTE you need to scroll to the right to see the change from md5 to trust.

# TYPE  DATABASE        USER            ADDRESS                 METHOD
# IPv4 local connections:
host    all             all             127.0.0.1/32            trust
# IPv6 local connections:
host    all             all             ::1/128                 trust

The problem with the stack overflow answers I found is they were written before WSL was around, so they assume you are running PostgreSQL directly from Windows. To get to the files in the Linux subsystem I needed to use the \\wsl$ prefix to access the WSL files. In the end I found the PostgreSQL’s pg_hba.conf at \\wsl$\Ubuntu\etc\postgresql\12\main (note that the \12\ part refers to the version of PostgreSQL application).

NOTE: Editing the pg_hba.conf file needs a high level of access, but I found VSCode could edit it, or you can edit the file via NotePad run as administrator.

5. Creating a PostgreSQL server

Once the password trust problem was fixed, I tried again to create a PostgreSQL server and it worked. When the pgAdmin application is started it asks for a password, which can be anything you like (and you can change it easily). This password lets you save PostgreSQL passwords, which is useful.

To create a PostgreSQL server, I clicked the Quick Link -> Add New Server link on the main page, which then showed a Create – Server popup with various tabs. I had to fill the following tabs/field

  1. On the General tab you need to fill the Name field with your chosen name. I used PostgreServer
  2. On the Connection tab you need to set
    • Host name/address: 127.0.0.1
    • Password: the same password as given in step 2 when setting the PostgreSQL password
    • Save password: worth turning on

You now have a PostgreSQL Server, but no databases other than the system database called postgres. You can create database using pgAdmin, but in the next article I show how your application and unit tests can create databases.

NOTE: I found this article by Chloe Sun which has pictures of the pgAdmin pages, which might be useful.

Conclusion

Most of the steps were fairly easy until I hit step 4, and which point I had to do a lot of digging to find the solution. The Microsoft documents were great, but they didn’t provide all the information I needed. Thanks to all the people that answered stock overflow questions and people who write articles about this, especially Chloe Sun who wrote a similar article to this.

The next article shows you how to access a PostgreSQL database using EF Core and describes the unit test methods I added to my EfCore.TestSupport 6.0.0-preview001. These unit test helpers allow you to create a unique PostgreSQL database for each xUnit test class.

Happy coding.

Finally, a library that improves role authorization in ASP.NET Core

Last Updated: November 14, 2021 | Created: August 10, 2021

In December 2018 I wrote the first article in the series called “A better way to handle authorization in ASP.NET Core” which describe an approach to improving how authorization (i.e., what pages/feature the logged in user can access) in ASP.NET Core. These articles were very popular, and many people have used this authorization/data key approaches in their applications.

Back in 2018 I didn’t think I could produce a library for people to use, but I have finally found a way to build an library, and it’s called AuthPermissions.AspNetCore (shorten to AuthP library in these articles). This open-source library implements most of the features described in the “A better way to handle authorization” series, but the library will work with any ASP.NET Core authentication provider and now supports JWT Tokens.

NOTE: At this time this article was written the AuthPermissions.AspNetCore library is/was in preview (1.0.0-preview) and I am looking for feedback before taking the library to full release. Please look at the roadmap discussion page for what is coming. Also, the preview some features won’t work on a web app running multiple instances (called scale-out on Azure).

This first article is focused on the improvements the AuthP library provides to using “Roles” to manage what feature users can access, but the library contains a number of other useful elements. The full list of articles is shown below:

TL;DR; – summary

  • The AuthPermissions.AspNetCore library has three main features:
    • Implements an improved Role authorization system (explained in this article).
    • Implements a JWT refresh token for better JWT Token security (see video and docs)
    • Includes an optional a multi-tenant database system (see video and docs)
  • The AuthPermissions.AspNetCore library can work with
    • Any ASP.NET Core authentication provider.
    • Either Cookie authentication or JWT Token authentication
  • This article focuses on AuthP’s improved “Roles / Permissions” approach, which allows you to change a Role without having to edit your application. Each user has a list or Roles, each Role containing one or more enum Permissions.
  • When the user logs in the AuthP library finds all the Permissions in all the Roles that the user has. These Permissions are packed into a string and added as a “Permissions” claim into the Cookie authentication or the JWT Token.
  • The AuthP’s HasPermission attribute / method will only allow access if the current user has the required Permission.
  • See other articles or the AuthP’s documentation for more information.
  • You can see the status of the AuthP library via the Release Notes document in the repo.

Setting the scene – How the AuthP library improves Roles authorization in ASP.NET Core

If you not familiar with ASP.NET’s authentication and authorization features, then I suggest you read the ”Setting the Scene” section in the first article in the A better way to handle authorization” series.

Back in the ‘old days’ with ASP.NET MVC the main way to authorize what a user can access was by using the Authorize attribute with what are called “Roles” – see the code below, which only lets a user with either the “Staff” or “Manager” Role to access that method/page.

[Authorize(Roles = "Staff,Manager")]
public ActionResult Index()
{
    return View(MyData);
}

ASP.NET Core kept the Roles approach but added the more powerful policy-based approach. But I found both approaches to have limitations:

  • Limitations of the ASP.NET Core Roles approach
    • The main limitation of the ASP.NET Core Roles is that authorization rules are hard coded into your code. So, if you want to change who can access a certain pages/Web APIs, you have to edit the appropriate Authorize attributes and redeploy your applicatios.
    • In any reasonably sized application, the Authorize attributes can get long and complicated, e.g.  [Authorize(Roles = “Staff, SalesManager , DevManage, Admin, SuperAdmin”)]. These are hard to find and maintain.
  • Limitations of the ASP.NET Core Roles policy-based approach
    • It very versatile, but I have to write code for each policy, which are all slightly different. And in a real application you might end up writing a lot of policy code.

From my point of view “Roles” are a useful concept for users, typically a user (human or machine) has a Role, or a few Roles, like SalesManager with maybe with an additional FirstAider role. Think of Roles as “Use Cases for users”.

But “Roles” are not a useful concept when defining what ASP.NET Core page/feature the user can access. That’s because for some feature you need fine-grained control, for instance a different access level for display, create, and update of sales information. For this a use what I call “Permissions”, for example you I might have SalesRead, SalesSell, SaleReturn, Permissions where the “Sales person” Role only have the SalesRead, SalesSell Permissions, but the” Sales manage” Role also has the SaleReturn Permission.

The AuthP’s approach says that “users have Roles, and your code uses Permissions to secure each page/WebAPI”. The AuthP’s Roles a number of benefits:

  1. No more deploying a new version of your application when you want to change what a Role can do – the AuthP’s Role/Permissions mapping is now in a database which admin people can change.
  2. The Permissions are declarative, just like the [Authorize] attribute (e.g. [HasPermission(MyPermissions.SalesSell)]) which makes it easier to maintain.
  3. You can have broad Permissions, e.g. NoticeBroadAccess (covering create, read, update, and delete), or fine-grained Permissions like SalesRead, SalesSell, SalesSellWithDiscount, SaleReturn for individual actions/pages.

NOTE: The AuthP’s Roles are different from the ASP.NET Core Roles – that’s why I always refer to “AuthP’s Roles” so that its clear which type of Role I am talking about.

How to use the AuthP’ library with its Roles and Permissions

So, let’s look at the various code you need in your ASP.NET Core application to use the AuthP’s Roles/Permissions system. Starting at the Permissions and working up the parts are:

  1. Defining your Permissions using an Enum
  2. Using these Permissions in your Blazor, Razor, MVC or Web API application
  3. Configuring the AuthP library in ASP.NET Core’s ConfigureServices method.
  4. Managing the user’s AuthP Roles and what Permissions are in each Role.

1. Defining your Permissions

The Permissions could be strings (just like ASP.NET Roles are), but in the end I found a C# Enum was best for the following reasons:

  • Using an Enum means IntelliSense can prompt you as to what Enum names you can use. This stops the possibility of typing an incorrect Permission name.
  • Its easier to find where a specific Permission is used using Visual Studio’s “Find all references”.
  • You can provide extra information to an Enum entry using attributes. The extra information helps the admin person when looking for a Permission to add to a AuthP’s Role – see the section on defining your Permission Enum.
  • I can use the Enum value: In the end I defined an enum with a ushort value (giving 65534 values), which can be stored efficiently in a (Unicode) string. This is important because the Permissions needed to be held in a claim and if you ASP.NET Core’s Cookie Authorization then a cookie has a maximum size of 4096 bytes.

So, let’s look at an example from Example4 in the AuthP’s repo, which is an application that manages a shop stock and sales. I have only shown a small part of Permissions in this example, but it gives you an idea how you can decorate each Permission to provide more info and filtering to the admin user.

public enum Example4Permissions : ushort //Must be ushort to work with AuthP
{
    NotSet = 0, //error condition

    //Here is an example of very detailed control over the selling things
    [Display(GroupName = "Sales", Name = "Read", Description = "Can read any sales")]
    SalesRead = 20,
    [Display(GroupName = "Sales", Name = "Sell", Description = "Can sell items from stock")]
    SalesSell = 21,
    [Display(GroupName = "Sales", Name = "Return", Description = "Can return an item to stock")]
    SalesReturn = 22,

    [Display(GroupName = "Employees", Name = "Read", Description = "Can read company employees")]
    EmployeeRead = 30,

    //other Permissions left out… 
}

2. Using these Permissions in your Blazor, Razor, MVC or Web API application

AuthP can be used with any type of ASP.NET Core application, with three ways to check if the current user has a given permission.

2a. Using AuthP’s [HasPermission] attribute

For a ASP.NET Core MVC or Web API controller you can add the [HasPermission] attribute to an access method in a controller. Here is a example taken from Example2’s WeatherForecastController, which is Web API controller – see the first line.

[HasPermission(PermissionEnum.ReadWeather)]
[HttpGet]
public IEnumerable<WeatherForecast> Get()
{
    //… other code left out
}

2b. Using AuthP’s HasPermission extension method

If you are using Blazor, or in any Razor file you can use the HasPermission extension method to check if the current ASP.NET Core’s User has a specific Permission. Here is an example taken from AuthP’s Example1 Razor Pages application

public class SalesReadModel : PageModel
{
    public IActionResult OnGet()
    {
        if (!User.HasPermission(Example1Permissions.SalesRead))
            return Challenge();

        return Page();
    }
}

The HasPermission extension method is also useful in any Razor page (e.g. User.HasPermission(Example.SalesRead)) to decide whether a link/button should be displayed. In Blazor the call would be @context.User.HasPermission(Example.SalesRead).

2c. Using the IUsersPermissionsService service

If you are using a front-end library such as React, Angular, Vue and so on, then your front-end needs to know what Permissions the current user has so that the front-end library can create method similar to option 2, the HasPermission extension method.

The IUsersPermissionsService service has a method called PermissionsFromUser which returns a list of the Permission names for the current user. You can see the IUsersPermissionsService service in action in Example2’s AuthenticateController.

3. Configuring the AuthP library in ASP.NET Core’s ConfigureServices method.

The AuthP library has a lot of methods and options to set it up. It will work with any ASP.NET Core authentication provider that returns the UserId as a string.  

The code below uses ASP.NET Core’s Individual Accounts authentication provider in an MVC-type application – the highlighted lines contain the configuration code to set up the

public void ConfigureServices(IServiceCollection services)
{
    //… other normal ASP.NET configuration code left out
    
    //This registers the AuthP library with your Permission Enum 
    services.RegisterAuthPermissions<MyPermissions>()
        //This sets up the AuthP’s database 
        .UsingEfCoreSqlServer(Configuration.GetConnectionString("DefaultConnection"))
        //This syncs AuthP to your ASP.NET Core authentication provider, 
        //In this example it’s the Individual Accounts authentication provider
        .RegisterAuthenticationProviderReader<SyncIndividualAccountUsers>()
        //This will ensure the AuthP’s database is created/migrated on startup
        .SetupAuthDatabaseOnStartup();
}

NOTE: There are many configuration parts to the AuthP’s library, and this shows a simple configuration appropriate for an existing application that is using ASP.Net Core’s Individual Accounts authentication. Please look at the AuthP startup documentation for a more detailed list of all the configuration options.

How the AuthP library works inside

The final part of this article gives you an overview of how the AuthP library works, mainly with diagrams. This should help you understand how AuthP library works.

When someone logs in some ASP.NET Core claims are built and stored in an the Authentication Cookie or in the JWT Token. The diagram shows the AuthP code in orange, with the blue part provided by ASP.NET Core and its authentication provider.

As you can see AuthP finds the user’s AuthP Roles and combine the Permissions in all the user’s AuthP Roles into a string (known as packed permissions), which becomes the Permissions claim’s value. This Permissions claim is stored in the Authentication Cookie or in the JWT Token.

This approach is a) very efficient as the Permission data is available as a claim (i.e., no database accesses needed), and b) a user’s AuthP Roles can be changed without needing to re-deploy your application.

NOTE: Normally the Permissions are only calculated when someone logs in, and any change to the Users’ AuthP Roles would only change by logging out and back in again. However, there are features in AuthP that can periodically re-calculate the user’s Roles/Permissions. I talk about this in this section of the document explaining how JWT Token refresh works.

The second part is what happens when a logged-in user wants to access a Page, Web Api etc. Because AuthP library added a Permissions claim to the Authentication Cookie / JWT Token the ASP.NET Core ClaimsPrincipal User claims will contain the Permissions claim.

When an [HasPermission(MyPermissions.SalesSell)] attribute is applies to a ASP.NET Core controller or Razor Page is calls a policy-based service that the AuthP library has registered. This Permission policy allows access to the method / page if the “MyPermissions.SalesSell” Permission is found in the User’s Permissions claim’s string. Otherwise, it redirects the user to either the login page or an Access Denied page, if already logged in (or for Web API it returns HTTP 401, unauthorized, or HTTP 403, forbidden).

Conclusion

This first article gives you an overview of the AuthP’s most important, that is the ability to change what a Role can do via an admin page instead of the ASP.NET Core Roles where a change requires you to edit your code and redeploy your application. In addition, the AuthP’s Permissions allow you to have very fine-gained access rules if you need it. Future articles will explain other features in the AuthP libraray.

The AuthP library is far from finished but I think it has the basics in it. The big limitation of the preview is that some features, like bulk loading on startup, only run on a single instance of the web app (i.e. no scale out). My plan is to solve that in the non-preview version of the AuthP library

I have put out a preview version to get some feedback. I have set up a roadmap discussion for you to see what I am planning and getting your suggestions and help. Please have a look at the library and add your comments suggestions to the roadmap discussion.

Happy coding!

Evolving modular monoliths: 3. Passing data between bounded contexts

Last Updated: May 17, 2021 | Created: May 17, 2021

This article describes the different ways you can pass data between isolated sections of your code, known in DDD as bounded contexts. The first two articles used bounded contexts to modularize our monolith application so, when we implement the communication paths between bounded contexts, we don’t want to compromise this modularization.

DDD has lots to say about design view of communication between bounded contexts, and .NET provides some tools to implement these communication channels in modern applications. In this article I give four different approaches to communicate between bounded contexts with varying levels of isolation.

This article is part of the Evolving Modular Monoliths series, the articles are:

TL;DR – summary

  • DDD says that your application should be broken up into separate parts (DDD terms: bounded contexts or domain) and these bounded contexts should be isolated from each other so that each bounded context can focus on its particular business group.
  • DDD also describes various ways to communicate between bounded contexts based on the business needs but doesn’t talk too much about how they can be implemented.
  • These articles are about .NET monolith applications and describe various ways to implement communicate paths between two bounded contexts:
    • Example 1: exchange data via a common database
    • Example 2: exchange data via a method call
    • Example 3: exchange data using a message broker
    • Example 4: communicating from new code to a legacy application
  • At the end of example 1 there is information on how to create EF Core DbContexts for each bounded context with individual database migrations.
  • The conclusion hives the pros, cons and limitation of each communication example.

What DDD says about communicating between bounded contexts

Just as DDD’s bounded context helped us when breaking up out monolith into modules, then DDD can also help with mapping the communicated between bounded contexts. DDD defines seven ways to map data when passing data between bounded contexts (read this article for an explanation of each type).

You might have thought that the mappings between two bounded contexts should always isolated, but DDD recognises the isolation comes at the cost of more development time, and possibly slower communications. Therefore, the seven DDD mapping approaches run from tightly coupling right up to complete isolation between the two ends of the communication. DDD warns us that using a mapping design that tightly links the two bounded contexts can cause problems when you want to refactor / improve one of the bounded contexts.

NOTE: I highly recommend Eric Evans’ 30-minute talk about bounded contexts at DDD Europe 2020.

Later in this article I show a number of approaches which use various DDD mappings approaches.

The tools that .NET provides for communicating between bounded contexts

DDD provides an architectural view of communicating between bounded contexts, but .NET provides the tools to implement the mapping and communication. The facts that the application is a monolith makes the implementation very simple and fast. Here are three approaches:

  • Having two bounded contexts to map to the same data in the database.
  • Calling a method in another bounded context using dependency injection (DI).
  • Using a message broker to call a method in another bounded context.

I show examples of all three of these approaches and extracts the pros, cons and limitations of each. I also add a forth example that looks at you can introduce a modular monolith architecture into existing applications whose design is more like “a ball of mud”.

Example 1: Exchanging data via the database

In a monolith you usually have one database used by all the bounded contexts, this provides a way to exchange data between bounded contexts. In a modular monolith you want each bounded context to have its own section of the database that it works with, and you can do that with EF Core. The fact there is one database does allow you to exchange data by sharing tables/columns in the database.

An example of this approach in the BookApp application is that when the Orders bounded context gets a user’s order it only has the book’s SKU (Stock Keeping Unit), but it needs the book price, title etc. Now, the BookApp.Books part of the database has that data in its Books SQL table, so the BookApp.Orders bounded context could also map to that table too. But this tightly links the BookApp.Books and BookApp.Orders bounded context.

One way to reduce the tightly linking is to have the …Orders only map to the few columns it needs. That way the …Books could add more columns and relationships without effecting the …Orders bounded context. Another thing you can do is make the …Orders mapping to the Books table read-only by using EF Core’s ToView configuration command. That makes it completely clear that the ….Orders bounded context isn’t in change of this data. The figure below shows this setup.

Pros and cons of this approach

In terms of DDD’s mapping approaches this a shared kernel, which makes the two bounded contexts tightly linking. The fact that the …Orders bounded context only accesses a few of the columns in the Books table reduces the amount the linking of the two bounded contexts because the …Books bounded contexts could add new columns without the need to …Orders code.

The positives of this approach it’s easy to set up and it works with NuGet packages (see article 2).

EXTRA: Who to set up multiple DbContexts using EF Core

Setting up separate DbContexts for each bounded context does make using EF Core migration feature a little bit complex.  Here are the steps you need to do to use migrations:

  1. Create a DbContext just containing the classes/table in your bounded context (examples:  BookDbContext and OrderDbContext)
  2. If any of the DbContexts access the same SQL table, you need to be careful to ensure two separate migrations try to change the same table. Your options are:
    1. Have only one DbContext that maps to that SQL table and other DbContexts map to that table using EF Core’s ToView configuration command. This is the recommended way because it allows you to select only the columns you need, and you only have read-only access.
    1. Choose one DbContext to handle EF Core configuration of that SQL table and the other DbContext’s use EF Core’s ExcludeFromMigrations configuration command.
  3. Then create a IDesignTimeDbContextFactory<your DbContext> and include the MigrationsHistoryTable option to set a unique name for the migration history file (example: Orders DesignTimeContextFactory)
  4. When you register each DbContext on startup you need to again add the MigrationsHistoryTable option (example: Startup code in ASP.NET Core)

When you want to create a migration for a DbContext in a bounded context, then you need to do that from the project containing the DbContext: this these comments in the BookApp.All OrderDbContext. This approach should be used for each DbContext in a bounded context.

EXAMPLE 2: Exchanges data via a method call

One big advantage of a monolith is you can call methods, which are quick and don’t have any communication problems. But we have isolated the bounded contexts from each other, so how can we call a method in another bounded context without breaking the isolation rules? The solution is to use interfaces and dependency injection (DI). The interface provides the isolation and DI provides the correct method.

For this example of this approach let’s say that the address that the Order should be sent to is stored in the Users bounded context. To make this work, and not break the isolation we do the following:

  1. You place the following items in the BookApp.Common.Domain layer because only Common layers can have multiple bounded contexts access it (see the rules about the Common I defined in part 1 of this series)
    1. The interface IUserAddress that defines the service that the BookApp.Orders can call to obtain an Address class of a specific UserId.
    1. The Address class that the service will return.
  2. You create a class called UserAddress in the BookApp.Users bounded context that inherits the IUserAddress interface defined in step 1a. You most likely put that class in the BookApp.Users.Infrastructure layer.
  3. You arrange for UserAddress / IUserAddress pair to be registered with the DI provider.
  4. Finally, in your BookApp.Orders you obtain an instance of the UserAddress via DI and use it to get the Address you need.

The figure below shows this setup.

Pros and cons of this approach

In terms of DDD’s mapping approaches this a customer / supplier mapping approach – the customer is the BookApp.Orders and the supplier is the BookApp.Users. The interface provides good isolation for the service but sharing of the Address class does link the two bounded contexts.

From the development point of view, you have to organise your code in three places, which is bit more work. Also, this approach isn’t that good when working with bounded contexts that have been turned into NuGet packages.

Overall, this approach provides better isolation than the exchange via the database but takes more effort.

EXAMPLE 3: Exchange data using a request/reply message broker

In the last mapping implementation used the .NET DI provider, which meant any interfaces and classes used has to be in a .NET project that both bounded contexts. There is another way to automate this using a request/reply message broker. This allows you to set up mapping links between two bounded context while not breaking the isolation rules,

Let’s create the same in example 2, that is getting the address that the Order from the bounded context. Here are the steps:

  1. Register a request/reply message broker as a singleton to the DI provider.
  2. In the BookApp.Users bounded context register a get function with the message broker. You will send an Address class which is registered in the BookApp.Users bounded context.
  3. In the BookApp.Orders bounded context call an ask method in the message broker. You will receive the add into an Address class in the BookApp. Orders bounded context.

The figure below shows this setup.

The message broker allows you to register a getter function (left side of the figure) that can be called by the AskFor method (right side in figure). This is equivalent to calling a method in example 2 but doesn’t need any external interfaces or classes. Also, in this example there are two classes called Address which the message broker can map between to two Address classes, thus removing the need for an extra common layer we needed in example 2.

Initially I couldn’t find a request/reply message broker so I build a simple version you can find here, but more research RabbitMQ has a remote procedure call that does this (but my simple version is easier to understand).

NOTE: Microservice architecture normally use a publish/listen message broker, where apps register to be informed if certain data changes. This is done to improve performance by sending updated of data needed by a Microservice app so that it cache the data locally. However, in a monolith architecture you can access data anywhere in the app in nanoseconds (just an in-memory dictionary lookup and a function call), so a request/reply message broker is more efficient.

Pros and cons of this approach

This is another customer / supplier mapping approach as used in example 2, but more isolation due to the request/reply message broker being able to copy data from one type to another.

From a development point of view this is easier than example 2 which called a method using DI, because you don’t have to add the BookApp.Common.Domain layer to share the interface and class. The other advantage of this message broker approach is it works with bounded contexts turned into NuGet packages.

There aren’t any downsides other than learning how to use a request/reply message broker.

Overall, I think this approach is quick to implement, provides excellent isolation and works with bounded contexts turned into NuGet packages.

EXAMPLE 4: Added new modular code to a legacy application

There are lots of existing applications out there, some of which don’t have a good design and have fallen into “a ball of mud” – we call these legacy applications. So, the challenge is to apply a modular monolith architecture to an existing application without the “ball of mud” code “infecting” your new code.

One solution I defined uses three parts to add new code to legacy applications.

  • Build your new feature in a separate solution: This gives you have a much better chance of building your code using modern approaches such as modular monolith architecture.
  • You install your new feature via a NuGet package: By packaging your new feature into NuGet package makes it much easier to add your new code to the existing application.
  • Use DDD’s Anticorruption Layer (ACL) mapping approach: The ACL mapping approach that builds adapters between the existing application’s code and concepts and the new code you have written to add a new feature.

The figure below shows how this might work.

You already know about separate solutions and NuGet packaging from in part 2 of this series so I concentrate an the ACL and how it works.

DDD’s ACL mapping approach is designed for interfaces to a legacy system. It assumes that a) the legacy system’s code cannot be easily changed, and b) the legacy system design suboptimal design. The ACL mapping approach hides the more difficult parts of the legacy system by using the adapter pattern, which allows you to write your new code against a “cleaned up” interface.

Of all the DDD mapping patterns the ACL mapping provides a very high level of separation between the legacy system and your new code. The downside is of all the DDD mapping approach the ACL take the most development effort to create.

While I have described how to add new code to legacy system, I have to say that it isn’t a simple job. My experience is that fully understanding a legacy system’s code is far harder and takes longer that writing the ACL layer code.

The understanding and unscrambling of a legacy system is a big topic and I’m not going to cover it here, but you might like to look a few of links I have listed below:

NOTE: You might also be interested in the strangler pattern if working with existing applications. This pattern provides a way to progressively change your old code to a more modern code design.

Pros and cons of this approach

DDD’s ACL mapping approach provides excellent separation between the two parts but takes a lot of development effort to build. Therefore, you should only use this when there is no other way to achieve this. However, it’s not the building the ACL mapping that the hard part, the hard part is working out how the legacy system works so that you can add your new.

Conclusion

I have described four examples of communicating between DDD bounded contexts. From a DDD point of view I didn’t cover all of DDD’s bounded context mapping approaches in this article, but I did cover the four main ways to implement communicating between bounded contexts in .NET monolith applications.

As you have seen it’s a balance between how much the communication ties the design of two bounded contexts together against the amount of development effort it takes to write the communication link. The list below provides a summary of the pros and cons of each approach I cover in this article.

  • Exchanging data via the database (example 1)
    • Pro: fairly easy to implement
    • Con: some linking between bounded contexts (DDD shared kernel)
    • Limitations: none
  • Exchanges data via a method call (example 2)
    • Pro: good performance, easy to implement
    • Cons: Needs extra common layer to share interfaces/classes
    • Limitations: Doesn’t work with bounded context NuGet packages
  • Exchange data using a request/reply message broker (example 3)
    • Pro: good performance, easy to implement
    • Cons: You need a request/reply message broker
    • Limitations: none
  • Adding new modular code to a legacy application (example 4)
    • Pro: allows you to write new code using a modern design
    • Cons: A LOT of work
    • Limitations: none

NOTE: The “exchanging data via the database” also contains extra information (link!!!) on how to create individual EF Core DbContexts for each bounded context that has to link the database.

I hope this article, plus others in the series have been useful to you.

Happy coding!

Evolving modular monoliths: 2. Breaking up your app into multiple solutions

Last Updated: May 10, 2021 | Created: May 10, 2021

This is the second article in a series about building Microsoft .NET applications using a modular monolith architecture. This article covers a way to extract parts of your application into separate solutions which you turn into NuGet packages that are installed in your main application. Each solution is physically separated from each other in a similar way to Microservice architecture, but without the performance and communication failure modes that Microservices can have.

My view is that the Microservice architecture is great for applications that need with large development teams and/or have to handle high levels of demand, like Netflix. But for smaller applications the Microservice architecture can be overkill. However, I do like the Microservice idea of having separate solutions, because that makes the application easier to understand, refactor and manage so I have defined a way to do extract parts of an application into their own solution. The result is an application that is easier to build because it’s using the simpler monolith architecture but has the benefits of having multiple separate solutions that are combined by using NuGet.

NOTE: If you are planning to build Microservice architecture, then the approach described is also a great starting point because it already creates separate solutions. Martin Fowler also suggests that starting with a monolith approach is the best way to build a Microservice application – see his article called “Monolith First”.

This article is part of the Evolving Modular Monoliths series, the articles are:

TL;DR – summary

  • One of the best ways to structure your application is to break the business needs into what DDD calls bounded contexts (see this section in the first article for more on bounded contexts).
  • In a modular monolith some bounded contexts can be large and/or complex. In these cases, giving a large/complex bounded context its own solution makes the code easier to understand, navigate, and manage.
  • I have created a dotnet tool called MultiProjPack that automates the creation of the NuGet package for projects that follow the naming convention described in the first article.
  • I describe a fast local build/test/debug cycle when adding/changing code in a separate solution. This uses a local NuGet package source and some features in the MultiProjPack tool.
  • I describe the options for getting source code information while debugging a NuGet package inside your main application.
  • I finish with a section on building the composite application for deployment to production and suggest ways to store your private NuGet packages.

Breaking up your app into multiple solutions

The idea is to get the separation that a Microservice pattern provides while keeping the performance and reliability of direct method or database access. To do this we extract a DDD bounded context (see the section on bounded contexts in the first article) section of your code into its own solution and then turn it into NuGet package to install in your main application.

Turning a DDD bounded context into a separate solution/NuGet package requires a bit more work, so I suggest you only apply this approach to large and/or complex parts of your application. The benefits are:

  • The code is easier to understand/navigate because it’s isolated into its own solution.
  • If the development team is large, it’s easier for one team to work on an isolated solution   and “publish” a NuGet package for other teams to use.
  • On older application this approach means you can build new features using a more modern design without the structure of the old application hindering your new design.

I have created two example repos that show this approach in action I have created another version of the e-commence web app that sells books called BookApp. The main application is called BookApp.Main (see https://github.com/JonPSmith/BookApp.Main for the code example), which contains the FrontEnd and the Order processing parts (BookApp.Orders…). The part of the application deals with the querying and updating of the books in the database (referred to as BookApp.Books) is large enough to warrant turning into a solution (see https://github.com/JonPSmith/BookApp.Books for code example). The BookApp.Books solution is turned into a NuGet package and installed in the BookApp.Main. The figure below shows this in action.

This makes it much easier for a development team to work on a part of the application in a separate repo/solution. Also, the team can “publish” a new version via a private NuGet server, with fallback to the old version provided by simply changing back to the previous NuGet packages.

NOTE: One limitation around using a NuGet package is that all your projects in the solution must have the same framework, e.g. .net5.0. That’s because NuGet is designed to handle packages with work with multiple frameworks, for instance the Newtonsoft.Json package can work with seven types of frameworks (see its dependences). But in this usage you want all of your projects to be the same otherwise it won’t work.

Communication between the front-end and a NuGet package

Turning a bounded context into a NuGet package gives you the benefit of separation that a Microservice has. But while a Microservice architecture has one API front-end per Microservice, in our modular monolith design there is direct code link, via the NuGet package, to the BookApp.Books code.

This changes the communication channel we use between each bounded context and the user. In a Microservice design the front-end typically accesses a service via HTTP API, but in our modular monolith design it accesses a service via dependency injection to call method. So, in a modular monolith the API is defined mainly by interfaces and dependency injection, which a few key class definitions. And because we are using a monolith architecture there is one front-end in BookApp, that is a ASP.NET Core project.

NOTE: There are other communication paths between two bounded contexts, which I cover in part 3.

The API is mainly defined by the ServiceLayer, which contains many of the services linked to a bounded context (read this section about the ServiceLayer in one of my articles).  For instance, in my example BookApp.Main I want to display a list of books with sorting, filtering and paging. This uses a service referred to by the interface IListBooksService in the …ServiceLayer.GoodLinq project. The ServiceLayer will also contain some classes too, such as the BookListDto class to provide the data to display the books, and various other classes, interfaces, constants etc.

NOTE: For Separation of Concerns (SoC) reasons I also recommend adding a small project specifically to handle any startup code, e.g., registering services with the dependency injection provider. I also with application that loads data from a configuration file (e.g. appsettings.json) I pass the IConfiguration interface too in case the code needed that – see BookApp.Books.AppSetup as an example.  

How to create a NuGet packages

Now that we have defined how the bounded context with link to the main application, we now need to create a NuGet package. It turns out that creating a NuGet package containing multiple .NET projects is doable, but takes a lot of work building the .nuspec file properly.

To automate the creation the NuGet package I built a dotnet tool called MultiProgPack (repo found here) that builds the .nuspec file by scanning your solution for certain projects/namespaces and builds a NuGet package for you. This tool also contains features to make it much quicker to build and test your NuGet package on your development computer by using a local NuGet package server, which I describe later.

The MultiProgPack (which is a NuGet package) can be installed or updated on your computer the following three lines of command line commands:

dotnet tool install JonPSmith.MultiProjPack -global

dotnet tool update JonPSmith.MultiProjPack –global

Once you have installed the MultiProgPack you to call this dotnet tool via a command line in one of the projects in your solution (you can use any project, but I normally use the BookApp.Books.AppSetup project). Here are how you call it, with your selection of one of three of its options:

MultiProjPack <D|R|U>

The three options do the following

  • D(ebug): This creates a NuGet package using the Debug configuration of the code.
  • R(elease): This creates a NuGet package using the Release configuration of the code.
  • U(pdate): This builds a NuGet package using the Debug configuration, but also updates the .dll’s in the NuGet cache (I explain this in this section).

The tool relied on a xml file called MultiProjPack.xml in the folder you run the tool from. This file defines the NuGet data (name, version and so on) and optional tool settings. Here is a typical setup, but there are a lot more settings if you need them (see this example file containing all the settings and the READMe file for more information)

<?xml version="1.0" encoding="utf-8"?>
<allsettings xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<!-- this contains the typical information you should have in your settings -->
  <metadata>
    <id>BookApp.Books</id>
    <version>1.0.0-preview001</version>
    <authors>you must give a list of author(s)</authors>
    <description>you must provide a description of the NuGet</description>
    <releaseNotes>optional: what is changed in this release?</releaseNotes>
  </metadata>
  <toolSettings>
    <!-- This is used to find projects with names starting with this. If null, then uses NuGet id -->
    <NamespacePrefix></NamespacePrefix>
    <!-- excludes named projects (comma separated), e.g. "Test" would exclude a project starting with "BookApp.Books.Test" -->
    <ExcludeProjects>Test</ExcludeProjects>
    <!-- worth filling in with your local NuGet Package Source folder. See docs about using {USERPROFILE} in string -->
    <CopyNuGetTo>{USERPROFILE}\LocalNuGet</CopyNuGetTo> 
  </toolSettings>
</allsettings>

NOTE: Don’t worry, the tool has the command –CreateSettings that will create the MultiProjPack.xml with the typical settings for you to edit.

When you run the tool it:

  1. Scans all the folder for all .NET projects starting with the toolSettings.NamespacePrefix value e.g., BookApp.Books, and create a .nuspec file containing all the .NET projects it found
  2. It then calls dotnet pack to create the NuGet package using the .nuspec file created in step1.
  3. If you set up the correct  <toolSettings> it will update your local NuGet package server. I explain this later
  4. If you ran it in U(pdate) mode it will replace the data in the NuGet cache. I explain this later.  

But before I describe step 3 and 4 when running the MultiProgPack tool, lets consider how you might debug an application using a NuGet package.

The debugging an application that uses modular monolith NuGet package

There is a problem when developing an application using the modular monolith NuGet package approach, and that is the time it takes to upload a NuGet package to nuget.org (or any other web-based NuGet servers). That upload can take a few minutes, which doesn’t make a good development process. But be of good cheer – I have solved this problem and it takes less than a second to upload a NuGet package when testing locally. But before I describe the solution let’s look at the development process first.

For this example, you want to add a new feature, say adding a wish list where users can tell other people which books they would like for their birthday. And your application is called BookApp.Main, which contains the ASP.NET Core FrontEnd, and you decide to add the wish list feature to the code to BookApp.Books solution, which is in a separate solution and installed in the BookApp.Main application via a NuGet package.

The development process would require five things:

  1. Add new pages and commands in the FrontEnd code in your BookApp.Main application.
  2. Add new services in the part of the application that you have in separate BookApp.Books solution and write some unit tests to make sure they work.
  3. Then create a new version of the NuGet package from the BookApp.Books solution.
  4. Then you install the new version BookApp.Books NuGet package in the BookApp.Main application.
  5. Then you can try out your new feature by running the BookApp.Main locally with dummy data.

If you are a genus, you might be able to write both parts and it all works first time, but most people like me would need to go around these five steps many times. And if you had to wait for a few minutes for each NuGet package upload, it would be a very painful development cycle. Thankfully there are ways around this – the first is using local NuGet server on your development computer.

1. Using a local NuGet server to reduce the package upload time to milliseconds

It turns out you can define a folder on your development computer (Windows, Mac or Linux) as a NuGet package source. This means the upload is now just a copy of your new NuGet package into a folder on your development computer, so it’s very fast. That means your development testing will also be fast.

To add a local NuGet package source via Visual Studio you need to get to its NuGet package sources page. The command to get to that page is Tools > NuGet Package Manager > Package Manager Settings > Package Sources. Then you can add a new package source that is linked to a folder on your computer. Below is a screenshot of my NuGet package sources page where I have added a local folder as a possible NuGet package source (see selected source).

Once you have done that any NuGet packages the folder you have defined will show up in the NuGet Package Manager display if the local NuGet package source (or All) is selected. See the example below where I have selected my Local NuGet in the package source, which is highlighted in yellow.

NOTE: If you are using VSCode and dotnet commands then see this article about setting up folder on your local computer.

To make this even easier/quicker the MultiProgPack tool has a feature that will copy the newly created NuGet package directly into your local folder. To turn on that feature you need to fill in the <toolSettings>.<CopyNuGetTo> setting with your local NuGet package folder. The CopyNuGetTo setting will support the string {USERPROFILE} to get your user’s account folder, which means it works for any developer. For instance, the string {USERPROFILE}\LocalNuGet on my computer would become C:\Users\JonPSmith\LocalNuGet on my computer. That means your MultiProjPack.xml doesn’t have to change for each developer on the team.

The end result of setting the CopyNuGetTo path is that the command MultiProgPack D (or R) is a new NuGet package will appear in your local NuGet package source and you can immediately add or update your main application with that NuGet package.

2. Directly updating the NuGet package cache

The local NuGet package source is great, but you need to in increment the NuGet version for each package, as once a package is in the NuGet package cache you can’t override it. I found that a pain, so I looked for a better solution. I found one by accessing the NuGet package cache.

The NuGet package cache speeds up the performance of builds using NuGet packages. When you install a new NuGet package it adds an unpacked version of the NuGet package in the NuGet package cache. The unpacked version of the NuGet package contains all the files in the NuGet package, e.g. the code (.dll), symbol files (.pdb), documentation files (.xml), and these are copied into your application on certain actions, e.g. Build > Rebuild Solution, Restore NuGet packages, etc.

I take advantage of this feature to speed up the whole write code, install, test cycle (steps 3 to 5 of the development process already described) by directly updating the NuGet package cache at the same time, which is triggered by the MultiProjPack U(pdate). This works if you have created and installed a NuGet package with new code, but it has a bug. Instead of you having to create a new version of the NuGet package the U(pdate) command update the existing NuGet package, both in the local NuGet package source and the NuGet package cache.

The end result is, after running the MultiProjPack U(pdate) command you use Visual Studio’s command Build > Rebuild Solution, which will rebuild your main application using the files in the NuGet package cache. That cuts out two manual steps (changing the NuGet package version and updating the NuGet package in the main application), but there is a limitation.

The limitation is if you add, remove or update the .csproj file (say, by adding the NuGet packages it need), then you can’t use the U(pdate) command. For these cases you have to create a new NuGet package with a higher version number. The MultiProjPack U(pdate) command is useful for testing or fixing a bug and need to go around multiple times.

NOTE: For users that don’t like with the idea of changing the cache another way to go is to add a new NuGet version using the -m: option e.g., MultiProjPack D -m: version=1.0.0.1-preview002. This overrides the version in the MultiProjPack.xml file, thus saving you having to edit the MultiProjPack.xml every time, but then you need to manually update the NuGet package via the NuGet Package Manager display.

These two features reduce the minutes of uploading a NuGet package to nuget.org into a one command that takes seconds to update a NuGet package in your main application.

Tips on how to debug your NuGet package

As part of the build/test/debug cycle you will install or update your NuGet package into the main application to check it works. If something goes fails in your main application you might want to debug code that is in your NuGet package. Here are some tips on how to do this.

The first thing you need to do is turn off the “Just My Code” setting in the Debugger (see figure below). This allows you to step into NuGet package, and set breakpoints, see the local data and so on inside your NuGet package.

The second thing you need to know about how Visual Studio accessed the code introduced by your NuGet package. There are various ways to add the source code, names of local variables, stack traces, and so on into your NuGet package. The ways to add source code information to a NuGet for debugging: embedding the code in the .dll file or using symbols files.

1. Embedding the source code in the .dll file

You can embed the source code in the code (.dll) file, which makes it easy for Visual Studio to find the source code related to the .dll file. This works well, but requires you to add some extra xml commands to the NET project’s .csproj files.

The downside of this approach is it makes the .dll file bigger, so I suggest that you only add the source code to the .dll file when working to a Debug configuration build. To do this need to add following xml commands in a .NET project’s .csproj files you want this feature.

  <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|AnyCPU'">
    <DebugType>embedded</DebugType>
    <EmbedAllSources>true</EmbedAllSources>
  </PropertyGroup>

The next approach, using symbol files, is more automatic, but has some limitations.

2. Using the symbols files

By default, a .NET project will create a code (.dll) file and a symbols (.pdb) file in both the bin\{framework} and obj\{framework} folders when a project is build and the MultiProjPack tool will add then to the NuGet package. Visual Studio can use these when debugging, but if you change the code in a project in the NuGet package Visual Studio will stop you debugging that project. 

Visual Studio is very clever about finding debug information. If you open the Visual Studio’s Debuger > Windows > Modules window you will see all the .dll files in your application, and some of them will have a symbol status of loaded. By looking at this I can see that Visual Studio can find the folder holding the solution for the NuGet package, and it can access the symbols (.pdb) file of each project (I don’t know how Visual Studio does that, but it does).

This all sounds perfect, but as I said if you change the code in a project in your NuGet package, then Visual Studio will stop linking symbols file of the changed code. That means you can’t debug the change project anymore. The good news is the MultiProjPack U(pdate) does work, and you still debug the project via symbols.

NOTE: There are different formats for symbols files, which is a bit confusing. Also symbols files and NuGet symbol servers aren’t supported everywhere. The MultiProjPack tool will copy symbols (.pdb) files into the NuGet package if they are found in the project. It can also create a NuGet symbol package if you need it (see this setting).

What to do for production?

MultiProjPack’s D(ebug) and U(pdate) commands create NuGet packages using the Debug configuration of your code, and typically live in the local NuGet project source. But what should you do when you want to release your code for production?

Firstly, you should build a NuGet package using the release version of the code using the R(elease) command, i.e., MultiProjPack R. The created NuGet package will be in a folder called .nupkg in the folder where you ran the MultiProjPack tool. Now you need this release NuGet package so others can use it. How you do that depends on how you work and whether you want your NuGet packages to be private or not.

By default, the git ignore file will ignore NuGet packages so it you create a package you need to store your Release packages in some other form other than your Git repo. If your NuGet package can be public, then it’s easy – use https://www.nuget.org/. For holding NuGet packages privately, then here are some possibilities:

  • GitHub: GitHub allows you to publish and consume NuGet packages and they can be private, even for free accounts. See the GitHub NuGet docs and this article by Bruno Hildenbrand.
  • Azure DevOps: Azure provides a way to publish and consume NuGet packages within a DevOps pipeline. See the Azure NuGet docs and this article by Greg Margol.
  • MyGet: MyGet has been around for years and used by a host of companies, but it costs.
  • BaGet: Scott Hanselman’s article recommends BaGet, but at the time of writing it doesn’t have a private feed version.

Conclusion

In my quest to modularize a large monolith architecture I wanted to mimic the Microservice pattern approach of breaking the business problem into separate applications that communicate with each other. To do this I used NuGet packages to implement the “separate” solutions part of the Microservice architecture in a monolith application, using DDD bounded contexts to guide how to break my application into discrete parts.

It does take a big more work to use NuGet packages, but as a working developer I have strived to make this approach as automated and fast as possible. That’s why I build the MultiProjPack tool is a key part both creates the correct NuGet package and has features to make your build/test/debug cycle as quick as possible.

The first article described two ways to modularize a monolith so that its code is less likely to turn into “a ball of mud”. This article takes this another level of separation by allowing you build parts of your application as individual solutions and combine them into the main application using NuGet. The missing part is how you can communicate between bounded contexts, including bounded contexts build as NuGet packages, which I cover in the third article.

Evolving modular monoliths: 1. An architecture for .NET

Last Updated: July 6, 2021 | Created: May 3, 2021

This is the first article in a series about building a .NET application using a modular monolith architecture. The aim of this architecture is to keep the simplicity of a monolith design while providing a better structure so that as your application grows it doesn’t turn into what is known as “a big ball of mud” (see the original article which describes a big ball of mud as “haphazardly structured, sprawling, sloppy, duct-tape-and-baling-wire, spaghetti-code jungle”).

The designs I describe will create a clean separation between the main parts of your monolith application by using a modular monolith architecture and Domain-Driven Design’s (DDD) bounded context approach. These designs also provide a way to add a modular monolith section to an old application whose architecture isn’t easy to work with (see part 3).

NOTE: This article doesn’t compare a Monolith against a Microservice or Serverless architecture (read this article for a quick review of the three architectures). The general view is that a monolith architecture is the simplest to start with but can easily become “a ball of mud”. These articles are about improving the basic monolith design so that it keeps a good design for longer.

In Evolving Modular Monoliths series, the articles are:

NOTE: To support this series I create a demo ASP.NET Core e-commerce application that sells books using a modular monolith architecture. For this article the code in the https://github.com/JonPSmith/BookApp.All repo provides an example of the modularize bounded context design described in this article.

Who should read these articles?

These articles are aimed at all .NET software developers and software architects. It gives an overview of a modular monolith architecture approach, but later articles also look at team collaboration, and the nitty-gritty of checking, testing and deploying your modular monolith application.

I assume the reader is already a .NET software developer and is comfortable with writing C# code and understand .NET term, such as .NET projects (also known as assemblies), NuGet packages and so on.

What software principals do we need to avoid “a big ball of mud”?

Certain software principals and architectures can help organise the code in your application. The more complex the application, then the more you need to use good software principals to guide you to create code that is understandable, refactorable and robust. Here are some software principals that I will apply in this series:

  • Encapsulation: Isolating the various code for each business features from each other so that that a change to code for one business feature doesn’t break code for another business feature.
  • Separation of Concerns (SoC): That is, this design provides high cohesion (links only to relevant code) and low dependency (isn’t affected by other parts of the code).
  • Organisation: Makes it easy to find the various parts of code for a specific feature when you want to change things.
  • Collaboration: Allows multiple software developers to work on an application at the same time.
  • Development effort/reward: We want a architecture that balances speed of development against building an application that is easy to enhance and refactor.
  • Testability: Breaking up a feature into different parts often makes it easier to test the code.

Stages in building an ASP.NET Core application

I am going to take you three levels of modularization: modularization using an n-layered architecture, modularization at the DDD bounded context level, and finally modularization inside a DDD bounded context. Each level builds on the previous level, ending up with a high level of isolated for your code.

1. Basic n-layered application

I start with the well-known approach of breaking up an application into layers, known as an n-layered architecture. The figure below shows the way I break up an application into layers within my modular monolith design.

You might be surprised by the number of layers I used, but they are there to help to apply the SoC principal.  Later on, I define some architectural rules, enforced by unit tests, that ensures that the code to implement a feature has is high cohesion and low dependency.

To help you to understand what all these layers are for let’s look at how the creation of an order for book in my BookApp application (you try this feature by downloading BookApp.All repo and running it on your computer):

  • Front-end: The ASP.NET Core app handles showing the user’s basket and sending their order to the ServiceLayer.
  • ServiceLayer: This contains a class with a method that accepts the order data and sends it to the business layer. If everything is OK, it commits the order to the database and clears the user’s basket.
  • Infrastructure: Not needed for this feature.
  • BizLogic / BizDbAccess: An order is complex, so it has its own business logic code which is isolated from the database via code held in the BizDbAccess layer (this approach is described here).
  • Persistence: This provides the EF Core code to map the Orders classes to the database.
  • Domain: This holds the Order classes mapped to the database.

The figure above shows the seven layers as n-layer architecture which most developers are aware of. But in fact, I am using the clean architecture layering approach, which depicted as a series of rings, as shown in the figure to the right (click for a larger view). The clean architecture is also known as Hexagonal Architecture and Onion Architecture.

I use the clean architecture layering approach because I like its rules, such as inner layers can’t access outer layers. But any good n-layering approach would work.

I really like the clean architecture’s “layers can’t access outer layers” rule, but I have altered some of the clean architecture’s rules. They are:

  • I added a Persistence layer deep inside the rings to accommodate EF Core. See this section of a pervious article which explains why.
  • The clean architecture would place all the interfaces in the inner Domain layer, but I place the interfaces in the same project where the service is defined for two reasons: a) to me keeping the interface with the service is a better SoC design and, b) it also means an inner layer can’t call outer service via DI because they can’t access the interface.

Two general layering approaches that still apply when we move to a modular monolith architecture. They are:

  • I place as little as possible business code in the Front-end layer, but instead the Front-end calls services registered with the dependency injection system to display or input data. I do this because a) the Front-end should only manage the output / input of data, and b) it’s easier to test these services as a method call rather than testing ASP.NET Core APIs or Pages, which is much harder and slower.
  • The ServiceLayer has a very important role in my applications, as it acts an adapter between the business classes and the user display/input classes. See the section called “the importance of the Service Layer in the Clean Architecture layers” in a previous article to find out why.

The downsides of only using an n-layered architecture

I have used the n-layered architecture for many years, and it works, but the problem is that the n-layered architecture only applies the SoC principal at layer level, but not within layer. This has two bad effects on the structure of your application.

  • Layers can very big, especially the ServiceLayer, and hard to find and change anything.
  • When you do find the code for a business feature it’s not obvious whether other classes link you this code.

The question is, is there an architecture that would help you to follow the SoC encapsulation principals? Once I tried a modular monolith architecture (with DDD and clean architecture) I found the whole experience was significantly better than the n-layered architecture on its own. That’s because I knew where code for a certain business feature was by the name of the .NET project, and I knew that those projects only held code that is relevant to the business feature I am looking at. See a previous article called “My experience of using modular monolith and DDD architectures” where I reflect on using modular monolith on a project that was running late.

2. Modularize your code using DDD’s bounded context approach

So, rather than having all your code in an n-layered architecture we want to isolate the code for each feature. One way is to break up your application into specific business groups, and DDD provides an approach called bounded contexts (see this article by Martin Fowler for an overview of bounded contexts).

Bounded contexts are found by looking at the business needs in your application but Identifying bounded context can be hard (see this the video The Art of Discovering Bounded Contexts by Nick Tune for some tips). Personally, I define some bounded contexts early on, but I am happy to change the bounded context’s boundary walls and names as the project progresses and I gain more understand of the business rules.

NOTE: I use the name bounded context throughout this series, but there lots of different names around DDD’s bounded context concept, such as domain, subdomain, core domain etc. In many places I should use the DDD term domain, but that clashed with the clean architecture’s usage of the term domain, so I use the term bounded context wherever DDD bounded contexts or DDD domains are used. If you want some more information on all of the DDD terms around the bounded context try this short article by Nick Tune.

DDD’s bounded context work at the large scale in your business, for example in my BookApp as well as displaying books the user could also order books. From a DDD point of view the handling of books and handling of user’s orders are in different bounded contexts: BookApp.Books and BookApp.Orders – see the figure below.

Each layer in each bounded context has a .NET project containing the code for each layer, with the BookApp.Books .NET projects separate from the BookApp.Orders NET projects. So, in the figure the .NET projects in the Books bounded context is completely separate from the .NET projects in the Orders bounded context, which means each bounded context is isolated from each other.

NOTE: Another way to keep the bounded context isolated is to build a separate EF Core DbContext for each bounded context with only the tables that the bounded context needs to access. I cover how to do this in part 3 of this series.

Each layer is a .NET project/namespace and must have a unique name, and we want a naming convention that makes it easy for the developer to find the code they want to work on: the figure below shows the naming convention I that best described the application’s parts.

Bounded contexts also need to share data between bounded contexts, but in a way that doesn’t compromise the isolation por design of the bounded context. There are known design patterns of sharing data between bounded contexts, and I cover these in part 3.

NOTE: I recommend Kamil Grzybek’s excellent series on the Modular Monolith. He uses the same approach as I have described in this section. Kamil’s articles give a more details on the architectural design behind this design while my series introduces some extra ways to modularize and share your code.

3. Modularize inside a bounded context

The problem of just modularizing at the bounded context level is that many bounded contexts contain a lot of code. I can think of client projects I have worked where a single bounded context contained at over a years’ worth of developer effort.  That could mean that a single bounded context could become “a big ball of mud” all by itself. For this reason, I have developed a way to modularize within a single bounded context.

NOTE: There is a fully working BookApp build using a modularizing bounded context approach at https://github.com/JonPSmith/BookApp.All. It contains 23 projects and provides a small but complex application as an example of how the modularizing bounded context approach could be applied to a .NET application.

Modularizing at the bounded context level is done via the high-level business design of your application, while the modularization inside a bounded context is done by grouping all the code for a specific feature and give it its own .NET project(s). Taking an example from my Book App used in my book I created lots of different ways to query the database, so I had one .NET project for each query type in the ServiceLayer. These then linked down to the lower layers as shown in the figure below, although I don’t show all the references (for instance nearly every outer layer links to the Domain layer) as it would make the figure hard to understand.

NOTE: As you can see there are lots of .NET projects in the ServiceLayer, a few in the Infrastructure, none in the BizLogic/DbAccess layers and often only one .NET project in the Persistence and Domain layers. Typically, the ServiceLayer has the most projects with the Persistence and Domain layers only containing one project.

Building a modularized bounded context looks like a lot of work, but for me it was very natural, and positive, change from what I did when using an n-layer architecture. Previously, when using an n-layer architecture, I grouped my code into folders, but with modular monolith approach the previous classes etc in each folder are placed in a .NET project instead.

These .NET projects/namespaces and must have a unique name so I extend the naming convention I showed in the previous bounded context modularization section by adding an extra name on the end of each .NET project/namespace name when needed, as shown below.

The rules for how the .NET projects in a modularized bounded context are pretty simple, but powerful:

  • A .NET project can only reference other .NET projects within its bounded context (see part 3 for how data can exchanged between bounded contexts).
  • A .NET project in an outer layer can only reference .NET projects in the inner layers.
  • A .NET project can access a .NET project in the same layer, but only if its name contains the word “Common”. This allows your code to be DRY (i.e., no duplicate code) but it’s very clear that any .NET project containing “Common” in its name effects multiple features.

NOTE: To ensure these are adhered to I wrote some unit test code that will check your application follows these three rules – see this unit test class in the BookApp.All repo.

The positive effects of using this modularization approach and its rules are:

  • The code is isolated from the other feature code (using a folder didn’t do that).
  • I can find the code more quickly via the .NET project’s name.
  • I can create unit test to check that my code is following the modularization rules.

Overall, this modularization approach stops the spaghetti-code jungle part of the “a big ball of mud” because now the relationships are managed by .NET and you can’t get around them easily. In the end, it’s up to the developer to apply the SoC and encapsulation software principals but following this modularization style will help you to write code that is easy to understand and easy to refactor.

Thinking about the downside – what happens with large applications?

I learnt a lot of things about building .NET application using my modular monolith modularization approach, but it was very small compared to the applications I work on for clients. So, I need to consider the downsides of building a large application to ensure this approach will scale, because a really big application could have 1,000 .NET projects. 

The first issue to consider is, can the development tools handle an application with say 1,000 .NET projects? The recent announcement of the 64-bit Visual Studio 2022, which can handle 1,600 projects, says this won’t be a problem. And even Visual Studio 2019 can handle 1,000 .NET projects (according to a report @ErikEJ found), but another person on Twitter said that 300 .NET projects too much. However, an application with lots of .NET projects could be tiresome to navigate through.

The second issue to consider is, can multiple teams of developers work together on a large application? In my view the bounded context approach is the key to allowing multiple teams to work together, as different teams can work on different bounded contexts. Of course, the teams need to follow the DDD bounded context rules, especially the rules about how bounded context communicate with each other, which I cover in part 3 of this series.

The final issue to consider is, how could the modular monolith modularization be applied to an existing application? There are many existing monolith applications, and it would be great if you could add new features using the modular monolith modularization approach. I talk about this in more detail in part 3, but I do see a way to make that work.

An answer to these downsides – break the application into separate packages

While the 3 downsides could be handled through rules and good team communication a modular monolith doesn’t have the level of separation that separate solutions (i.e., like Microservice do), but how can we do this when we are dealing with a monolith? My answer is to move any of the larger or complex bounded contexts into its own solution, pack each solution into a NuGet package and then install these NuGet packages into the main application.

This physically separates one or more of your bounded contexts from main application code while keeping the benefits of the monolith’s quick method/data transfer. Turning a bounded context into a separate solution allows a team to work on a bounded context on its own with easier navigation and no clashing with other team’s changes. And for existing applications you can create new features in a separate solution using the modular monolith approach and add these new features via NuGet packages to your existing application.

In part 2 I describe how you can turn a bounded context into separate solution and turn it into a NuGet package that can be installed in the main application, with special focus on the development cycle to make it only take a few seconds (not the few minutes that nuget.org takes) to create, upload, and install a NuGet package for local testing.

Conclusion

I have introduced you to the modular monolith architecture and then provided two approaches to applying a modular monolith architecture and DDD principals to .NET applications. One modularized at the DDD’s bounded context level and the second level added extra modularization inside a bounded context.

The question is: will the extra work needed to apply a modular monolith architecture to your application really create an application that is easier to extend over time? My first use of a modular monolith architecture was while writing the Book, “Entity Framework Core in Action” and it was very positive. Overall, I think it made me slightly faster than using an n-layered architecture because it was easier to find things. But the real benefit was when I added features for performance tuning and added CQRS architecture, which required a lot of refactoring and moving of code.  

NOTE: I recommend you read the sections “Modular Monolith – what was bad?” and “Modular Monolith – how did it fair under time pressure?” for my review of my first use of a modular monolith architecture.

Since my first use of a modular monolith architecture, I have further refined my modular monolith design to handle large application development. In the second article I add a further level of separation for development team so that large parts of your application can be worked on in its own solution. As a software developer myself I ensured that the development process is quick and reliable, as it’s quite possible I will use this approach on a client’s application.

Please do leave comments on this article. Happy to discuss the best ways to implement a modular monolith architecture or hear of any experience people have of using a modular monolith architecture.

Happy coding!

Five levels of performance tuning for an EF Core query

Last Updated: March 4, 2021 | Created: February 23, 2021

This is a companion article to the EF Core Community Standup called “Performance tuning an EF Core app” where I apply a series of performance enhancements to a demo ASP.NET Core e-commerce book selling site called the Book App. I start with 700 books, then 100,000 books and finally ½ million books.

This article, plus the EF Core Community Standup video, pulls information from chapters 14 to 16 from my book “Entity Framework Core in Action, 2nd edition” and uses code in the associated GitHub repo https://github.com/JonPSmith/EfCoreinAction-SecondEdition.

NOTE: You can download the code and run the application described in this article/video via the https://github.com/JonPSmith/EfCoreinAction-SecondEdition GitHub repo. Select the Part3 branch and run the project called BookApp.UI. The home page of the Book App has information on how to change the Book App’s settings for chapter 15 (four SQL versions) and chapter 16 (Cosmos DB).

Other articles that are relevant to the performance tuning shown in this article

TL;DR – summary

  • The demo e-commerce book selling site displays books with various sort, filter and paging that you might expect to need. One of the hardest of the queries is to sort the book by their average votes (think Amazon’s star ratings).
  • At 700 books a well-designed LINQ query is all you need.
  • At 100,000 books (and ½ million reviews) LINQ on its own isn’t good enough. I add three new ways to handle the book display, each one improving performance, but also takes more development effort.
  • At ½ million books (and 2.7 million reviews) SQL on its own has some serious problems, so I swap to a Command Query Responsibility Segregation (CQRS) architecture, with the read-side using a Cosmos DB database (Cosmos DB is a NOSQL database)
  • The use of Cosmos DB with EF Core highlights
    • How Cosmos DB is different from a relational (SQL) database
    • The limitations in EF Core’s Cosmos DB database provider
  • At the end I give my view of performance gain against development time.

The Book App and its features

The Book App is a demo e-commerce site that sells books. In my book “Entity Framework Core in Action, 2nd edition” I use this Book App as an example of using various EF Core features. It starts out with about 50 books in it, but in Part3 of the book I spend three chapters on performance tuning and take the number of books up to 100,000 book and then to ½ million books. Here is a screenshot of the Book App running in “Chapter 15” mode, where it shows four different modes of querying a SQL Server database.

The Book App query which I improve has the following Sort, Filter, Page features

  • Sort: Price, Publication Date, Average votes, and primary key (default)
  • Filter: By Votes (1+, 2+, 3+, 4+), By year published, By tag, (defaults to no filter)
  • Paging: Num books shown (default 100) and page num

Note: that a book can be soft deleted, which means there is always an extra filter on the books shown.

The book part of the database (the part of the database that handles orders isn’t shown) looks like this.

First level of performance tuning – Good LINQ

One way to load a Book with its relationships is by using Includes (see code below)

var books = context.Books
    .Include(book => book.AuthorsLink
        .OrderBy(bookAuthor => bookAuthor.Order)) 
            .ThenInclude(bookAuthor => bookAuthor.Author)
    .Include(book => book.Reviews)
    .Include(book => book.Tags)
    .ToList();

By that isn’t the best way to load books if you want good performance. That’s because a) you are loading a lot of data that you don’t need and b) you would need to do sorting and filter in software, which is slow. So here are my five rules for building fast, read-only queries.

  1. Don’t load data you don’t need, e.g.  Use Select method pick out what is needed.
    See lines 18 to 24 of my MapBookToDto class.
  2. Don’t Include relationships but pick out what you need from the relationships.
    See lines 25 to 30 of my MapBookToDto class.
  3. If possible, move calculations into the database.
    See lines 13 to 34 of my MapBookToDto class.
  4. Add SQL indexes to any property you sort or filter on.
    See the configuration of the Book entity.
  5. Add AsNoTracking method to your query (or don’t load any entity classes).
    See line 29 in ListBookService class

NOTE: Rule 3 is the hardest to get right. Just remember that some SQL commands, like Average (SQL AVE) can return null if there are no entries, which needs a cast to a nullable type to make it work.

So, combining the Select, Sort, Filter and paging my code looks like this.

public async Task<IQueryable<BookListDto>> SortFilterPageAsync
    (SortFilterPageOptions options)
{
    var booksQuery = _context.Books 
        .AsNoTracking() 
        .MapBookToDto() 
        .OrderBooksBy(options.OrderByOptions) 
        .FilterBooksBy(options.FilterBy, options.FilterValue); 

    await options.SetupRestOfDtoAsync(booksQuery); 

    return booksQuery.Page(options.PageNum - 1, 
        options.PageSize); 
}

Using these rules will start you off with a good LINQ query, which is a great starting point. The next sections are what to do if that doesn’t’ give you the performance you want.

When the five rules aren’t enough

The query above is going to work well when there aren’t many books, but in chapter 15 I create a database containing 100,000 books with 540,000 reviews. At this point the “five rules” version has some performance problems and I create three new approaches, each of which a) improves performance and b) take development effort. Here is a list of the four approaches, with the Good LINQ version as our base performance version.

  1. Good LINQ: This uses the “five rules” approach. We compare all the other version to this query.
  2. SQL (+UDFs): This combines LINQ with SQL UDFs (user-defined functions) to move concatenations of Author’s Names and Tags into the database.
  3. SQL (Dapper): This creates the required SQL commands and then uses the Micro-ORM Dapper to execute that SQL to read the data.
  4. SQL (+caching): This pre-calculates some of the costly query parts, like the averages of the Review’s NumStars (referred to as votes).

In the video I describe how I build each of these queries and the performance for the hardest query, this is sort by review votes.

NOTE: The SQL (+caching) version is very complex, and I skipped over how I built it, but I have an article called “A technique for building high-performance databases with EF Core” which describes how I did this. Also, chapter 15 on my book “Entity Framework Core in Action, 2nd edition” covers this too.

Here is a chart in the I showed in the video which provides performances timings for three queries from the hardest (sort by votes) down to a simple query (sort by date).

The other chart I showed was a breakdown of the parts of the simple query, sort by date. I wanted to show this to point out that Dapper (which is a micro-ORM) is only significantly faster than EF Core if you have better SQL then EF Core produces.

Once you have a performance problem just taking a few milliseconds off isn’t going to be enough – typically you need cut its time by at least 33% and often more. Therefore, using Dapper to shave off a few milliseconds over EF Core isn’t worth the development time. So, my advice is and study the SQL that EF Core creates and if you know away to improve the SQL, then Dapper is a good solution.

Going bigger – how to handle ½ million or more books

In chapter 16 I build what is called a Command Query Responsibility Segregation (CQRS) architecture. The CQRS architecture acknowledges that the read side of an application is different from the write side. Reads are often complicated, drawing in data from multiple places, whereas in many applications (but not all) the write side can be simpler, and less onerous. This is true in the Book App.

To build my CQRS system I decided to make the read-side live in a different database to the write-side of the CQRS architecture, which allowed me to use a Cosmos DB for my read-side database. I did this because Cosmos DB designed for performance (speed of queries) and scalability (how many requests it can handle). The figure below shows this two-database CQRS system.

The key point is the data saved in the Cosmos DB has as many of the calculations as possible pre-calculated, rather like the SQL (+cached) version – that’s what the projection stage does when a Book or its associated relationships are updated.

If you want to find out how to build a two-database CQRS code using Cosmos DB then my article Building a robust CQRS database with EF Core and Cosmos DB describes one way, while chapter 16 on my book provides another way using events.

Limitations using Cosmos DB with EF Core

It was very interesting to work with Cosmos DB with EF Core as there were two parts to deal with

  • Cosmos DB is a NoSQL database and works differently to a SQL database (read this Microsoft article for one view)
  • The EF Core 5 Cosmos DB database provider has many limitations.

I had already look at these two parts back in 2019 and written an article, which I have updated to EF Core 5, and renamed it to “An in-depth study of Cosmos DB and the EF Core 3 to 5 database provider”.

Some of the issues I encountered, listed with the issues that made the biggest change to my Book App are:

  • EF Core 5 limitation: Counting the number of books in Cosmos DB is SLOW!
  • EF Core 5 limitation: EF Core 5 cannot do subqueries on a Cosmos DB database.
  • EF Core 5 limitation: No relationships or joins.
  • Cosmos difference: Complex queries might need breaking up
  • EF Core 5 limitation: Many database functions not implemented.
  • Cosmos difference: Complex queries might need breaking up.
  • Cosmos difference: Skip is slow and expensive.
  • Cosmos difference: By default, all properties are indexed.

I’m not going to go though all of these – the “An in-depth study of Cosmos DB and the EF Core 3 to 5 database provider” covers most of these.

Because of the EF Core limitation on counting books, I changed the way that that paging works. Instead of you picking what page you want you have a Next/Prev approach, like Amazon uses (see figure after list of query approaches). And to allow a balanced performance comparison with the SQL version and the Cosmos DB version I added the best two SQL approaches, but turned of counting too (SQL is slow on that).

It also turns out that Cosmos DB can count very fast so I built another way to query Cosmos DB using its NET (pseudo) SQL API. With this the Book App had four query approaches.

  1. Cosmos (EF): This accesses the Cosmos DB database using EF Core (with some parts using the SQL database where EF Core didn’t have the features to implement parts of the query.
  2. Cosmos (Direct): This uses Cosmos DB’s NET SQL API and I wrote raw commands – bit like using Dapper for SQL.
  3. SQL (+cacheNC): This uses the SQL cache approach using the 100,000 books version, but with counting turned off to compare with Cosmos (EF).
  4. SQL (DapperNC): This uses Dapper, which has the best SQL performance, but with counting turned off to compare with Cosmos (EF).

The following figure shows the Book App in CQRS/Cosmos DB mode with the four query approaches, and the Prev/Next paging approach.

Performance if the CQRS/Cosmos DB version

To test the performance, I used an Azure SQL Server and Cosmos DB service from a local Azure site in London. To compare the SQL performance and the Cosmos DB performance I used databases with a similar cost (and low enough it didn’t cost me too much!). The table below shows what I used.

Database typeAzure service namePerformance unitsPrice/month
Azure SQL ServerStandard20 DTUs$37
Cosmos DBPay-as-you-gomanual scale, 800 RUs$47

I did performance tests on the Cosmos DB queries while I was adding books to the database to see if the size of the database effected performance. Its hard to get a good test of this as there is quite a bit of variation in the timings.

The chart below compares EF Core calling Cosmos DB, referred to as Cosmos (EF), against using direct Cosmos DB commands via its NET SQL API – referred to as Cosmos (Direct).

This chart (and other timing I took) tells me two things:

  • The increase in the number in the database doesn’t make much effect on the performance (the Cosmos (Direct) 250,000 is well within the variation)
  • Counting the Books costs ~25 ms, which is much better than the SQL count, which added about ~150 ms.

The important performance test was to look at Cosmos DB against the best of our SQL accesses. I picked a cross-section of sorting and filtering queries and run them on all four query approaches – see the chart below.

From the timings in the figure about here some conclusions.

  1. Even the best SQL version, SQL (DapperNC), doesn’t work in this application because any sort or filter on the Reviews took so long that the connection timed out at 30 seconds.
  2. The SQL (+cacheNC) version was at parity or better with Cosmos DB (EF) on the first two queries, but as the query got more complex it fell behind in performance.
  3. The Cosmos DB (direct), with its book count, was ~25% slower than the Cosmos DB (EF) with no count but is twice as fast as the SQL count versions.

Of course, there are some downsides of the CQRS/Cosmos DB approach.

  • The add and update of a book to the Cosmos DB takes a bit longer: this is because the CQRS requires four database accesses (two to update the SQL database and two to update the Cosmos database) – that adds up to about 110 ms, which is more than double the time a single SQL database would take. There are ways around this (see this part of my article about CQRS/Cosmos DB) but it takes more work.
  • Cosmos DB takes longer and costs more if you skip items in its database. This shouldn’t be a problem with the Book App as many people would give up after a few pages, but if your application needs deep skipping through data, then Cosmos DB is not a good fit.

Even with the downsides I still think CQRS/Cosmos DB is a good solution, especially when I add in the fact that implementing this CQRS was easier and quicker than building the original SQL (+cache) version. Also, the Cosmos concurrency handling is easier than the SQL (+cache) version.

NOTE: What I didn’t test is Cosmos DB’s scalability or the ability to have multiple versions of the Cosmos DB around the work. Mainly because it’s hard to do and it costs (more) money.

Performance against development effort

In the end it’s a trade-off of a) performance gain and b) development time. I have tried to summarise this in the following table, giving a number from 1 to 9 for difficultly (Diff? in table) and performance (Perf? In the table).

The other thing to consider is how much more complexity does your performance tuning add to your application. Badly implemented performance tuning can make an application harder to enhance and extend. That is one reason why use like the event approach I used on the SQL (+cache) and CQRS / Cosmos DB approaches because it makes the least changes to the existing code.

Conclusion

As a freelance developer/architect I have had to performance tune many queries, and sometimes writes, on real applications. That’s not because EF Core is bad at performance, but because real-world application has a lot of data and lots of relationships (often hierarchical) and it takes some extra work to get the performance the client needs.

I have already used a variation of the SQL (+cache) on a client’s app to improve the performance of their “has the warehouse got all the parts for this job?”. And I wish Cosmos DB was around when I built a multi-tenant service that needed to cover the whole of the USA.

Hopefully something in this article and video will be useful if (when!) you need performance tune your application.

NOTE: You might like to look at the article “My experience of using modular monolith and DDD architectures” and its companion article to look at the architectural approaches I used on the Part3 Book App. I found the Modular Monolith architectural approach really nice.

I am a freelance developer who wrote the book “Entity Framework Core in Action“. If you need help performance tuning an EF Core application I am available for work. If you want hire me please contact me to discuss your needs.

My experience of using the Clean Architecture with a Modular Monolith

Last Updated: July 6, 2021 | Created: February 11, 2021

In this article I look at my use of a clean architecture with the modular monolith architecture covered in the first article. Like the first article this isn’t primer on Clean Architecture and modular monolith but is more about how I adapted the Clean Architecture to provide the vertical separation of the features in the modular monolith application.

  1. My experience of using modular monolith and DDD architectures.
  2. My experience of using the Clean Architecture with a Modular Monolith (this article).

Like the first article I’m going to give you my impression of the good and bad parts of the Clean Architecture, plus a look at whether the time pressure of the project (which was about 5 weeks later) made me “break” any rules.

UPDATE

See my new series on building modular monoliths where I take my experience and come up with a better approach to building modular monoliths using the .NET architecture.

TL;DR – summary

  • The Clean Architecture is like the traditional layered architecture, but with a series of rules that improve the layering.
  • I build an application using ASP.NET Core and EF Core using the Clean Architecture with the modular monolith approach. After this application was finished, I analysed how each approach had worked under time pressure.
  • I had used the Clean Architecture once before on a client’s project, but not with the modular monolith approach.
  • While the modular monolith approach had the biggest effect on the application’s structure without the Clean Architecture layers the code would not be so good.
  • I give you my views of the good, bad and possible “cracks under time pressure” for the Clean Architecture.
  • Overall I think the Clean Architecture adds some useful rules to the traditional layered architecture, but I had to break one of those rules you make it work with EF Core.

A summary of the Clean Architecture

NOTE: I don’t describe the modular monolith in this article because I did that in the first article. Here is a link to the modular monolith intro in the first article.

The Clean Architecture approach (also called the Hexagonal Architecture and Onion Architecture) is an development of the traditional “N-Layer” architecture (shortened to layered architecture). The Clean Architectureapproach talks about “onion layers” wrapped around each other and has the following main rules:

  1. The business classes (typically the classes mapped to a database) are in the inner-most layer of the “onion”.
  2. The inner-most layer of the onion should not have no significant external code e.g., NuGet packages, added to it. This is designed to keep the business logic as clean and simple as possible.
  3. Only the outer layer can access anything outside of the code. That means:
    1. The code that users access, e.g. ASP.NET Core, is in the outer layer
    1. Any external services, like the database, email sending etc. is in the outer layer.
  4. Code in inner layers can’t reference any outer layers.

The combination of rules 3 and 4 could cause lots of issues as lower layers will need to access external services. This is handled by adding interfaces to the inner-most layer of the onion and registering the external services using dependency injection (DI).

The figure below shows how I applied the Clean Architecture to my application, with is an e-commerce web site selling book, called the Book App.

NOTE: I detail the modification that I make to Clean Architecture approach around the persistence (database) layer later in the article.

Links to more detailed information on Clean Architecture (unmodified)

Setting the scene – the application and the time pressure

In 2020 I was updating my book “Entity Framework Core in Action” I build an ASP.NET Core application that sells books called Book App. In the early chapters is very simple, as I am describing the basics of EF Core, but in the last section I build a much more complex Book App that I progressively performance tuned, starting with 700 books, then 100,000 books and finally ½ million books. For the Book App to perform well it through three significant enhancement stages.  Here is example of Book App features and display with four different ways to display the books to compare their performance.

At the same time, I was falling behind on getting the book finished. I had planned to finish all the chapters by the end of November 2020 when EF Core 5 was due out. But I only started the enhanced Book App in August 2020 so with 6 chapters to write I was NOT going to finish the book in November. So, the push was on to get things done! (In the end I finished writing the book just before Christmas 2020).

My experience of using Clean Architecture with a Modular Monolith

I had used a simpler Clean Architecture on a client’s project I worked on, so I had some ideas of what I would do. Clean Architecture was useful, but its just another layered architecture with more rules and I had to break one of its key rules to make it work with EF Core. Overall I think I would use my modified Clean Architecture again in a larger application.

A. What was good about Clean Architecture?

To explain how Clean Architecture helps we need to talk about the main architecture – the modular monolith goals. The modular monolith focuses on features (Kamil Grzybek called them modules). One way to work would have one project per feature, but that has some problems.

  • The project would be more complex, as it has everything inside it.
  • You could end up with duplicating some code.

The Separation of Concerns (SoC) principal says breaking up a feature parts that focus on one part of the feature is a better way to go. So, the combination of modular monolith and using layers provides a better solution. The figure below shows two modular monolith features running vertically, and the five Clean Architecture layers running horizontally. The figure has a lot in it, but it’s there to show:

  • Reduce complexity: A feature can be split up into projects spread across the Clean Architecture layers, thus making the feature easier to understand, test and refactor.
  • Removing duplication: Breaking up the features into layer stops duplication – feature 1 and 2 share the Domain and Persistence layers.

The importance of the Service Layer in the Clean Architecture layers

Many years ago, I was introduced to the concept of the Service Layer. There are many definitions of the Service Layer (try this definition), but for me it’s a layer that knows about the lower / inner layer data structures and knows about the front-end data structures and it can adapt between the two structures (see LHS of the diagram above). So, the Service Layer isolates lower layers from having to know how the front-end works.

For me a Service Layer is a very important level.

  • It holds all the business logic or database accessed that the front-end needs, normally providing as services. This makes it much easier to unit test these services.
  • It takes on the job of adapting data to / from the front end. This means this layer that has to care about the two different data structures.

NOTE: Some of my libraries, like EfCore.GenericServices and EfCore.GenericBizRunner are designed to work as a Service Layer type service i.e., both libraries adapts between the lower / inner layer data structures to the front-end data structures.

Thus, the infrastructure layer, which is just below the Service Layer, contains for services that are still working in the entity class view. In the Book App these projects contained code to seed the database, handle logging and providing event handling. While services in the Service Layer worked with both lower / inner layer data structures and front-end data structures.

To end the “good” part of Clean Architecture I should say that a layered architecture could also provide the layering that the Clean Architecture defined. It’s just that the v has some more rules, most of which are useful.

B. Clean Architecture – what was bad?

The main problem was fitting the EF Core DbContext into the Clean Architecture. Clean Architecture says that the database should be on the outer ring, with interfaces for the access. The problem is there is no simple interface that you can use for the application’s DbContext. Even if you using a repository pattern (which I don’t, and here is why) then you have a problem that the application’s DbContext has to be define deep in the onion.

My solution was to put the EF Core right after to the inner circle (name Domain) holding the entity classes – I called that layer persistence, as that’s what DDD calls it. That breaks one of the key rules of the Clean Architecture, but other than that it works fine. But other external services, such as an external email service, I would follow the Clean Architecture rules and add an interface in the inner (Domain) circle and register the service using DI.

Clean Architecture – how did it fair under time pressure?

Appling the Clean Architecture and Modular Monolith architectures together took a little more time to think thought (I covered this in this section of the first article), but the end result was very good (I explain that in this section of the first article). The Clean Architecture layers broke a modular monolith feature into different parts, thus making the feature easier to understand and removing duplicate code.

The one small part of the clean architecture approach I didn’t like, but I stuck to, is that the Domain layer shouldn’t have any significant external packages, for instance a NuGet library, added to it. Overall, I applaud this rule as it keeps the Domain entities clean, but it did mean I had to do more work when configuring the EF Core code, e.g. I couldn’t use EF Core’s [Owned] attribute on entity classes. In a larger application I might break that rule.

So, I didn’t break any Clean Architecture rules because of the time pressure. The only rules I changed were make it work with EF Core, but I might break the “Domain layer and no significant external packages” in the future.

Conclusion

I don’t think the Clean Architecture approach has as big of effect on the structure that the modular monolith did (read the first article), but Clean Architecture certainly added to the structure by breaking modular monolith features into smaller, focused projects. The combination of the two approaches gave a really good structure.

My question is: does the Clean Architecture provide good improvements over a traditional layered architecture, especially as I had to break one of its key rules to work with EF Core? My answer is that using the Clean Architecture approach has made me a bit more aware of how I organise my layers, for instance I now have an infrastructure layer that I didn’t have before, and I appreciate that.

Please feel free to comment on what I have written about. I’m sure there are lots of people who have more experience with the Clean Architecture than me, so you can give your experience too.

Happy coding.