Finally, a library that improves role authorization in ASP.NET Core

Last Updated: September 14, 2021 | Created: August 10, 2021

In December 2018 I wrote the first article in the series called “A better way to handle authorization in ASP.NET Core” which describe an approach to improving how authorization (i.e., what pages/feature the logged in user can access) in ASP.NET Core. These articles were very popular, and many people have used this authorization/data key approaches in their applications.

Back in 2018 I didn’t think I could produce a library for people to use, but I have finally found a way to build an library, and it’s called AuthPermissions.AspNetCore (shorten to AuthP library in these articles). This open-source library implements most of the features described in the “A better way to handle authorization” series, but the library will work with any ASP.NET Core authentication provider and now supports JWT Tokens.

NOTE: At this time this article was written the AuthPermissions.AspNetCore library is/was in preview (1.0.0-preview) and I am looking for feedback before taking the library to full release. Please look at the roadmap discussion page for what is coming. Also, the preview some features won’t work on a web app running multiple instances (called scale-out on Azure).

This first article is focused on the improvements the AuthP library provides to using “Roles” to manage what feature users can access, but the library contains a number of other useful elements. The full list of articles is shown below:

  • Finally, a library that improves role authorization in ASP.NET Core (this article).
  • Improving security when using JWT Tokens in ASP.NET Core (VIDEO)
  • Managing the roles and permissions of your ASP.NET Core users (coming soon).
  • Simplifying the building of a multi-tenant architecture with ASP.NET Core (coming soon).

TL;DR; – summary

  • The AuthPermissions.AspNetCore library has three main features:
    • Implements an improved Role authorization system (explained in this article).
    • Implements a JWT refresh token for better JWT Token security (article 2 coming soon)
    • Includes an optional a multi-tenant database system (article 4 coming soon)
  • The AuthPermissions.AspNetCore library can work with
    • Any ASP.NET Core authentication provider.
    • Either Cookie authentication or JWT Token authentication
  • This article focuses on AuthP’s improved “Roles / Permissions” approach, which allows you to change a Role without having to edit your application. Each user has a list or Roles, each Role containing one or more enum Permissions.
  • When the user logs in the AuthP library finds all the Permissions in all the Roles that the user has. These Permissions are packed into a string and added as a “Permissions” claim into the Cookie authentication or the JWT Token.
  • The AuthP’s HasPermission attribute / method will only allow access if the current user has the required Permission.
  • See other articles or the AuthP’s documentation for more information.
  • You can see the status of the AuthP library via the Release Notes document in the repo.

Setting the scene – How the AuthP library improves Roles authorization in ASP.NET Core

If you not familiar with ASP.NET’s authentication and authorization features, then I suggest you read the ”Setting the Scene” section in the first article in the A better way to handle authorization” series.

Back in the ‘old days’ with ASP.NET MVC the main way to authorize what a user can access was by using the Authorize attribute with what are called “Roles” – see the code below, which only lets a user with either the “Staff” or “Manager” Role to access that method/page.

[Authorize(Roles = "Staff,Manager")]
public ActionResult Index()
{
    return View(MyData);
}

ASP.NET Core kept the Roles approach but added the more powerful policy-based approach. But I found both approaches to have limitations:

  • Limitations of the ASP.NET Core Roles approach
    • The main limitation of the ASP.NET Core Roles is that authorization rules are hard coded into your code. So, if you want to change who can access a certain pages/Web APIs, you have to edit the appropriate Authorize attributes and redeploy your applicatios.
    • In any reasonably sized application, the Authorize attributes can get long and complicated, e.g.  [Authorize(Roles = “Staff, SalesManager , DevManage, Admin, SuperAdmin”)]. These are hard to find and maintain.
  • Limitations of the ASP.NET Core Roles policy-based approach
    • It very versatile, but I have to write code for each policy, which are all slightly different. And in a real application you might end up writing a lot of policy code.

From my point of view “Roles” are a useful concept for users, typically a user (human or machine) has a Role, or a few Roles, like SalesManager with maybe with an additional FirstAider role. Think of Roles as “Use Cases for users”.

But “Roles” are not a useful concept when defining what ASP.NET Core page/feature the user can access. That’s because for some feature you need fine-grained control, for instance a different access level for display, create, and update of sales information. For this a use what I call “Permissions”, for example you I might have SalesRead, SalesSell, SaleReturn, Permissions where the “Sales person” Role only have the SalesRead, SalesSell Permissions, but the” Sales manage” Role also has the SaleReturn Permission.

The AuthP’s approach says that “users have Roles, and your code uses Permissions to secure each page/WebAPI”. The AuthP’s Roles a number of benefits:

  1. No more deploying a new version of your application when you want to change what a Role can do – the AuthP’s Role/Permissions mapping is now in a database which admin people can change.
  2. The Permissions are declarative, just like the [Authorize] attribute (e.g. [HasPermission(MyPermissions.SalesSell)]) which makes it easier to maintain.
  3. You can have broad Permissions, e.g. NoticeBroadAccess (covering create, read, update, and delete), or fine-grained Permissions like SalesRead, SalesSell, SalesSellWithDiscount, SaleReturn for individual actions/pages.

NOTE: The AuthP’s Roles are different from the ASP.NET Core Roles – that’s why I always refer to “AuthP’s Roles” so that its clear which type of Role I am talking about.

How to use the AuthP’ library with its Roles and Permissions

So, let’s look at the various code you need in your ASP.NET Core application to use the AuthP’s Roles/Permissions system. Starting at the Permissions and working up the parts are:

  1. Defining your Permissions using an Enum
  2. Using these Permissions in your Blazor, Razor, MVC or Web API application
  3. Configuring the AuthP library in ASP.NET Core’s ConfigureServices method.
  4. Managing the user’s AuthP Roles and what Permissions are in each Role.

1. Defining your Permissions

The Permissions could be strings (just like ASP.NET Roles are), but in the end I found a C# Enum was best for the following reasons:

  • Using an Enum means IntelliSense can prompt you as to what Enum names you can use. This stops the possibility of typing an incorrect Permission name.
  • Its easier to find where a specific Permission is used using Visual Studio’s “Find all references”.
  • You can provide extra information to an Enum entry using attributes. The extra information helps the admin person when looking for a Permission to add to a AuthP’s Role – see the section on defining your Permission Enum.
  • I can use the Enum value: In the end I defined an enum with a ushort value (giving 65534 values), which can be stored efficiently in a (Unicode) string. This is important because the Permissions needed to be held in a claim and if you ASP.NET Core’s Cookie Authorization then a cookie has a maximum size of 4096 bytes.

So, let’s look at an example from Example4 in the AuthP’s repo, which is an application that manages a shop stock and sales. I have only shown a small part of Permissions in this example, but it gives you an idea how you can decorate each Permission to provide more info and filtering to the admin user.

public enum Example4Permissions : ushort //Must be ushort to work with AuthP
{
    NotSet = 0, //error condition

    //Here is an example of very detailed control over the selling things
    [Display(GroupName = "Sales", Name = "Read", Description = "Can read any sales")]
    SalesRead = 20,
    [Display(GroupName = "Sales", Name = "Sell", Description = "Can sell items from stock")]
    SalesSell = 21,
    [Display(GroupName = "Sales", Name = "Return", Description = "Can return an item to stock")]
    SalesReturn = 22,

    [Display(GroupName = "Employees", Name = "Read", Description = "Can read company employees")]
    EmployeeRead = 30,

    //other Permissions left out… 
}

2. Using these Permissions in your Blazor, Razor, MVC or Web API application

AuthP can be used with any type of ASP.NET Core application, with three ways to check if the current user has a given permission.

2a. Using AuthP’s [HasPermission] attribute

For a ASP.NET Core MVC or Web API controller you can add the [HasPermission] attribute to an access method in a controller. Here is a example taken from Example2’s WeatherForecastController, which is Web API controller – see the first line.

[HasPermission(PermissionEnum.ReadWeather)]
[HttpGet]
public IEnumerable<WeatherForecast> Get()
{
    //… other code left out
}

2b. Using AuthP’s HasPermission extension method

If you are using Blazor, or in any Razor file you can use the HasPermission extension method to check if the current ASP.NET Core’s User has a specific Permission. Here is an example taken from AuthP’s Example1 Razor Pages application

public class SalesReadModel : PageModel
{
    public IActionResult OnGet()
    {
        if (!User.HasPermission(Example1Permissions.SalesRead))
            return Challenge();

        return Page();
    }
}

The HasPermission extension method is also useful in any Razor page (e.g. User.HasPermission(Example.SalesRead)) to decide whether a link/button should be displayed. In Blazor the call would be @context.User.HasPermission(Example.SalesRead).

2c. Using the IUsersPermissionsService service

If you are using a front-end library such as React, Angular, Vue and so on, then your front-end needs to know what Permissions the current user has so that the front-end library can create method similar to option 2, the HasPermission extension method.

The IUsersPermissionsService service has a method called PermissionsFromUser which returns a list of the Permission names for the current user. You can see the IUsersPermissionsService service in action in Example2’s AuthenticateController.

3. Configuring the AuthP library in ASP.NET Core’s ConfigureServices method.

The AuthP library has a lot of methods and options to set it up. It will work with any ASP.NET Core authentication provider that returns the UserId as a string.  

The code below uses ASP.NET Core’s Individual Accounts authentication provider in an MVC-type application – the highlighted lines contain the configuration code to set up the

public void ConfigureServices(IServiceCollection services)
{
    //… other normal ASP.NET configuration code left out
    
    //This registers the AuthP library with your Permission Enum 
    services.RegisterAuthPermissions<MyPermissions>()
        //This sets up the AuthP’s database 
        .UsingEfCoreSqlServer(Configuration.GetConnectionString("DefaultConnection"))
        //This syncs AuthP to your ASP.NET Core authentication provider, 
        //In this example it’s the Individual Accounts authentication provider
        .RegisterAuthenticationProviderReader<SyncIndividualAccountUsers>()
        //This will ensure the AuthP’s database is created/migrated on startup
        .SetupAuthDatabaseOnStartup();
}

NOTE: There are many configuration parts to the AuthP’s library, and this shows a simple configuration appropriate for an existing application that is using ASP.Net Core’s Individual Accounts authentication. Please look at the AuthP startup documentation for a more detailed list of all the configuration options.

How the AuthP library works inside

The final part of this article gives you an overview of how the AuthP library works, mainly with diagrams. This should help you understand how AuthP library works.

When someone logs in some ASP.NET Core claims are built and stored in an the Authentication Cookie or in the JWT Token. The diagram shows the AuthP code in orange, with the blue part provided by ASP.NET Core and its authentication provider.

As you can see AuthP finds the user’s AuthP Roles and combine the Permissions in all the user’s AuthP Roles into a string (known as packed permissions), which becomes the Permissions claim’s value. This Permissions claim is stored in the Authentication Cookie or in the JWT Token.

This approach is a) very efficient as the Permission data is available as a claim (i.e., no database accesses needed), and b) a user’s AuthP Roles can be changed without needing to re-deploy your application.

NOTE: Normally the Permissions are only calculated when someone logs in, and any change to the Users’ AuthP Roles would only change by logging out and back in again. However, there are features in AuthP that can periodically re-calculate the user’s Roles/Permissions. I talk about this in this section of the document explaining how JWT Token refresh works.

The second part is what happens when a logged-in user wants to access a Page, Web Api etc. Because AuthP library added a Permissions claim to the Authentication Cookie / JWT Token the ASP.NET Core ClaimsPrincipal User claims will contain the Permissions claim.

When an [HasPermission(MyPermissions.SalesSell)] attribute is applies to a ASP.NET Core controller or Razor Page is calls a policy-based service that the AuthP library has registered. This Permission policy allows access to the method / page if the “MyPermissions.SalesSell” Permission is found in the User’s Permissions claim’s string. Otherwise, it redirects the user to either the login page or an Access Denied page, if already logged in (or for Web API it returns HTTP 401, unauthorized, or HTTP 403, forbidden).

Conclusion

This first article gives you an overview of the AuthP’s most important, that is the ability to change what a Role can do via an admin page instead of the ASP.NET Core Roles where a change requires you to edit your code and redeploy your application. In addition, the AuthP’s Permissions allow you to have very fine-gained access rules if you need it. Future articles will explain other features in the AuthP libraray.

The AuthP library is far from finished but I think it has the basics in it. The big limitation of the preview is that some features, like bulk loading on startup, only run on a single instance of the web app (i.e. no scale out). My plan is to solve that in the non-preview version of the AuthP library

I have put out a preview version to get some feedback. I have set up a roadmap discussion for you to see what I am planning and getting your suggestions and help. Please have a look at the library and add your comments suggestions to the roadmap discussion.

Happy coding!

Evolving modular monoliths: 3. Passing data between bounded contexts

Last Updated: May 17, 2021 | Created: May 17, 2021

This article describes the different ways you can pass data between isolated sections of your code, known in DDD as bounded contexts. The first two articles used bounded contexts to modularize our monolith application so, when we implement the communication paths between bounded contexts, we don’t want to compromise this modularization.

DDD has lots to say about design view of communication between bounded contexts, and .NET provides some tools to implement these communication channels in modern applications. In this article I give four different approaches to communicate between bounded contexts with varying levels of isolation.

This article is part of the Evolving Modular Monoliths series, the articles are:

TL;DR – summary

  • DDD says that your application should be broken up into separate parts (DDD terms: bounded contexts or domain) and these bounded contexts should be isolated from each other so that each bounded context can focus on its particular business group.
  • DDD also describes various ways to communicate between bounded contexts based on the business needs but doesn’t talk too much about how they can be implemented.
  • These articles are about .NET monolith applications and describe various ways to implement communicate paths between two bounded contexts:
    • Example 1: exchange data via a common database
    • Example 2: exchange data via a method call
    • Example 3: exchange data using a message broker
    • Example 4: communicating from new code to a legacy application
  • At the end of example 1 there is information on how to create EF Core DbContexts for each bounded context with individual database migrations.
  • The conclusion hives the pros, cons and limitation of each communication example.

What DDD says about communicating between bounded contexts

Just as DDD’s bounded context helped us when breaking up out monolith into modules, then DDD can also help with mapping the communicated between bounded contexts. DDD defines seven ways to map data when passing data between bounded contexts (read this article for an explanation of each type).

You might have thought that the mappings between two bounded contexts should always isolated, but DDD recognises the isolation comes at the cost of more development time, and possibly slower communications. Therefore, the seven DDD mapping approaches run from tightly coupling right up to complete isolation between the two ends of the communication. DDD warns us that using a mapping design that tightly links the two bounded contexts can cause problems when you want to refactor / improve one of the bounded contexts.

NOTE: I highly recommend Eric Evans’ 30-minute talk about bounded contexts at DDD Europe 2020.

Later in this article I show a number of approaches which use various DDD mappings approaches.

The tools that .NET provides for communicating between bounded contexts

DDD provides an architectural view of communicating between bounded contexts, but .NET provides the tools to implement the mapping and communication. The facts that the application is a monolith makes the implementation very simple and fast. Here are three approaches:

  • Having two bounded contexts to map to the same data in the database.
  • Calling a method in another bounded context using dependency injection (DI).
  • Using a message broker to call a method in another bounded context.

I show examples of all three of these approaches and extracts the pros, cons and limitations of each. I also add a forth example that looks at you can introduce a modular monolith architecture into existing applications whose design is more like “a ball of mud”.

Example 1: Exchanging data via the database

In a monolith you usually have one database used by all the bounded contexts, this provides a way to exchange data between bounded contexts. In a modular monolith you want each bounded context to have its own section of the database that it works with, and you can do that with EF Core. The fact there is one database does allow you to exchange data by sharing tables/columns in the database.

An example of this approach in the BookApp application is that when the Orders bounded context gets a user’s order it only has the book’s SKU (Stock Keeping Unit), but it needs the book price, title etc. Now, the BookApp.Books part of the database has that data in its Books SQL table, so the BookApp.Orders bounded context could also map to that table too. But this tightly links the BookApp.Books and BookApp.Orders bounded context.

One way to reduce the tightly linking is to have the …Orders only map to the few columns it needs. That way the …Books could add more columns and relationships without effecting the …Orders bounded context. Another thing you can do is make the …Orders mapping to the Books table read-only by using EF Core’s ToView configuration command. That makes it completely clear that the ….Orders bounded context isn’t in change of this data. The figure below shows this setup.

Pros and cons of this approach

In terms of DDD’s mapping approaches this a shared kernel, which makes the two bounded contexts tightly linking. The fact that the …Orders bounded context only accesses a few of the columns in the Books table reduces the amount the linking of the two bounded contexts because the …Books bounded contexts could add new columns without the need to …Orders code.

The positives of this approach it’s easy to set up and it works with NuGet packages (see article 2).

EXTRA: Who to set up multiple DbContexts using EF Core

Setting up separate DbContexts for each bounded context does make using EF Core migration feature a little bit complex.  Here are the steps you need to do to use migrations:

  1. Create a DbContext just containing the classes/table in your bounded context (examples:  BookDbContext and OrderDbContext)
  2. If any of the DbContexts access the same SQL table, you need to be careful to ensure two separate migrations try to change the same table. Your options are:
    1. Have only one DbContext that maps to that SQL table and other DbContexts map to that table using EF Core’s ToView configuration command. This is the recommended way because it allows you to select only the columns you need, and you only have read-only access.
    1. Choose one DbContext to handle EF Core configuration of that SQL table and the other DbContext’s use EF Core’s ExcludeFromMigrations configuration command.
  3. Then create a IDesignTimeDbContextFactory<your DbContext> and include the MigrationsHistoryTable option to set a unique name for the migration history file (example: Orders DesignTimeContextFactory)
  4. When you register each DbContext on startup you need to again add the MigrationsHistoryTable option (example: Startup code in ASP.NET Core)

When you want to create a migration for a DbContext in a bounded context, then you need to do that from the project containing the DbContext: this these comments in the BookApp.All OrderDbContext. This approach should be used for each DbContext in a bounded context.

EXAMPLE 2: Exchanges data via a method call

One big advantage of a monolith is you can call methods, which are quick and don’t have any communication problems. But we have isolated the bounded contexts from each other, so how can we call a method in another bounded context without breaking the isolation rules? The solution is to use interfaces and dependency injection (DI). The interface provides the isolation and DI provides the correct method.

For this example of this approach let’s say that the address that the Order should be sent to is stored in the Users bounded context. To make this work, and not break the isolation we do the following:

  1. You place the following items in the BookApp.Common.Domain layer because only Common layers can have multiple bounded contexts access it (see the rules about the Common I defined in part 1 of this series)
    1. The interface IUserAddress that defines the service that the BookApp.Orders can call to obtain an Address class of a specific UserId.
    1. The Address class that the service will return.
  2. You create a class called UserAddress in the BookApp.Users bounded context that inherits the IUserAddress interface defined in step 1a. You most likely put that class in the BookApp.Users.Infrastructure layer.
  3. You arrange for UserAddress / IUserAddress pair to be registered with the DI provider.
  4. Finally, in your BookApp.Orders you obtain an instance of the UserAddress via DI and use it to get the Address you need.

The figure below shows this setup.

Pros and cons of this approach

In terms of DDD’s mapping approaches this a customer / supplier mapping approach – the customer is the BookApp.Orders and the supplier is the BookApp.Users. The interface provides good isolation for the service but sharing of the Address class does link the two bounded contexts.

From the development point of view, you have to organise your code in three places, which is bit more work. Also, this approach isn’t that good when working with bounded contexts that have been turned into NuGet packages.

Overall, this approach provides better isolation than the exchange via the database but takes more effort.

EXAMPLE 3: Exchange data using a request/reply message broker

In the last mapping implementation used the .NET DI provider, which meant any interfaces and classes used has to be in a .NET project that both bounded contexts. There is another way to automate this using a request/reply message broker. This allows you to set up mapping links between two bounded context while not breaking the isolation rules,

Let’s create the same in example 2, that is getting the address that the Order from the bounded context. Here are the steps:

  1. Register a request/reply message broker as a singleton to the DI provider.
  2. In the BookApp.Users bounded context register a get function with the message broker. You will send an Address class which is registered in the BookApp.Users bounded context.
  3. In the BookApp.Orders bounded context call an ask method in the message broker. You will receive the add into an Address class in the BookApp. Orders bounded context.

The figure below shows this setup.

The message broker allows you to register a getter function (left side of the figure) that can be called by the AskFor method (right side in figure). This is equivalent to calling a method in example 2 but doesn’t need any external interfaces or classes. Also, in this example there are two classes called Address which the message broker can map between to two Address classes, thus removing the need for an extra common layer we needed in example 2.

Initially I couldn’t find a request/reply message broker so I build a simple version you can find here, but more research RabbitMQ has a remote procedure call that does this (but my simple version is easier to understand).

NOTE: Microservice architecture normally use a publish/listen message broker, where apps register to be informed if certain data changes. This is done to improve performance by sending updated of data needed by a Microservice app so that it cache the data locally. However, in a monolith architecture you can access data anywhere in the app in nanoseconds (just an in-memory dictionary lookup and a function call), so a request/reply message broker is more efficient.

Pros and cons of this approach

This is another customer / supplier mapping approach as used in example 2, but more isolation due to the request/reply message broker being able to copy data from one type to another.

From a development point of view this is easier than example 2 which called a method using DI, because you don’t have to add the BookApp.Common.Domain layer to share the interface and class. The other advantage of this message broker approach is it works with bounded contexts turned into NuGet packages.

There aren’t any downsides other than learning how to use a request/reply message broker.

Overall, I think this approach is quick to implement, provides excellent isolation and works with bounded contexts turned into NuGet packages.

EXAMPLE 4: Added new modular code to a legacy application

There are lots of existing applications out there, some of which don’t have a good design and have fallen into “a ball of mud” – we call these legacy applications. So, the challenge is to apply a modular monolith architecture to an existing application without the “ball of mud” code “infecting” your new code.

One solution I defined uses three parts to add new code to legacy applications.

  • Build your new feature in a separate solution: This gives you have a much better chance of building your code using modern approaches such as modular monolith architecture.
  • You install your new feature via a NuGet package: By packaging your new feature into NuGet package makes it much easier to add your new code to the existing application.
  • Use DDD’s Anticorruption Layer (ACL) mapping approach: The ACL mapping approach that builds adapters between the existing application’s code and concepts and the new code you have written to add a new feature.

The figure below shows how this might work.

You already know about separate solutions and NuGet packaging from in part 2 of this series so I concentrate an the ACL and how it works.

DDD’s ACL mapping approach is designed for interfaces to a legacy system. It assumes that a) the legacy system’s code cannot be easily changed, and b) the legacy system design suboptimal design. The ACL mapping approach hides the more difficult parts of the legacy system by using the adapter pattern, which allows you to write your new code against a “cleaned up” interface.

Of all the DDD mapping patterns the ACL mapping provides a very high level of separation between the legacy system and your new code. The downside is of all the DDD mapping approach the ACL take the most development effort to create.

While I have described how to add new code to legacy system, I have to say that it isn’t a simple job. My experience is that fully understanding a legacy system’s code is far harder and takes longer that writing the ACL layer code.

The understanding and unscrambling of a legacy system is a big topic and I’m not going to cover it here, but you might like to look a few of links I have listed below:

NOTE: You might also be interested in the strangler pattern if working with existing applications. This pattern provides a way to progressively change your old code to a more modern code design.

Pros and cons of this approach

DDD’s ACL mapping approach provides excellent separation between the two parts but takes a lot of development effort to build. Therefore, you should only use this when there is no other way to achieve this. However, it’s not the building the ACL mapping that the hard part, the hard part is working out how the legacy system works so that you can add your new.

Conclusion

I have described four examples of communicating between DDD bounded contexts. From a DDD point of view I didn’t cover all of DDD’s bounded context mapping approaches in this article, but I did cover the four main ways to implement communicating between bounded contexts in .NET monolith applications.

As you have seen it’s a balance between how much the communication ties the design of two bounded contexts together against the amount of development effort it takes to write the communication link. The list below provides a summary of the pros and cons of each approach I cover in this article.

  • Exchanging data via the database (example 1)
    • Pro: fairly easy to implement
    • Con: some linking between bounded contexts (DDD shared kernel)
    • Limitations: none
  • Exchanges data via a method call (example 2)
    • Pro: good performance, easy to implement
    • Cons: Needs extra common layer to share interfaces/classes
    • Limitations: Doesn’t work with bounded context NuGet packages
  • Exchange data using a request/reply message broker (example 3)
    • Pro: good performance, easy to implement
    • Cons: You need a request/reply message broker
    • Limitations: none
  • Adding new modular code to a legacy application (example 4)
    • Pro: allows you to write new code using a modern design
    • Cons: A LOT of work
    • Limitations: none

NOTE: The “exchanging data via the database” also contains extra information (link!!!) on how to create individual EF Core DbContexts for each bounded context that has to link the database.

I hope this article, plus others in the series have been useful to you.

Happy coding!

Evolving modular monoliths: 2. Breaking up your app into multiple solutions

Last Updated: May 10, 2021 | Created: May 10, 2021

This is the second article in a series about building Microsoft .NET applications using a modular monolith architecture. This article covers a way to extract parts of your application into separate solutions which you turn into NuGet packages that are installed in your main application. Each solution is physically separated from each other in a similar way to Microservice architecture, but without the performance and communication failure modes that Microservices can have.

My view is that the Microservice architecture is great for applications that need with large development teams and/or have to handle high levels of demand, like Netflix. But for smaller applications the Microservice architecture can be overkill. However, I do like the Microservice idea of having separate solutions, because that makes the application easier to understand, refactor and manage so I have defined a way to do extract parts of an application into their own solution. The result is an application that is easier to build because it’s using the simpler monolith architecture but has the benefits of having multiple separate solutions that are combined by using NuGet.

NOTE: If you are planning to build Microservice architecture, then the approach described is also a great starting point because it already creates separate solutions. Martin Fowler also suggests that starting with a monolith approach is the best way to build a Microservice application – see his article called “Monolith First”.

This article is part of the Evolving Modular Monoliths series, the articles are:

TL;DR – summary

  • One of the best ways to structure your application is to break the business needs into what DDD calls bounded contexts (see this section in the first article for more on bounded contexts).
  • In a modular monolith some bounded contexts can be large and/or complex. In these cases, giving a large/complex bounded context its own solution makes the code easier to understand, navigate, and manage.
  • I have created a dotnet tool called MultiProjPack that automates the creation of the NuGet package for projects that follow the naming convention described in the first article.
  • I describe a fast local build/test/debug cycle when adding/changing code in a separate solution. This uses a local NuGet package source and some features in the MultiProjPack tool.
  • I describe the options for getting source code information while debugging a NuGet package inside your main application.
  • I finish with a section on building the composite application for deployment to production and suggest ways to store your private NuGet packages.

Breaking up your app into multiple solutions

The idea is to get the separation that a Microservice pattern provides while keeping the performance and reliability of direct method or database access. To do this we extract a DDD bounded context (see the section on bounded contexts in the first article) section of your code into its own solution and then turn it into NuGet package to install in your main application.

Turning a DDD bounded context into a separate solution/NuGet package requires a bit more work, so I suggest you only apply this approach to large and/or complex parts of your application. The benefits are:

  • The code is easier to understand/navigate because it’s isolated into its own solution.
  • If the development team is large, it’s easier for one team to work on an isolated solution   and “publish” a NuGet package for other teams to use.
  • On older application this approach means you can build new features using a more modern design without the structure of the old application hindering your new design.

I have created two example repos that show this approach in action I have created another version of the e-commence web app that sells books called BookApp. The main application is called BookApp.Main (see https://github.com/JonPSmith/BookApp.Main for the code example), which contains the FrontEnd and the Order processing parts (BookApp.Orders…). The part of the application deals with the querying and updating of the books in the database (referred to as BookApp.Books) is large enough to warrant turning into a solution (see https://github.com/JonPSmith/BookApp.Books for code example). The BookApp.Books solution is turned into a NuGet package and installed in the BookApp.Main. The figure below shows this in action.

This makes it much easier for a development team to work on a part of the application in a separate repo/solution. Also, the team can “publish” a new version via a private NuGet server, with fallback to the old version provided by simply changing back to the previous NuGet packages.

NOTE: One limitation around using a NuGet package is that all your projects in the solution must have the same framework, e.g. .net5.0. That’s because NuGet is designed to handle packages with work with multiple frameworks, for instance the Newtonsoft.Json package can work with seven types of frameworks (see its dependences). But in this usage you want all of your projects to be the same otherwise it won’t work.

Communication between the front-end and a NuGet package

Turning a bounded context into a NuGet package gives you the benefit of separation that a Microservice has. But while a Microservice architecture has one API front-end per Microservice, in our modular monolith design there is direct code link, via the NuGet package, to the BookApp.Books code.

This changes the communication channel we use between each bounded context and the user. In a Microservice design the front-end typically accesses a service via HTTP API, but in our modular monolith design it accesses a service via dependency injection to call method. So, in a modular monolith the API is defined mainly by interfaces and dependency injection, which a few key class definitions. And because we are using a monolith architecture there is one front-end in BookApp, that is a ASP.NET Core project.

NOTE: There are other communication paths between two bounded contexts, which I cover in part 3.

The API is mainly defined by the ServiceLayer, which contains many of the services linked to a bounded context (read this section about the ServiceLayer in one of my articles).  For instance, in my example BookApp.Main I want to display a list of books with sorting, filtering and paging. This uses a service referred to by the interface IListBooksService in the …ServiceLayer.GoodLinq project. The ServiceLayer will also contain some classes too, such as the BookListDto class to provide the data to display the books, and various other classes, interfaces, constants etc.

NOTE: For Separation of Concerns (SoC) reasons I also recommend adding a small project specifically to handle any startup code, e.g., registering services with the dependency injection provider. I also with application that loads data from a configuration file (e.g. appsettings.json) I pass the IConfiguration interface too in case the code needed that – see BookApp.Books.AppSetup as an example.  

How to create a NuGet packages

Now that we have defined how the bounded context with link to the main application, we now need to create a NuGet package. It turns out that creating a NuGet package containing multiple .NET projects is doable, but takes a lot of work building the .nuspec file properly.

To automate the creation the NuGet package I built a dotnet tool called MultiProgPack (repo found here) that builds the .nuspec file by scanning your solution for certain projects/namespaces and builds a NuGet package for you. This tool also contains features to make it much quicker to build and test your NuGet package on your development computer by using a local NuGet package server, which I describe later.

The MultiProgPack (which is a NuGet package) can be installed or updated on your computer the following three lines of command line commands:

dotnet tool install JonPSmith.MultiProjPack -global

dotnet tool update JonPSmith.MultiProjPack –global

Once you have installed the MultiProgPack you to call this dotnet tool via a command line in one of the projects in your solution (you can use any project, but I normally use the BookApp.Books.AppSetup project). Here are how you call it, with your selection of one of three of its options:

MultiProjPack <D|R|U>

The three options do the following

  • D(ebug): This creates a NuGet package using the Debug configuration of the code.
  • R(elease): This creates a NuGet package using the Release configuration of the code.
  • U(pdate): This builds a NuGet package using the Debug configuration, but also updates the .dll’s in the NuGet cache (I explain this in this section).

The tool relied on a xml file called MultiProjPack.xml in the folder you run the tool from. This file defines the NuGet data (name, version and so on) and optional tool settings. Here is a typical setup, but there are a lot more settings if you need them (see this example file containing all the settings and the READMe file for more information)

<?xml version="1.0" encoding="utf-8"?>
<allsettings xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<!-- this contains the typical information you should have in your settings -->
  <metadata>
    <id>BookApp.Books</id>
    <version>1.0.0-preview001</version>
    <authors>you must give a list of author(s)</authors>
    <description>you must provide a description of the NuGet</description>
    <releaseNotes>optional: what is changed in this release?</releaseNotes>
  </metadata>
  <toolSettings>
    <!-- This is used to find projects with names starting with this. If null, then uses NuGet id -->
    <NamespacePrefix></NamespacePrefix>
    <!-- excludes named projects (comma separated), e.g. "Test" would exclude a project starting with "BookApp.Books.Test" -->
    <ExcludeProjects>Test</ExcludeProjects>
    <!-- worth filling in with your local NuGet Package Source folder. See docs about using {USERPROFILE} in string -->
    <CopyNuGetTo>{USERPROFILE}\LocalNuGet</CopyNuGetTo> 
  </toolSettings>
</allsettings>

NOTE: Don’t worry, the tool has the command –CreateSettings that will create the MultiProjPack.xml with the typical settings for you to edit.

When you run the tool it:

  1. Scans all the folder for all .NET projects starting with the toolSettings.NamespacePrefix value e.g., BookApp.Books, and create a .nuspec file containing all the .NET projects it found
  2. It then calls dotnet pack to create the NuGet package using the .nuspec file created in step1.
  3. If you set up the correct  <toolSettings> it will update your local NuGet package server. I explain this later
  4. If you ran it in U(pdate) mode it will replace the data in the NuGet cache. I explain this later.  

But before I describe step 3 and 4 when running the MultiProgPack tool, lets consider how you might debug an application using a NuGet package.

The debugging an application that uses modular monolith NuGet package

There is a problem when developing an application using the modular monolith NuGet package approach, and that is the time it takes to upload a NuGet package to nuget.org (or any other web-based NuGet servers). That upload can take a few minutes, which doesn’t make a good development process. But be of good cheer – I have solved this problem and it takes less than a second to upload a NuGet package when testing locally. But before I describe the solution let’s look at the development process first.

For this example, you want to add a new feature, say adding a wish list where users can tell other people which books they would like for their birthday. And your application is called BookApp.Main, which contains the ASP.NET Core FrontEnd, and you decide to add the wish list feature to the code to BookApp.Books solution, which is in a separate solution and installed in the BookApp.Main application via a NuGet package.

The development process would require five things:

  1. Add new pages and commands in the FrontEnd code in your BookApp.Main application.
  2. Add new services in the part of the application that you have in separate BookApp.Books solution and write some unit tests to make sure they work.
  3. Then create a new version of the NuGet package from the BookApp.Books solution.
  4. Then you install the new version BookApp.Books NuGet package in the BookApp.Main application.
  5. Then you can try out your new feature by running the BookApp.Main locally with dummy data.

If you are a genus, you might be able to write both parts and it all works first time, but most people like me would need to go around these five steps many times. And if you had to wait for a few minutes for each NuGet package upload, it would be a very painful development cycle. Thankfully there are ways around this – the first is using local NuGet server on your development computer.

1. Using a local NuGet server to reduce the package upload time to milliseconds

It turns out you can define a folder on your development computer (Windows, Mac or Linux) as a NuGet package source. This means the upload is now just a copy of your new NuGet package into a folder on your development computer, so it’s very fast. That means your development testing will also be fast.

To add a local NuGet package source via Visual Studio you need to get to its NuGet package sources page. The command to get to that page is Tools > NuGet Package Manager > Package Manager Settings > Package Sources. Then you can add a new package source that is linked to a folder on your computer. Below is a screenshot of my NuGet package sources page where I have added a local folder as a possible NuGet package source (see selected source).

Once you have done that any NuGet packages the folder you have defined will show up in the NuGet Package Manager display if the local NuGet package source (or All) is selected. See the example below where I have selected my Local NuGet in the package source, which is highlighted in yellow.

NOTE: If you are using VSCode and dotnet commands then see this article about setting up folder on your local computer.

To make this even easier/quicker the MultiProgPack tool has a feature that will copy the newly created NuGet package directly into your local folder. To turn on that feature you need to fill in the <toolSettings>.<CopyNuGetTo> setting with your local NuGet package folder. The CopyNuGetTo setting will support the string {USERPROFILE} to get your user’s account folder, which means it works for any developer. For instance, the string {USERPROFILE}\LocalNuGet on my computer would become C:\Users\JonPSmith\LocalNuGet on my computer. That means your MultiProjPack.xml doesn’t have to change for each developer on the team.

The end result of setting the CopyNuGetTo path is that the command MultiProgPack D (or R) is a new NuGet package will appear in your local NuGet package source and you can immediately add or update your main application with that NuGet package.

2. Directly updating the NuGet package cache

The local NuGet package source is great, but you need to in increment the NuGet version for each package, as once a package is in the NuGet package cache you can’t override it. I found that a pain, so I looked for a better solution. I found one by accessing the NuGet package cache.

The NuGet package cache speeds up the performance of builds using NuGet packages. When you install a new NuGet package it adds an unpacked version of the NuGet package in the NuGet package cache. The unpacked version of the NuGet package contains all the files in the NuGet package, e.g. the code (.dll), symbol files (.pdb), documentation files (.xml), and these are copied into your application on certain actions, e.g. Build > Rebuild Solution, Restore NuGet packages, etc.

I take advantage of this feature to speed up the whole write code, install, test cycle (steps 3 to 5 of the development process already described) by directly updating the NuGet package cache at the same time, which is triggered by the MultiProjPack U(pdate). This works if you have created and installed a NuGet package with new code, but it has a bug. Instead of you having to create a new version of the NuGet package the U(pdate) command update the existing NuGet package, both in the local NuGet package source and the NuGet package cache.

The end result is, after running the MultiProjPack U(pdate) command you use Visual Studio’s command Build > Rebuild Solution, which will rebuild your main application using the files in the NuGet package cache. That cuts out two manual steps (changing the NuGet package version and updating the NuGet package in the main application), but there is a limitation.

The limitation is if you add, remove or update the .csproj file (say, by adding the NuGet packages it need), then you can’t use the U(pdate) command. For these cases you have to create a new NuGet package with a higher version number. The MultiProjPack U(pdate) command is useful for testing or fixing a bug and need to go around multiple times.

NOTE: For users that don’t like with the idea of changing the cache another way to go is to add a new NuGet version using the -m: option e.g., MultiProjPack D -m: version=1.0.0.1-preview002. This overrides the version in the MultiProjPack.xml file, thus saving you having to edit the MultiProjPack.xml every time, but then you need to manually update the NuGet package via the NuGet Package Manager display.

These two features reduce the minutes of uploading a NuGet package to nuget.org into a one command that takes seconds to update a NuGet package in your main application.

Tips on how to debug your NuGet package

As part of the build/test/debug cycle you will install or update your NuGet package into the main application to check it works. If something goes fails in your main application you might want to debug code that is in your NuGet package. Here are some tips on how to do this.

The first thing you need to do is turn off the “Just My Code” setting in the Debugger (see figure below). This allows you to step into NuGet package, and set breakpoints, see the local data and so on inside your NuGet package.

The second thing you need to know about how Visual Studio accessed the code introduced by your NuGet package. There are various ways to add the source code, names of local variables, stack traces, and so on into your NuGet package. The ways to add source code information to a NuGet for debugging: embedding the code in the .dll file or using symbols files.

1. Embedding the source code in the .dll file

You can embed the source code in the code (.dll) file, which makes it easy for Visual Studio to find the source code related to the .dll file. This works well, but requires you to add some extra xml commands to the NET project’s .csproj files.

The downside of this approach is it makes the .dll file bigger, so I suggest that you only add the source code to the .dll file when working to a Debug configuration build. To do this need to add following xml commands in a .NET project’s .csproj files you want this feature.

  <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|AnyCPU'">
    <DebugType>embedded</DebugType>
    <EmbedAllSources>true</EmbedAllSources>
  </PropertyGroup>

The next approach, using symbol files, is more automatic, but has some limitations.

2. Using the symbols files

By default, a .NET project will create a code (.dll) file and a symbols (.pdb) file in both the bin\{framework} and obj\{framework} folders when a project is build and the MultiProjPack tool will add then to the NuGet package. Visual Studio can use these when debugging, but if you change the code in a project in the NuGet package Visual Studio will stop you debugging that project. 

Visual Studio is very clever about finding debug information. If you open the Visual Studio’s Debuger > Windows > Modules window you will see all the .dll files in your application, and some of them will have a symbol status of loaded. By looking at this I can see that Visual Studio can find the folder holding the solution for the NuGet package, and it can access the symbols (.pdb) file of each project (I don’t know how Visual Studio does that, but it does).

This all sounds perfect, but as I said if you change the code in a project in your NuGet package, then Visual Studio will stop linking symbols file of the changed code. That means you can’t debug the change project anymore. The good news is the MultiProjPack U(pdate) does work, and you still debug the project via symbols.

NOTE: There are different formats for symbols files, which is a bit confusing. Also symbols files and NuGet symbol servers aren’t supported everywhere. The MultiProjPack tool will copy symbols (.pdb) files into the NuGet package if they are found in the project. It can also create a NuGet symbol package if you need it (see this setting).

What to do for production?

MultiProjPack’s D(ebug) and U(pdate) commands create NuGet packages using the Debug configuration of your code, and typically live in the local NuGet project source. But what should you do when you want to release your code for production?

Firstly, you should build a NuGet package using the release version of the code using the R(elease) command, i.e., MultiProjPack R. The created NuGet package will be in a folder called .nupkg in the folder where you ran the MultiProjPack tool. Now you need this release NuGet package so others can use it. How you do that depends on how you work and whether you want your NuGet packages to be private or not.

By default, the git ignore file will ignore NuGet packages so it you create a package you need to store your Release packages in some other form other than your Git repo. If your NuGet package can be public, then it’s easy – use https://www.nuget.org/. For holding NuGet packages privately, then here are some possibilities:

  • GitHub: GitHub allows you to publish and consume NuGet packages and they can be private, even for free accounts. See the GitHub NuGet docs and this article by Bruno Hildenbrand.
  • Azure DevOps: Azure provides a way to publish and consume NuGet packages within a DevOps pipeline. See the Azure NuGet docs and this article by Greg Margol.
  • MyGet: MyGet has been around for years and used by a host of companies, but it costs.
  • BaGet: Scott Hanselman’s article recommends BaGet, but at the time of writing it doesn’t have a private feed version.

Conclusion

In my quest to modularize a large monolith architecture I wanted to mimic the Microservice pattern approach of breaking the business problem into separate applications that communicate with each other. To do this I used NuGet packages to implement the “separate” solutions part of the Microservice architecture in a monolith application, using DDD bounded contexts to guide how to break my application into discrete parts.

It does take a big more work to use NuGet packages, but as a working developer I have strived to make this approach as automated and fast as possible. That’s why I build the MultiProjPack tool is a key part both creates the correct NuGet package and has features to make your build/test/debug cycle as quick as possible.

The first article described two ways to modularize a monolith so that its code is less likely to turn into “a ball of mud”. This article takes this another level of separation by allowing you build parts of your application as individual solutions and combine them into the main application using NuGet. The missing part is how you can communicate between bounded contexts, including bounded contexts build as NuGet packages, which I cover in the third article.

Evolving modular monoliths: 1. An architecture for .NET

Last Updated: July 6, 2021 | Created: May 3, 2021

This is the first article in a series about building a .NET application using a modular monolith architecture. The aim of this architecture is to keep the simplicity of a monolith design while providing a better structure so that as your application grows it doesn’t turn into what is known as “a big ball of mud” (see the original article which describes a big ball of mud as “haphazardly structured, sprawling, sloppy, duct-tape-and-baling-wire, spaghetti-code jungle”).

The designs I describe will create a clean separation between the main parts of your monolith application by using a modular monolith architecture and Domain-Driven Design’s (DDD) bounded context approach. These designs also provide a way to add a modular monolith section to an old application whose architecture isn’t easy to work with (see part 3).

NOTE: This article doesn’t compare a Monolith against a Microservice or Serverless architecture (read this article for a quick review of the three architectures). The general view is that a monolith architecture is the simplest to start with but can easily become “a ball of mud”. These articles are about improving the basic monolith design so that it keeps a good design for longer.

In Evolving Modular Monoliths series, the articles are:

NOTE: To support this series I create a demo ASP.NET Core e-commerce application that sells books using a modular monolith architecture. For this article the code in the https://github.com/JonPSmith/BookApp.All repo provides an example of the modularize bounded context design described in this article.

Who should read these articles?

These articles are aimed at all .NET software developers and software architects. It gives an overview of a modular monolith architecture approach, but later articles also look at team collaboration, and the nitty-gritty of checking, testing and deploying your modular monolith application.

I assume the reader is already a .NET software developer and is comfortable with writing C# code and understand .NET term, such as .NET projects (also known as assemblies), NuGet packages and so on.

What software principals do we need to avoid “a big ball of mud”?

Certain software principals and architectures can help organise the code in your application. The more complex the application, then the more you need to use good software principals to guide you to create code that is understandable, refactorable and robust. Here are some software principals that I will apply in this series:

  • Encapsulation: Isolating the various code for each business features from each other so that that a change to code for one business feature doesn’t break code for another business feature.
  • Separation of Concerns (SoC): That is, this design provides high cohesion (links only to relevant code) and low dependency (isn’t affected by other parts of the code).
  • Organisation: Makes it easy to find the various parts of code for a specific feature when you want to change things.
  • Collaboration: Allows multiple software developers to work on an application at the same time.
  • Development effort/reward: We want a architecture that balances speed of development against building an application that is easy to enhance and refactor.
  • Testability: Breaking up a feature into different parts often makes it easier to test the code.

Stages in building an ASP.NET Core application

I am going to take you three levels of modularization: modularization using an n-layered architecture, modularization at the DDD bounded context level, and finally modularization inside a DDD bounded context. Each level builds on the previous level, ending up with a high level of isolated for your code.

1. Basic n-layered application

I start with the well-known approach of breaking up an application into layers, known as an n-layered architecture. The figure below shows the way I break up an application into layers within my modular monolith design.

You might be surprised by the number of layers I used, but they are there to help to apply the SoC principal.  Later on, I define some architectural rules, enforced by unit tests, that ensures that the code to implement a feature has is high cohesion and low dependency.

To help you to understand what all these layers are for let’s look at how the creation of an order for book in my BookApp application (you try this feature by downloading BookApp.All repo and running it on your computer):

  • Front-end: The ASP.NET Core app handles showing the user’s basket and sending their order to the ServiceLayer.
  • ServiceLayer: This contains a class with a method that accepts the order data and sends it to the business layer. If everything is OK, it commits the order to the database and clears the user’s basket.
  • Infrastructure: Not needed for this feature.
  • BizLogic / BizDbAccess: An order is complex, so it has its own business logic code which is isolated from the database via code held in the BizDbAccess layer (this approach is described here).
  • Persistence: This provides the EF Core code to map the Orders classes to the database.
  • Domain: This holds the Order classes mapped to the database.

The figure above shows the seven layers as n-layer architecture which most developers are aware of. But in fact, I am using the clean architecture layering approach, which depicted as a series of rings, as shown in the figure to the right (click for a larger view). The clean architecture is also known as Hexagonal Architecture and Onion Architecture.

I use the clean architecture layering approach because I like its rules, such as inner layers can’t access outer layers. But any good n-layering approach would work.

I really like the clean architecture’s “layers can’t access outer layers” rule, but I have altered some of the clean architecture’s rules. They are:

  • I added a Persistence layer deep inside the rings to accommodate EF Core. See this section of a pervious article which explains why.
  • The clean architecture would place all the interfaces in the inner Domain layer, but I place the interfaces in the same project where the service is defined for two reasons: a) to me keeping the interface with the service is a better SoC design and, b) it also means an inner layer can’t call outer service via DI because they can’t access the interface.

Two general layering approaches that still apply when we move to a modular monolith architecture. They are:

  • I place as little as possible business code in the Front-end layer, but instead the Front-end calls services registered with the dependency injection system to display or input data. I do this because a) the Front-end should only manage the output / input of data, and b) it’s easier to test these services as a method call rather than testing ASP.NET Core APIs or Pages, which is much harder and slower.
  • The ServiceLayer has a very important role in my applications, as it acts an adapter between the business classes and the user display/input classes. See the section called “the importance of the Service Layer in the Clean Architecture layers” in a previous article to find out why.

The downsides of only using an n-layered architecture

I have used the n-layered architecture for many years, and it works, but the problem is that the n-layered architecture only applies the SoC principal at layer level, but not within layer. This has two bad effects on the structure of your application.

  • Layers can very big, especially the ServiceLayer, and hard to find and change anything.
  • When you do find the code for a business feature it’s not obvious whether other classes link you this code.

The question is, is there an architecture that would help you to follow the SoC encapsulation principals? Once I tried a modular monolith architecture (with DDD and clean architecture) I found the whole experience was significantly better than the n-layered architecture on its own. That’s because I knew where code for a certain business feature was by the name of the .NET project, and I knew that those projects only held code that is relevant to the business feature I am looking at. See a previous article called “My experience of using modular monolith and DDD architectures” where I reflect on using modular monolith on a project that was running late.

2. Modularize your code using DDD’s bounded context approach

So, rather than having all your code in an n-layered architecture we want to isolate the code for each feature. One way is to break up your application into specific business groups, and DDD provides an approach called bounded contexts (see this article by Martin Fowler for an overview of bounded contexts).

Bounded contexts are found by looking at the business needs in your application but Identifying bounded context can be hard (see this the video The Art of Discovering Bounded Contexts by Nick Tune for some tips). Personally, I define some bounded contexts early on, but I am happy to change the bounded context’s boundary walls and names as the project progresses and I gain more understand of the business rules.

NOTE: I use the name bounded context throughout this series, but there lots of different names around DDD’s bounded context concept, such as domain, subdomain, core domain etc. In many places I should use the DDD term domain, but that clashed with the clean architecture’s usage of the term domain, so I use the term bounded context wherever DDD bounded contexts or DDD domains are used. If you want some more information on all of the DDD terms around the bounded context try this short article by Nick Tune.

DDD’s bounded context work at the large scale in your business, for example in my BookApp as well as displaying books the user could also order books. From a DDD point of view the handling of books and handling of user’s orders are in different bounded contexts: BookApp.Books and BookApp.Orders – see the figure below.

Each layer in each bounded context has a .NET project containing the code for each layer, with the BookApp.Books .NET projects separate from the BookApp.Orders NET projects. So, in the figure the .NET projects in the Books bounded context is completely separate from the .NET projects in the Orders bounded context, which means each bounded context is isolated from each other.

NOTE: Another way to keep the bounded context isolated is to build a separate EF Core DbContext for each bounded context with only the tables that the bounded context needs to access. I cover how to do this in part 3 of this series.

Each layer is a .NET project/namespace and must have a unique name, and we want a naming convention that makes it easy for the developer to find the code they want to work on: the figure below shows the naming convention I that best described the application’s parts.

Bounded contexts also need to share data between bounded contexts, but in a way that doesn’t compromise the isolation por design of the bounded context. There are known design patterns of sharing data between bounded contexts, and I cover these in part 3.

NOTE: I recommend Kamil Grzybek’s excellent series on the Modular Monolith. He uses the same approach as I have described in this section. Kamil’s articles give a more details on the architectural design behind this design while my series introduces some extra ways to modularize and share your code.

3. Modularize inside a bounded context

The problem of just modularizing at the bounded context level is that many bounded contexts contain a lot of code. I can think of client projects I have worked where a single bounded context contained at over a years’ worth of developer effort.  That could mean that a single bounded context could become “a big ball of mud” all by itself. For this reason, I have developed a way to modularize within a single bounded context.

NOTE: There is a fully working BookApp build using a modularizing bounded context approach at https://github.com/JonPSmith/BookApp.All. It contains 23 projects and provides a small but complex application as an example of how the modularizing bounded context approach could be applied to a .NET application.

Modularizing at the bounded context level is done via the high-level business design of your application, while the modularization inside a bounded context is done by grouping all the code for a specific feature and give it its own .NET project(s). Taking an example from my Book App used in my book I created lots of different ways to query the database, so I had one .NET project for each query type in the ServiceLayer. These then linked down to the lower layers as shown in the figure below, although I don’t show all the references (for instance nearly every outer layer links to the Domain layer) as it would make the figure hard to understand.

NOTE: As you can see there are lots of .NET projects in the ServiceLayer, a few in the Infrastructure, none in the BizLogic/DbAccess layers and often only one .NET project in the Persistence and Domain layers. Typically, the ServiceLayer has the most projects with the Persistence and Domain layers only containing one project.

Building a modularized bounded context looks like a lot of work, but for me it was very natural, and positive, change from what I did when using an n-layer architecture. Previously, when using an n-layer architecture, I grouped my code into folders, but with modular monolith approach the previous classes etc in each folder are placed in a .NET project instead.

These .NET projects/namespaces and must have a unique name so I extend the naming convention I showed in the previous bounded context modularization section by adding an extra name on the end of each .NET project/namespace name when needed, as shown below.

The rules for how the .NET projects in a modularized bounded context are pretty simple, but powerful:

  • A .NET project can only reference other .NET projects within its bounded context (see part 3 for how data can exchanged between bounded contexts).
  • A .NET project in an outer layer can only reference .NET projects in the inner layers.
  • A .NET project can access a .NET project in the same layer, but only if its name contains the word “Common”. This allows your code to be DRY (i.e., no duplicate code) but it’s very clear that any .NET project containing “Common” in its name effects multiple features.

NOTE: To ensure these are adhered to I wrote some unit test code that will check your application follows these three rules – see this unit test class in the BookApp.All repo.

The positive effects of using this modularization approach and its rules are:

  • The code is isolated from the other feature code (using a folder didn’t do that).
  • I can find the code more quickly via the .NET project’s name.
  • I can create unit test to check that my code is following the modularization rules.

Overall, this modularization approach stops the spaghetti-code jungle part of the “a big ball of mud” because now the relationships are managed by .NET and you can’t get around them easily. In the end, it’s up to the developer to apply the SoC and encapsulation software principals but following this modularization style will help you to write code that is easy to understand and easy to refactor.

Thinking about the downside – what happens with large applications?

I learnt a lot of things about building .NET application using my modular monolith modularization approach, but it was very small compared to the applications I work on for clients. So, I need to consider the downsides of building a large application to ensure this approach will scale, because a really big application could have 1,000 .NET projects. 

The first issue to consider is, can the development tools handle an application with say 1,000 .NET projects? The recent announcement of the 64-bit Visual Studio 2022, which can handle 1,600 projects, says this won’t be a problem. And even Visual Studio 2019 can handle 1,000 .NET projects (according to a report @ErikEJ found), but another person on Twitter said that 300 .NET projects too much. However, an application with lots of .NET projects could be tiresome to navigate through.

The second issue to consider is, can multiple teams of developers work together on a large application? In my view the bounded context approach is the key to allowing multiple teams to work together, as different teams can work on different bounded contexts. Of course, the teams need to follow the DDD bounded context rules, especially the rules about how bounded context communicate with each other, which I cover in part 3 of this series.

The final issue to consider is, how could the modular monolith modularization be applied to an existing application? There are many existing monolith applications, and it would be great if you could add new features using the modular monolith modularization approach. I talk about this in more detail in part 3, but I do see a way to make that work.

An answer to these downsides – break the application into separate packages

While the 3 downsides could be handled through rules and good team communication a modular monolith doesn’t have the level of separation that separate solutions (i.e., like Microservice do), but how can we do this when we are dealing with a monolith? My answer is to move any of the larger or complex bounded contexts into its own solution, pack each solution into a NuGet package and then install these NuGet packages into the main application.

This physically separates one or more of your bounded contexts from main application code while keeping the benefits of the monolith’s quick method/data transfer. Turning a bounded context into a separate solution allows a team to work on a bounded context on its own with easier navigation and no clashing with other team’s changes. And for existing applications you can create new features in a separate solution using the modular monolith approach and add these new features via NuGet packages to your existing application.

In part 2 I describe how you can turn a bounded context into separate solution and turn it into a NuGet package that can be installed in the main application, with special focus on the development cycle to make it only take a few seconds (not the few minutes that nuget.org takes) to create, upload, and install a NuGet package for local testing.

Conclusion

I have introduced you to the modular monolith architecture and then provided two approaches to applying a modular monolith architecture and DDD principals to .NET applications. One modularized at the DDD’s bounded context level and the second level added extra modularization inside a bounded context.

The question is: will the extra work needed to apply a modular monolith architecture to your application really create an application that is easier to extend over time? My first use of a modular monolith architecture was while writing the Book, “Entity Framework Core in Action” and it was very positive. Overall, I think it made me slightly faster than using an n-layered architecture because it was easier to find things. But the real benefit was when I added features for performance tuning and added CQRS architecture, which required a lot of refactoring and moving of code.  

NOTE: I recommend you read the sections “Modular Monolith – what was bad?” and “Modular Monolith – how did it fair under time pressure?” for my review of my first use of a modular monolith architecture.

Since my first use of a modular monolith architecture, I have further refined my modular monolith design to handle large application development. In the second article I add a further level of separation for development team so that large parts of your application can be worked on in its own solution. As a software developer myself I ensured that the development process is quick and reliable, as it’s quite possible I will use this approach on a client’s application.

Please do leave comments on this article. Happy to discuss the best ways to implement a modular monolith architecture or hear of any experience people have of using a modular monolith architecture.

Happy coding!

Five levels of performance tuning for an EF Core query

Last Updated: March 4, 2021 | Created: February 23, 2021

This is a companion article to the EF Core Community Standup called “Performance tuning an EF Core app” where I apply a series of performance enhancements to a demo ASP.NET Core e-commerce book selling site called the Book App. I start with 700 books, then 100,000 books and finally ½ million books.

This article, plus the EF Core Community Standup video, pulls information from chapters 14 to 16 from my book “Entity Framework Core in Action, 2nd edition” and uses code in the associated GitHub repo https://github.com/JonPSmith/EfCoreinAction-SecondEdition.

NOTE: You can download the code and run the application described in this article/video via the https://github.com/JonPSmith/EfCoreinAction-SecondEdition GitHub repo. Select the Part3 branch and run the project called BookApp.UI. The home page of the Book App has information on how to change the Book App’s settings for chapter 15 (four SQL versions) and chapter 16 (Cosmos DB).

Other articles that are relevant to the performance tuning shown in this article

TL;DR – summary

  • The demo e-commerce book selling site displays books with various sort, filter and paging that you might expect to need. One of the hardest of the queries is to sort the book by their average votes (think Amazon’s star ratings).
  • At 700 books a well-designed LINQ query is all you need.
  • At 100,000 books (and ½ million reviews) LINQ on its own isn’t good enough. I add three new ways to handle the book display, each one improving performance, but also takes more development effort.
  • At ½ million books (and 2.7 million reviews) SQL on its own has some serious problems, so I swap to a Command Query Responsibility Segregation (CQRS) architecture, with the read-side using a Cosmos DB database (Cosmos DB is a NOSQL database)
  • The use of Cosmos DB with EF Core highlights
    • How Cosmos DB is different from a relational (SQL) database
    • The limitations in EF Core’s Cosmos DB database provider
  • At the end I give my view of performance gain against development time.

The Book App and its features

The Book App is a demo e-commerce site that sells books. In my book “Entity Framework Core in Action, 2nd edition” I use this Book App as an example of using various EF Core features. It starts out with about 50 books in it, but in Part3 of the book I spend three chapters on performance tuning and take the number of books up to 100,000 book and then to ½ million books. Here is a screenshot of the Book App running in “Chapter 15” mode, where it shows four different modes of querying a SQL Server database.

The Book App query which I improve has the following Sort, Filter, Page features

  • Sort: Price, Publication Date, Average votes, and primary key (default)
  • Filter: By Votes (1+, 2+, 3+, 4+), By year published, By tag, (defaults to no filter)
  • Paging: Num books shown (default 100) and page num

Note: that a book can be soft deleted, which means there is always an extra filter on the books shown.

The book part of the database (the part of the database that handles orders isn’t shown) looks like this.

First level of performance tuning – Good LINQ

One way to load a Book with its relationships is by using Includes (see code below)

var books = context.Books
    .Include(book => book.AuthorsLink
        .OrderBy(bookAuthor => bookAuthor.Order)) 
            .ThenInclude(bookAuthor => bookAuthor.Author)
    .Include(book => book.Reviews)
    .Include(book => book.Tags)
    .ToList();

By that isn’t the best way to load books if you want good performance. That’s because a) you are loading a lot of data that you don’t need and b) you would need to do sorting and filter in software, which is slow. So here are my five rules for building fast, read-only queries.

  1. Don’t load data you don’t need, e.g.  Use Select method pick out what is needed.
    See lines 18 to 24 of my MapBookToDto class.
  2. Don’t Include relationships but pick out what you need from the relationships.
    See lines 25 to 30 of my MapBookToDto class.
  3. If possible, move calculations into the database.
    See lines 13 to 34 of my MapBookToDto class.
  4. Add SQL indexes to any property you sort or filter on.
    See the configuration of the Book entity.
  5. Add AsNoTracking method to your query (or don’t load any entity classes).
    See line 29 in ListBookService class

NOTE: Rule 3 is the hardest to get right. Just remember that some SQL commands, like Average (SQL AVE) can return null if there are no entries, which needs a cast to a nullable type to make it work.

So, combining the Select, Sort, Filter and paging my code looks like this.

public async Task<IQueryable<BookListDto>> SortFilterPageAsync
    (SortFilterPageOptions options)
{
    var booksQuery = _context.Books 
        .AsNoTracking() 
        .MapBookToDto() 
        .OrderBooksBy(options.OrderByOptions) 
        .FilterBooksBy(options.FilterBy, options.FilterValue); 

    await options.SetupRestOfDtoAsync(booksQuery); 

    return booksQuery.Page(options.PageNum - 1, 
        options.PageSize); 
}

Using these rules will start you off with a good LINQ query, which is a great starting point. The next sections are what to do if that doesn’t’ give you the performance you want.

When the five rules aren’t enough

The query above is going to work well when there aren’t many books, but in chapter 15 I create a database containing 100,000 books with 540,000 reviews. At this point the “five rules” version has some performance problems and I create three new approaches, each of which a) improves performance and b) take development effort. Here is a list of the four approaches, with the Good LINQ version as our base performance version.

  1. Good LINQ: This uses the “five rules” approach. We compare all the other version to this query.
  2. SQL (+UDFs): This combines LINQ with SQL UDFs (user-defined functions) to move concatenations of Author’s Names and Tags into the database.
  3. SQL (Dapper): This creates the required SQL commands and then uses the Micro-ORM Dapper to execute that SQL to read the data.
  4. SQL (+caching): This pre-calculates some of the costly query parts, like the averages of the Review’s NumStars (referred to as votes).

In the video I describe how I build each of these queries and the performance for the hardest query, this is sort by review votes.

NOTE: The SQL (+caching) version is very complex, and I skipped over how I built it, but I have an article called “A technique for building high-performance databases with EF Core” which describes how I did this. Also, chapter 15 on my book “Entity Framework Core in Action, 2nd edition” covers this too.

Here is a chart in the I showed in the video which provides performances timings for three queries from the hardest (sort by votes) down to a simple query (sort by date).

The other chart I showed was a breakdown of the parts of the simple query, sort by date. I wanted to show this to point out that Dapper (which is a micro-ORM) is only significantly faster than EF Core if you have better SQL then EF Core produces.

Once you have a performance problem just taking a few milliseconds off isn’t going to be enough – typically you need cut its time by at least 33% and often more. Therefore, using Dapper to shave off a few milliseconds over EF Core isn’t worth the development time. So, my advice is and study the SQL that EF Core creates and if you know away to improve the SQL, then Dapper is a good solution.

Going bigger – how to handle ½ million or more books

In chapter 16 I build what is called a Command Query Responsibility Segregation (CQRS) architecture. The CQRS architecture acknowledges that the read side of an application is different from the write side. Reads are often complicated, drawing in data from multiple places, whereas in many applications (but not all) the write side can be simpler, and less onerous. This is true in the Book App.

To build my CQRS system I decided to make the read-side live in a different database to the write-side of the CQRS architecture, which allowed me to use a Cosmos DB for my read-side database. I did this because Cosmos DB designed for performance (speed of queries) and scalability (how many requests it can handle). The figure below shows this two-database CQRS system.

The key point is the data saved in the Cosmos DB has as many of the calculations as possible pre-calculated, rather like the SQL (+cached) version – that’s what the projection stage does when a Book or its associated relationships are updated.

If you want to find out how to build a two-database CQRS code using Cosmos DB then my article Building a robust CQRS database with EF Core and Cosmos DB describes one way, while chapter 16 on my book provides another way using events.

Limitations using Cosmos DB with EF Core

It was very interesting to work with Cosmos DB with EF Core as there were two parts to deal with

  • Cosmos DB is a NoSQL database and works differently to a SQL database (read this Microsoft article for one view)
  • The EF Core 5 Cosmos DB database provider has many limitations.

I had already look at these two parts back in 2019 and written an article, which I have updated to EF Core 5, and renamed it to “An in-depth study of Cosmos DB and the EF Core 3 to 5 database provider”.

Some of the issues I encountered, listed with the issues that made the biggest change to my Book App are:

  • EF Core 5 limitation: Counting the number of books in Cosmos DB is SLOW!
  • EF Core 5 limitation: EF Core 5 cannot do subqueries on a Cosmos DB database.
  • EF Core 5 limitation: No relationships or joins.
  • Cosmos difference: Complex queries might need breaking up
  • EF Core 5 limitation: Many database functions not implemented.
  • Cosmos difference: Complex queries might need breaking up.
  • Cosmos difference: Skip is slow and expensive.
  • Cosmos difference: By default, all properties are indexed.

I’m not going to go though all of these – the “An in-depth study of Cosmos DB and the EF Core 3 to 5 database provider” covers most of these.

Because of the EF Core limitation on counting books, I changed the way that that paging works. Instead of you picking what page you want you have a Next/Prev approach, like Amazon uses (see figure after list of query approaches). And to allow a balanced performance comparison with the SQL version and the Cosmos DB version I added the best two SQL approaches, but turned of counting too (SQL is slow on that).

It also turns out that Cosmos DB can count very fast so I built another way to query Cosmos DB using its NET (pseudo) SQL API. With this the Book App had four query approaches.

  1. Cosmos (EF): This accesses the Cosmos DB database using EF Core (with some parts using the SQL database where EF Core didn’t have the features to implement parts of the query.
  2. Cosmos (Direct): This uses Cosmos DB’s NET SQL API and I wrote raw commands – bit like using Dapper for SQL.
  3. SQL (+cacheNC): This uses the SQL cache approach using the 100,000 books version, but with counting turned off to compare with Cosmos (EF).
  4. SQL (DapperNC): This uses Dapper, which has the best SQL performance, but with counting turned off to compare with Cosmos (EF).

The following figure shows the Book App in CQRS/Cosmos DB mode with the four query approaches, and the Prev/Next paging approach.

Performance if the CQRS/Cosmos DB version

To test the performance, I used an Azure SQL Server and Cosmos DB service from a local Azure site in London. To compare the SQL performance and the Cosmos DB performance I used databases with a similar cost (and low enough it didn’t cost me too much!). The table below shows what I used.

Database typeAzure service namePerformance unitsPrice/month
Azure SQL ServerStandard20 DTUs$37
Cosmos DBPay-as-you-gomanual scale, 800 RUs$47

I did performance tests on the Cosmos DB queries while I was adding books to the database to see if the size of the database effected performance. Its hard to get a good test of this as there is quite a bit of variation in the timings.

The chart below compares EF Core calling Cosmos DB, referred to as Cosmos (EF), against using direct Cosmos DB commands via its NET SQL API – referred to as Cosmos (Direct).

This chart (and other timing I took) tells me two things:

  • The increase in the number in the database doesn’t make much effect on the performance (the Cosmos (Direct) 250,000 is well within the variation)
  • Counting the Books costs ~25 ms, which is much better than the SQL count, which added about ~150 ms.

The important performance test was to look at Cosmos DB against the best of our SQL accesses. I picked a cross-section of sorting and filtering queries and run them on all four query approaches – see the chart below.

From the timings in the figure about here some conclusions.

  1. Even the best SQL version, SQL (DapperNC), doesn’t work in this application because any sort or filter on the Reviews took so long that the connection timed out at 30 seconds.
  2. The SQL (+cacheNC) version was at parity or better with Cosmos DB (EF) on the first two queries, but as the query got more complex it fell behind in performance.
  3. The Cosmos DB (direct), with its book count, was ~25% slower than the Cosmos DB (EF) with no count but is twice as fast as the SQL count versions.

Of course, there are some downsides of the CQRS/Cosmos DB approach.

  • The add and update of a book to the Cosmos DB takes a bit longer: this is because the CQRS requires four database accesses (two to update the SQL database and two to update the Cosmos database) – that adds up to about 110 ms, which is more than double the time a single SQL database would take. There are ways around this (see this part of my article about CQRS/Cosmos DB) but it takes more work.
  • Cosmos DB takes longer and costs more if you skip items in its database. This shouldn’t be a problem with the Book App as many people would give up after a few pages, but if your application needs deep skipping through data, then Cosmos DB is not a good fit.

Even with the downsides I still think CQRS/Cosmos DB is a good solution, especially when I add in the fact that implementing this CQRS was easier and quicker than building the original SQL (+cache) version. Also, the Cosmos concurrency handling is easier than the SQL (+cache) version.

NOTE: What I didn’t test is Cosmos DB’s scalability or the ability to have multiple versions of the Cosmos DB around the work. Mainly because it’s hard to do and it costs (more) money.

Performance against development effort

In the end it’s a trade-off of a) performance gain and b) development time. I have tried to summarise this in the following table, giving a number from 1 to 9 for difficultly (Diff? in table) and performance (Perf? In the table).

The other thing to consider is how much more complexity does your performance tuning add to your application. Badly implemented performance tuning can make an application harder to enhance and extend. That is one reason why use like the event approach I used on the SQL (+cache) and CQRS / Cosmos DB approaches because it makes the least changes to the existing code.

Conclusion

As a freelance developer/architect I have had to performance tune many queries, and sometimes writes, on real applications. That’s not because EF Core is bad at performance, but because real-world application has a lot of data and lots of relationships (often hierarchical) and it takes some extra work to get the performance the client needs.

I have already used a variation of the SQL (+cache) on a client’s app to improve the performance of their “has the warehouse got all the parts for this job?”. And I wish Cosmos DB was around when I built a multi-tenant service that needed to cover the whole of the USA.

Hopefully something in this article and video will be useful if (when!) you need performance tune your application.

NOTE: You might like to look at the article “My experience of using modular monolith and DDD architectures” and its companion article to look at the architectural approaches I used on the Part3 Book App. I found the Modular Monolith architectural approach really nice.

I am a freelance developer who wrote the book “Entity Framework Core in Action“. If you need help performance tuning an EF Core application I am available for work. If you want hire me please contact me to discuss your needs.

My experience of using the Clean Architecture with a Modular Monolith

Last Updated: July 6, 2021 | Created: February 11, 2021

In this article I look at my use of a clean architecture with the modular monolith architecture covered in the first article. Like the first article this isn’t primer on Clean Architecture and modular monolith but is more about how I adapted the Clean Architecture to provide the vertical separation of the features in the modular monolith application.

  1. My experience of using modular monolith and DDD architectures.
  2. My experience of using the Clean Architecture with a Modular Monolith (this article).

Like the first article I’m going to give you my impression of the good and bad parts of the Clean Architecture, plus a look at whether the time pressure of the project (which was about 5 weeks later) made me “break” any rules.

UPDATE

See my new series on building modular monoliths where I take my experience and come up with a better approach to building modular monoliths using the .NET architecture.

TL;DR – summary

  • The Clean Architecture is like the traditional layered architecture, but with a series of rules that improve the layering.
  • I build an application using ASP.NET Core and EF Core using the Clean Architecture with the modular monolith approach. After this application was finished, I analysed how each approach had worked under time pressure.
  • I had used the Clean Architecture once before on a client’s project, but not with the modular monolith approach.
  • While the modular monolith approach had the biggest effect on the application’s structure without the Clean Architecture layers the code would not be so good.
  • I give you my views of the good, bad and possible “cracks under time pressure” for the Clean Architecture.
  • Overall I think the Clean Architecture adds some useful rules to the traditional layered architecture, but I had to break one of those rules you make it work with EF Core.

A summary of the Clean Architecture

NOTE: I don’t describe the modular monolith in this article because I did that in the first article. Here is a link to the modular monolith intro in the first article.

The Clean Architecture approach (also called the Hexagonal Architecture and Onion Architecture) is an development of the traditional “N-Layer” architecture (shortened to layered architecture). The Clean Architectureapproach talks about “onion layers” wrapped around each other and has the following main rules:

  1. The business classes (typically the classes mapped to a database) are in the inner-most layer of the “onion”.
  2. The inner-most layer of the onion should not have no significant external code e.g., NuGet packages, added to it. This is designed to keep the business logic as clean and simple as possible.
  3. Only the outer layer can access anything outside of the code. That means:
    1. The code that users access, e.g. ASP.NET Core, is in the outer layer
    1. Any external services, like the database, email sending etc. is in the outer layer.
  4. Code in inner layers can’t reference any outer layers.

The combination of rules 3 and 4 could cause lots of issues as lower layers will need to access external services. This is handled by adding interfaces to the inner-most layer of the onion and registering the external services using dependency injection (DI).

The figure below shows how I applied the Clean Architecture to my application, with is an e-commerce web site selling book, called the Book App.

NOTE: I detail the modification that I make to Clean Architecture approach around the persistence (database) layer later in the article.

Links to more detailed information on Clean Architecture (unmodified)

Setting the scene – the application and the time pressure

In 2020 I was updating my book “Entity Framework Core in Action” I build an ASP.NET Core application that sells books called Book App. In the early chapters is very simple, as I am describing the basics of EF Core, but in the last section I build a much more complex Book App that I progressively performance tuned, starting with 700 books, then 100,000 books and finally ½ million books. For the Book App to perform well it through three significant enhancement stages.  Here is example of Book App features and display with four different ways to display the books to compare their performance.

At the same time, I was falling behind on getting the book finished. I had planned to finish all the chapters by the end of November 2020 when EF Core 5 was due out. But I only started the enhanced Book App in August 2020 so with 6 chapters to write I was NOT going to finish the book in November. So, the push was on to get things done! (In the end I finished writing the book just before Christmas 2020).

My experience of using Clean Architecture with a Modular Monolith

I had used a simpler Clean Architecture on a client’s project I worked on, so I had some ideas of what I would do. Clean Architecture was useful, but its just another layered architecture with more rules and I had to break one of its key rules to make it work with EF Core. Overall I think I would use my modified Clean Architecture again in a larger application.

A. What was good about Clean Architecture?

To explain how Clean Architecture helps we need to talk about the main architecture – the modular monolith goals. The modular monolith focuses on features (Kamil Grzybek called them modules). One way to work would have one project per feature, but that has some problems.

  • The project would be more complex, as it has everything inside it.
  • You could end up with duplicating some code.

The Separation of Concerns (SoC) principal says breaking up a feature parts that focus on one part of the feature is a better way to go. So, the combination of modular monolith and using layers provides a better solution. The figure below shows two modular monolith features running vertically, and the five Clean Architecture layers running horizontally. The figure has a lot in it, but it’s there to show:

  • Reduce complexity: A feature can be split up into projects spread across the Clean Architecture layers, thus making the feature easier to understand, test and refactor.
  • Removing duplication: Breaking up the features into layer stops duplication – feature 1 and 2 share the Domain and Persistence layers.

The importance of the Service Layer in the Clean Architecture layers

Many years ago, I was introduced to the concept of the Service Layer. There are many definitions of the Service Layer (try this definition), but for me it’s a layer that knows about the lower / inner layer data structures and knows about the front-end data structures and it can adapt between the two structures (see LHS of the diagram above). So, the Service Layer isolates lower layers from having to know how the front-end works.

For me a Service Layer is a very important level.

  • It holds all the business logic or database accessed that the front-end needs, normally providing as services. This makes it much easier to unit test these services.
  • It takes on the job of adapting data to / from the front end. This means this layer that has to care about the two different data structures.

NOTE: Some of my libraries, like EfCore.GenericServices and EfCore.GenericBizRunner are designed to work as a Service Layer type service i.e., both libraries adapts between the lower / inner layer data structures to the front-end data structures.

Thus, the infrastructure layer, which is just below the Service Layer, contains for services that are still working in the entity class view. In the Book App these projects contained code to seed the database, handle logging and providing event handling. While services in the Service Layer worked with both lower / inner layer data structures and front-end data structures.

To end the “good” part of Clean Architecture I should say that a layered architecture could also provide the layering that the Clean Architecture defined. It’s just that the v has some more rules, most of which are useful.

B. Clean Architecture – what was bad?

The main problem was fitting the EF Core DbContext into the Clean Architecture. Clean Architecture says that the database should be on the outer ring, with interfaces for the access. The problem is there is no simple interface that you can use for the application’s DbContext. Even if you using a repository pattern (which I don’t, and here is why) then you have a problem that the application’s DbContext has to be define deep in the onion.

My solution was to put the EF Core right after to the inner circle (name Domain) holding the entity classes – I called that layer persistence, as that’s what DDD calls it. That breaks one of the key rules of the Clean Architecture, but other than that it works fine. But other external services, such as an external email service, I would follow the Clean Architecture rules and add an interface in the inner (Domain) circle and register the service using DI.

Clean Architecture – how did it fair under time pressure?

Appling the Clean Architecture and Modular Monolith architectures together took a little more time to think thought (I covered this in this section of the first article), but the end result was very good (I explain that in this section of the first article). The Clean Architecture layers broke a modular monolith feature into different parts, thus making the feature easier to understand and removing duplicate code.

The one small part of the clean architecture approach I didn’t like, but I stuck to, is that the Domain layer shouldn’t have any significant external packages, for instance a NuGet library, added to it. Overall, I applaud this rule as it keeps the Domain entities clean, but it did mean I had to do more work when configuring the EF Core code, e.g. I couldn’t use EF Core’s [Owned] attribute on entity classes. In a larger application I might break that rule.

So, I didn’t break any Clean Architecture rules because of the time pressure. The only rules I changed were make it work with EF Core, but I might break the “Domain layer and no significant external packages” in the future.

Conclusion

I don’t think the Clean Architecture approach has as big of effect on the structure that the modular monolith did (read the first article), but Clean Architecture certainly added to the structure by breaking modular monolith features into smaller, focused projects. The combination of the two approaches gave a really good structure.

My question is: does the Clean Architecture provide good improvements over a traditional layered architecture, especially as I had to break one of its key rules to work with EF Core? My answer is that using the Clean Architecture approach has made me a bit more aware of how I organise my layers, for instance I now have an infrastructure layer that I didn’t have before, and I appreciate that.

Please feel free to comment on what I have written about. I’m sure there are lots of people who have more experience with the Clean Architecture than me, so you can give your experience too.

Happy coding.

My experience of using modular monolith and DDD architectures

Last Updated: July 6, 2021 | Created: February 8, 2021

This article is about my experience of using a Modular Monolith architecture and Domain-Driven Design (DDD) approach on a small, but complex application using ASP.NET Core and EF Core. This isn’t a primer on modular monolith or DDD (but there is a good summary of each with links) but gives my views of the good and bad aspects of each approach.

  1. My experience of using modular monolith and DDD architectures (this article).
  2. My experience of using the Clean Architecture with a Modular Monolith.

I’m also interested in architecture approaches that help me to build applications with a good structure, even when I’m under some time pressure to finish – I want to have rules and patterns than help me to “do the right thing” even when I am under pressure. I’m about to explore this aspect because the I was privileged (?!) to be working on a project that was late😊.

UPDATE – new series

I having finished my book I have spent some time improving the initial modular monolith design described on this article. Please see the new evolving modular monoliths series listed below.

TL;DR – summary

  • The Modular Monolith architecture breaks up the code into independent modules (C# projects) for each of the features needed in your application. Each module only link to other modules that specifically provides services it needs.
  • Domain-Driven Design (DDD) is a holistic approach to understanding, designing and building software applications.
  • I build an application using ASP.NET Core and EF Core using both of these architectural approaches. After this application was finished, I analysed how each approach had worked under time pressure.
  • It was the first time I had used the Modular Monolith architecture, but it came out with flying colours. The code structure consists of 22 projects, and each (other than the ASP.NET Core front-end) are focused on one specific part of the application’s features.
  • I have used DDD a lot and, as I expected, it worked really well in the application. The classes (called entities by DDD) have meaningful named methods to update the data, so it is easy to understand.
  • I also give you my views of the good, bad and possible “cracks” that each approach has.

The architectural approaches covered in this article

At the start of building my application I knew the application would go through three major changes. I therefore wanted architectural approaches that makes it easier to enhance the application. Here are the main approaches I will be talking about:

  1. Modular Monolith: building independent modules for each feature.
  2. DDD: Better separation and control of business logic

NOTE: I used a number of other architectures and patterns that I am not going to describe. They are layered architecture, domain events / integration events, and CQRS database pattern to name but a few.

1. Modular Monolith

A Modular Monolith is an approach where you build and deploy a single application (that’s the “Monolith” part), but you build application in a way that breaks up the code into independent modules for each of the features needed in your application. This approach reduces the dependencies of a module in such as way that you can enhance/change a module without it effecting other modules. The figure below shows you 9 projects that implement two different features in the system. Notice they have only one common (small) project.

The benefits of a Modular Monolith approach over a normal Monolith are:

  • Reduced complexity: Each module only links to code that it specifically needs.
  • Easier to refactor: Changing a module has less or no effect on other modules.
  • Better for teams: Easier for developers to work on different parts of the code.

Links to more detailed information on modular monoliths

NOTE: An alternative to a monolith is to go to a Microservice architecture. Microservice architecture allow lots of developers to work in parallel, but there are lots of issues around communications between each Microservice – read this Microsoft document comparing a Monolith vs. Microservice architecture. The good news is a Modular Monolith architecture is much easier to convert to a Microservice architecture because it is already modular.

2. Domain-Driven Design (DDD)

DDD is holistic approach to understanding, designing and building software application. It comes from the book Domain-Driven Design by Eric Evans. Because DDD covers so many areas I’m only going to talk about the DDD-styled classes mapped to the database (referred to as entity classes in this article) and a quick coverage of bounded contexts.

DDD says your entity classes must control of the data that it contains; therefore, all the properties are read-only with constructors / methods used to create or change the data in an entity class. That way the entity class’s constructors / methods can ensure the create / update follows the business rules for this entity.

NOTE: The above paragraph is super-simplification of what DDD says about entity classes and there is so much more to say. If you are new to DDD then google “DDD entity” and your software language, e.g., C#.

Another DDD term is bounded contexts which is about separating your application into separate parts with very controlled interaction between different bounded context. For example, my application is an e-commerce web site selling book, and I could see that the display of books was different to the customer ordering some books. There is shared data (like the product number and the price the book is sold for) but other than that they are separate.

The figure below shows how I separated the displaying of book and the ordering of books at the database level.

Using DDD’s bounded context technique, the Book bounded context can change its data (other than the product code and sale price) without it effecting the Orders bounded context, and the Orders code can’t change anything in the Books bounded context.

The benefits of DDD over a non-DDD approach are:

  • Protected business rules: The entity classes methods contain most of the business logic.
  • More obvious: entity classes containing methods with meaningful named to call.
  • Reduces complexity: Bounded context breaks an app into separate parts.

Links to more detailed information on DDD

Setting the scene – the application and the time pressure

In 2020 I was updating my book “Entity Framework Core in Action” and my example application was an ASP.NET Core application that sells books called the Book App. In the early chapters the Book App is very simple, as I am describing the basics of EF Core. But in the last section I build a much more complex Book App that I progressively performance tuned, starting with 700 books, then 100,000 books and finally ½ million books. For the Book App to perform well it through three significant enhancement stages.  Here is example of Book App features and display with four different ways to display the books to compare their performance.

At the same time, I was falling behind on getting the book finished. I had planned to finish all the chapters by the end of November 2020 when EF Core 5 was due out. But I only started the enhanced Book App in August 2020 so with 6 chapters to write I was NOT going to finish the book in November. So, the push was on to get things done! (In the end I finished writing the book just before Christmas 2020).

NOTE: The ASP.NET Core application I talk about in this article is available on GitHub. It is in branch Part3 of the repo https://github.com/JonPSmith/EfCoreinAction-SecondEdition and can be run locally.

My experience of each architectural approach

As well as experiencing each architectural approach while upgrade the application (what Neal Ford calls evolutionary architecture) I also had the extra experience of building the Book App under a serious time pressure. This type of situation is a great for learning whether the approaches worked or not, which is what I am now going to describe. But first here is a summary for you:

  1. Modular Monolith = Brilliant. First time I had used it and I will use it again!
  2. DDD = Love it: Used it for years and its really good.

Here are more details on each of these three approaches where I point out:

  1. What was good?
  2. What was bad?
  3. How did it fair under time pressure?

1. Modular Monolith

I was already aware of the modular monolith approach, but I hadn’t used this approach in application before. My experience was it worked really well: in fact, it was much better than I thought it would be. I would certainly use this approach again on any medium to large application.

1a. Modular Monolith – what was good?

The first good thing is how the modular monolith compared with the traditional “N-Layer” architecture (shortened to layered architecture). I have used the layered architecture approach many times, usually with four projects, but the modular monolith application has 22 projects, see figure to the right.

This means I know the code in each project is doing one job, and there are only links to other projects that are relevant to a project. That makes it easier to find and understand the code.

Also, I’m much more inclined to refactor the old code as I’m less likely to break something else. In contrast the layered architecture on its own has lots of different features in one project and I can’t be sure what its linked to, which makes me more declined refactor code as it might affect other code.

1b. Modular Monolith – what was bad?

Nothing was really wrong with the modular monolith but working out the project naming convention took some time, but it was super important (see the list of project in the figure above). By having the right naming convention, the name told me a) where in the layered architecture the project was in and b) the end of the name told me what it does. If you are going to try using a modular monolith approach, I recommend you think carefully about your naming convention.

I didn’t change name too often because of a development tool issue (Visual Studio) as you could rename the project, but the underlying folder name wasn’t changed, which makes the GitHub/folder display look wrong. The few I did change required me to rename the project, and then outside Visual Studio rename the folder and then hand-editing the solution file, which is horrible!

NOTE: I have since found a really nice tool that will do a project/folder rename and updates the solution/csproj files on applications that use Git for source control.

Also, I learnt to not end project name with the name of a class e.g., Book class, as that caused problems if you referred to the Book class in that project. That’s why you see projects ending with “s” e.g., “…Books” and “…Orders”.

1c. Modular Monolith – how did it fair under time pressure?

It certainly didn’t slow me down (other the time deliberating over the project naming convention!) and it might have made me a bit faster than the layered architecture because I knew where to go to find the code. If I had come back to work on the app after a few months, then it would be much quicker to find the code I am looking for and it will be easier to change without effecting other features.

I did find me breaking the rules at one point because I was racing to finish the book. The code ended up with 6 different ways to query the database for the book display, and there were a few common parts, like some of the constants used in the sort/filter display and one DTO. Instead of creating a BookApp.ServiceLayer.Common.Books project I just referred to the first project. That’s my bad, but it shows that while the modular monolith approach really helps separate the code, but it does rely on the developers following the rules.

NOTE: I had to go back to the Book App to add a feature to two of the book display query so I took the opportunity to create a project called BookApp.ServiceLayer.DisplayCommon.Books which holds all the common code. That has removed the linking between each query feature and made it clear what code is shared.

2. Domain-Driven Design

I have used DDD many years and I find it an excellent approach which is focuses on the business (domain) issues rather that the technical aspects of the application. Because I have used DDD so much it was second nature to me, but I try to define what works, didn’t work etc. in the Book App, and some feedback from working on client’s applications.

2a. Domain-Driven Design – what was good?

DDD is so massive I am going to only talk about one of the key aspects I use every day – that is the entity class. DDD says your entity classes should be in complete control over the data inside it, and its direct relationships. I therefore make all the business classes and the classes mapped to the database to have read-only properties and use constructors and methods to create or update the data inside.

Making the entity’s properties are read-only means your business logic/validation must be in the entity too. This means:

  • You know exactly where to look for the business logic/validation is – its in the entity with its data.
  • You change an entity by calling a appropriately named method in the entity, e.g. AddReview(…). That makes it crystal clear what you are doing and what parameters it needs.
  • The read-only aspect means you can ONLY update the data via the entity’s method. This is DRY, but more importantly its obvious where to find the code.
  • Your entity can never be in an invalid state because the constructors / methods will check the create/update and return an error if it would make the entity’s data invalid.

Overall, this means using a DDD entity class is so much clearer to create and update than changing a few properties in a normal class. I love the clarity that the names methods provide – its obvious what I am doing.

I already gave you an example of the Books and Orders bounded contexts. That worked really well and was easy to implement once a understood how to use EF Core to map a class to a table as if it was a SQL view.

2b. Domain-Driven Design – what was bad?

One of the downsides of DDD is you have to write more code. The figure below shows this by  comparing a normal (non-DDD) approach on the left against a DDD approach on the right in an ASP.NET Core/EF Core application.

Its not hard to see that you need to write more code, as the non-DDD version on the left is shorter. It might only be five or ten lines, but that mounts up pretty quickly when you have a real application with lots of entity classes and methods.

Having been using DDD for a long time I have built a library called EfCore.GenericServices, which reduces the amount of code you have to write. It does this replacing the repository and the method call with a service. You still have to write the method in the entity class, but that library reduces the rest of the code to a DTO / ViewModel and a service call. The core below how you would use the EfCore.GenericServices library in an ASP.NET Core action method.

[HttpPost]
[ValidateAntiForgeryToken]
public async Task<IActionResult> AddBookReview(AddReviewDto dto, 
    [FromServices] ICrudServicesAsync<BookDbContext> service)
{
    if (!ModelState.IsValid)
    {
        return View(dto);
    }
    await service.UpdateAndSaveAsync(dto);

    if (service.IsValid)
        //Success path
        return View("BookUpdated", service.Message);

    //Error path
    service.CopyErrorsToModelState(ModelState, dto);
    return View(dto);
}

Over the years the EfCore.GenericServices library has saved me a LOT of development time. It can’t do everything but its great at handling all the simple Create, Update and Deleting (known as CUD) leaving me to work on more complex parts of the application.

2c. Domain-Driven Design – how did it fair under time pressure?

Yes, DDD does take a bit more time but its (almost) impossible to bypass the way DDD works because of the design. In the Book App it worked really well, but I did use an extra approach known as domain events (see this article about this approach) which made some of the business logic easier to implement.

I didn’t break any DDD rules in the Book App, but two projects I worked on both clients found calling DDD methods for every update didn’t work for their front-end. For instance, one client wanted to use JSON Patch to speed up the front-end (angular) development.

To handle this I came up with the hybrid DDD style where non-business properties are read-write, but data with business logic/validation has to go through methods calls. The hybrid DDD style is a bit of a compromise over DDD, but certainly speeded up the development for my clients.  In retrospect both projects worked well, but I do worry that the hybrid DDD does allow a developer in a hurry to make a property read-write when they shouldn’t. If every entity class is locked down, then no one can break the rules.

Conclusion

I always like to analyse any applications I work on after the project is finished. That’s because when I am still working on a project there is often pressure (often from myself) to “get it done”. By analysing a project after its finished or a key milestone is met, I can look back more dispassionately and see if there is anything to learn from the project.

My main take-away from building the Book App is that the Modular Monolith approach was very good, and I would use it again. The modular monolith approach provides small, focused projects. That means I know all the code in the project is doing one job, and there are only links to other projects that are relevant to a project. That makes it easier to understand, and I’m much more inclined to refactor the old code as I’m less likely to break something else.

I would say that the hardest part of using the Modular Monolith approach was working out the naming convention of the projects, which only goes to prove the quote “There are only two hard things in Computer Science: cache invalidation and naming things” is still true😊.

DDD is an old friend to me and while it needed a bit more lines of code written the result is a rock-solid design where every entity makes sure its data in a valid state. Quite a lot of the validation and simple business logic can live inside the entity class, but business logic that cuts across multiple entities or bounded context can be a challenge. I have an approach to handle that, but I also used a new feature I learnt from a client’s project about a year ago and that helped too.

I hope this article helps you consider these two architectural approaches, or if you are using these approaches, they might spark some ideas. I’m sure many people have their own ideas or experiences so please do leave comments as I’m sure there is more to say on this subject.

Happy coding.

How to update a database’s schema without using EF Core’s migrate feature

Last Updated: January 27, 2021 | Created: January 26, 2021

This article is aimed at developers that want to use EF Core to access the database but want complete control over their database schema. I decided to write this article after seeing the EF Core Community standup covering the EF Core 6.0 Survey Results. In that video there was a page looking at the ways people deploy changes to production (link to video at that point), and quite a few respondents said they use SQL scripts to update their production database.

It’s not clear if people create those SQL scripts themselves or use EF Core’s Migrate SQL scripts, but Marcus Christensen commented during the Community standup (link to video at that point) said “A lot of projects that I have worked on during the years, has held the db as the one truth for everything, so the switch to code first is not that easy to sell.

To put that in context he was saying that some developers want to retain control of the database’s schema and have EF Core match the given database. EF Core can definitely do that, but in practice it gets a bit more complex (I know because I have done this on real-world applications).

TL;DR – summary

  • If you have a database that isn’t going to change much, then EF Core’s reverse engineering tool can create classes to map to the database and the correct DbContext and configurations.
  • If you are change the database’s schema as the project progresses, then the reverse engineering on its own isn’t such a good idea. I cover three approaches to cover this:
    • Option 0: Have a look at the EF Core 5’s improved migration feature to check it work for you – I will save you time if it can work for your project.
    • Option 1: Use Visual Studio’s extension called EF Core Power Tools. This is reverse engineering on steroids and is designed for repeated database’s schema changes.
    • Option 2: Use the EfCore.SchemaCompare library. This lets you to write EF Core code and update database schema manually and tells you where they differ.  

Setting the scene – what are the issues of updating your database schema yourself?

If you have a database that isn’t changing, then EF Core’s reverse engineering tool as a great fit. This reads your SQL database and creates the classes to map to the database (I call these classes entity classes) and a class you use to access the database, with EF Core configurations/attributes to define things in the EF Core to match your database.

That’s fine for a fixed database as you can take the code the reverse engineering tool output and edit it to work the way you want it to. You can (carefully) alter the code that the reverse engineering tool produces to get the best out of EF Core’s features, like Value Converters, Owned type, Query Filters and so on.

The problems come if you are enhancing the database as the project progresses, because the EF Core’s reverse engineering works, but some things aren’t so good:

  1. The reverse engineering tool has no way to detect useful EF Core features, like Owned Types, Value Converters, Query Filters, Table-per-Hierarchy, Table-per-Type, table splitting, and concurrency tokens, which means you need to edit the entity classes and the EF Core configurations.
  2. You can’t edit the entity classes or the DbContext class because you will be replacing them the next time you change your database. One way around this is to add to the entity classes with another class of the same name – that works because the entity classes are marked as partial.
  3. The entity classes have all the possible navigational relationships added, which can be confusing if some navigational relationships would typically not be added because of certain business rules. Also, you can’t change the entity classes to follow a Domain-Driven Design approach.
  4. A minor point, you need to type in the reverse engineering command, which can be long, every time. I only mention because Option 1 will solve that for you.

So, if you want to have complete control over your database you have a few options, one of which I created (see Option 2). I start with a non-obvious approach considering the title of this article – that is using EF Core to create a migration and tweaking it. I think its worth a quick look at this to make sure you’re not taking on more work than need to – simply skip Option 0 if you are sure you don’t want to use EF Core migrations.

Option 0: Use EF Core’s Migrate feature

I have just finished updating my book Entity Framework Core in Action to EF Core 5 and I completely rewrote many of the chapters from scratch because there was so much change in EF Core (or my understanding of EF Core) – one of those complete rewrites was the chapter on handling database migrations.

I have say I wasn’t a fan of EF Core migration feature, but after writing the migration chapter I’m coming around to using the migration feature. Partly it was because I more experience on real-world EF Core applications, but also some of the new features like the MigrationBuilder.Sql() gives me more control of what the migration does.

The EF Core team want you to at least review, and possibly alter, a migration. Their approach is that the migration is rather like a scaffolded Razor Page (ASP.NET Core example), where it’s a good start, but you might want to alter it. There is a great video on EF Core 5 updated migrations and there was a discussion about this (link to video at the start of that discussion).

NOTE: If you decide to use the migration feature and manually alter the migration you might find Option 2 useful to double check your migration changes still matches EF Core’s Model of the database.

So, you might like to have a look at EF Core migration feature to see if it might work for you. You don’t have to change much in the way you apply the SQL migration scripts, as EF Core team recommends having the migration be apply by scripts anyway.

Option 1: Use Visual Studio’s extension called EF Core Power Tools

In my opinion, if you want to reverse engineer multiple times, then you should use Erik Ejlskov Jensen (known as @ErikEJ on GitHub and Twitter) Visual Studio’s extension called EF Core Power Tools. This allows you to run EF Core’s reverse engineering service via a friendly user interface, but that’s just the start. It provides a ton of options, some not even EF Core’s reverse engineering like reverse engineering SQL stored procs. All your options are stored, which makes subsequent reverse engineering just select and click. The extension also has lots other features creating a diagram of your database based on what your entity classes and EF Core configuration.  

I’m not going to detail all the features in the EF Core Power Tools extension because Erik has done that already. Here are two videos as a starting point, plus a link to the EF Core Power Tools documentation.

So, if you are happy with the general type of output that reverse engineering produces, then the EF Core Power Tools extension is a very helpful tool with extra features over the EF Core reverse engineering tool. EF Core Power Tools also specifically designed for continuous changes to the database, and Erik used it that way in the company he was working for.

NOTE: I talked to Erik and he said they use a SQL Server database project (.sqlproj) to keep the SQL Server schema under source control, and the resulting SQL Server .dacpac files to update the database and EF Core Power Tools to update the code. See this article for how Erik does this.

OPTION 2: Use the EfCore.SchemaCompare library

The problem with any reverse engineering approach is that you aren’t fully in control of the entity classes and the EF Core features. Just as developers want complete control over the database, I also want complete control of my entity classes and what EF Core features a can use. As Jeremy Likness said on the EF Core 6.0 survey video when database-first etc were being talked about  “I want to model the domain properly and model the database property and then use (EF Core) fluent API to map to two together in the right way” (link to video at that point).

I feel the same, and I built a feature I refer to as EfSchemaCompare – the latest version of this (I have version going back to EF6!) is in the repo https://github.com/JonPSmith/EfCore.SchemaCompare. This library compares EF Core’s view of the database based on the entity classes and the EF Core configuration against any relational database that EF Core supports. That’s because, like EF Core Power Tools, I use EF Core’s reverse engineering service to read the database, so no extra coding for me to do.

This library allows me (and you) to create my own SQL scripts to update the database while using any EF Core feature I need in my code. I can then run the EfSchemaCompare tool and it tells me if my EF Core code matches the database. If they don’t it gives me detailed errors so that I can fix either the database or the EF Core code. Here is a simplified diagram on how EfSchemaCompare works.

The plus side of this is I can write my entity classes any way I like, and normally I use a DDD pattern for my entity classes. I can also use many of the great EF Core features like Owned Types, Value Converters, Query Filters, Table-per-Hierarchy, table splitting, and concurrency tokens in the way I want to. Also, I control the database schema – in the past I have created SQL scripts and applied them to the database using DbUp.

The downside is I have to do more work. The reverse engineering tool or the EF Core migrate feature could do part of the work, but I have decided I want complete control over the entity classes and the EF Core features I use. As I said before, I think the migration feature (and documentation) in EF Core 5 is really good now, but for complex applications, say working with a database that has non-EF Core applications accessing it, then the EfSchemaCompare tool is my go-to solution.

The README file in the EfCore.SchemaCompare repo contains all the documentation on what the library checks and how to call it. I typically create a unit test to check a database – there are lots of options to allow you to provide the connection string of the database you want to test against your entity classes and EF Core configuration provided by your application’s DbContext class.

NOTE: The EfCore.SchemaCompare library only works with EF Core 5. There is a version in the EfCore.TestSuport library version 3.2.0 that works with EF Core 2.1 and EF Core 3.? and the documentation for that can be found in the Old-Docs page. This older version has more limitations than the latest EfCore.SchemaCompare version.

Conclusion

So, if you, like Marcus Christensen, consider “the database as the one truth for everything”, then I have described two (maybe three) options you could use. Taking charge of your database schema/update is a good idea, but it does mean you have to do more work.

Using EF Core’s migration tool, with the ability to alter the migration is the simplest, but some people don’t like that. The reverser engineering/EF Core Power Tools is the next easiest, as it will write the EF Core code for you. But if you want to really tap into EF Core’s features and/or DDD, then these approaches don’t cut it. That’s why I created the many versions of the EfSchemaCompare library.

I have used the EfSchemaCompare library on real-world applications, and I have also worked on client projects that used EF Core’s migration feature. The migration feature is much simpler but sometimes it’s too easy, which means you don’t think enough about what the best schema for your database would be. But that’s not the problem of the migration feature, its our desire/need to quickly move on, because you can change any migration EF Core produces if you want to.

I hope this article was useful to you on your usage of EF Core. Let me know your thoughts on this in the comments.

Happy coding.

Using ValueTask to create methods that can work as sync or async

Last Updated: January 25, 2021 | Created: January 23, 2021

In this article I delve into C#’s ValueTask struct, which provides a subset of the Task class features, and use it’s features to solve a problem of building libraries that need both sync and async version of the library’s methods. Along the way I learnt something about ValueTask and how it works with sync code.

NOTE: Thanks to Stephen Toub, who works on the Microsoft NET platform and wrote the article “Understanding the Whys, Whats, and Whens of ValueTask”, for confirming this is a valid approach and is used inside Microsoft’s code. His feedback, plus amoerie’s comment, helped me to improve the code to return the correct stack trace.

TL;DR – summary

  • Many of my libraries provide a sync and async version of each method. This can cause me to have to duplicate code, one for the sync call and one for the async call, with just a few different calls, e.g. SaveChanges and SaveChangesAsync
  • This article tells you how the ValueTask (and ValueTask <TResult>) works when it returns without running an async method, and what its properties mean. I also have some unit tests to check this.
  • Using this information, I found a way to use C#’s ValueTask to build a single method work as sync or async method, which is selected by a book a parameter. This removes a lot of duplicate code.
  • I have built some extension methods that will check that the returned ValueTask a) didn’t use an async call, and b) if an exception was thrown in the method (which won’t bubble up) it then throws it so that it does bubble up.

Setting the scene – why I needed methods to work sync or async

I have built quite a few libraries, NuGet says I have 15 packages, and most are designed to work with EF Core (a few are for EF6.x). Five of these have both sync and async versions of the methods to allow the developer to use it whatever way they want to. This means I have to build some methods twice: one for sync and one for async, and of course that leads to duplication of code.

Normally I can minimise the duplication by building internal methods that return IQueryable<T>, but when I developed the EfCore.GenericEventRunner library I wasn’t querying the database but running sync or async code provided by the developer. The internal methods normally have lots of code with one or two methods that could be sync or async, e.g. SaveChanges and SaveChangesAsync.

Ideally, I wanted internal methods that I could call that sync or async, where a parameter told it whether to call sync or async, e.g.

  • SYNC:   var result = MyMethod(useAsync: false)
  • ASYNC: var result = await MyMethod(useAsync: true)

I found the amazingly good article by Stephen Toub called “Understanding the Whys, Whats, and Whens of ValueTask” which explained about ValueTask <TResult> and synchronous completion, and this got me thinking – can I use ValueTask to make a method that could work sync and async? And I could! Read on to see how I did this.

What happens when a ValueTask has synchronous completion?

The ValueTask (and ValueTask <TResult>) code is complex and linked to the Task class, and the documentation is rather short on explaining what an “failed operation”. But from lots of unit tests and inspecting the internal data I worked out what happens with a sync return.

The ValueTask (and ValueTask <TResult>) have four bool properties. They are:

  • IsCompleted: This is true if the ValueTask is completed. So, if I captured the ValueTask, and this was true, then it had finished with means I don’t have to await it.
  • IsCompletedSuccessfully: This is true if no error happened. In a sync return it means no exception has been thrown.
  • IsFaulted: This is true if there was an error, and for a sync return that means an exception.
  • IsCancelled: This is true the CancellationToken cancelled the async method. This is not used in a sync return.

From this information I decided I could check that a method had synchronously if the IsCompleted property is true.

The next problem was what to do when a method using ValueTask throws an exception. The exception isn’t bubbled up but is held inside the ValueTask so I needed to extract that exception to throw it. I bit more unit testing and inspecting the ValueTask internals showed me how to extract the exception and throw it. Information provided by Stephen Toub showed a better way to throw the exception with the correct stacktrace.

NOTE: You can see the unit tests I did to detect what ValueTask and ValueTask <TResult> here.

So I could my var valueTask =MyMethod(useAsync: false) method and inspect the valueTask returned to check it didn’t call any async methods inside it, and calls GetResult, which will throw an exception if there is one. The code below does this for a ValueTask (this ValueTaskSyncCheckers class also contains a similar method for ValueTask<TResult>).

This code comes from from Microsoft code which this approach is used (look at this code and search for “useAsync: false”). Stephen Toub told me The valueTask.GetAwaiter().GetResult(); is the best way to end an ValueTask, even for the version that doesn’t return a result.. That’s because:

  • If there was an exception, then that call will throw the exception inside the method with the correct stacktrace.
  • Stephen Toub said that it should call GetResult even in the version with no result as if your method is used in a pooled resource, that call it typically used to tell the pooled resource is no longer used.

The listing below shows the two versions of the CheckSyncValueTaskWorked methods – the first is for ValueTask and the second for ValueTask<TResult>.

public static void CheckSyncValueTaskWorked(
    this ValueTask valueTask)
{
    if (!valueTask.IsCompleted)
        throw new InvalidOperationException(
            "Expected a sync task, but got an async task");
    valueTask.GetAwaiter().GetResult();
}

public static TResult CheckSyncValueTaskWorkedAndReturnResult
    <TResult>(this ValueTask<TResult> valueTask)
{
    if (!valueTask.IsCompleted)
        throw new InvalidOperationException(
             "Expected a sync task, but got an async task");
    return valueTask.GetAwaiter().GetResult();
}

NOTE: You can access these extension methods via this link.

How I used this feature in my libraries

I first used this in my EfCore.GenericEventRunner library, but those examples are complex, so I show a simple example in my EfCore.SoftDeleteServices, which has a very simple example. Here is a method that uses the useAsync property – see the highlighted lines at the end of the code.

public static async ValueTask<TEntity> LoadEntityViaPrimaryKeys<TEntity>(this DbContext conte
    Dictionary<Type, Expression<Func<object, bool>>> otherFilters, 
    bool useAsync,
    params object[] keyValues)
    where TEntity : class
{
    // Lots of checks/exceptions left out 

    var entityType = context.Model.FindEntityType(typeof(TEntity));
    var keyProps = context.Model.FindEntityType(typeof(TEntity))
        .FindPrimaryKey().Properties
        .Select(x => x.PropertyInfo).ToList();

    var filterOutInvalidEntities = otherFilters
          .FormOtherFiltersOnly<TEntity>();
    var query = filterOutInvalidEntities == null
        ? context.Set<TEntity>().IgnoreQueryFilters()
        : context.Set<TEntity>().IgnoreQueryFilters()
            .Where(filterOutInvalidEntities);

    return useAsync
        ? await query.SingleOrDefaultAsync(
              CreateFilter<TEntity>(keyProps, keyValues))
        : query.SingleOrDefault(
              CreateFilter<TEntity>(keyProps, keyValues));
}

The following two versions – notice the sync takes the ValueTask and then calls the CheckSyncValueTaskWorked  method, while the async uses the normal async/await approach.

SYNC VERSION

var entity= _context.LoadEntityViaPrimaryKeys<TEntity>(
    _config.OtherFilters, false, keyValues)
    .CheckSyncValueTaskWorkedAndReturnResult();
if (entity == null)
{
    //… rest of code left out

ASYNC VERSION

var entity = await _context.LoadEntityViaPrimaryKeys<TEntity>(
     _config.OtherFilters, true, keyValues);
if (entity == null) 
{
    //… rest of code left out

NOTE: I generally create the sync version of a library first, as its much easier to debug because async exception stacktraces are hard to read and the debug data can be harder to read. Once I have the sync version working, with its unit tests, then I build the async side of the library.

Conclusion

So, I used this sync/async approach in my EfCore.GenericEventRunner library, where the code is very complex, and it really made the job much easier. I then used the same approach in EfCore.SoftDeleteServices library – again there was a complex class called CascadeWalker, that “walks” the dependant navigational properties. Both of this approach stopped a significant duplication of code.

You might not be building a library, but you have learnt how the ValueTask does when it returns a sync result to an async call. The ValueType is there to make the sync return faster, and especially memory usage. Also, you now have another approach if you have a similar sync/async need.

NOTE:  ValueTask has a number of limitations so I only use ValueType in my internal parts of my libraries and provide a Task version to the user of my libraries.  

In case you missed it do read the excellent article “Understanding the Whys, Whats, and Whens of ValueTask” which explained ValueTask. And thanks again to Stephen Toub and amoerie’s comment for improving the solution.

Happy coding.

New features for unit testing your Entity Framework Core 5 code

Last Updated: January 21, 2021 | Created: January 21, 2021

This article is about unit testing applications that use Entity Framework Core (EF Core), with the focus on the new approaches you can use in EF Core 5. I start out with an overview of the different ways of unit testing in EF Core and then highlight some improvements brought in EF Core 5, plus some improvements I added to the EfCore.TestSupport version 5 library that is designed to help unit testing of EF Core 5.

NOTE: I am using xUnit in my unit tests. xUnit is well supported in NET and widely used (EF Core uses xUnit for its testing).

TL;DR – summary

NOTE: In this article I use the term entity class (or entity instance) to refer to a class that is mapped to the database by EF Core.

Setting the scene – why, and how, I unit test my EF Core code

I have used unit testing since I came back to being a developer (I had a long stint as a tech manager) and I consider unit testing one of the most useful tools to produce correct code. And when it comes to database code, which I write a lot of, I started with a repository pattern which is easy to unit test, but I soon moved away from the repository patten to using Query Objects. At that point had to work out how to unit test my code that uses the database. This was critical to me as I wanted my tests to cover as much of my code as possible.

In EF6.x I used a library called Effort which mocks the database, but when EF Core came out, I had to find another way. I tried EF Core’s In-Memory database provider, but the EF Core’s SQLite database provider using an in-memory database was MUCH better. The SQLite in-memory database (I cover this later) very easy to use for unit tests, but has some limitations so sometimes I have to use a real database.

The reason why I unit test my EF Core code is to make sure they work as I expected. Typical things that I am trying to catch.

  • Bad LINQ code: EF Core throws an exception if can’t translate my LINQ query into database code. That will make the unit test fail.
  • Database write didn’t work. Sometimes my EF Core create, update or delete didn’t work the way I expected. Maybe I left out an Include or forgot to call SaveChanges. By testing the database

While some people don’t like using unit tests on a database my approach has caught many, many errors which would be been hard to catch in the application. Also, a unit test gives me immediate feedback when I’m writing code and continues to check that code as I extend and refactor that code.

The three ways to unit test your EF Core code

Before I start on the new features here is a quick look at the three ways you can unit test your EF Core code. The figure (which comes from chapter 17 on my book “Entity Framework Core in Action, 2nd edition”) compares three ways to unit test your EF Core code, with the pros and cons of each.

As the figure says, using the same type of database in your unit test as what your application uses is the safest approach – the unit test database accesses will respond just the same as your production system. In this case, why would I also list using an SQLite in-memory database in unit tests? While there are some limitations/differences from other databases, it does have many positives:

  1. The database schema is always up to date.
  2. The database is empty, which is a good starting point for a unit test.
  3. Running your unit tests in parallel works because each database is held locally in each test.
  4. Your unit tests will run successfully in the Test part of a DevOps pipeline without any other settings.
  5. Your unit tests are faster.

NOTE: Item 3, “running your unit tests in parallel”, is an xUnit feature which makes running unit tests much quicker. It does mean you need separate databases for each unit test class if you are using a non in-memory database. The library EfCore.TestSupport has features to obtain unique database names for SQL Server databases.

The last option, mocking the database, really relies on you using some form pf repository pattern. I use this for really complex business logic where I build a specific repository pattern for the business logic, which allows me to intercept and mock the database access – see this article for this approach.

The new features in EF Core 5 that help with unit testing

I am going to cover four features that have changes in the EfCore.TestSupport library, either because of new features in EF Core 5, or improvements that have been added to the library.

  • Creating a unique, empty test database with the correct schema
  • How to make sure your EF Core accesses match the real-world usage
  • Improved SQLite in-memory options to dispose the database
  • How to check the SQL created by your EF Core code

Creating a unique, empty test database with the correct schema

To use a database in a xUnit unit test it needs to be:

  • Unique to the test class: that’s needed to allow for parallel running of your unit tests
  • Its schema must match the current Model of your application’s DbContext: if the database schema is different to what EF Core things it is, then your unit tests aren’t working on the correct database.
  • The database’s data should be a known state, otherwise your unit tests won’t know what to expect when reading the database. An empty database is the best choice.

Here are three ways to make sure database fulfils these three requirements.

  1. Use an SQLite in-memory which is created every time.
  2. Use a unique database name, plus calling EnsureDeleted, and then EnsureCreated.
  3. Use unique database name, plus call the EnsureClean method (only works on SQL Server).

1. Use an SQLite in-memory which is created every time.

The SQLite database has a in-memory mode, which is applied by setting the connection string to “Filename=:memory:”. The database is then hidden in the connection string, which makes it unique to its unit test and its database hasn’t been created yet. This is quick and easy, but you production database uses another type of database then it might not work for you.

The EF Core documentation on unit testing shows one way to set up a in-memory SQLite database, but I use the EfCore.TestSupport library’s static method called SqliteInMemory.CreateOptions<TContext> that will setup the options for creating an SQLite in-memory database, as shown in the code below.

[Fact]
public void TestSqliteInMemoryOk()
{
    //SETUP
    var options = SqliteInMemory.CreateOptions<BookContext>();
    using var context = new BookContext(options);

    context.Database.EnsureCreated();

    //Rest of unit test is left out
}

The database isn’t created at the start, so you need to call EF Core’s EnsureCreated method at the start. This means you get a database that matches the current Model of your application’s DbContext.

2. Unique database name, plus Calling EnsureDeleted, then EnsureCreated

If you are use a normal (non in-memory) database, then you need to make sure the database has a unique name for each test class (xUnit runs test classes in parallel, but methods in a test class are run serially). To get a unique database name the EfCore.TestSupport has methods that take a base SQL Server connection string from an appsetting.json file and adds the class name to the end of the current name (see code after next paragraph, and this docs).

That solves the unique database name, and we solve the “matching schema” and “empty database” by calling the EnsureDeleted method, then the EnsureCreated method. These two methods will delete the existing database and create a new database whose schema will match the EF Core’s current Model of the database. The EnsureDeleted / EnsureCreated approach works for all databases but is shown with SQL Server here.

[Fact]
public void TestEnsureDeletedEnsureCreatedOk()
{
    //SETUP
    var options = this.CreateUniqueClassOptions<BookContext>();
    using var context = new BookContext(options);
    
    context.Database.EnsureDeleted();
    context.Database.EnsureCreated();

    //Rest of unit test is left out
}

The EnsureDeleted / EnsureCreated approach used to be very slow (~10 seconds) for a SQL Server database, but since the new SqlClient came out in NET 5 this is much quicker (~ 1.5 seconds), which makes a big difference to how long a unit test would take to run when using this EnsureDeleted + EnsureCreated version.

3. Unique database name, plus call the EnsureClean method (only works on SQL Server).

While asking some questions on the EFCore GitHub Arthur Vickers described a method that could wipe the schema of an SQL Server database.  This clever method removed the current schema of the database by deleting all the SQL indexes, constraints, tables, sequences, UDFs and so on in the database. It then, by default, calls EnsuredCreated method to return a database with the correct schema and empty of data.

The EnsureClean method is deep inside EF Core’s unit tests, but I extracted that code and build the other parts needed to make it useful and it is available in version 5 of the EfCore.TestSupport library. The following listing shows how you use this method in your unit test.

[Fact]
public void TestSqlDatabaseEnsureCleanOk()
{
    //SETUP
    var options = this.CreateUniqueClassOptions<BookContext>();
    using var context = new BookContext(options);
    
    context.Database.EnsureClean();

    //Rest of unit test is left out
}

EnsureClean approach is a faster, maybe twice as fast as the EnsureDeleted + EnsureCreated version, which could make a big difference to how long your unit tests take to run. It also better in situations where your database server won’t allow you to delete or create new databases but does allow you to read/write a database, for instance if your test databases were on SQL Server where you don’t have admin privileges.

How to make sure your EF Core accesses match the real-world usage

Each unit test is a single method that has to a) setup the database ready for testing, b) runs the code you are testing, and the final part, c) checks that the results of the code you are testing are correct. And the middle part, run the code, must reproduce the situation in which the code you are testing is normally used. But because all three parts are all in one method it can be difficult to create the same state that the test code is normally used in.

The issue of “reproducing the same state the test code is normally used in” is common to unit testing, but when testing EF Core code this is made more complicated by the EF Core feature called Identity Resolution. Identity Resolution is critical in your normal code as it makes sure you only have one entity instance of a class type that has a specific primary key (see this example). The problem is that Identity Resolution can make your unit test pass even when there is a bug in your code.

Here is a unit test that passes because of Identity Resolution. The test of the Price at the end of the unit test should fail, because SaveChanges wasn’t called (see line 15). The reason it passed is because the entity instance in the variable called verifyBook was read from the database, but because the tracked entity instances inside the DbContext was found, and that was returned instead of reading from the database.

[Fact]
public void ExampleIdentityResolutionBad()
{
    //SETUP
    var options = SqliteInMemory
        .CreateOptions<EfCoreContext>();
    using var context = new EfCoreContext(options);

    context.Database.EnsureCreated();
    context.SeedDatabaseFourBooks();

    //ATTEMPT
    var book = context.Books.First();
    book.Price = 123;
    // Should call context.SaveChanges()

    //VERIFY
    var verifyBook = context.Books.First();
    //!!! THIS IS WRONG !!! THIS IS WRONG
    verifyBook.Price.ShouldEqual(123);
}

In the past we fixed this with multiple instances of the DbContext, as shown in the following code

public void UsingThreeInstancesOfTheDbcontext()
{
    //SETUP
    var options = SqliteInMemory         
        .CreateOptions<EfCoreContext>(); 
    options.StopNextDispose();
    using (var context = new EfCoreContext(options)) 
    {
        //SETUP instance
    }
    options.StopNextDispose();   
    using (var context = new EfCoreContext(options)) 
    {
        //ATTEMPT instance
    }
    using (var context = new EfCoreContext(options)) 
    {
        //VERIFY instance
    }
}

But there is a better way to do this with EF Core 5’s ChangeTracker.Clear method. This method quickly removes all entity instances the DbContext is currently tracking. This means you can use one instance of the DbContext, but each stage, SETUP, ATTEMPT and VERIFY, are all isolated which stops Identity Resolution from giving you data from another satge. In the code below there are two potential errors that would slip through if you didn’t add calls to the ChangeTracker.Clear method (or used multiple DbContexts).

  • Line 15: If the Include was left out the unit test would still pass (because the Reviews collection was set up in the SETUP stage).
  • Line 19: If the SaveChanges was left out the unit test would still pass (because the VERIFY read of the database would have bee given the book entity from the ATTEMPT stage)
public void UsingChangeTrackerClear()
{
    //SETUP
    var options = SqliteInMemory
        .CreateOptions<EfCoreContext>();
    using var context = new EfCoreContext(options);

    context.Database.EnsureCreated();             
    var setupBooks = context.SeedDatabaseFourBooks();              

    context.ChangeTracker.Clear();                

    //ATTEMPT
    var book = context.Books                      
        .Include(b => b.Reviews)
        .Single(b => b.BookId = setupBooks.Last().BookId);           
    book.Reviews.Add(new Review { NumStars = 5 });

    context.SaveChanges();                        

    //VERIFY
    context.ChangeTracker.Clear();                

    context.Books.Include(b => b.Reviews)         
        .Single(b => b.BookId = setupBooks.Last().BookId)            
        .Reviews.Count.ShouldEqual(3);            
}

This is much better than the three separate DbContext instances because

  1. You don’t have to create the three DbContext’s scopes (saves typing. Shorter unit test)
  2. You can use using var context = …, so no indents (nicer to write. Easier to read)
  3. You can still refer to previous parts, say to get its primary key (see use of setupBooks on line 16 and 25)
  4. It works better with the improved SQLite in-memory disposable options (see next section)

Improved SQLite in-memory options to dispose the database

You have already seen the SqliteInMemory.CreateOptions<TContext> method earlier but in version 5 of the EfCore.TestSupport library I have updated it to dispose the SQLite connection when the DbContext is disposed. The SQLite connection holds the in-memory database so disposing it makes sure that the memory used to hold the database is released.

NOTE: In previous versions of the EfCore.TestSupport library I didn’t do that, and I haven’t had any memory problems. But the EF Core docs say you should dispose the connection, so I updated the SqliteInMemory options methods to implement the IDisposable interface.

It turns out the disposing of the DbContext instance will dispose the options instance, which in turn disposes the SQLite connection. See the comments at the end of the code.

[Fact]
public void TestSqliteInMemoryOk()
{
    //SETUP
    var options = SqliteInMemory.CreateOptions<BookContext>();
    using var context = new BookContext(options);

    //Rest of unit test is left out
} // context is disposed at end of the using var scope,
  // which disposes the options that was used to create it, 
  // which in turn disposes the SQLite connection

NOTE: If you use multiple instances of the DbContext based on the same options instance, then you need to use one of these approaches to delay the dispose of the options until the end of the unit test.

How to check the SQL created by your EF Core code

If I’m interested in the performance of some part of the code I am working on, it often easier to look at the SQL commands in a unit test than in the actual application. EF Core 5 provides two ways to do this:

  1. The AsQueryString method to use on database queries
  2. Capturing EF Core’s logging output using the LogTo method

1. The AsQueryString method to use on database queries

The AsQueryString method will turn an IQueryable variable built using EF Core’s DbContext into a string containing the database commands for that query. Typically, I output this string the xUnit’s window so I can check it (see line 24). In the code below also contains a test so you can see the created SQL code – I don’t normally do that, but I put it in so you could see the type of output you get.

public class TestToQueryString
{
    private readonly ITestOutputHelper _output;

    public TestToQueryString(ITestOutputHelper output)
    {
        _output = output;
    }

    [Fact]
    public void TestToQueryStringOnLinqQuery()
    {
        //SETUP
        var options = SqliteInMemory.CreateOptions<BookDb
        using var context = new BookDbContext(options);
        context.Database.EnsureCreated();
        context.SeedDatabaseFourBooks();

        //ATTEMPT 
        var query = context.Books.Select(x => x.BookId); 
        var bookIds = query.ToArray();                   

        //VERIFY
        _output.WriteLine(query.ToQueryString());        
        query.ToQueryString().ShouldEqual(               
            "SELECT \"b\".\"BookId\"\r\n" +              
            "FROM \"Books\" AS \"b\"\r\n" +              
            "WHERE NOT (\"b\".\"SoftDeleted\")");        
        bookIds.ShouldEqual(new []{1,2,3,4});            
    }
}

2. Capturing EF Core’s logging output using the LogTo method

EF Core 5 makes it much easier to capture the logging that EF Core outputs (before you needed to create a ILoggerProvider class and register that with EF Core). But now you can add the LogTo method to your options and it will return a string output for every log. The code below shows how to do this, with the logs output to the xUnit’s window.

[Fact]
public void TestLogToDemoToConsole()
{
    //SETUP
    var connectionString = 
        this.GetUniqueDatabaseConnectionString();
    var builder =                                   
        new DbContextOptionsBuilder<BookDbContext>()
        .UseSqlServer(connectionString)             
        .EnableSensitiveDataLogging()
        .LogTo(_output.WriteLine);


    // Rest of unit test is left out

The LogTo has lots of different ways to filter and format the output (see the EF Core docs here). I created versions of the EfCore.TestSupport’s SQLite in-memory and SQL Server option builders to use LogTo, and I decided to use a class, called LogToOptions, to manage all the filters/formats for LogTo (the LogTo requires calls to different methods). This allowed me to define better defaults (defaults to: a) Information, not Debug, log level, b) does not include datetime in output) for logging output and make it easier for the filters/formats to be changed.

I also added a feature that I use a lot, that is the ability to turn on or off the log output. The code below shows the SQLite in-memory option builder with the LogTo output, plus using the ShowLog feature. This only starts output logging once the database has been created and seeded – see highlighted line 17 .

[Fact]
public void TestEfCoreLoggingCheckSqlOutputShowLog()
{
    //SETUP
    var logToOptions = new LogToOptions
    {
        ShowLog = false
    };
    var options = SqliteInMemory
         .CreateOptionsWithLogTo<BookContext>(
             _output.WriteLine);
    using var context = new BookContext(options);
    context.Database.EnsureCreated();
    context.SeedDatabaseFourBooks();

    //ATTEMPT 
    logToOptions.ShowLog = true;
    var book = context.Books.Single(x => x.Reviews.Count() > 1);

    //Rest of unit test left out
}

Information to existing user of the EfCore.TestSupport library

It’s no longer possible to detect the EF Core version via the netstandard so now it is done via the first number in the library’s version. For instance EfCore.TestSupport, version 5.?.? works with EF Core 5.?.?.* At the same time the library was getting hard to keep up to date, especially with EfSchemaCompare in it, so I took the opportunity to clean up the library.

BUT that clean up includes BREAKING CHANGES, mainly around SQLite in-memory option builder. If you use SqliteInMemory.CreateOptions you MUST read this document to decided whether you want to upgrade or not.

NOTE: You may not be aware, but your NuGet packages in your test projects override the same NuGet packages installed EfCore.TestSupport library. So, as long as you add the newest versions of the EF Core NuGet libraries, then the EfCore.TestSupport library will use those. The only part that won’t run is the EfSchemaCompare, but has now got its own library so you can use that directly.

Conclusion

I just read a great article on the stackoverfow blog which said

“Which of these bugs would be easier for you to find, understand, repro, and fix: a bug in the code you know you wrote earlier today or a bug in the code someone on your team probably wrote last year? It’s not even close! You will never, ever again debug or understand this code as well as you do right now, with your original intent fresh in your brain, your understanding of the problem and its solution space rich and fresh.

I totally agree and that’s why I love unit tests – I get feedback as I am developing (I also like that unit tests will tell me if I broke some old code too). So, I couldn’t work without unit tests, but at the same time I know that unit tests can take a lot of development time. How do I solve that dilemma?

My answer to the extra development time isn’t to write less unit tests, but to build a library and develop patterns that makes me really quick at unit testing. I also try to make my unit tests fast to run – that’s why I worry about how long it takes to set up the database, and why I like xUnit’s parallel running of unit tests.

The changes to the EfCore.TestSupport library and my unit test patterns due to EF Core 5 are fairly small, but each one reduces the number of lines I have to write for each unit test or makes the unit test easier to read. I think the ChangeTracker.Clear method is the best improvement because it does both – it’s easier to write and easier to read my unit tests.

Happy coding.