Six ways to build better Entity Framework (Core and EF6) applications

In this article I describe six different approaches to building applications that use Microsoft’s database access framework, Entity Framework (EF). All six approaches are based on software principles and patterns that many of you will be familiar with. This article shows how I applied these principles and patterns in real-world applications that use EF. The six principles and patterns are:

  1. Separation of concerns – building the right architecture.
  2. The Service Layer – separating data actions from presentation action.
  3. Repositories – picking the right sort of database access pattern.
  4. Dependency injection – turning your database code into services.
  5. Building business logic – using Domain-Driven Design with EF.
  6. Performance tuning EF – get it working, but be ready to make it faster if you need to.

I have an interesting perspective when it comes to software principles and patterns. I started out as a programmer, but then moved into technical management 1988. I came back to programming in 2009, a 21-year gap, and the whole software world had changed. Thankfully over those years some seriously clever people had been thinking about the science of programming, and I avidly read their articles and books.

While it was great to learn things from these software giants, I found it took some time to find the right ways to apply their ideas in the applications I developed. This article brings together the combination of ideas from some of the great software thinkers with my years of learning as I have applied and refined their approaches in building applications that used EF for database access.

All the software and figures in this article come from a book I am writing for Manning Publication called Entity Framework Core in Action that is now on early-access release i.e. if you buy now you will get the chapters as I write them, and the full book when it’s finished. The code is therefore based on the new .NET Core frameworks: EF Core and ASP.NET Core. However, these software principles and patterns apply equally well to the older Entity Framework, version 6 (EF6.x) framework and ASP.NET MVC5.

Note: I’m going to assume you know about Entity Framework already. If you don’t then I recommend you read the first chapter of my book, which is free, or look at Microsoft’s EF Core documentation which includes a number of example applications.

Before I start I need to say that the descriptions I show are my way of implementing the principles and patterns I talk about. There are plenty of other ways to implement each of six topics, mine are just one of them. Have fun yourself developing your own techniques and libraries that will improve and accelerate your development.

1. Principle: Separation of concerns – building on the right architecture

There is a software principal called Separation of Concerns (SoC), which says that you should:

  • Put code that has a similar, or strongly-related functions should be grouped together – in .NET terms put in separate projects. This called cohesion.
  • Make each group/project as self-contained as possible. Each piece of code should have a clear interface and scope of work that is unlikely to change to because of other callers changing what they do. This is called low coupling.

NOTE: Here is a more in-depth article on Separation of Concerns if you want to look into SoC some more.

For simplicity most examples of EF code on the web tend to show EF Core database commands being called directly from whatever application type they are using. This doesn’t follow SoC and nor does it really represent how real applications are written. I took the decision in my book to use a more representative software architecture for my example code, and I used a layered architecture. Using this does make it a little more difficult for the reader, but I build up the application structure over a number of chapter. Here is the final architecture of my example book selling site.

I could have used a number of different architectures, see this link for a good list, but the layered approach works well for small to medium applications. A layered architecture is also a good fit for cloud hosting, where cloud providers can spin up more instances of the web application if it is under a heavy load, that is, it will run multiple copies of a web application and place a load balancer to spread the load over all the copies. This known as scale out on Microsoft’s Azure and auto scaling on Amazon’s AWS.

The figure below shows how I apply SoC to my database access code. It shows the same software architecture, but with all my EF database access code highlighted in bubbles. The size of the bubbles relates to the amount of EF code you will find in each layer. Notice that the ASP.NET Core project and the pure business logic (BizLogic) project have no EF Core query/update code in them at all.

As I go through this article I will refer back to SoC, as it is an integral part of how I approach database applications.

2. Pattern: The Service Layer – separating data actions from presentation actions

One of the influential books I read when I came back to programming was Dino Esposito and Andrea Saltarello’s book Microsoft .NET: Architecting Applications for the Enterprise, published in 2009. Some of the technology he covered then is now superseded, but the patterns are still relevant today (Dino has written a number of newer books, see this link). This book introduced me to the use of Service Layer, which Martin Fowler previously described in his classic book in 2002.

Dino Esposito says in his book that the Service Layer “sets a boundary between two interfacing layers” and Martin Fowler’s site link says the Service Layer “Defines an application’s boundary with a layer of services that establishes a set of available operations and coordinates the application’s response in each operation”. That all sounds great, but how does that help my applications? Quite a bit actually.

I’m going to describe how the Service Layer acts as an adapter in this section. Later, in the section of business logic, I will cover the second way the Service Layer can serve me, by being in command of running my business code.

The Service Layer as an adapter

In a layered architecture there is often a data mismatch between the database/business logic and the presentation layer. The Domain-Driven Design (DDD) approach, which I describe later, says that the database and the business logic should be focused on the business rules, while the Presentation Layer is about giving the user a great user experience, or in the case of a web service, providing a standard and simple API.

For this reason, the Service Layer becomes a crucial layer, as it can be the layer that understands both sides and can transform the data between the two worlds. This keeps the business logic and the database uncluttered by the presentation needs, like drop down list and json AJAX calls. Similarly, by having the Service Layer deliver pre-formatted data in exactly the form the presentation layer needs then it makes it much simpler for the presentation layer to show that data.

When dealing with database accesses via EF the Service Layers uses an adapter pattern to transform from the data layer/business logic layers to/from the presentation layer. Databases tend to minimise duplication of data and maximises the relational links between data, while the presentation layer is about showing the data in a form that the user finds useful.

The figure below shows an example of this difference in approach. The image comes from the list of books produced by the example book selling site that I create in my book. You can see how I have to pick data from lots of different tables in the database, and do some calculations, to form a summary of a book in my book list display.

Note: You can see this book list in action on the live site that hosts the example book selling site, at http://efcoreinaction.com/

EF provides a way of building queries, called select loading, that can ‘pick out’ the relevant columns from each table and combine them into a DTO/ViewModel class that exactly fits the user view. I apply this transform in the Service Layer, alone with other sorting, filtering and paging features. The listing below is the select query using EF Core to build the book summary you just saw in the figure above.

public static IQueryable<BookListDto> 
    MapBookToDto(this IQueryable<Book> books)   
{
    return books.Select(p => new BookListDto
    {
        BookId = p.BookId,                      
        Title = p.Title,                        
        Price = p.Price,                        
        PublishedOn = p.PublishedOn,            
        ActualPrice = p.Promotion == null       
                ? p.Price : p.Promotion.NewPrice,         
        PromotionPromotionalText =              
                p.Promotion == null             
                  ? null : p.Promotion.PromotionalText,
        AuthorsOrdered = string.Join(", ",      
                p.AuthorsLink                   
                .OrderBy(q => q.Order)          
                .Select(q => q.Author.Name)),   
        ReviewsCount = p.Reviews.Count,         
        ReviewsAverageVotes =                   
                p.Reviews.Count == 0            
                ? null                          
                : (decimal?)p.Reviews           
                    .Select(q => q.NumStars).Average()
    });
} 

Note: The code above wouldn’t work with EF6.x because it includes the command string.Join that cannot be converted into SQL but EF6.x. EF Core has a called Client vs. Server Evaluation, which allows methods that cannot be translated to SQL to be included. They are run after the data has been returned from the database.

Yes, this code is complex, but to build the summary we need to pull data from lots of different places and do some calculations at the same time, so that’s what you get. I have built a library called GenericServices (currently only available for EF6.x) with automates the building of EF select loading commands like this by using a LINQ Mapper called AutoMapper. This significantly improves the speed of development of these complex queries.

3. Pattern: Picking the right sort of database access pattern

There are a number of different ways we can form your EF database access inside an application, with different levels of hiding the EF access code from the rest of the application. In the figure below I show four different data access patterns.

The four types of database access patterns are:

  1. Repository + Unit of Work (Repo+UOW). This hides all the EF Core behind code that provides a different interface to EF. The idea being you could replace EF with another database access framework with no change to the methods that call the Repo+UOW.
  2. EF repository. This is a repository patterns that doesn’t try and hide the EF code like the Repo+UOW pattern does. EF repositories assume that you as developer know the rules of EF, such as using tracked entities and calling SaveChanges for updates, and you will abide by them.
  3. Query Object. Query objects encapsulate the code for a database query, that is a database read. They hold the whole code for a query or for complex queries it might hold part of a query. Query objects are normally built as extension methods with IQueryable<T> inputs and outputs so that they can be chained together to build more complex queries.
  4. Direct calls to EF. This represents the case where you simply place the EF code you need in the method that needs it. For instance, all the EF code to build a list of books would be in the ASP.NET action method that shows that list.

NOTE: AS I said earlier I have created a library called GenericServices for EF6.x (and EF Core in the future). This is a form of EF repository.

I used the Repo+UOW pattern, which was the recommended approach at the time, in a big project in 2014 – and I found it was really hard work. I and many others realised Repo+UOW wasn’t the way to go – see my article ‘Is the Repository pattern useful with Entity Framework?’. The Repo+UOW can be a valid pattern in some cases where hiding of the certain part of the data is needed, but I think there are better ways to do this with some of the new EF Core features, such as backing fields.

At the other end of the spectrum is the direct calls to EF in the method that needs it. This fails the separation of concerns principal because the database code is mixed in with other code not directly involved in database issues.

So, having ruled out the two extremes I would recommend:

  • Query Objects for building queries, often breaking down large queries into a series of query objects. The previous listing in this article of the method called MapBookToDto, which is a query object. I cover query objects in chapter 2 of my book.
  • For Create, Update and Delete (and business logic which I cover later) I use a EF repository pattern, that is, I create a method that encapsulates the EF database access. This isolates the EF code and makes it easier to refactor or performance tune that code.

The listing below shows a class with two EF repository methods for changing the publication date of a book in my example book selling site. I cover this in chapter 3 of my book.

public class ChangePubDateService : IChangePubDateService
{
    private readonly EfCoreContext _context;

    public ChangePubDateService(EfCoreContext context)
    {
        _context = context;
    }

    public ChangePubDateDto GetOriginal(int id)    
    {
        return _context.Books
            .Select(p => new ChangePubDateDto      
            {                                      
                BookId = p.BookId,                 
                Title = p.Title,                   
                PublishedOn = p.PublishedOn        
            })                                     
            .Single(k => k.BookId == id);          
    }

    public Book UpdateBook(ChangePubDateDto dto)   
    {
        var book = _context.Books.Find(dto.BookId);
        book.PublishedOn = dto.PublishedOn;        
        _context.SaveChanges();                    
        return book;                               
    }
}

4. Pattern: Turning your database code into services

I have used dependency injection (DI) for years and I think it’s really useful approach. I want to show you a way you can inject your database access code into an ASP.NET Core application.

Note: If you haven’t used DI before have a look at this article for an introduction, or this longer article from another of the great thinker, Martin Fowler.

The benefits of doing this are twofold. Firstly, DI will dynamically link together your database access to into the parts of the presentation/web API code that need it. Secondly, because I am using interfaces, it is very easy to replace the calls to the database access code with mocks for unit testing.

I haven’t got the space in this article to give you all the details (I takes five pages in chapter 5 of my book to cover this), but here are the main steps, with links to online documentation if you want to follow it up. Here are the steps:

  1. You need to make each of your database access code into thin repositories. That is a class containing a method, or methods, that the front-end code needs to call. See the ChangePubDateService class listed above.
  2. You need to add an interface to each EF repository class. You can see the IChangePubDateService interface applied to the ChangePubDateService class listed above.
  3. You need to register your EF repository class against its interface in the DI provider. This will depend on your application. For ASP.NET Core see this article.
  4. Then you need to inject it into the front-end method that needs it. In ASP.NET Core you can inject into an action method using the [FromServices] Note: I use this DI parameter injection rather than the more normal constructor injection because it means I only create the class when I really need it, i.e. it is more efficient this way.

Note: I realise that is a lot to take in. If you need more information can look at the GitHub repo associated with my book. Here are some useful links:

At the end of this you have a method you can call to access the database. The listing below shows an ASP.NET Core action method that calls the UpdateBook method of the ChangePubDateService class that I listed previously. Line 4 has the [FromServices] attribute that tells the DI provider to inject the ChangePubDateService class into the parameter called service.


[HttpPost]
[ValidateAntiForgeryToken]
public IActionResult ChangePubDate(ChangePubDateDto dto,
   [FromServices]IChangePubDateService service)
   {
      service.UpdateBook(dto);
      return View("BookUpdated",
         "Successfully changed publication date");
}

NOTE: There is a way to do parameter injecting into an ASP.NET MVC action method, but it involves you having to override the default Binder. See The section “How DI is used in SampleMvcWebApp” at the bottom of this page, and my DiModelBinder in the associated GitHub repo.

The result of all this is that database code is nicely separated into its own class/method and your front-end code just has to call the method, not knowing what it contains. And unit testing is easy, as you can check the database access code on its own, and replace the same code in your front-end call with a mocking class that implements the interface.

5. Pattern: Building business logic – using Domain-Driven Design

Real-world applications are built to supply some sort of service, ranging from holding a simple list of things on your computer through to managing a nuclear reactor. Every different real-world problem has a set of rules, often referred to as business rules, or by the more generic name, domain rules.

Another book that I read some time ago that had a big impact on me was “Domain-Driven Design” by Eric Evans. The Domain-Driven Design (DDD) approach says that the business problem you are trying to solve must drive the whole of the development. Eric then goes on to explain how the business logic should be isolated from everything else other that the database classes so that you can give all your attention to what Eric Evans calls the “difficult task” of writing business logic.

There are lots of debates about whether EF Core is suitable for a DDD approach, because the business logic code is normally separate from the EF entity classes which it maps to the database. However, Eric Evans is pragmatic on this point and says in the section entitled “Working within your (database access) Frameworks” that, and I quote:

“In general, don’t fight your framework. Seek ways to keep the fundamentals of domain-driven design and let go of the specifics when your framework is antagonistic”
Page 157, Domain-Driven Design, by Eric Evans, 2004.

Note: I had to look up the word antagonistic: it means “showing or feeling active opposition or hostility towards someone or something”.

Over the years I have developed a DDD approach that works with EF and I have dedicated the whole of chapter 4 of my book to the topic of business logic because it is so important to get right. Here is a summary of the guidelines in that chapter:

  1. The business logic has first call on how the database structure is defined

Because the problem I am trying to solve, called the “Domain Model” by Eric Evans, is the heart of the problem then it should define the way the whole application is designed. Therefore, I try to make the database structure, and the entity classes, match my business logic data needs as much as I can.

  1. The business logic should have no distractions

Writing the business logic is difficult enough in itself, so I isolate it from all the other application layers, other than the entity classes. That means when I write the business logic I only have to think about the business problem I am trying to fix. I leave the task of adapting the data for presentation to the Service Layer in my application.

  1. Business logic should think it is working on in-memory data

This is something Eric Evans taught me – write your business logic as if the data was in-memory. Of course there needs to be some a ‘load’ and ‘save’ parts, but for the core of my business logic I treat, as much as is practical, the data as if it is a normal, in-memory class or collection.

  1. Isolate the database access code into a separate project

This fairly new rule came out of writing an e-commerce application with some complex pricing and deliver rules. Before this I used EF directly in my business logic, but I found that it was hard to maintain, and difficult to performance tune. Now I have another project, which is a companion to the business logic, and holds all the database access code.

  1. The business logic should not call EF Core’s SaveChanges directly

The business logic does not call EF Core’s SaveChanges method directly. I have a class in the Service Layer whose job it is to run the business logic – this is a case of the Service Layer implementing the command pattern. and, if there are no errors, it calls SaveChanges. The main reason is to have control of whether to write the data out, but there are other benefits that I describe in the book.

The figure below shows the original software architecture, but with the focus on how the business logic is handled. The five numbers, with comments, match the numbered guidelines above.

In my book I use the processing of an order for books as an example of a piece of business logic. You can see this business logic in action by going to the companion live site, http://efcoreinaction.com/, where you can ‘buy’ a book. The site uses an HTTP cookie to hold your basket and your identity (saves you having to log in). No money needed – as the terms and conditions says, you aren’t actually going to buy a book.

The code is too long to add to this article, but I have written another article called Architecture of Business Layer working with Entity Framework (Core and v6) which covers the same area in more detail and contains plenty of code examples.

6. Principle: Get your EF code working, but be ready make it faster if you need to.

The recommended approach to developing software is to get it to work, and then worry about making it faster. Another more nuanced approach, attributed to Kent Beck, is Make it Work. Make it Right. Make it Fast. Either way, these principle says we should leave performance tuning to the end. I would add a second part: you should only performance tune if you need to.

In this article I am talking about database accesses via EF. I can develop pretty complex database accesses in EF really quickly – at least five times faster than using ADO.NET or Dapper. That covers the “get it working part”. The down side is that EF doesn’t always produce the best performing SQL commands: sometimes it’s because EF didn’t come up with a good SQL translation, and sometimes it’s because the LINQ code I wrote isn’t as efficient as I thought it was. The question is: does it matter?

For example, I developed a small e-commerce site (the code took me 10 months) which had a little over a 100 difference database accesses and about 20 tables. More than 60% of the database accessed were on the admin side, with maybe 10% of accesses that really mattered to the paying user.

To show this graphically I have picked out three features from my example book selling site and then graded them by two scales:

  • vertically, what the user expects in terms of performance.
  • Horizontally, how difficult is the database access.

This gives you the figure below, with top right highlighted as area where we really need to think about performance tuning.

My analysis says that only the book search query needs work to improve it. The user is used to fast searches thanks to Google etc. and will get frustrated if my application is too slow. Looking at the complexity of the book search, which includes filtering on things like average user votes, I can see that it produces some rather complex database access commands.

It turns out that the EF Core code for my book search performs badly, but there is plenty I can do about it. In fact, I have mapped out a whole section towards the end of my book where I show how I can improve the book search in a series stages, each getting more complex and taking more development time. They are:

  1. Can I improve the basic EF commands by rearranging or refining my EF code?
  2. Can I convert some or all of the EF code into direct SQL commands, calculated columns, store procedures etc.?
  3. Can I change the database structure, such as de-normalising the database, to improve the search performance?

Improving the book search will take quite a bit of time, but it’s worth it in this application. Quite honestly the other features aren’t worth the effort, as they are fast enough using standard EF Core commands.

Planning for possible performance tuning

While I totally agree with the idea that you shouldn’t try to performance tune too early, it is sensible to plan that you might have to performance tune. All the other approaches I have described, especially the encapsulation of the databases accesses, means that my database code is a) clearly isolated, and b) open for performance tuning.

So, my practice is to develop database accesses quickly with EF, but organise the code so that I encapsulate the database code cleanly. Then, if I have a performance problem then I can tune the database access code with minimal impact on other parts of your application.

Conclusion

When I came back to programming in 2009 some of my early code ended up with long methods which were intertwined with each other – they worked, but the code was hard to understand, debug and maintain. Since then I have taken the principles and practices from the software giants and worked out how to apply them real applications. With these I can write code that is robust, easy to test and easy to improve – and I write code much faster too.

However, for me, there is a gap between seeing a great idea and working out how it could help me. Some books I instantly loved, like Eric Evans “Domain-Driven Design” but it has still taken two or three projects before I had an implementation that works really well for me. I don’t think I’m unusual in this, as learning any new technique takes some time for it to become smooth and natural.

There are many great software thinkers out there and some great principles and practices. So the next time you think “there must be a better way of doing this” then go and have a look for other people’s thoughts. It’s a journey, but the reward is becoming a better developer.

Happy coding!

Architecture of Business Layer – Calling multiple business methods in one HTTP request

In this article I explore the issues and problems of calling business logic in a web application. The aim is to supplement my last article,  Architecture of Business Layer working with Entity Framework, which looks at the design of the business layer by describing how a web application user interface or RESTful service might call the business logic to provide some useful function.

In this article I look at a specific issue around calling multiple business methods, some of which write to the database using Microsoft’s Entity Framework (EF) data access technology. My aim is to both show you what issues we need to consider and describe a solution that overcomes these issues. At the end I give your some further reading for the more advanced situations.

Calling business logic

In most web applications we have some data access methods and we have some business logic. Sometimes they are closely linked and sometimes they are separated. When building an application with EF as your database access technology  then your choices are constrained. You can go the pure Domain-Driven Design (DDD) route and beef up your EF classes, or you can separate your EF data Entity classes into a Data Layer and place your Business Logic in a separate Layer (in this article I use ‘Layer’ to mean a separate .NET assembly).

I don’t want to get into the whole argument about what is the base way, but I have chosen what is a fairly normal arrangement where I keep my business logic in a separate layer to my data classes.  This is not a pure DDD designer but I do design my Business Classes to it treat the database as far as possible as an in-memory collection. Because of this I architect my business logic in a certain way and use a private library, called GenericActions, for handling calls. (see Architecture of Business Layer working with Entity Framework article for more on this).

Everything was going fine until I had to call multiple business methods in sequence to achieve the result I needed. It is something I have just come across in a project I am working on and have developed a solution, so its a hot topic for me. This is the story of why I needed to do this, what the problems were and how I implemented my final solution.

Calling multiple business methods to achieve one business goal

I am working on a e-commerce site and I came across one business process that needed two parts: the first stage checked the data input by the user and wrote the top level data out to the database. The second stage then imported some files and used the primary keys of the data written by the first stage as a reference name for the files. That meant that the first stage needed to complete and EF .SaveChanges() needed to be called so that the primary keys are in place before the second stage started.

Couldn’t I combine the two stages into one business method? Well, if you read my other article on Business Layer Architecture you will know that I don’t let me business logic call .SaveChanges(). I could break my own rule, but I have found this level of isolation really useful (read the other article for the full story). Anyway, even if I did that I still have the problem that that if the second half failed then I would be left with a bad, half completed database entry.

Now there are other ways round this particular issue (like use a GUID), but the idea of having to run multiple business methods in series is one that I thought might happen. When the first case occurred I hand-coded a solution, but when I came across another case it was time to create a proper solution.

Solution design

I implemented a solution inside my GenericActions library. While this is private I can outline my approach so you can implement your own version if you need to.

Because my solution uses EFs transactions control, I called the key internal class ‘Transactional Runner’, referred to as TR from now on. The primary feature of TR is of course EF’s V6 transaction control methods. When running multiple business methods GenericServices creates a DbContextTransaction using the context.Database.BeginTransaction() method. This means that all writing to the database is controlled and not committed until the DbContextTransaction.Commit() method is called. If a problem is found than the DbContextTransaction.Rollback() method can be called and all database writes done within the transaction scope are undone.

The diagram below shows it running three business methods in turn. The first two write to the database, but the third fails. The runner then rolls back the changes using EF Rollback(). If they had all succeeded then it would have called EF’s Commit() to make the changes permanent and available to other application/threads.

genericactions-transactional-runner

The simplified code for this would be:

using(var context = new MyDbContext())
{
    using (var dbContextTransaction = context.Database.BeginTransaction())
    {
        try
        {
            var biz1Result = CallBizMethod1(input);
            context.SaveChanges();
            var biz2Result = CallBizMethod1(biz1Result);
            context.SaveChanges();
            var biz3Result = CallBizMethod1(biz2Result);
            context.SaveChanges();
 
            dbContextTransaction.Commit();
            return biz3Result;
        }
        catch (Exception)
        {
            dbContextTransaction.Rollback();
        }
    }  
}

If you look at the code above you see that I pass data between each business method. This allows the previous method to pass information onto the next, and so on. If you think back to my e-commerce problem that started it off then the first business method creates the database entities, which are written to the database. However the second method needs to know which entity to use, which is why the first method passes on the class, with its newly created primary key, to the second class.

As you would expect my generic solution is quite a bit more complicated than that, but the code above should give you the idea. If you are writing a solution you might like to consider that some business methods are really simple, with no database writing and maybe just a few lines of code, while some do a lot, with async writes to the database and maybe async outside requests. It is worth a bit of thought to optimise for each type. This leads to two things I do.

  • I have normal synchronous TR and also an async TR, which is what I mainly use. However I have written the Async TR version so that it can take a mix of sync and async business methods and only runs the async ones in async mode. This means small, code-only business methods are not burdened with unnecessary Tasking which makes writing/debugging them simpler and they run more quickly.
  • Not all of my business methods write to the database, so I have an attribute on the business classes that access the database and only call SaveChanges() on those methods.

Warning: use of DbContext needs careful thought

In looking at this I had to think long and hard if my final solution would work properly in a web application. It will, but only because of specific constraints I have placed on my libraries. It is worth spelling out why this is so that you understand the issues if you come to write your own version.

The sample code I gave you will work fine in all cases, but in the real world you would not normally create the DbContext in that way. I use Autofac Dependency Injection to create my DbContext and it has an .InstancePerLifetimeScope() on it. This means that a single instance of DbContext is created for each HTTP request. This is important for the following reasons:

  • If two business methods look at the same DbContext then any changes done by one method will be seen by the second method. If they each had there own DbContext then, depending on when the DbContexts accessed the database, they might be out of step.
  • If a method writes to the database, but then fails and .SaveChanges is not called then the writes are discarded before the next HTTP request. Because I control when .SaveChanges() is called by using either my open-source GenericService (database access) or private GenericActions (business logic calling) then I know a stray .SaveChanges() will not happen once a method has returned an error. This means any data written to the database from a failed method will not be persisted.

However in another application I ran a long running task, and then I have to create a separate DbContext. Why was that? The primary reason is DbContext is not Thread Safe, so you would get an error if you used the same instance in multiple threads. Getting an error is a good thing because it would cause all sorts of problems elsewhere if it didn’t.

So, if I am running multiple methods, some of which are async, so why does my solution work with a single instance of DbContext? It is because they all run in the same thread. Async simply releases the thread when busy – it does not run things in parallel, so all is fine.

I was recommended a great article called ‘Managing DbContext the right way with Entity Framework 6: an in-depth guide‘ on this whole subject by one of my readers, khalil. I found this very helpful and did consider using his library. However I am clear that in my current design the limits I have placed on my libraries , for now, everything will work reliably. Maybe I will have to look at this later if I need to extend to parallel running, but that is a fight for another day.

Conclusion

I have shown that there are cases where multiple business logic, one or more of which write the database, need to be run is series. I have then described an approach using Entity Framework where you can ensure that the database is only updated if the whole series of business methods finished successfully. While the solution is fairly simple there are both performance and tasking issues to consider.

Hopefully this article will have given you some pointers on what the problem is, what sort of solution you could use and what to watch out for. If anything is unclear then please leave a comment and I will try and I will try and improve the information.

Happy coding!

What makes a good software library?

If you are a professional software developer then you will have used lots of software libraries/framework to help you build an application. In fact nowadays one of the key skills a developer need is pick the right software libraries to use in a given situation. I also expect most development teams have built their own software libraries to help them with some specific aspect of the work they do.

In this article I explore what I think makes a good, even great, software library. I would appreciate your thoughts on this so do add comments at the end.

NOTE: By software libraries I mean a software subsystem with some sort of Application Programming Interface (API). In the Microsoft/.NET approach software libraries are normally a Class Library with a known API, which are downloaded as a .dll file or most likely using NuGet. JavaScript libraries are normally a series of JavaScript files, possibly provided by Bower or Node.js npm, that again have some sort of API.

What motivated me to write this article?

I am building a number of .NET software libraries, one of which is an open-source project called GenericServices. Recently I built a small, but fairly complex test application to check out GenericServices and various other supporting libraries. This ‘stress test’ was really interesting in ways I did not expect (you can read more about what I did in this article).

In writing the code I found some really complex uses ‘just worked’ – in fact in a couple of cases I wrote the Unit Tests that I though should fail but they passed first time, which really threw me for a while. In other parts I had a problem in someone else’s software library and even when looking at their code I hadn’t a clue how to fix the problem.

All of this got me thinking about “what makes a good software library?” This is especially important to me as I want my libraries to be easy to use by anybody. I therefore been thinking about this for some months and I thought I would write down my thoughts.

Categorising software libraries

To help me in considering this issue I came up with categories for the types of software libraries I have come across. This helped me work out what I thought worked well and what didn’t, and more importantly why the library was like that.

Over a few months I came up with six categorisations for software libraries. They are:

1. Normal software libraries

These are the normal, run of the mill software libraries. Their APIs are predictable and maybe a bit boring, but they get the job done. However they are great for simple problems and can save you time.

I would say that some loggers, like Log4Net, are like this. Log4Net is a fine package and it does great job. I understand the problem of logging and I could write it myself, but this library has already implemented a great solution so I use that. Job done.

2. Verbose software libraries

These are run-of-the-mill software libraries. They work, but the API’s might be a bit long-winded or verbose.

One example of a library I use that has some aspects of this is Kendo UI. I pay good money to use this library because of what it provides, but the API can be hard work. It uses a JQuery type approach where everything is configurable. That can be useful but in Kendo UI there can be lots of things that need to be configured. For instance of Charts there are over a thousand configuration options which makes finding the right one quite hard work. In contrast NVD3.js charting has separate methods for each chart type which exposes the configurations needed by that chart.

3. Clever software libraries

These are software libraries that do really clever things, but as a developer you really need to know what you are doing. You get comments like “the trick is…”

While not technically a software library I think SQL, T-SQL in this case, is an example of a ‘clever’ library/API. T-SQL is the language for talking to a SQL server, which is a form of API, but called a Domain Specific Language. Using SQL I have built some amazing queries which perform complex modelling really quickly, but it is NOT simple. That is because the problem isn’t simple and the SQL API doesn’t try to hide it from you.

4. ‘Magic’ software libraries

These are the software libraries that do things that are useful, but they are so intricate that it is hard to understand what they are doing – I call these ‘magic’ software libraries. If they are written well they can be great, but if the underlying problem is complex then they are hard to write.

I think Entity Framework (EF) is a really good example of a good ‘magic’ software library. EF is database front-end library, known as an ORM, which tries to hide the complexities of the T-SQL API mention above to make it easier to work with databases. It is a great library, but it does take a while to really understand EF because there is so much ‘magic’ going on inside it.

5. ‘Mysterious’ software libraries

These are software libraries are really hard to understand. This could simply be because it’s a very complex API. However ‘magic’ software libraries can often become ‘mysterious’ when their standard ‘magic’ doesn’t work.

Entity Framework (EF) can suffer from this in places. For instance if you have a many-to-many relationship in a database then EF inserts an extra database table which is looks after to make it really easy to use. The problem is this ‘magic’ has some limitations and when it doesn’t work you need to know how to fix it. In actual fact the link to Updating a many to many relationship in entity framework on my site gets the highest number of hits of all of my posts.

6. Elegant software libraries

These are software libraries that are a joy to use. They may do simple or clever things, but they are a) short and to the point and b) you are clear on what is happening. This what all of us strive to achieve when we write a software library.

One example of this is Microsoft’s LINQ. LINQ is a data query language which I think is really elegant. Yes, there are a few complicated bits in it, like Grouping, but most of the time it’s easy to string together a series of commands to achieve quite complex things. It is clean, clear and only mentions the things you need.

NOTE: I haven’t included a ‘must work’ category as I take that as a prerequisite. However some libraries, especially ones that have not been used much, can be suspect. Caveat emptor.

Having finished these software categories let us look at some particular issues that we should consider when building software libraries.

Think about how the developer will use your library

Here are some ideas to make you as the library writer think about the experience the end-user developer has when using your library.

1. Think from the user’s perspective

As the developer of a library we tend to be focused on the implementation, not on the user. That makes the library writer think about what he/she wants to achieve, not how the user wants to use it. Let me give you an example from my own experience of developing my GenericServices library.

When I started development of GenericServices I saw that the access to the database and the calling of business logic had some similarities. I therefore ‘saved time’ by writing code that handled both. Problem is the code was a little bit complex and a little bit opaque to the end user.

This became apparent when I started to us it. Even I might hesitate for a second or two about a certain term, so what would another developer feel like. I therefore split the two libraries, with GenericServices focused on the core problem of database access. Yes, there is some duplication but it is much clearer to the user/developer and two libraries are much simpler.

2. Focus on the most important use-cases

A Use Case is a term to define how a user, in this case a software developer, will use your tools, in this case the API of your software library, to achieve a specific action. I found it very useful to order the use cases.

The most obvious way to order is on how often a particular use case was likely to be used. Sometimes there isn’t this sort of split, but often there is – sometimes called the 80/20 rule. If you do have this split then you need to make the API for the ‘normal’ (80%) cases short and easy. The ‘special’ (20%) use cases still need to be supported, but the API can have a few more lines of code to archive it so that the ‘normal’ use cases stay simple. This can make a big difference to how many lines of code a developer will need to write in complete project.

In the case of my GenericServices library I had another ordering based on the architecture used in the data layer: a CRUD design or a Domain-Driven Design (DDD) design. These two design approaches have very different needs when it comes to adding or changing data. DDD is more difficult to support because the data layer imposes some rules that GenericServices has to abide by. I managed to support DDD while having no impact on the simpler CRUD usages, which was great. Maybe you have another perspective/ordering that will help you focus your API.

3. Principal of least astonishment

I came across the term ‘Principal of least astonishment’ in one of Addy Osmani blogs, although he didn’t coin the term. Addy explains this as follows:

Try not to violate the principle of least astonishment, i.e. the users of your component API shouldn’t be surprised by behaviour. Hold this true and you’ll have happier users and a happier team.

I really like this term. It sums up the need for the The API to be a) obvious b) consistent and c) in tune with what developers expects from a library.

To that end if you are developing for the Microsoft .NET arena I would recommend the book ‘Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable .NET Libraries: Conventions, Idioms, and Patterns for Reuseable .NET Libraries (Microsoft .Net Development)’. This book is packed with well researched guidelines which will make your .NET library look more like the standard .NET libraries.

NOTE: This book was written in 2008 so it does not include LINQ Lambda and the newer async/await, both of which I think change some of the guidelines. However I still recommend this book.

Complex problems demand a lot of thought

If your problem is complex then you have to be really careful when contemplating writing a good library. What you are normally doing in a library is providing a set of commands (API) that help you do a job more quickly than if you wrote the actual code yourself. This often requires you to ‘abstract’ the underlying problem into a set of commands.

With small problems like a logger then the abstraction isn’t that great, but the more complex the problem the more you are abstracting the problem. The problems come if you abstraction is not perfect, or as it is called ‘a leaky abstraction’. In the post ‘The Law of Leaky Abstractions‘ Joel talks about abstractions. He says:

All non-trivial abstractions, to some degree, are leaky.

Abstractions fail. Sometimes a little, sometimes a lot. There’s leakage. Things go wrong. It happens all over the place when you have abstractions.

Get the abstraction right and you have a ‘magic’, possibly Elegant design. Get it wrong and you can have a ‘Mysterious’ software library that people will give up on. Here are a few ideas I have on helping with this problem:

  • Include ‘get out of jail’ access points
    By this I mean the library tries to abstract things but it still provides direct access to the underlying system. Entity Framework does this by including access to direct SQL commands for the times when its abstraction gets in the way.
  • Don’t bite off more than you can chew
    Sometimes a complex problem can be broken down into sub-problems. This allows you to build a few smaller, focused libraries rather than one big complex library. I have used this a few times and found it has worked well for me.
  • If all else fails, don’t abstract too much. The T-SQL mentioned earlier does this. It does not abstract the problem much at all which leaves it up to the developer to learn its complexities of the problem space. That way it minimises the leakiness of the abstraction.

Striving for Elegance

Many years ago I read a brilliant book by Eric Evans called Domain-Driven Design. In this book Eric devotes a whole section called “Refactoring towards deeper insight”. His message is we are used to refactoring to make the software better, but what about refactoring to improve what the software does? My experience is that making a good software library comes from a series of “deeper insights” following by some key refactoring.

The development of my GenericServices library proceeded with a mixture of hard works and “deeper insights”. Below is a list of the “deeper insight” moments that I think improved the library.

  1. The service layer has a lot of nearly duplicate code. I could build a library to do that.
  2. Generics would be a really good approach to use.
  3. The class signatures are hard to use, especially with dependency injection. I need to make them simpler.
  4. Having database and business logic together is confusing. I should split them.
  5. The stress test (see next section below) has shown up some issues I need to improve.
  6. I need computed parameters. Let’s use DelegateDecompiler to do that.

I already have my eyes on a few more changes:

  1. I could simplify even further the class signatures.
  2. Domain-Driven Design databases are important to me. I could improve the support for that.

Of course I would love to have been able to go immediately to the final design, but just I don’t think I could have done that. You have to live with a library and see it being used to find out what is wrong. In fact my eldest son Ben, who is a good programmer in his own right, says “Build one to throw away”. It does seem like you need to build something to see what is wrong before you can make it better – well I do at least.

Stress your library to find the pinch points

I built hundreds of Unit Tests and even build a whole demo web site to show GenericServices in action. This weeded out lots of issues but because I wrote the code it was subconsciously biased towards the GenericServices way of doing things.

Normally the ‘stress test’ is on a real project of some size. However I didn’t have a big project on the go at that stage so I decided to stress test GenericServices myself by using someone else’s database. I used the Microsoft AdventureWorks2012Lite database which is a) not designed by me and b) has some database features I had not come across before. That really pushed my library in ways I had not done before, and revealed some areas that needed more work.I recommend anyone who is trying to perfect a software library to ‘stress test’ their library.

The users of the library will test your library, but not as well as you can and they aren’t likely to feed back if they just find it too hard. For the investment of only two weeks I could really dig into what worked/did not work myself and I think it radically improved my library. Maybe you can stress your library on a real project you are working on, but allow time to tweak the library.

You can read about some of the things I found in the article called ‘Using Entity Framework with an Existing Database: User Interface’ on the Simple-Talk web site.

The end

I hope you have enjoyed reading this article and I expect you have your own views/experiences. I would love to hear them as it is only by sharing ideas do we get better at what we do.