Is the Repository pattern useful with Entity Framework?

05/10/2014
Quick Summary
This is, hopefully, a critical review of whether the repository and UnitOfWork pattern is still useful with the modern implementations of Microsoft’s Entity Framework ORM. I look at the reasons why people are suggesting the repository pattern is not useful and compare this with my own experience of using the repository and UnitOfWork over many years.

I have just finished the first release of Spatial Modeller™ , a medium sized ASP.NET MVC web application. I am now undertaking a critical review of its design and implementation to see what I could improve in V2. The design of Spatial Modeller™ is a fairly standard four layer architecture.

Four layer design drawing

Software design of Spatial Modeller Web app

I have use the repository pattern and UnitOfWork pattern over many years, even back in the ADO.NET days. In Spatial Modeller™ I think I have perfected the design and use of these patterns and I am quite pleased with how it helps the overall design.

However we don’t learn unless we hear from others that have a counter view, so let me try and summarise the arguments I have seen against the repository/Unit of Work pattern.

What people are saying against the repository pattern

In researching as part of my review of the current Spatial Modeller™ design I found some blog posts that make a compelling case for ditching the repository. The most cogent and well thought-out post of this kind is ‘Repositories On Top UnitOfWork Are Not a Good Idea’. Rob Conery’s main point is that the repository & UnitOfWork just duplicates what Entity Framework (EF) DbContext give you anyway, so why hide a perfectly good framework behind a façade that adds no value. What Rob calls ‘this over-abstraction silliness’.

Another blog is ‘Why Entity Framework renders the Repository pattern obsolete’. In this Isaac Abraham adds that repository doesn’t make testing any easier, which is one thing it was supposed to do.

I should also say I found another blog, ‘Say No to the Repository Pattern in your DAL’ which, says that using a repository removes access to Linq querying, ability to include/prefetch or aync support (EF 6). However this not true if your repository passes IQueryable items as this allows all these features.

So, are they right?

Reviewing my use of repository and UnitOfWork patterns

I have been using repositories for a while now and my reflection is that when the ORMs weren’t so good then I really needed repositories.

I build a geographic modelling application for a project to improve HIV/AIDS testing in South Africa. I used LINQ SQL, but because it didn’t support spatial parts I had to write a lot of T-SQL stored procedures and use ADO.NET to code access the spatial values. The database code was therefore a pretty complicated mix of technologies and a repository pattern acted as a good Façade to make it look seamless. I definitely think the software design was helped by the repository pattern.

However EF has come a long way since then. Spatial Modeller™ used EF 5, which was the first version to support spatial types (and enums, which is nice). I used the repository pattern and they have worked well, but I don’t think it added as much value as in the South Africa project as the EF code was pretty clean. Therefore I sympathise with the sentiments of Rob etc.

My views on the pros and cons of repository and UnitOfWork patterns

Let me try and review the pros/cons of the repository/UnitOfWork patterns in as even-handed way as I can. Here are my views.

Benefits of repository and UnitOfWork patterns (with best first)

  1. Domain-Driven Design: I find well designed repositories can present a much more domain specific view of the database. Commands like GetAllMembersInLayer make a lot more sense than a complex linq command. I think this is a significant advantage. However Rob Conery’s post suggests another solution using Command/Query Objects and references an excellent post by Jimmy Bogard called ‘Favor query objects over repositories’.
  2. Aggregation: I have some quite complex geographic data consisting of six or seven closely interrelated classes. These are treated as one entity, with one repository accessing them as a group. This hides complexity and I really like this.
  3. Security of data: one problem with EF is if you load data normally and then change it by accident it will update the database on the next SaveChanges. My repositories have are very specific what it returns, with GetTracked and GetUntracked versions of commands. And for things like audit trails I only ever return them as untracked. (see box below right if you not clear on what EF tracking is).
  4. Hiding complex T- SQL commands: As I described before sometimes database accesses need a more sophisticated command that needs T-SQL. My view these should be only in one place to help maintenance. Again I should point out that Rob Conery’s post Command/Query Objects (see item 1) could also handle this.
Not sure what EF Tracked is?
A normal EF database command using DbContext returns ‘attached’ classes. When SaveChanges() is called it checks these attached classes and writes any changed data back to the database. You need to append AsNoTracking() to the linq command if you want a read-only copy. Untracked classes are also slightly faster to load.

Non-benefits of repository and UnitOfWork patterns are:

  1. More code: Repositories and UnitOfWork is more code that needs developing, maintaining and testing.
  2. Testing: I totally agree with Isaac Abraham. Repositories are no easier to mock than IDbSet. I have tried on many occasions to mock the database, but with complex, interrelated classes it is really hard to get right. I now use EF and a test database in my unit tests for most of the complex models. It’s a bit slower but all my mocks could not handle the relational fixup that EF does.

Conclusion

I have to say I think Command/Query Objects mentioned by Rob Conery and described in detail in Jimmy Bogard’s post called ‘Favor query objects over repositories’ have a lot going for them. While the repository/UnitOfWork pattern has served me well maybe EF has progressed enough to use it directly, with the help of Command/Query Objects to make access more Domain-Driven in nature.

What I do in a case like this is build a small app to try out a new approach. I ensure that the app has a few complex things that I know have proved a problem in the past, but small enough to not take too long. This fits in well with me reviewing the design of Spatial Modeller™ before starting on version 2. I already have a small app looking at using t4 code generation to produce some of the boiler plate code and I will extend the app to try a non-repository approach. Should be interesting.

Update after 4 months

Well, I have pursued the non-repository approach and got somewhere nice. I didn’t go the T4 route in the end but built a library called GenericServices with a live example web site at samplemvcwebapp.net that you can try.  Both the GenericServices library and the code for the web site (see SampleMvcWebApp) are open-source projects and you are free to look at them and gain ideas from them.

The result of all this is that I am converted and fully agree with Rob Conery, and others, assertion that the repository pattern is not helpful in EF. To unpack this a bit more here are the points I have learnt:

Improvements

  • Creating a good repository pattern is hard and I used to find myself fiddling with the repository implementation during the project as I hit new issues. That is all gone.
  • Using EF directly allows me to use the full range of commands, like load to load relationships that are not currently loaded. I now realise when I used a repository pattern I was having to tweak the code to get round things that EF had commands to solve.

Negatives (and how I got around them)

  • Because you are using EF then you can put bits of database code anywhere you want and its not very obvious what is changing the database. Therefore I have focused the EF core code into one place, with clear extension points. That is what  the GenericServices library is all about.
  • You do need to have some form of ‘query objects’ to make it more obvious what is going on – a long string of EF commands isn’t as readable as ‘GetAllUnpaidInvoices’. In the GenericServices library I ended up using DTOs (Data Transfer Objects) because they could pull together other things needed in the real world.

On this last point I would also recommend ‘s article ‘Giving Clarity to LINQ Queries by Extending Expressions‘ which has some excellent ideas and source code for building ‘query objects’. Below I have listed one of the many examples he gives, which I think are really readable. The only negative is that Edward uses a repository pattern, but he does that because he is looking at IQueryable expressions in general rather than EF in particular.

public IQueryable<Post> GetVisiblePosts (DateTime today)
{
   return postRepository   //I would use 'db.Posts' here instead of 'postRepository'
          .GetAll()
          .ArePublished()
          .PostedOnOrBefore(today);
}

I hope that is helpful to people. Happy coding!