GenericServices Masterclass: a deeper look at deleting entries

Last Updated: June 24, 2015 | Created: June 23, 2015

I have been building a new application using GenericServices and I thought it would be useful to others to reflect on the approaches I have used when working with more complex data. In this second master class article I am looking at deleting entries, i.e. data rows, in a SQL Server database using Entity Framework. The master class first article looked at creating and updating entries and can be found here.

I wrote this article because the more I used GenericServices IDeleteService in real applications I ended up using the more complex DeleteWithRelationships version of the command. This article says why and suggests some ways of tackling things, including not deleting at all!

What is GenericServices??

For those of you who are not familiar with GenericServices it is an open-source .NET library designed to simplify the interface between an ASP.NET MVC application and a database handled by Microsoft’s Entity Framework (EF) data access technology. You can read more about it at https://github.com/JonPSmith/GenericServices where the Readme file has links to two example web sites. The GenericServices’ Wiki also has lots more information about the library.

While this article talks about GenericServices I think anyone who uses EF might find it useful as it deals with the basics behind deleting entities.

An introduction to relationships and deletion

Note: If you already understand the different types of referential relations in a relational database and how they affect deletion then you can skip this part and go to the section with title Deleting with GenericServices bookmark.

On the face of it deleting rows of data, known in EF at entities, seems like a simple action. In practice in relational databases like SQL Server it can have far reaching consequences, some of which might not be obvious at first glance when using EF. Let me give you a simple example to show some different aspects to delete.

Below is a database diagram of the data used in the example application SampleMvcWebApp, which contains:

  1. A selection of Tags, e.g. Programming, My Opinion, Entertainment, etc.  that can be applied to a post.
  2. A list of Posts, where each is a article on some specific topic.
  3. A list of Bloggers, which are the authors of each of the Posts.

tagpostblog

Now this example shows the two most used type of relationships we can have in a relational database. They are:

  • One-to-many (1-*): A Post has one Blogger, and a Blogger can have from zero to many Posts.
  • Many-to-many (*-*). A Post can have zero to many Tags, and Tags may be used in zero to many Posts.

The other type of basic relationship is a one-to-one relationship. This is where one entity/table is connected to another. You don’t see this type of relationship so much as if both ends are required (see below) then they can be combined. However you do see one-on-one relationships where one end is optional.

So the other aspect to the ‘One’ part of a relationship is whether it is required or optional (required: 1, optional: 0..1). An example of a required relationship is that the Post must have a Blogger, as that defines the author. An example of an optional relationship would allowing the Blogger to add an optional ‘more about the author’ entry in another table. The Author can choose to set up that data, or not bother.

How EF models these relationships

EF has multiple ways of setting up relationships and I would refer you to their documentation. Here is my super simple explanation to help you relate the section above to Entity Framework classes:

  • The ‘One’ end of a relationship has a property to hold the key (or multiple properties if a composite key).
    • If the key property(ies) is/are nullable, e.g. int? or string without [Required] attribute, then the relationship is optional.
    • If the key property(ies) is/are not nullable, e.g. int or string with [Required] attribute, then the relationship is required. See the BlogId property in the Post class of SampleMvcWebApp.
  • The ‘Many’ end of a relationship is represented by a collection, normally ICollection<T> where T is the class of the other end of the relationship. See the Posts property in the Blog class of SampleMvcWebApp.
  • Many-to-Many relationships have Collections at each end, see the Tags property in the Post class and the Posts property in the Tag Class of SampleMvcWebApp.
    EF is very clever on many-to-many relationships and automatically creates a new table that links the two classes. See my article Updating a many to many relationship in entity framework for a detailed look at how that works.

How these relationships affect Deletion?

If you delete something in a relational database that has some sort of relationship then it is going to affect the other parts of the relationship.  Sometimes the consequences are so small that they don’t cause a problem. However, especially in one-to-one or one-to-many relationships, the effect of a delete does have consequences that you need to think about. Again, let me give you two examples you can actually try yourself on the SampleMvcWebApp web site.

  1. Delete a Many-to-Many relationship. If you go to the Tags Page of SampleMvcWebApp and delete a Tag then when you look at the list of Posts then you will see that that tag has been removed from all Posts that used it (Now press Reset Blogs Data to get it back).
  2. Delete a One-to-Many. However if we go to the list of Bloggers on SampleMvcWebApp and delete one of the Bloggers then when you look at the list of Posts you will see that all Posts by that author have gone. (Now press Reset Blogs Data to get them all back).

So, what has happened on the second one? Well, in this case the database could not keep the relationship without doing something because the Post’s Blog link wasn’t optional. There were two possibilities: either it  could delete all the Post for that Blogger or it could have refused to carry out the delete.

By default EF sets up what is called ‘cascade deletes‘, which is how SampleMvcWebApp is set up. In this case is what deleted the related Posts for that Blogger. If we turned off cascade deletes then the we would get a ‘DELETE statement conflicted with COLUMN REFERENCE’ (547) and the delete would fail.

The simplified rule is if entity A is involved in a required relationship with entity B then when A is deleted something has to happen: either B is deleted or the delete fails. Which happens depends on how you configure EF.

Bookmark

Deleting with GenericServices

GenericServices has a service called IDeleteService which has two delete methods:

  1. Delete<T>(key1, key2…) (sync and async) which deletes the row with the given key, or keys if it has a composite key, from the EF entity referred to by class T, e.g. Delete<Post>(1) would delete the Post with the key of 1.
  2. DeleteWithRelationships<T>(removeRelationshipsFunc, key1, key2…) (sync and async) which does the the same, but called the removeRelationshipsFunc as part of the delete.

I am not going to detail how to use them as the GenericServices Wiki has a good description.You can also find an example of the use of Delete<T> at line 121 in PostsController and an example of the use of DeleteWithRelationships<T> at line 218 in CustomerController.

The only other thing I would say is that deleting entries with composite keys is straightforward – just supply the keys in the order in which they occur in the database. Note: some programmer don’t like composite keys, but I do find them useful. There are places where composite keys are good at segregating data into groups: the primary key can be the group name, the secondary key is the name of the item itself.


When I first wrote GenericServices I just had a Delete<T> method. I very soon found that wasn’t enough, so I added DeleteWithRelationships<T>. Now I find I am using DeleteWithRelationships 80% of the time.

The rest of the article is about why I use DeleteWithRelationships so much, and a few pointers on alternatives to using Delete at all.

Why I use DeleteWithRelationships so much

I have used DeleteWithRelationships in three situations that I will describe:

  1. To provide better error messages when a delete would fail.
  2. To delete other associated entities that would not be caught be cascade deletes.
  3. To delete associated files etc. outside the relational database.

1. To provide better error messages when a delete would fail

I have many instances where I don’t want cascade deletes to work, so if I deleted I would get a SQL error 547. While GenericServices catches this and provides a fairly helpful error it isn’t that informative. I therefore often (actually, nearly always) use DeleteWithRelationships to provide a better error message. Let me give you an example.

In a web site I was building designers could set up designs with text fields. Each field had a ColourSpec, which is a database entity. I allowed a designer to delete a ColourSpec as long as that colour isn’t used in any of the text fields. If it is used then I output a list of designs where it is used so that the designer can decide if they want to remove those reference and try the delete again.

2. To delete other associated entities that would not be caught be cascade deletes

This happens rarely, but sometimes I have a complex relationship that needs more work. I found one in the AdvertureWorks database that I use in my example application complex.samplemvcwebapp.net. In this the customer address consists of two parts: the CustomerAddress with has a relationship to an Address. Now, if we want to delete one of the customer’s  addresses, say because they have closed that office, we want to try and delete the associated Address part, which isn’t linked by cascade deletes.

By using DeleteWithRelationships I can pick up the associated Address relationship and delete that too. In fact I might need to do a bit more work to check if I can delete it as it might be references in an old order. By calling a method which I can write specifically for this case then I insert special logic into the delete service.

NOTE: because the delete is done by GenericServices and any extra delete done in the DeleteWithRelationships method are all executed in one commit. That means if either part of the delete fails then both are rolled bakc, which is exactly what you need.

3. To delete associated files etc. outside the relational database

In a recent web application the system included image files. I chose to store then in the Azure blob storage, which I have to say works extremely well. The database stored all the information about the image, include a url to access the image from the blob, while the images, which can be very large were stored in a blob table.

Now, when I delete the database entity about the image I will end up with an orphan image in the blob storage. I could have a WebJob that runs in the background to delete orphan images, but that is complicated. What I did do was add code to the DeleteWithRelationships to delete the images as part of the delete process.

There is obviously a danger here. Because the database and the blob are not connected then there is no combined ‘commit’, i.e. I might delete the images and the SQL database delete might then fail. Then my url links don’t point to a valid image. In this case I have made a pragmatic choice: I check that the delete should work by checking all the SQL relationships before I delete the blob images. I could get a failure at the SQL end which would make things go wrong, but other parts of the system are designed to be resilient to not getting an image so I accept that small possibility for a simpler system.  Note that if I fail to delete the images from the blob I don’t stop the SQL delete – I just end up with orphan images.

An alternative to Delete

In working web applications I find it is sometimes better not to delete important entities, especially if I want to keep a history of orders or usage. In these cases, which happen a lot in live systems, I am now adding an enum ‘Status’ to my key entities which has an ‘Archive’ setting. So, instead of deleting something I set the Status to Archive.

All I have to do to make this useful is filter which entities are shown based on this Status flag. That way entries with a Status setting of ‘Archive’ are not shown on the normal displays, but can be seen in audit trails or admin screens.

I needed a Status flag anyway, so there is little extra overhead to providing this on my key entities. Also, in EF 6 you can now set a property to have a index, so the query is fairly quick. You need to think about your application, but maybe this would work for you too.

Note: See the comment below by Anders Baumann, with a link to an article called ‘Don’t Delete – Just Don’t‘ for a well formed argument for not using delete.

Conclusion

Deleting looks so simple, but my experience is that it is often more complicated than you think. Hopefully this article gives you both the background on why and the detail on how to use GenericServices to delete entities. I hope its helpful.

Happy coding!

User impersonation in MVC using ASP.NET Identity 2

Last Updated: June 22, 2015 | Created: June 19, 2015
Quick Summary
This post is about how impersonate another user in a modern ASP.NET MVC web application that uses Identity 2. It describes a system that takes advantage of MVC 5’s new AuthenticationFilter to add or change the claims on a user so that they can ‘semi-impersonate’ another user without giving them too much access.

ASP.NET Identity 2 (referred to as Identity2 from now on) is a great system for handling site membership, i.e. who can log in, referred to as authentication, and what they are allowed to do, called authorisation (UK spelling, authorization in .NET code). Identity2 has a host of new features, such as allowing social logins, e.g. login with your Goggle or Facebook account, and it also works well with modern, scaled applications because it uses a cookie for holding session information.

The problem I had was that I wanted to allow a suitably authorised person to ‘impersonate’ another user, i.e. they could gain access to certain data that was normally read-only to anyone other than the author. Identity2 changes how you do that and I went hunt for solutions.

I found a great solution by Max Vasilyev, see his article User impersonation with ASP.Net Identity 2, which I nearly went with. However my need was slightly different to Max’s so I tackled it a different way, which I call ‘semi-Impersonation’ for reasons you will see later. Like Max having got something to work I thought I would also write a blog to share a solution.

My Use-Case

Before I describe the solution I wanted to say why I needed this impersonation method, then the code should make more sense.

I am working on a web application where graphic designers can set up designs for others to buy. They have their own ‘key’ which protects their designs from being changed by anyone other than them. The problem is, sometimes they need some help on setting up a design and its often easier for the site’s chief designer, called SuperDesigner from now on, to sort it out for them.

There are all sorts of ways for solving this from the downright dangerous (i.e. sharing passwords), to not that nice (i.e. allowing the SuperDesigner to have write access to all designs all of the time). In the end I decided to use a ‘semi-impersonate’ method, i.e. the SuperDesigner could just change their ‘key’ to the ‘key’ of the designer that needed help.

Technically the ‘key’ is a Claim, which is was Identity2 uses for authorisation.  If you are not familiar with claims they are a Key/Value pairs that each user has. Some are fixed and some you can add yourself.

My semi-impersonation solution

The solution uses a semi-impersonation approach, i.e. I changes a small part of the user information which allows access to the Designer’s work. The solution  consists of four main parts:

  1. A session cookie to control when we are in ‘impersonate’ mode.
  2. Use of MVC’s 5 new AuthenticationFilter OnAuthentication method which changes the claims on the current user.
  3. Some change to my application code that provides the ‘key’ for each designer.
  4. Feedback to the SuperDesigner that they are in ‘impersonate’ mode and a button to drop out of impersonation.

The big picture of the design is as follows:

  1. The SuperDesigner looks through a list of designers and clicks ‘Impersonate’ on the designer that needs help.
  2. That creates an Impersonate session cookie which holds the ‘key’ of that designer, encrypted of course!
  3. My Impersonate AuthenticationFilter is registered at startup and runs on every request. If it finds an Impersonate session cookie it adds a new, temporary Impersonate claim with the ‘key’ from the cookie to the current ClaimPrincipal.
  4. The code that provides the unique ‘key’ to each designer was changed to check for an Impersonate claim first, and if present it uses that key instead.
  5. The _LoginPartial.cshtml code is changed to visually show the SuperDesigner is in ‘impersonate’ mode and offers a ‘Revoke’ button, which deletes the cookie and hence returns to normal.

1. The Session Cookie

In MVC cookies are found my looking in HttpRequestBase Cookies collection, and added or deleted by adding the HttpResponseBase Cookies collection. I didn’t find the MSDN documentation on Cookies very useful, so here some code snippets to help. Note that in the code snippets the ‘_request’ variable holds the MVC Request property and ‘_response’ variable holds the MVC Response property.

Adding a Cookie

public HttpCookie AddCookie(string value)
{
    var cookie = new HttpCookie(ImpersonationCookieName, Encrypt(value));
    _response.Cookies.Add(cookie);
    return cookie;
}

This code adds a Cookie to the HttpResponseBase cookie collection. A few things to note:

  • Because I don’t set a expiry date then it is a Session Cookie, i.e. it only lasts until the browser is closed, or until I set a negative Expiry date. That way it can’t hang around.
  • I encrypt the value using the MachineKey.Protect method to keep it safe.
  • I have a setting in my Web.Config to set all cookies to HTTPOnly, and Secure when running on Production systems.

Cookie Exists and GetValue

public bool Exists()
{
    return _request.Cookies[ImpersonationCookieName] != null;
}

public string GetValue()
{
    var cookie = _request.Cookies[ImpersonationCookieName];
    return cookie == null ? null : Decrypt(cookie.Value);
}

Fairly obvious. Decrypt is the opposite of Encrypt

Deleting the Cookie

public void DeleteCookie()
{
    if (!Exists()) return;
    var cookie = new HttpCookie(ImpersonationCookieName) { Expires = DateTime.Now.AddYears(-1) };
    _response.Cookies.Add(cookie);
}

Deleting a cookie proved to be a bit troublesome and I found some misleading information on stackoverflow. The best article is the MSDN one here. Watch out that the delete cookies is new, don’t use the one from the request, and the new cookie must have exactly the same attributes as the original cookie. I had a .Path constraint on the added cookie and not on the delete cookie and it didn’t work.

2. AuthenticationFilter OnAuthentication

The AuthenticationFilter is a great addition that came in with MVC version 5. It allows the Current Principal, both Thread.CurrentPrincipal and HttpContext.User, to be changed. The key to doing this is the OnAuthentication method inside the AuthenticationFilter, which I have listed in full below:

/// <summary>
/// This looks for a Impersonation Cookie. If not found then it simply returns.
/// If found it changes the filterContext.Principal to include a
/// ImpersonateClaimType with the Designer Key value taken from the cookie.
/// </summary>
/// <param name="filterContext"></param>
public void OnAuthentication(AuthenticationContext filterContext)
{
    var impersonateCookie = new ImpersonationCookie(
        filterContext.HttpContext.Request,
        filterContext.HttpContext.Response);

    if (!impersonateCookie.Exists()) return;

    //There is an impersonate cookie, so we build a new
    //ClaimsPrincipal to replace the current Principal
    var existingCp = filterContext.Principal as ClaimsPrincipal;
    if (existingCp == null)
        throw new NullReferenceException("Should be claims principal");

    var claims = new List<Claim>();
    claims.AddRange(existingCp.Claims);
    claims.Add(new Claim(ImpersonateClaimType,
         impersonateCookie.GetValue()));

    filterContext.Principal = new ClaimsPrincipal(
         new ClaimsIdentity(claims, "ClaimAuthentication"));
}

Hopefully the code is fairly obvious. The main thing to know is if you change the filterContext.Principal then MVC will change the Thread.CurrPrincipal. too, so you change gets propagated throughout your application. So, by creating a new ClaimsPrincipal with all the original claims, plus the new ImpersonateClaimType then this can be picked up by calling Thread.CurrPrincipal by casting it to a ClaimsPrincipal.

Note that the ‘new ImpersonationCookie(…’ is just my wrapper round my Cookie code shown earlier.

3. Picking up the key

This bit of code is specific to my application, but worth showing for completeness:

public static string GetDesignerKey()
{
   var claimsPrincipal = Thread.CurrentPrincipal as ClaimsPrincipal;
   if (claimsPrincipal == null) return null;  //system failure

   var impersonateClaim = claimsPrincipal.Claims
       .SingleOrDefault(x => x.Type == ImpersonateClaimType);
   if (impersonateClaim != null) return impersonateClaim.Value;

   var designerKeyClaim = claimsPrincipal
       .Claims.SingleOrDefault(x => x.Type == DesignerKeyClaimType);
   if (designerKeyClaim != null) return designerKeyClaim.Value;

   return null;   //system failure
}

You may wonder why I did not just change the DesignerKeyClaimType? I could have, but then it would have been hard for other parts of the system to find out if we were in impersonate mode, as we will see in the next section.

4. Feedback to user and ‘Revoke impersonation’

The final part is the feedback to the user that they are in ‘impersonate’ mode and offer them a ‘Revoke impersonation’ button. This I did in a Html helper method I wrote which went into the ‘if (Request.IsAuthenticated)’ path of  _loginPartial.cshtml. This helper did the following:

  1. It looked for the ImpersonateClaimType. If not there then output normal html.
  2. If ImpersonateClaimType there then:
    1. Say we are in impersonation mode
    2. Offer a ‘Revoke’ button that calls an MVC action which deletes the Impersonation cookie.

Why/how is my implementation different to Max Vasilyev solution?

It is always good to compare solutions so you know why they are different. Here are my thoughts on that (Max, please do tell me anything else you would like added).

  • Max Vasilyev’s solution is great if you want to take over all the user’s roles and claims – you become the user in total. In my case I just wanted to change one attribute, the designer key, hence the name ‘semi-impersonate’.
  • If you just need to get impersonation going quickly then use Max’s solution – it just works. My method needs more design work as you need target specific attributes/features for it to work. Therefore it is only worth considering my method if you want more control on how the impersonation works, see the points below:
    • My method only adds/changes the specific claims that are needed, so limiting the power of the impersonator.
    • My method would would with different levels/types of impersonation by having different cookies. The OnAuthentication method could then obey different rules for each cookie type.
    • My method can use the cookie attributes like Domain and Path to limit what part of my site supports impersonation. In my application I use the .Path attribute to make impersonation only work on the Designer setup area, thus restricting the power of the SuperDesigner to say edit the designer’s details  or payments.

Note: Max makes a good point in his feedback that using .Path in that way ‘is not future- or refactoring- proof’, i.e. if the routing is changed by someone who does not know about the .PATH on the cookie then it could lead to errors that are hard to find at testing time. You may want to take his advice on that and use a more obvious authentication filter attribute that checks for the cookie.

Other, more minor differences are:

  • My method allows for audit trails to work properly as only the key is changed and the rest of the user identity is left alone. Also means the user who is impersonation, SuperDesigner in my case, keeps their enhanced Authorisation Roles.
  • My method will be very slightly slower than Max’s solution as it runs the OnAuthentication method on every HTTP request.
  • My method is limited to what information you can put in  a cookie (4093 bytes). Also because I encrypt my content it can take up a lot of room so you might only get four or five big strings in an encrypted cookie.

Conclusion

This article has described a way of one user gaining access to facilities of another user in a secure and controlled way. Hopefully I have given you enough code and links to see how it works. I have also pointed you to Max Vasilyev solution, which is very good but uses a different approach, so now you have two ways to impersonate another user.

Do have a look at both and take your choice based on what you need to do, but between Max’s solution and mine I hope they help you with your web application development.

Happy coding!

Architecture of Business Layer – Calling multiple business methods in one HTTP request

Last Updated: November 30, 2017 | Created: April 25, 2015

In this article I explore the issues and problems of calling business logic in a web application. The aim is to supplement my last article,  Architecture of Business Layer working with Entity Framework, which looks at the design of the business layer by describing how a web application user interface or RESTful service might call the business logic to provide some useful function.

In this article I look at a specific issue around calling multiple business methods, some of which write to the database using Microsoft’s Entity Framework (EF) data access technology. My aim is to both show you what issues we need to consider and describe a solution that overcomes these issues. At the end I give your some further reading for the more advanced situations.

Calling business logic

In most web applications we have some data access methods and we have some business logic. Sometimes they are closely linked and sometimes they are separated. When building an application with EF as your database access technology  then your choices are constrained. You can go the pure Domain-Driven Design (DDD) route and beef up your EF classes, or you can separate your EF data Entity classes into a Data Layer and place your Business Logic in a separate Layer (in this article I use ‘Layer’ to mean a separate .NET assembly).

I don’t want to get into the whole argument about what is the base way, but I have chosen what is a fairly normal arrangement where I keep my business logic in a separate layer to my data classes.  This is not a pure DDD designer but I do design my Business Classes to it treat the database as far as possible as an in-memory collection. Because of this I architect my business logic in a certain way and use a private library, called GenericActions, for handling calls. (see Architecture of Business Layer working with Entity Framework article for more on this).

Everything was going fine until I had to call multiple business methods in sequence to achieve the result I needed. It is something I have just come across in a project I am working on and have developed a solution, so its a hot topic for me. This is the story of why I needed to do this, what the problems were and how I implemented my final solution.

Calling multiple business methods to achieve one business goal

I am working on a e-commerce site and I came across one business process that needed two parts: the first stage checked the data input by the user and wrote the top level data out to the database. The second stage then imported some files and used the primary keys of the data written by the first stage as a reference name for the files. That meant that the first stage needed to complete and EF .SaveChanges() needed to be called so that the primary keys are in place before the second stage started.

Couldn’t I combine the two stages into one business method? Well, if you read my other article on Business Layer Architecture you will know that I don’t let me business logic call .SaveChanges(). I could break my own rule, but I have found this level of isolation really useful (read the other article for the full story). Anyway, even if I did that I still have the problem that that if the second half failed then I would be left with a bad, half completed database entry.

Now there are other ways round this particular issue (like use a GUID), but the idea of having to run multiple business methods in series is one that I thought might happen. When the first case occurred I hand-coded a solution, but when I came across another case it was time to create a proper solution.

Solution design

I implemented a solution inside my GenericActions library. While this is private I can outline my approach so you can implement your own version if you need to.

Because my solution uses EFs transactions control, I called the key internal class ‘Transactional Runner’, referred to as TR from now on. The primary feature of TR is of course EF’s V6 transaction control methods. When running multiple business methods GenericServices creates a DbContextTransaction using the context.Database.BeginTransaction() method. This means that all writing to the database is controlled and not committed until the DbContextTransaction.Commit() method is called. If a problem is found than the DbContextTransaction.Rollback() method can be called and all database writes done within the transaction scope are undone.

The diagram below shows it running three business methods in turn. The first two write to the database, but the third fails. The runner then rolls back the changes using EF Rollback(). If they had all succeeded then it would have called EF’s Commit() to make the changes permanent and available to other application/threads.

genericactions-transactional-runner

The simplified code for this would be:

using(var context = new MyDbContext())
{
    using (var dbContextTransaction = context.Database.BeginTransaction())
    {
        try
        {
            var biz1Result = CallBizMethod1(input);
            context.SaveChanges();
            var biz2Result = CallBizMethod1(biz1Result);
            context.SaveChanges();
            var biz3Result = CallBizMethod1(biz2Result);
            context.SaveChanges();
 
            dbContextTransaction.Commit();
            return biz3Result;
        }
        catch (Exception)
        {
            dbContextTransaction.Rollback();
        }
    }  
}

If you look at the code above you see that I pass data between each business method. This allows the previous method to pass information onto the next, and so on. If you think back to my e-commerce problem that started it off then the first business method creates the database entities, which are written to the database. However the second method needs to know which entity to use, which is why the first method passes on the class, with its newly created primary key, to the second class.

As you would expect my generic solution is quite a bit more complicated than that, but the code above should give you the idea. If you are writing a solution you might like to consider that some business methods are really simple, with no database writing and maybe just a few lines of code, while some do a lot, with async writes to the database and maybe async outside requests. It is worth a bit of thought to optimise for each type. This leads to two things I do.

  • I have normal synchronous TR and also an async TR, which is what I mainly use. However I have written the Async TR version so that it can take a mix of sync and async business methods and only runs the async ones in async mode. This means small, code-only business methods are not burdened with unnecessary Tasking which makes writing/debugging them simpler and they run more quickly.
  • Not all of my business methods write to the database, so I have an attribute on the business classes that access the database and only call SaveChanges() on those methods.

Warning: use of DbContext needs careful thought

In looking at this I had to think long and hard if my final solution would work properly in a web application. It will, but only because of specific constraints I have placed on my libraries. It is worth spelling out why this is so that you understand the issues if you come to write your own version.

The sample code I gave you will work fine in all cases, but in the real world you would not normally create the DbContext in that way. I use Autofac Dependency Injection to create my DbContext and it has an .InstancePerLifetimeScope() on it. This means that a single instance of DbContext is created for each HTTP request. This is important for the following reasons:

  • If two business methods look at the same DbContext then any changes done by one method will be seen by the second method. If they each had there own DbContext then, depending on when the DbContexts accessed the database, they might be out of step.
  • If a method writes to the database, but then fails and .SaveChanges is not called then the writes are discarded before the next HTTP request. Because I control when .SaveChanges() is called by using either my open-source GenericService (database access) or private GenericActions (business logic calling) then I know a stray .SaveChanges() will not happen once a method has returned an error. This means any data written to the database from a failed method will not be persisted.

However in another application I ran a long running task, and then I have to create a separate DbContext. Why was that? The primary reason is DbContext is not Thread Safe, so you would get an error if you used the same instance in multiple threads. Getting an error is a good thing because it would cause all sorts of problems elsewhere if it didn’t.

So, if I am running multiple methods, some of which are async, so why does my solution work with a single instance of DbContext? It is because they all run in the same thread. Async simply releases the thread when busy – it does not run things in parallel, so all is fine.

I was recommended a great article called ‘Managing DbContext the right way with Entity Framework 6: an in-depth guide‘ on this whole subject by one of my readers, khalil. I found this very helpful and did consider using his library. However I am clear that in my current design the limits I have placed on my libraries , for now, everything will work reliably. Maybe I will have to look at this later if I need to extend to parallel running, but that is a fight for another day.

Conclusion

I have shown that there are cases where multiple business logic, one or more of which write the database, need to be run is series. I have then described an approach using Entity Framework where you can ensure that the database is only updated if the whole series of business methods finished successfully. While the solution is fairly simple there are both performance and tasking issues to consider.

Hopefully this article will have given you some pointers on what the problem is, what sort of solution you could use and what to watch out for. If anything is unclear then please leave a comment and I will try and I will try and improve the information.

Happy coding!

GenericServices Masterclass: a deeper look at Create and Update

Last Updated: April 21, 2015 | Created: April 21, 2015

I have been building a new application using GenericServices and I thought it would be useful to others to reflect on the approaches I have used when working with more complex data. In this first master class article I am looking at Create and Update of data, especially via DTOs (Data Transfer Objects).

What is GenericServices??

For those of you who are not familiar with GenericServices it is an open-source .NET library designed to simplify the interface between an ASP.NET MVC application and a database handled by Microsoft’s Entity Framework (EF) data access technology. You can read more about it at https://github.com/JonPSmith/GenericServices where the Readme file has links to two example web sites. The GenericServices’ Wiki also has lots more information about the library.

If you are not familiar or interested in GenericServices then save yourself time and skip this post.

Introduction to create and update with DTOs

GenericServices has five primary data access methods: List, Detail, Create, Update and Delete. Create and Update are the most complicated as they write data back to the database. If you are working directly with the database classes then it is fairly straightforward: in this case GenericServices is executing the EF commands for you through a standardised interface that adds some extra code to check and return any validation or selected SQL errors.

Create and update become more complex when you are using DTOs. However, if you have read any of my articles, such as Why DTOs, then you will know that 95% of my accesses to the database go through DTOs. Therefore handling DTOs is critical and GenericServices is built primarily for DTO mode database accesses.

The most common use of DTO is in reading data, but for Create or Update the DTOs need to be transformed back to the database entity before that entity is written out to the database. There are four levels of complexity, starting with the simplest and working up to the more complex. I will list them here and then go into more detail on what to do later on.

  1. Simple one-to-one mapping: Each property is a simple data type, i.e not another user class or relationship, and has an equivalent DTO property. In this case GenericServices ‘just works’.
  2. One-to-one mapping, but some properties are read-only. This case is like the first, but maybe you have properties you want to show the user, but you don’t want them to update them. A obvious example might be the price of something they can buy. In this case you will use the [DoNotCopyBackToDatabase] attribute – see later for an example.
  3. Contains properties that are developer defined classes or collections. In this case you might have a relationship class, a collection or an EF complex type. In this case AutoMapper will read these property using its conventions, but by default will not write them back. You need to add extra code to handle this which I will cover later.
  4. Contains properties that must be calculated on write. This happens when the value to be written depends on the user’s input, or we need to do some drastic reformat to the data before writing back. Again, I will show you some example later.

Before I describe these four cases I would like to point out that AutoMapper and the DelegateDecompiler’s [Computed] attribute (see GenericServices Wiki page on Calculated Properties) are all designed to work when READing from the database. When writing to the database GenericServices does use AutoMapper, but it has to supplement extra commands as, by default, AutoMapper is not meant to be used for write to database actions. That is why this article and the various examples exist – to show you how to handle this write operation successfully.

So, with that background let me go through all the four cases above.

1. Simple one-to-one mapping

To be honest if you have a simple one to one mapping between the data class and the DTO then you most likely should be using GenericServices with the data class itself as it is quicker and more transparent. I can’t think of any cases that I have had a ‘simple’ mapping where I used a DTO. However it will work if you do.

2. One-to-one mapping, but some properties are read-only

[DoNotCopyBackToDatabase] is GenericServices’ version of MVC’s [Bind(Exclude = “SomeProperty”)]. Both allow a property to be marked as not being overwritten by user input. You can get more about this in the GenericServices Wiki DoNotCopyBackToDatabase attribute page.

The good example is given in the Wiki page, i.e. you want a user to see the price of their purchase, but they should not be able to change the price. I use this myself in one of my applications to hold a status flag that can be seen by the user, but can only be changed by the business logic. Therefore I mark the property in the DTO with [DoNotCopyBackToDatabase] to ensure it can not be changed either by accident or maliciously.

In SampleMvcWebApp I apply [DoNotCopyBackToDatabase] in the DetailPostDto.cs class to stop the LastUpdated property being coped back. However that is really a comment as any value would be overwritten by the code inside the SampleMvcWebApp’s .SaveChanged() method which sets that property at save time.

 3. Contains properties that are developer defined classes or collections

This is where it starts to get more complicated. There are a few possible scenarios and I will deal with each in turn:

a. A relationship class

This is where you have a class that is a relationship. An example from SampleMvcWebAppComplex is a CustomerAddress class, which has a one-to-one relationship with the Address class, which holds the address details. In SampleMvcWebAppComplex I use a top level DTO called CrudCustomerAddressDto to hold all the customer address data. This type of situation doesn’t happen often, but when they do you need a way to handle them.

Firstly let me explain why the normal mappings wouldn’t work. If we used AutoMapper’s flattening convention it could read properties in the associated Address class by concatenating the names, e.g. AddressCity would read the City property in the Address part of CustomerAddress. The problem is it can read, but it won’t write as AutoMapper doesn’t ‘un-flatten’ things, i.e. setting the property AddressCity in the DTO will not be mapped back to Address.City in the CustomerAddress class.

There are a few ways round this. One is to hand-code the copy back, but we can also use AutoMapper’s nested mappings feature, which is what I actually did in this case. To use this I need to create a sub-DTO, called CrudAddressDto. I then need to do two extra things in the CrudCustomerAddressDto class.

Firstly I have to tell GenericServices to make sure that AutoMapper knows about how to map the CrudAddressDto, otherwise I will get an error from AutoMapper if I try to do the mapping. That is achieved by overriding AssociatedDtoMapping method in CrudCustomerAddressDto to add the extra mapping as show below (note: there is a AssociatedDtoMappings version if you need multiple extra mappings) :

protected override Type AssociatedDtoMapping
{
   get { return typeof(CrudAddressDto); }
}

Secondly for the update to work I need to make sure I have loaded the CustomerAddress and the associated Address class. This is done by overriding the FindItemTrackedForUpdate method (see below). If I don’t do this then updating the CustomerAddress will create a NEW Address class rather than update the existing class. As that Address class may be used elsewhere, such as in the delivery address, then its important to get that right.

 protected override CustomerAddress FindItemTrackedForUpdate(IGenericServicesDbContext context)
 {
    return context.Set<CustomerAddress>()
                             .Where(x => x.CustomerID == CustomerID && x.AddressID == AddressID)
                             .Include(x => x.Address).SingleOrDefault();
 }

b. A Complex Type

A ‘complex type’ is the term EF uses for a reference to class which has no key  – see section entitled ‘Complex Types Convention’ in Entity Framework documentation. This is often used to simplify or separate parts of the class.

If you want to do this then you can use the nesting mapping approach shown above to do the to/from mapping of the complex types. However, as complex types are simply a way of arranging things I tend ‘flatted at design time’, i.e. I don’t use complex types. That means that AutoMapper will do both the read and the write.

If that simplification does not work for you and you want to use Complex Types then you need to create a new dto for each complex type and override the AssociatedDtoMapping or AssociatedDtoMappings to allow GenericServices to set up the AutoMapper mappings. You don’t need to override FindItemTrackedForUpdate, as the class is in fact one single row in the database.

c. A Collection

This is the most complex case because a collection refer to multiple relationships (EF does not support lists and arrays of simple data like bytes or strings). As you can imagine, updating multiple relationships is complex and has to be hand coded by overriding CreateDataFromDto and UpdateDataFromDto, which I will explain in the next section.

I should say that updating multiple relationships isn’t a simple problem anyway. This normally falls under the next section, Contains properties that must be calculated on write, as it is complex enough that GenericServices cannot make a guess as to your intentions.

Contains properties that must be calculated on write

There are times where a property in a class is either set by the user via some other means, like a dropdownlist, or the data in the DTO needs a more complicated reformatting/parsing to get the required value for the property. In these cases the developer needs to intercept the CreateDataFromDto and UpdateDataFromDto methods and write code to set the properties.

Let us start by looking at the anatomy of the two methods. The first thing to note is that they are very similar, but have a slightly different signature. The method signatures are different because update is handed an database entity to update while the create returns a new instance of the database entity class (This was done because databases using a Domain-Drive Designs (DDD) approach the creation of a entity may be done via a factory). This means that the handling of the return status is slightly different between the two methods

However when overriding these methods the code inside has a common approach. Let us take UpdateDataFromDto as an example and look at the three stages the update and create goes through.

protected override ISuccessOrErrors UpdateDataFromDto(
    IGenericServicesDbContext context, TDto source, TEntity destination)
{
   //1) Before copy/create: If you put your code here then it should
   //set properties in the source, which AutoMapper will then copy over

   //2) Call the base method which uses AutoMapper to create/copy the data
   var status = base.UpdateDataFromDto(context, source, destination);

   //3) After copy/create: If you put your code here then it should
   //set properties in the destination, ready to be written out

   return status;
}

So, your options are:

  1. Change properties in the DTO and then let AutoMapper copying them over for you.
  2. Wait until Automapper has run and change the properties in the database entity class.
  3. Don’t call the base method and handle the whole copy/conversion yourself (this is what you would need to do for DDD databases).

I have used option one and two depending on the need. Option ones is good if the property needs to be in the DTO while option 2 is useful when you want to exclude a property from the DTO and only set it via your code. In both cases I try to do any validation of the values early so that if it fails it returns immediately and therefore saves the redundant call to AutoMapper.

SampleMvcWebApp has a classic example of option one in DetailPostDto. DetailPostDto is complex because it allows the user to select the post’s author from a dropdownlist and the post’s tags from a multi-select list. If you look at the code you will see the overrides of CreateDataFromDto and UpdateDataFromDto around line 140 or so. To save you swapping pages here is a copy of the overridden methods from DetailPostDto:

protected override ISuccessOrErrors<Post> CreateDataFromDto(IGenericServicesDbContext context,
    DetailPostDto source)
{
    var status = SetupRestOfDto(context);
    return status.IsValid
        ? base.CreateDataFromDto(context, this)
        : SuccessOrErrors<Post>.ConvertNonResultStatus(status);
}

protected override ISuccessOrErrors UpdateDataFromDto(IGenericServicesDbContext context,
    DetailPostDto source, Post destination)
{
    var status = SetupRestOfDto(context, destination);
    return status.IsValid
        ? base.UpdateDataFromDto(context, this, destination)
        : status;
}

As you can see both methods call a private method called SetupRestOfDto which handles the set up of the BloggerId and the Tags properties in the DTO. The SetupRestOfDto method needs to know if its a create or update, which it knows because if the Post is not given it is set to null. On return if the status for the SetupRestOfDto method is OK, the codes then calls the base method to do the create/update. This is a typical way of handling user selections from dropdown list.

I used option two, setting the value in the database entity class, in a recent project when I wanted to make the user input much simpler. In this case the user needed to input to input a CMYK colour and a RGB colour. The colours were stored as a series of numbers but it was simpler for the user to enter the CMYK and the RGB as two strings of digits, with different options on hex/decimal etc. I therefore overrided the create/update method and parsed the two strings into the right format directly into the database entity class.

I am sure your project has other requirements that GenericServices does not do as its stands. Overriding CreateDataFromDto and UpdateDataFromDto is the go-to place to add your code and still benefit from all of GenericServices’ features.

Note: Writing this section made me realise I could have simplified the the method signatures of the CreateDataFromDto and UpdateDataFromDto as I don’t (now) need to pass in the DTO. However that would be a breaking change to the interface and I’ll leave that for the next big change of GenericServices.

Conclusion

In this article I have covered in detail what happens in a Create or Update inside GenericServices. It is complex, especially around handling relationships. Hopefully this article will help anyone who needs to implement any of the more complex use of Create and Update.

Do ask any questions or make comments here, or raise an issue on the GenericServices GitHub site if you think there is a bug. Happy to answer any questions as it helps increase the documentation for the project.

Happy coding.

 

Using .NET Generics with a type derived at runtime

Last Updated: November 21, 2015 | Created: February 7, 2015

I have used .NET Generic types a lot and found them really useful. They form a core part of the solution in my open-source GenericServices library and my private GenericActions library. This article explores why and how I use Generics and some of the techniques I use to hide the more complex type definitions.

Even if you are not interested in my particular use of Generics this article has some useful information on:

  1. How to create an instance of a generic type where the type is defined at runtime.
  2. The two approaches to calling methods or read/write properties of the created generic instance.
  3. The performance issues around the two approaches to calling methods or read/write properties.

Only interested in how to use Generics with a runtime-derived type? Just skip to here in this article.

What are Generic Types

Generics are everywhere – if you use List<T> etc. you are using Generics. If you are not familiar with Generics then look at this introduction. The primary thing about Generics is it allows you to write one code implementation that can then work on a range of types. In particular they produce efficient code because they are ‘type safe’, i.e. the type is known, so the code can access what it needs.

Without Generics we would either:

  • Have to write the same code for each type, which is bad practice.
  • Write one method but use Reflection to access the items we want, which is not type-safe.

The down side of Generics is you (normally) have to know the type at compile time. However I will show you later how to overcome this.

Where I have found Generic types really useful

I developed a web application called Spatial Modeller that does mathematical predictive modelling for healthcare. The modelling is complex and I used a  Domain-Driven Design (DDD) approach, which said I should keep the database/business classes focused just on the modelling problem. That worked really well, but it meant that the database/business classes weren’t right for display to the user.

I therefore had a Service Layer which had classes more attuned to the display and visualisation of the data. These are typically called Data Transfer Objects (DTOs), or in ASP.NET MVC they are also called ViewModels. The Service Layer code transformed the data passing between the Data/Business Layers and the Presentation/WebApi Layer.

The problem was I ended up with writing very similar code, but with different data types, for the transformation of the database/business classes to DTOs. That is both bad software practice, time consuming and boring!

After that project I was determined to create a solution for that. It took a bit of experimenting, but Generics was the answer. However the final solution wasn’t quite a straightforward as you might think.

The problem of a complex Generic type definitions

Let me give you a real example of where my presentation layer wants to run a piece of business logic. In Spatial Modeller to model a scenario the following happens:

  1. The user picks the potential hospital locations for dropdown lists and fills in various selections.
  2. These are handed to the business layer modeller as primary keys for the hospital locations and various enums/parameters.
  3. The modeller does its work, writing some data to the database.
  4. The modeller returns the results as primary keys.
  5. These need to be looked up show the user.

So, in that process we have five classes:

  1. The presentation input DTO, called DtoIn.
  2. The business modeller data input, called BizIn.
  3. The class/method to call in Business Layer, called IBizAction.
  4. The output of the business modeller, called BizOut
  5. The presentation output DTO, called DtoOut.

So, if we define a Generic class to handle this its class definition would look like this:

var service = new ActionService<DtoIn, BizIn, IBizAction, BizOut, DtoOut>
     (db, new BizAction(db));
var dataIn = new DtoIn(some data here );
var result = service.RunAction(dataIn);

That is a) quite hard to write and b) not easy to use with Dependency Injection and c) downright ugly! In fact in both of my libraries its even more complex than that, with different types of input. Even worse the code will be the same, but have different transform parts. In GenericActions I calculated there were sixteen versions, all of which would need separate implementations.

GenericServices was much easier, but still produced ‘ugly’ generic type definitions, which needed special handling at dependency injection time.

Getting rid of the ‘Ugly’ type definitions

OK, so Generics got me so far but I needed to do something else to combine and simplify the type definition of the Generics. I did two things:

  1. For both libraries I hide the complex type definition via a outer, non-generic helper class where the called method does a decode of the types and creates the correct Generic class instance with the right types.
  2. In the case of GenericActions, which has so many versions, I did a further decode of the in/out types inside the Service Layer code, which reduced the sixteen possible versions down to six. I could have gone further, but this was the right level to keep performance up.

Before I describe the solution I will show you the improved code for the example in the last section (see below). I think you will agree that is much nicer, and the creation of the service is now very Dependency Injection friendly.

var service = new ActionService<IBizAction>(db, new BizAction(db));
var dataIn = new DtoIn(some data here);
var result = service.RunAction<DtoOut>(dataIn);

So, let us go through the steps to achieve this.

Part 1: Decode the underlying types

I’m going to use an example from GenericServices, as it is open-source and I can therefore link to the actual code. So the example for GenericServices is:

var service = new UpdateService(db);
var dataIn = new DetailPostDto(some data here);
var response = service.Update(dto);

The steps are:

  1. service.Update<T>(dto) is a top-level method (see first class) that can take either a database class, like Post, or a dto.
  2. If it is a dto then it is derived from abstract EfGenericDto<TEntity, TDto>. This forces the dto definition to include the database class. See the DetailPostDto.cs as an example.
  3. The Update method calls DecodeToService<UpdateService>.CreateCorrectService<T> (see this class, first method). It gets a bit complicated in this class because of sync/async versions of DTO, but the bottom line is it:
    1. Finds if it is a EfGenericDto or not. If not then it assumes its a direct access.
    2. If it inherits from EfGenericDto it finds the database class and the Dto class.

Bookmark

Part 2: Create the Generic Class from the runtime types

Ok, the last process found whether it was a database class, or a dto class. There are different classes to handle these two cases. The direct update has one Generic Type parameter, being the database class. The dto version has two Generic Type parameters: the database class and the dto class. However in the example below I look at a very simple case to make it easier to understand. GenericServices is a bit more complicated, but follows the same approach.

The code to create the instance of the Generic class is petty straightforward. For example if I wanted to create a List<string> at runtime I would

  1. Produce an array of the type(s) need to form the generic type, in this example ‘string’
  2. Get the generic type, in this example ‘List’
  3. Combine them using the ‘.MakeGenericType’ method
  4. Create an instance of that type using ‘Activator.CreateInstance’

The code below shows an example of creating ‘List<string>’ at runtime.

var dataType = new Type [] { typeof(string)};
var genericBase = typeof(List<>);
var combinedType = genericBase.MakeGenericType(dataType);
var listStringInstance = Activator.CreateInstance(combinedType);

Part 3: Calling methods from the created instance

You should be aware that ‘Activator.CreateInstance’ method returns an object, so you can’t just ‘listStringInstance.Add(“hello world”)’ as the compiler will something like “‘object’ does not contain a definition for ‘Add’“. You have two choices:

1. Use Dynamic Type

You can place the output of the ‘Activator.CreateInstance’ into a dynamic type (see start of line 4). This turns off compile time type checking which allows you to call any method, or access any property, with the type checking done at runtime. So in our List<string> case it would look like this:

var dataType = new Type [] { typeof(string)};
var genericBase = typeof(List<>);
var combinedType = genericBase.MakeGenericType(dataType);
dynamic listStringInstance = Activator.CreateInstance(combinedType);
listStringInstance.Add("Hello World");

Dynamic is easy to use, and allows much better freedom. However the dynamic runtime library takes a lot of time on the first call of the method. See Performance section for more detailed analysis.

2. Use Reflection

Reflection allows you to find methods, properties by name. You can then call the method or access the property via different Reflection methods. In our example we are calling a simple method ‘Add’ in the instance type (would be null if no method of that name existed) and then ‘Invoke’ that method, i.e.

var dataType = new Type [] { typeof(string)};
var genericBase = typeof(List<>);
var combinedType = genericBase.MakeGenericType(dataType);
var listStringInstance = Activator.CreateInstance(combinedType);
var addMethod = listStringInstance.GetType().GetMethod("Add");
addMethod.Invoke(genericInstance, new object[]{"Hello World"});

The reflection approach does have some complications, such as method or property accesses returns an object, which can be a problem if you need to type it. Also if the method had an output you need to build a generic Method using ‘MakeGenericMethod’. All in all reflection is harder to handle that dynamic.

However the reflection approach has a tiny first-use cost compared to the dynamic approach. However for lots of subsequent calls to the method then dynamic is quicker. See Performance section for more detailed analysis. (below)

Performance issues – simple types

The way you access the method or property does have a big effect on the performance of the command, as we are doing more. I care about performance so I have studied this in detail. There are two costs, one-off  first-use compute time and per instance compute time.

Let me start with a simple table giving the times for my simple List<string> example. What I did was run the same test on three types:

  1. Normal code, i.e var list = new List<string>(); list.Add(“Hello World”);
  2. Using reflection, i.e. listStringInstance.GetType().GetMethod(“Add”);
  3. Using dynamic, i.e. listStringDynamicInstance.Add(“Hello World”);

To take out any first-use costs I ran it on List<string> twice, followed by List<DateTime> three times, as it did seem to change. The List<DateTime> is there to check if building a different type has the same first-use cost. Here are the results, which were measured using the ANTS Profiler.

BE WARNED: This table is a bit misleading by implying refection is always faster than dynamic – it isn’t. See the ‘Performance issues – GenericAction library’ section where a real-life example shows that dynamic wins in the end.

Type Order Compiled Reflection dynamic
List<string> 1. First 0.8 ms 37.08 ms 5.0 ms
(was 600ms)
List<string> 2. Second < 0.001 ms 17.05 ms 1.0 ms
List<string> 2. Third < 0.001 ms 0.01 ms 0.6 ms
List<DateTime> 3. First (after List<string>) < 0.001 ms 0.03 ms 2.7 mS
List<DateTime> 4. Subsequent < 0.001 ms 0.03 ms 0.5 ms

 

When I first started testing I was getting 600ms for the first-use cost on the first dynamic method call, of which about 400 ms comes from the Dynamic Language Runtime. However once I installed .NET 4.5.2 on my windows system this dropped to 5ms. I cannot confirm that as the release notes do not say there is a change there. Maybe it was just reinstalling the .NET library. Anyway, be warned that things might have changed.

The way that .NET handled dynamic types is complex and I don’t claim to understand it properly. I would refer you to Simon Cooper’s series on ‘Inside DLR’ (Dynamic Language Runtime) starting with ‘Inside the DLR – Callsites‘ and then look at ‘Inside DLR – Invoking methods’ which describes the caching of methods.

Performance issues – GenericAction library

In GenericServices and GenericActions I have used the dynamic approach. Seeing the figures above made me wonder if that was the best way. I therefore forked a version of GenericActions and changed it to use a Refections approach with no use of dynamic anywhere. Here are my findings:

  1. When comparing the dynamic-based and the Reflection-based versions of GenericActions the first-use costs of dynamic are much less than the table above suggests. The figures are:
    1. Refection-based: 300 ms first-use cost due to AutoMapper first-use costs.
    2. dynamic-based: 700 ms first-use cost, which, if you exclude the AutoMapper part, means that the dynamic part is only 400 ms. I haven’t measured after installing .NET 4.5.2 but it now seems quicker.
  2. If a created instance is used many times then dynamic wins in the end. As an example my test of system on a 1000 accesses then dynamic was between 120% and 200% faster than reflection (on my tests anyway).
  3. Dynamic is much easier to use. I found it really hard to do everything with reflection, mainly because you cannot cast something. My altered GenericActions version worked, but some of the error checking on incorrect types did not work any more. I might have fixed them, but it wasn’t worth it.

So, lets look at the different parts of the performance problem.

1. One-off first-use compute time costs

With dynamic there is a cost on first decode and call of the method in my libraries which use dynamic – about 0.2 seconds or more on my system. As I explained earlier this because the first time you call a method in a dynamic type its needs to be created, which takes some time. However the method is then cached so later calls are very fast.

Any first-use performance cost is a pity, but I have other first-use time costs, like MVC, Entity Framework, AutoMapper etc., so I accepted , mainly because I can mitigate it (see next section) and overall the dynamic approach is faster.

Mitigating first-use costs

One thing that really helps is keeping the application alive so that the first-use cost only happens when you start/restart your application. I mainly develop web applications and one great feature introduced in .NET 4.5.1 was ‘Suspend‘ on an ASP.NET web site.

On shared hosting (maybe other hosting as well, not sure) when a web site has no traffic, i.e. all user sessions have timed out, then it stops. This means for low-traffic sites the web application would stop and the next user coming to it would have to wait for it to start. The ‘Suspend’ feature keeps the application alive so that this does not happen.

Suspend support does depend on your hosting provider setting up the feature in IIS so please read this post on it. My non-Azure shared hosting provider WebWiz supports this. Azure web sites doesn’t support ‘Suspend’ but fixes this with an ‘Always On’ feature which polls the site regularly enough that it stays up.

2. Per instance compute time

The whole decode and create takes about 0.0024 ms on my system, excluding first-use compute time. The actual creation of an instance of the generic type isn’t that costly (I haven’t measured it separately), but it is the decoding of the types etc. that take the time.

In the case of GenericServices the database call is so large, > 1 millsecond in all measured cases, that its not worth trying to improve the decode/create time.

However in GenericActions, which may call a very simple business method which could return very quickly, it is worth looking at. I therefore implemented a system caching system using a ConcurrentDictionary in my GenericActions library. It took quite a bit of tuning but the effect was worthwhile, as it brought the worse case, which has multiple decode calls, down from 0.037 ms to 0.018 ms. These are small figures, but I think worth the effort to make calls to the business logic have a low performance cost.

Conclusion

I have described not only a way of creating a Generic Type from a runtime type, but a whole system for using complex Generic Types, yet still having type definitions that are simple to use and Dependency Injection friendly. I have also looked how you call the methods or access the properties in the runtime created type, focusing on ease of use and detailed performance figures. I hope that is useful to people.

Happy coding!

Architecture of Business Layer working with Entity Framework

Last Updated: March 1, 2017 | Created: January 27, 2015

Jonathan Crosby asked a question on one of my GenericServices  GitHub projects about how to handle a Business Layer. I thought it was a good question so I have written an article to detail my approach to building business layers that use Entity Framework.

UPDATE 2016: See new, improved approach here

I have revisited this topic and refined my approach to Business Logic using EF. I suggest you read the new article called Architecture of Business Layer working with Entity Framework (Core and v6) – revisited.

UPDATE 2017: Book Entity Framework Core in Action

I have been commissioned my Manning Publishing to write the book Entity Framework Core in Action, in which chapter 4 is all about Business Logic in an Entity Framework Core environment (but the ideas are applicable to EF 6 too). 

This original article is old, but I have kept it because of all the comments.

What is the Business Layer?

GenericServices four layer designThe diagram on the left shows the type of layered application I normally build. This article is talking about the Business Layer (orange in diagram), which is called the “Domain Model” in Martin Fowler’s Service Layer diagram above.

The Business Layer is the place where all the business/domain logic, i.e. rules that are particular to the problem that the application has been built to handle, lives. This might be salary calculations, data analysis modelling, or workflow such as passing a order through different stages.

You can get a more in-depth coverage of the overall architecture in this  Simple-Talk article.

Aside: While having all the business rules in the Business Layer is something we should always strive for I find that in practice some issues go outside the Business Layer for good reasons. For instance validation of date often flows up to the Presentation Layer so that the user gets early feedback. It may also flow down to the database, which checks data written to the database, to ensure database integrity.Other business logic appears in the Presentation Layer, like not allowing users to buy unless they provide a credit card, but the status should be controlled by the Business Layer. Part of the skill is to decide whether what you are doing is justifiable or will come back and bite you later! I often get that wrong, but hopefully I learn from my mistakes.

My philosophy on the Business Layer

I have written numerous applications for predictive modelling of various healthcare issues, all of which have some quite complex business logic. Over the years I have tried different ways of organising these applications and I have come with one key philosophy – that the “The Business Layer is King“, i.e. its design and needs drives everything else.

I am a big fan of Domain-Driven Design (DDD) which has the same philosophy and provides some approaches to help keep the business logic at the forefront of any design. Here are some of the approaches from DDD that drive my designs.

1. The Business Layer defines the data structures

The problem we are trying to solve, often called the “Domain Model”, is the heart of the problem. These can be complex so the core data structures should be defined by, and solely focused on the business problem. How it is stored and how it is viewed are secondary issues.

2. The Business Layer should find persistence of data simple

While DDD accepts that the way data is stored/persisted will have an affect on the design (see Eric Evans book, page 159) the aim is to allow Business Layer code treat the database as almost an in-memory collection. Entity Framework really helps with this.

The architecture of my Business Layer

Based on the philosophy listed above the implementation I use has the following characteristics.

1. I use Adapters widely

As I said above the Business Layer is in control of the data structures – what makes the most sense for the Business Layer is what it does. This means it normally deals in data classes and/or entity keys (DDD term for the primary keys of data/entity classes).

This means the data needed/produced by the business logic is often not in the right format for communicating to the user and/or other external APIs. I therefore use Adapters, i.e. something that transforms the data to/from what the business logic. The Service Layer does this for the Business to Presentation layer communication while external services are adapted inside the Data Layer.

Further reading:

  • See Alistair Cockburn’s Hexagonal/Adapter-Port architecture . My applications are mainly in this style. I use Dependency Injection to link layers.
  • DDD call these anti-corruption layer – See Eric Evans’ talk on DDD in which he talks about different ways to adapt across Bounded Contexts (whole talk is great. Skip to 20mins in for part on linking bounded contexts).

2. The Business Layer uses EF directly

My early applications used the repository pattern and UnitOfWork pattern. However as EF has improved, and I have learnt more I have moved over to using EF directly in the Business Layer.

I found that the repository pattern in the end got in the way and made life much harder. Going for direct EF access allows the developer to access all the features of EF and stops some of the ranger tortuous approaches I used to use with repositories.

You can read more about this in my two blog posts:

  1. Is the Repository pattern useful with Entity Framework?
  2. Is the Repository pattern useful with Entity Framework? – part 2

3. The Business Layer does not do the final data save.

This is subtle, but I have found this very helpful. I don’t want the Business Layer to really know about saving data. I can’t totally ignore the data access code, in my case Entity Framework (EF), in the Business Layer, but I do minimise it.

Therefore Business Layer methods adds/inserts new data into the in-memory data classes or simply changes any loaded data from the database. However the Business Layer never calls EF’s SaveChanges. That is done in the Service Layer that called it.

Why do I find this useful? There are a number of reasons:

  • The Business Layer does not have to handle database validation errors, which can occur when ‘SaveChanges’ is called. The Service Layer does that, which has more information available to it on what is going on.
  • I know that all the data will be saved in one transaction. It is either there or its not.
  • If I need to chain multiple business methods I can use a transaction scope to rollback changes if one of the later methods fails.
  • The Service Layer can check other things to decide whether the data should be saved. I use this to allow users to save/not save on warnings.
  • It makes the Unit Testing of the Business Logic much easier as the database has not been changed. Most of my Business Logic has a property holding the data structures about to be written, so I can just check that.

Note: The discarding of data by not calling ‘SaveChanges’ only works in situation where each call has its own DbContext. This is the state in a web application as each HTTP request gets a new DbContext.

My implementation: GenericActions library

I have a library called GenericActions which has a very similar interface to the GenericServices library. Note: This isn’t an open-source library, but is part of my proprietary Rapid-Build™ library set that I use when building applications for clients.

Like GenericServices GenericActions can either work directly with the business class or more usually via a Data Transfer Object (DTO) – See Why DTO? in the GenericServices documentation for more on DTOs. It uses DI to pull in a Business Layer class, hence making sure the Business Class can pick up any services it needs through constructor injection.

A typical call of a business method at the MVC controller level looks very much like a GenericServices call.

[HttpPost]
[ValidateAntiForgeryToken]
public ActionResult SetupGrader(GraderDto dto,
            IActionService<IBizGraderSetup> service)
{
    if (!ModelState.IsValid)
        //model errors so return immediately
        return View(dto);

    var response = service.RunAction(dto);
    if (response.IsValid)
    {
        TempData["message"] = response.SuccessMessage;
        return RedirectToAction("Index");
    }

    //else errors, so copy the errors over to the ModelState and return to view
    response.CopyErrorsToModelState(ModelState, customerAddress);
    return View(customerAddress);
}

The main difference is the definition of the service (see line 4 above). The service contains the interface of the Business Layer class that contains the method it needs to call. DI is used to inject the right class into the service.

In the case above the action only returned a status to say if it was successful or not. However other Business Layer methods may need to return data, so there is another type of call that returns a class. This class can either be the class result produced by the business class or another DTO to do the conversion.

You can read about the way I implemented the GenericActions (and GenericServices) library in my new post, Using .NET Generics with a type derived at runtime.

Conclusions

Because GenericServices is focused on the database to/from Presentation Layer communication I have not described my view of the Business Layer. Hopefully this article fills in that gap by describing the ways and wherefores of the Business Layer in my  architectural view of applications.

Happy coding!

Announcing NuGet release of GenericServices

Last Updated: January 24, 2015 | Created: January 22, 2015

I am pleased to say that I have finally released a version of the GenericServices on NuGet!

For those who are not aware of GenericServices it is an open-source .NET library designed to help developer build a service layer, i.e. a layer that acts as a facard/adapter between your database in the data layer and your User Interface or HTTP service.

GenericServices makes heavy use of Entity Framework 6 – EF6 and .NET 4.5’s async/await commands. Its aim is to make the creation of the service layer simple while providing robust implementations of standard database access commands. You can get a summary of its features in from the Why GenericServices? page on the GenericServices documentation site or the README page on the GenericServices’ GitHub home page.

Link to NuGet Version

The library is known as GenericServices on NuGet, so simply search with that name in the ‘Manage NuGet Packages’ in Visual Studio. You can also find it on NuGet at https://www.nuget.org/packages/GenericServices/ .

Why have you released on NuGet now?

I did not feel confident to release GenericServices onto NuGet, where it is likely to get much more use, until I had good documentation and examples. In the past two weeks I have:

  1. I have written quite a bit of documentation and added a useful index in the sidebar. Do have a look at the documentation in the GenericServices’ Wiki.
  2. I have made available a much more advanced example called SampleMvcWebAppComplex. SampleMvcWebAppComplex is a ASP.NET MVC5 web site which was built as a ‘stress test’ of how well GenericServices library and Microsoft’s Entity Framework V6 could cope with an existing SQL database AdventureWorksLT2012.
    1. The example web site is at http://complex.samplemvcwebapp.net/.
    2. The source code is available as a reference source. Note: this project is not fully open-source because it uses a paid-for library, Kendo UI MVC.
  3. There are two articles I wrote on the Simple-Talk web site that describe in detail the process I went through to build SampleMvcWebAppComplex. They are:

I still have the original, more basic example web site called SampleMvcWebApp. You can access this via:

  1. The live example web site is at http://samplemvcwebapp.net/
  2. The source code, which is fully open-source, is available too.

Happy coding!

What makes a good software library?

Last Updated: June 1, 2016 | Created: January 5, 2015

If you are a professional software developer then you will have used lots of software libraries/framework to help you build an application. In fact nowadays one of the key skills a developer need is pick the right software libraries to use in a given situation. I also expect most development teams have built their own software libraries to help them with some specific aspect of the work they do.

In this article I explore what I think makes a good, even great, software library. I would appreciate your thoughts on this so do add comments at the end.

NOTE: By software libraries I mean a software subsystem with some sort of Application Programming Interface (API). In the Microsoft/.NET approach software libraries are normally a Class Library with a known API, which are downloaded as a .dll file or most likely using NuGet. JavaScript libraries are normally a series of JavaScript files, possibly provided by Bower or Node.js npm, that again have some sort of API.

What motivated me to write this article?

I am building a number of .NET software libraries, one of which is an open-source project called GenericServices. Recently I built a small, but fairly complex test application to check out GenericServices and various other supporting libraries. This ‘stress test’ was really interesting in ways I did not expect (you can read more about what I did in this article).

In writing the code I found some really complex uses ‘just worked’ – in fact in a couple of cases I wrote the Unit Tests that I though should fail but they passed first time, which really threw me for a while. In other parts I had a problem in someone else’s software library and even when looking at their code I hadn’t a clue how to fix the problem.

All of this got me thinking about “what makes a good software library?” This is especially important to me as I want my libraries to be easy to use by anybody. I therefore been thinking about this for some months and I thought I would write down my thoughts.

Categorising software libraries

To help me in considering this issue I came up with categories for the types of software libraries I have come across. This helped me work out what I thought worked well and what didn’t, and more importantly why the library was like that.

Over a few months I came up with six categorisations for software libraries. They are:

1. Normal software libraries

These are the normal, run of the mill software libraries. Their APIs are predictable and maybe a bit boring, but they get the job done. However they are great for simple problems and can save you time.

I would say that some loggers, like Log4Net, are like this. Log4Net is a fine package and it does great job. I understand the problem of logging and I could write it myself, but this library has already implemented a great solution so I use that. Job done.

2. Verbose software libraries

These are run-of-the-mill software libraries. They work, but the API’s might be a bit long-winded or verbose.

One example of a library I use that has some aspects of this is Kendo UI. I pay good money to use this library because of what it provides, but the API can be hard work. It uses a JQuery type approach where everything is configurable. That can be useful but in Kendo UI there can be lots of things that need to be configured. For instance of Charts there are over a thousand configuration options which makes finding the right one quite hard work. In contrast NVD3.js charting has separate methods for each chart type which exposes the configurations needed by that chart.

3. Clever software libraries

These are software libraries that do really clever things, but as a developer you really need to know what you are doing. You get comments like “the trick is…”

While not technically a software library I think SQL, T-SQL in this case, is an example of a ‘clever’ library/API. T-SQL is the language for talking to a SQL server, which is a form of API, but called a Domain Specific Language. Using SQL I have built some amazing queries which perform complex modelling really quickly, but it is NOT simple. That is because the problem isn’t simple and the SQL API doesn’t try to hide it from you.

4. ‘Magic’ software libraries

These are the software libraries that do things that are useful, but they are so intricate that it is hard to understand what they are doing – I call these ‘magic’ software libraries. If they are written well they can be great, but if the underlying problem is complex then they are hard to write.

I think Entity Framework (EF) is a really good example of a good ‘magic’ software library. EF is database front-end library, known as an ORM, which tries to hide the complexities of the T-SQL API mention above to make it easier to work with databases. It is a great library, but it does take a while to really understand EF because there is so much ‘magic’ going on inside it.

5. ‘Mysterious’ software libraries

These are software libraries are really hard to understand. This could simply be because it’s a very complex API. However ‘magic’ software libraries can often become ‘mysterious’ when their standard ‘magic’ doesn’t work.

Entity Framework (EF) can suffer from this in places. For instance if you have a many-to-many relationship in a database then EF inserts an extra database table which is looks after to make it really easy to use. The problem is this ‘magic’ has some limitations and when it doesn’t work you need to know how to fix it. In actual fact the link to Updating a many to many relationship in entity framework on my site gets the highest number of hits of all of my posts.

6. Elegant software libraries

These are software libraries that are a joy to use. They may do simple or clever things, but they are a) short and to the point and b) you are clear on what is happening. This what all of us strive to achieve when we write a software library.

One example of this is Microsoft’s LINQ. LINQ is a data query language which I think is really elegant. Yes, there are a few complicated bits in it, like Grouping, but most of the time it’s easy to string together a series of commands to achieve quite complex things. It is clean, clear and only mentions the things you need.

NOTE: I haven’t included a ‘must work’ category as I take that as a prerequisite. However some libraries, especially ones that have not been used much, can be suspect. Caveat emptor.

Having finished these software categories let us look at some particular issues that we should consider when building software libraries.

Think about how the developer will use your library

Here are some ideas to make you as the library writer think about the experience the end-user developer has when using your library.

1. Think from the user’s perspective

As the developer of a library we tend to be focused on the implementation, not on the user. That makes the library writer think about what he/she wants to achieve, not how the user wants to use it. Let me give you an example from my own experience of developing my GenericServices library.

When I started development of GenericServices I saw that the access to the database and the calling of business logic had some similarities. I therefore ‘saved time’ by writing code that handled both. Problem is the code was a little bit complex and a little bit opaque to the end user.

This became apparent when I started to us it. Even I might hesitate for a second or two about a certain term, so what would another developer feel like. I therefore split the two libraries, with GenericServices focused on the core problem of database access. Yes, there is some duplication but it is much clearer to the user/developer and two libraries are much simpler.

2. Focus on the most important use-cases

A Use Case is a term to define how a user, in this case a software developer, will use your tools, in this case the API of your software library, to achieve a specific action. I found it very useful to order the use cases.

The most obvious way to order is on how often a particular use case was likely to be used. Sometimes there isn’t this sort of split, but often there is – sometimes called the 80/20 rule. If you do have this split then you need to make the API for the ‘normal’ (80%) cases short and easy. The ‘special’ (20%) use cases still need to be supported, but the API can have a few more lines of code to archive it so that the ‘normal’ use cases stay simple. This can make a big difference to how many lines of code a developer will need to write in complete project.

In the case of my GenericServices library I had another ordering based on the architecture used in the data layer: a CRUD design or a Domain-Driven Design (DDD) design. These two design approaches have very different needs when it comes to adding or changing data. DDD is more difficult to support because the data layer imposes some rules that GenericServices has to abide by. I managed to support DDD while having no impact on the simpler CRUD usages, which was great. Maybe you have another perspective/ordering that will help you focus your API.

3. Principal of least astonishment

I came across the term ‘Principal of least astonishment’ in one of Addy Osmani blogs, although he didn’t coin the term. Addy explains this as follows:

Try not to violate the principle of least astonishment, i.e. the users of your component API shouldn’t be surprised by behaviour. Hold this true and you’ll have happier users and a happier team.

I really like this term. It sums up the need for the The API to be a) obvious b) consistent and c) in tune with what developers expects from a library.

To that end if you are developing for the Microsoft .NET arena I would recommend the book ‘Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable .NET Libraries: Conventions, Idioms, and Patterns for Reuseable .NET Libraries (Microsoft .Net Development)’. This book is packed with well researched guidelines which will make your .NET library look more like the standard .NET libraries.

NOTE: This book was written in 2008 so it does not include LINQ Lambda and the newer async/await, both of which I think change some of the guidelines. However I still recommend this book.

Complex problems demand a lot of thought

If your problem is complex then you have to be really careful when contemplating writing a good library. What you are normally doing in a library is providing a set of commands (API) that help you do a job more quickly than if you wrote the actual code yourself. This often requires you to ‘abstract’ the underlying problem into a set of commands.

With small problems like a logger then the abstraction isn’t that great, but the more complex the problem the more you are abstracting the problem. The problems come if you abstraction is not perfect, or as it is called ‘a leaky abstraction’. In the post ‘The Law of Leaky Abstractions‘ Joel talks about abstractions. He says:

All non-trivial abstractions, to some degree, are leaky.

Abstractions fail. Sometimes a little, sometimes a lot. There’s leakage. Things go wrong. It happens all over the place when you have abstractions.

Get the abstraction right and you have a ‘magic’, possibly Elegant design. Get it wrong and you can have a ‘Mysterious’ software library that people will give up on. Here are a few ideas I have on helping with this problem:

  • Include ‘get out of jail’ access points
    By this I mean the library tries to abstract things but it still provides direct access to the underlying system. Entity Framework does this by including access to direct SQL commands for the times when its abstraction gets in the way.
  • Don’t bite off more than you can chew
    Sometimes a complex problem can be broken down into sub-problems. This allows you to build a few smaller, focused libraries rather than one big complex library. I have used this a few times and found it has worked well for me.
  • If all else fails, don’t abstract too much. The T-SQL mentioned earlier does this. It does not abstract the problem much at all which leaves it up to the developer to learn its complexities of the problem space. That way it minimises the leakiness of the abstraction.

Striving for Elegance

Many years ago I read a brilliant book by Eric Evans called Domain-Driven Design. In this book Eric devotes a whole section called “Refactoring towards deeper insight”. His message is we are used to refactoring to make the software better, but what about refactoring to improve what the software does? My experience is that making a good software library comes from a series of “deeper insights” following by some key refactoring.

The development of my GenericServices library proceeded with a mixture of hard works and “deeper insights”. Below is a list of the “deeper insight” moments that I think improved the library.

  1. The service layer has a lot of nearly duplicate code. I could build a library to do that.
  2. Generics would be a really good approach to use.
  3. The class signatures are hard to use, especially with dependency injection. I need to make them simpler.
  4. Having database and business logic together is confusing. I should split them.
  5. The stress test (see next section below) has shown up some issues I need to improve.
  6. I need computed parameters. Let’s use DelegateDecompiler to do that.

I already have my eyes on a few more changes:

  1. I could simplify even further the class signatures.
  2. Domain-Driven Design databases are important to me. I could improve the support for that.

Of course I would love to have been able to go immediately to the final design, but just I don’t think I could have done that. You have to live with a library and see it being used to find out what is wrong. In fact my eldest son Ben, who is a good programmer in his own right, says “Build one to throw away”. It does seem like you need to build something to see what is wrong before you can make it better – well I do at least.

Stress your library to find the pinch points

I built hundreds of Unit Tests and even build a whole demo web site to show GenericServices in action. This weeded out lots of issues but because I wrote the code it was subconsciously biased towards the GenericServices way of doing things.

Normally the ‘stress test’ is on a real project of some size. However I didn’t have a big project on the go at that stage so I decided to stress test GenericServices myself by using someone else’s database. I used the Microsoft AdventureWorks2012Lite database which is a) not designed by me and b) has some database features I had not come across before. That really pushed my library in ways I had not done before, and revealed some areas that needed more work.I recommend anyone who is trying to perfect a software library to ‘stress test’ their library.

The users of the library will test your library, but not as well as you can and they aren’t likely to feed back if they just find it too hard. For the investment of only two weeks I could really dig into what worked/did not work myself and I think it radically improved my library. Maybe you can stress your library on a real project you are working on, but allow time to tweak the library.

You can read about some of the things I found in the article called ‘Using Entity Framework with an Existing Database: User Interface’ on the Simple-Talk web site.

The end

I hope you have enjoyed reading this article and I expect you have your own views/experiences. I would love to hear them as it is only by sharing ideas do we get better at what we do.

Entity Framework: working with an existing database – Part 2, MVC

Last Updated: December 16, 2014 | Created: November 16, 2014
Quick Summary
This two-part article looks at the software issues raised when presented with a database that was designed and built prior to software development. Part 1 explores the low-level problems of connecting to that database using Entity Framework (EF). Part 2 (this article) looks at the high-level problems of transforming the EF data into a form suitable for display via an ASP.NET MVC web site.

Part 2: ASP.NET MVC and Entity Framework with an existing database

  1. PART 1. The best way to get Microsoft’s Entity Framework version 6 (EF) data access tool to import and access the database.
  2. PART 2: (this article) This looks at the higher level functions is needed to transform that data into a user-focused display. In this case using the GenericServices open-source library to build a service/application layer to connect to modern web front end using ASP.NET MVC5.

Introduction

There are many examples of how to build ASP.NET MVC (MVC) web applications on the web. However in order to get the ideas across they often pick rather simplistic database schemes, as I did myself when I build the SampleMvcWebApp. The problem with these examples is that they do not deal with the more complex issues found in real world systems.

I wanted to stress-test some of my own libraries by building a MVC application with a database that has a more realistic level of complexity. For this I chose Microsoft’s sample AdventureWorksLT2012 database, which is a cut-down version of the larger AdventureWorks OLTP database. The AdventureWorks database is for a fictitious multinational manufacturing company producing and selling pedal cycles. (see pdf of database diagram).

In the first part of this article I looked at the problems of getting Entity Framework (EF) to work with existing database like AdventureWorks. In this second article I look at how to build a MVC web application that provided a comprehensive user interface as would be used by an employee of AdventureWorks. In doing this I try to pull out some of the key design and implementation issues that software developers face when working on real-life business systems.

The issues faced in building a web application

One of the fundamental issues that we face in building a complex web application is that what the business data held in the database is often not in the right format to display directly to the user. Let me give you a simple example from AdventureWorks.

I wanted to put up a list of existing companies that had bought from AdventureWorks before. This needed to quickly summarise who they were and give an idea of what level of purchasing they had done in the past. To do that I came up with following list as shown below:

Customer list

Now, to produce this I had to:

  1. Get the company name from the Customer table – easy
  2. Get the contact name. That needed me to combine the following properties from the Customer table:
    • Title, which can be null
    • FirstName
    • MiddleName, which can be null
    • LastName
    • Suffix, which could be null
  3. For the Total Sales Value, I needed to sum the TotalDue property in the SalesOrderHeader table.

On top of that I want to:

  1. Have a pre-filter to only show customers who had bought before in this list.
  2. Be able to search by a company or contact name.
  3. Sort on any column.
  4. Be able to page the data, i.e. only show a selection of say 10 or 20 customers at a time and be able to step forwards/backwards through the list. This is because there are hundreds of customers.

This is a very typical requirement and it happens hundreds of times in a reasonably sized application. The architecture of the system need careful thought as to how to provide these capabilities with a) good performance and b) with the minimum of development effort. That is a challenge.

The power of LINQ with Entity Framework

Before I go onto the solution of this challenge I want to delve into the two technologies that will form the cornerstone of my solution, that is LINQ/IQueryable and Microsoft’s data access technology, Entity Framework (EF).

Note: for those of you who are familiar with LINQ and EF you can skip to the next section, ‘Separation of Concerns and Layered Architecture’.

a. LINQ/IQueryable

LINQ (LINQ stands for Language-Integrated Query) is a way of writing queries on data. The important point is that any query written in LINQ is held as a query and only run when certain commands, like .ToList() are applied to that query. This allows software developers to write part of a filter/select command that can be added/extended later anywhere in the application that understands IQueryable.

Let me take a simple example using a subset of Customer List problem posed above. First let us select the customers who have bought from us before:

var realCustomers = Customers
                    .Where (c => c.SalesOrderHeaders.Any());

Then I select the CompanyName and form the ContactName from this filtered list:

var details = realCustomers.Select( x => new
   {
       CompanyName = x.CompanyName,
       ContactName = x.Title + " " + x.FirstName + " " +
                     x.MiddleName + " " + x.LastName +
                     " " + x.Suffix
   });

This returns an IQueryable<> class, but it has not been ‘realised’, i.e. it is just a command and has not accessed the database yet. This means I can add the following filter commands onto the query:

var filteredQuery = details.Where (x => x.CompanyName.Contains("Bike"));

Finally I might add something to add paging on to that query, e.g.

var finalQuery = filteredQuery.OrderBy(x => x.CompanyName)
                 .Skip(pageSize*pageNumber).Take(pageSize);

Hold that finalQuery as we step onto the next part of the solution which is…

b. Entity Framework

ListResultingFromFinalQueryEF is a database access framework that works by converting LINQ queries into SQL Server database commands, i.e. T-SQL code, and it does it very well. So, if we set pageSize to 5 and the pageNumber to 0 and then realise the finalQuery mentioned previously by putting a .ToList() on the end we get the results on the right.

EF has converted the finalQuery into T-SQL and applied it to the database. If we look at the final SQL command then we will see that it has:

  1. Only selected the columns that it needs.
  2. The concatenation of the names has been turned into a T-SQL string addition, including replacing null items with an empty string.
  3. The two .Where()’s, one looking for customers who have bought and the other looking for company names which contain “Bike”, are combined into a single SQL WHERE clause.
  4. The SQL command orders the data by CompanyName and then does a Skip + Take to return only the data that was requested.

For those who are interested here is the SQL command (text file) that EF produced from these concatenated queries.

Separation of Concerns and Layered Architecture

There is a software approach, almost a rule, called Separation of Concerns. In a Microsoft document it defines this term as follows:

Separation of Concerns: Divide your application into distinct features with as little overlap in functionality as possible. The important factor is minimization of interaction points to achieve high cohesion and low coupling…
(Microsoft, Software Architecture and Design 2009, Key Principles of Software Architecture)

Applying Separation of Concerns with what we have learnt about LINQ and EF I have separated the applications into Layers (see Multilayered architecture). In .NET these are separate projects/assemblies but are bound together into one application, i.e. direct method calls between layers. (Note: In today’s cloud system, such as Azure, scaling up is done by having the whole application running multiple times rather than the previous approach of having each layer running on a separate server). My layers are:

  1. Data Layer: sometimes called the Infrastructure Layer, or Persistence Layer.
    This deals with the reading and writing data to the database. It should provide IQueryable versions of each table/class for reading. Depending on the software design used it will either allow direct write/update of a database class, or if using Domain-Driven Design may impose some factories/methods that need to be called to accomplish this.
  2. Business Layer:
    This will contain all the business logic that goes beyond simple read/write of data. For instance in the AdventureWorks system a new order starts in the ‘Pending’ state and needs to be ‘Approved’ before it can be taken forward. That sort of logic should live in this Layer and it will directly interact with the Data Layer.
  3. Service Layer: sometimes called the Application layer.
    This section’s job is to format the data flowing from the Data Layer and the Business Layer to/from the format that the Presentation Layer It may need specialised handling for changing the data in the Data section if a Domain-Driven Design approach is used.
  4. Presentation Layer: sometimes called the UI (User Interface) Layer.
    This section’s job is to present the formatted data from the Service Layer to either a user via a web page, to another system via say a RESTful interface. This section will also handle any paging, filtering, sorting etc. as required.

As you can see I have apportioned different parts of the query handling to different layers, i.e. queries/updates can be passed up the system and modified as it moves through the layers.

Before I leave this section I should say this is one of the possible ways in which a system could be architected. I recommend the book ‘Microsoft .NET: Architecting Applications for the Enterprise’ for an excellent discussion of other approaches.

The Service Layer is key

My experience is that the Service Layer is one of the most useful layers because it totally separates the data model, i.e. the key data classes/tables that we have in the system, from the presentation layer.

Why is the service layer so useful – because in any reasonably complex application the data model should be defined by the business process/data and not by how the user wants to see the data. By adding a service layer to ‘adapt’ the data we keep the data model focused solely on the business needs and do not allow it to be contaminated by features that are only used for display.

Obviously we need another class more attuned to the Presentation Layer’s needs. These are often called Data Transfer Objects (DTOs). In my system the Service Layer methods do two things:

  1. They take on the work of reformatting the data in the Data Layer or Business Layer to/from a DTO, which holds the data in format that is suitable for the Presentation Layer.
  2. They call the correct commands in the Data Layer or Business Layer to execute certain commands. The most obvious are the Create, Update or Delete commands to the Data Layer to update data held in the database.

The only problem is – the Service Layer has a lot of really repetitive code, and boring to write! So I set about fixing that with a library called GenericServices.

GenericServices and how it solves certain problems

GenericServices uses C# Generics to provide a set of CRUD (Create, Read, Update and Delete) services to the database. The developer only needs to write a DTO which inherits from a GenericServices’ template DTO to gain access to GenericServices CRUD methods which includes intelligent copying of data layer classes to/from DTOs.

You might not want to use GenericServices but you might find it useful to understand how it solves certain problems when working with the AdventureWorksLt2012 database. Note: If you do want to look at GenericServices, which is an open-source project, then you can find out more about GenericServices on GitHub or via the SampleMvcWebApp site.

In the next four sections I describe various issues that I came across in my design and how I solved them in GenericServices.

1. Copying properties to/from Data Layer classes and DTOs using LINQ

I think I have made a case for using DTOs – they reduce the data required and allow data from various relationships to be pulled together into one place (called flattening). I am also saying that using LINQ commands to select the data means the EF can make a very efficient database access, only pulling in the specific cells needed for the Presentation Layer.

In my early projects I hand-coded the data to DTO and DTO to data code. This worked but was incredibly repetitive, boring and needed Unit Tests to check I hand not fallen asleep during the process. I was committed to automating this process, which I have achieved in GenericServices.

GenericServices used AutoMapper for the task of data<->DTO copying. AutoMapper is an excellent, conversion based mapper that can handle some complex mappings between two classes, i.e. the data class and the DTO. I had come across AutoMapper years ago, but had rejected it because I did not support LINQ commands. However now it does support LINQ projections, which means it converts the copy into a LINQ expression which EF then uses to make an efficient SQL query.

So, my solution was to put AutoMapper at the core of my data<->DTO copying process and I have not been disappointed. If you have a similar need to copy data between similar classes then you should look at the AutoMapper Wiki site and Jimmy Bogard’s AutoMapper blogs for newer information.

2. Adding new calculated properties for display or editing

The other issue you saw in the Customer List example at the start was that I needed to come up with some new, calculated values to use to help display the customer list. In early projects I simply hand-coded these using LINQ expressions. Getting a solution to this was a little harder.

Just to recap, from the earlier example the Customer list needs three calculated properties:

  1. HasBoughtBefore, which should be true if the customer has any sales orders
  2. FullName, which combines the parts of the name to make something sensible for the user.
  3. TotalAllOrders, which sums all the order values to give total.

I initially used custom mapping in AutoMapper, where I basically inserted my hand-coded LINQ expressions. This works but Jimmy Bogard’s AutoMapper blog put me onto a very elegant solution called DelegateDecompiler which turns calculated properties into LINQ expressions. That may not make sense so let me give an example.

If I added a calculated property to the EF data class as shown below then using it in a LINQ expression would throw an exception, as it cannot handle properties that have calculations in them.

public bool HasBoughtBefore { get { return SalesOrderHeaders.Any(); } }

What DelegateDecompiler does is allows you to mark such properties with a [Computed] attribute and in your query you add a call to .Decompile(). This then turns the calculated properties into a LINQ expression, which means EF can then execute that code.

So in my data layer I can add the following partial class to the Customer data class to add two of the three properties I need.

public partial class Customer : IModifiedEntity
{
    [Computed]
    public string FullName { get { return Title + " " + FirstName
                                   + " " + LastName + " " + Suffix; } }

    /// <summary>
    /// This is true if a 'Customer' has bought anything before
    /// </summary>
    [Computed]
    public bool HasBoughtBefore { get { return SalesOrderHeaders.Any(); } }
}

GenericServices notices that the class has some properties marked with the [Computed] attribute and invokes the DelegateDecompiler ‘s .Decompile() method before passing the query up to the Presentation Layer.

DelegateDecompiler is very elegant, but complicated library so some types of LINQ commands are not supported. In particular calculation like TotalAllOrders does not work, nor does it support properties that need to access a local variable. Therefore I have also provided an entry point in GenericServices to allow additional AutoMapper mapping to handle these complex cases. See the overridden AddedDatabaseToDtoMapping property in the GenericServices DTO definition.

public class ListCustomerDto : EfGenericDto<Customer, ListCustomerDto>
{

    // … other properties removed for clarity 

    /// <summary>
    /// This will hold the total value of all orders for this customer
    /// </summary>
    [Display(Name = "Total Sales Value")]
    [DataType(DataType.Currency)]
    public decimal TotalAllOrders { get; set; }

    // … other code removed for clarity

    protected override Action<IMappingExpression<Customer, ListCustomerDto>>
        AddedDatabaseToDtoMapping
    {
        get
        {
            return m => m.ForMember(d => d.TotalAllOrders,
                opt => opt.MapFrom(c =>
                       c.SalesOrderHeaders.Sum(
                           x => (decimal?)x.TotalDue) ?? 0));
        }
    }
}

Note: The LINQ calculation for TotalAllOrders is complex because it is summing a property in a collection relationship. Because a collection can be empty LINQ correctly insists on the nullable version of Sum. This returns null for empty collections, which I replace with zero.

Presentation Layer

Before I leave this example you might like to see the MVC controller code that displays the result. My design aim is to make the code in the MVC controller as simple and generic as possible. This will allow me to use scaffolding for building the controller code and the views. Below is the code for displaying a list.

public class CustomersController : Controller
{
    [DisplayName("Current Customers")]
    public ActionResult Index()
    {
        //the ‘true’ tells the Grid to only show rows where HasBoughtBefore is true
        return View(true);
    }

    public JsonResult IndexListRead(
        [DataSourceRequest]DataSourceRequest request,
        IListService service)
    {
        var result = service.GetAll<ListCustomerDto>()
                     .OrderBy( x => x.CustomerID).ToDataSourceResult(request);
        return Json(result, JsonRequestBehavior.AllowGet);
    }
    //… more create, update, delete method left out for clarity
}

Note the IListService, which is the GenericServices List service where we call GetAll<ListCustomerDto>(). to get the formatted DTO data. I then display this using a Kendo UI Pro Grid which takes an IQueryable query and adds its own filter, sort, group and paging controls to the query before executing the query. Note: Kendo UI Pro is a paid-for UI library for MVC and JavaScript. The result is a grid with paging, sorting, filtering etc.

Analysis of final SQL, and resulting changes

Finally I using Red Gates’ ANTS Performance Profiler to analyse the final SQL command (text file) that EF produced after all these different parts are combined. The display is for existing customers, i.e. HasBoughtBefore is true, and filters the results to only include customers whose company name containing “bike”.

I am no SQL expert but looking at the SQL command produced it seemed to me that sum of the SalesOrderHeader.TotalDue seemed to be done twice, once to see if it was null and then again if it wasn’t null. This didn’t seem too efficient so I changed the LINQ code to first check if there was any SalesOrderHeader before attempting the sum. The new LINQ code to define the TotalAllOrders property is shown below:

SalesOrderHeaders.Any() ? SalesOrderHeaders.Sum(x => x.TotalDue) : 0;

This produced the following SQL command (text file), which a) seems simpler and b) was faster that the original SQL command. As I say I am not a SQL expert but I would say this is an improvement.

In real life you should look at areas of poor performance and then apply increasing levels of refactoring, right up to using direct SQL commands, to obtain the performance you need. However this does take development time so in some projects a quickly written but possibly poor performing database access may be acceptable.

3. Handling composite keys

Quite a few of the tables in the AdventureWorks Lite database has composite keys, i.e. a primary key made up of two or more columns in the table. Now GenericServices is designed to handle composite keys and the Detail, Update and Delete services all take multiple keys. However there is one table in AdventureWorks Lite database where the keys can be a challenge: CustomerAddress. It is worth pointing out how to handle this in EF.

The picture below shows that the Customer has a collection of CustomerAddresses, which consists of a property called AddressType, which says if it’s their main office etc. and a link to a standard Address class. CustomerAddress has a composite key consisting of the CustomerID and the AddressID.

CsutomerAddress-table-with-relationships

So, the problem comes when creating a new CustomerAddress. We need to create the CustomerAddress and the Address in the same commit so that they are consistent, but we also need to get the new AddressID to fill in part of the composite key on the CustomerAddress.

The answer is that EF does this as part of its relational fixup. To get this to work I create the CustomerAddress and set its EF ‘Address’ relational property to a new Address type. We add the CustomerAddress to the database using db.CustomerAddresses.Add( myNew CustomerAddress) and call EF’s SaveChanges() and its done.

4. Deleting database classes with have relationships

When deleting items from the database that have relationships, there are a few issues we need to think about. Firstly, if something is ‘pointing to’ that item, in SQL terms it is referencing that key and this reference has been enforced by a foreign key constraint. Because of this constraint, you cannot delete the row if it is being referenced by another table. You need to remove the foreign keys that reference the key value for the row. In our case, we’d want to delete all rows that reference the row we wish to delete. You need a way of handling this in a graceful way so the user is not faced with a ‘There was an Error’ message.

In GenericServices I have used a layered approach. You can add your own code to check before you delete, but by default the GenericServices Delete method relies on a special .SaveChangedWithValidation() method which catches certain SQLExceptions. In this case the SQL error code is 547, which is a constraint error triggered on an attempt to delete a row that another row is relying on. I describe this approach in a previous Simple-Talk article called, Catching Bad Data in Entity Framework.

5. Properly deleting rows any their relationships

The other delete issue is the opposite of the previous one. What happens if I want to delete a database row that has a relationship to other database entries? In cases where the relationship is only between these two items then the lower relationship is often set as ‘cascade on delete’, so deleting the main row will cause its child relationship to be deleted. However there can be complex cases where you have to handle this yourself. That is true of the CustomerAddress and its Address listed in the last section.

When I delete CustomerAddress I most likely want to delete the Address entry that it points to. I say ‘most likely’ as it is quite possible that the Address part was used in an old Sales Order as the delivery or billing address. Because of this dual use there is no cascade delete on the Address so we need to do this by hand. In GenericServices I added a DeleteWithRelationships(Func<>) method which allows a small helper Func<> to be written by the developer that looks after any relationship deletion within the same SQL commit as the primary delete. That way if the delete of the Address entry fails because it is using in a delivery or billing address then the delete of the CustomerAddress is rolled back. The system also provides a rather generic, but friendly message.

 

Any down sides to this approach?

It is always good to think of what could be done better – that is how I got to write GenericServices. Looking at the approach outline above I would make these comments:

  1. The proposed architecture is very reliant on LINQ and its conversion to database commands. I don’t think this is a major problem, but it does limit you to providers that handle LINQ. Here are some providers.
    1. Entity Framework (EF) is acknowledged as having the best LINQ-to-SQL conversion. EF currently only supports Microsoft SQL Server, although the EF v7 roadmap includes support for other database types.
    2. NHibernate, MySql and other database access providers support LINQ-to-database converter and supports a lot of database types. However I cannot vouch for the depth of their implementation.
    3. You can write your own LINQ data source provider already, and it is said to become easier in EF v7, which is out ‘soon’.
  2. LINQ/IQueryable is not easily serializable, so a system that has separated systems talking over some message bus would not work. However current scaling practice, see Azure’s How to Scale an Application, is to run multiple instances of the same application which works for this approach.
  3. Some people say that passing IQueryable right up to the Presentation Layer is an anti-pattern. The main concern is that used wrongly a developer can add a .ToList() to the end of a query and load thousands of rows of data. That is a concern, which is why I make the Presentation layer calls very simple and standard. My long term aim is to scaffold most of Presentation Layer, but I need to do more projects before I really know what is needed.
  4. Similar to point 2 but in untrained hands software developers can write really bad queries that hit performance issues. This is true of nearly every ORM (Object-Relational Mapping) software system, but EF is big and it has taken me years to get really understand what EF is doing under the hood. However the fact is the software development part of a project is likely to be the biggest cost and a good ORM, plus training, is likely to reduce timescales.

Finally I should say that I describe a Layered Architecture but for really big systems this may not scale enough. However the overall approach of using LINQ and separating different parts of the processing of the query would be valid in other architectures like CQRS (Command and Query Responsibility Segregation) architecture or within each service in a MicroService architecture. I would recommend looking at the pdf of an excellent book called ‘Microsoft .NET: Architecting Applications for the Enterprise’ for a detailed discussion of other approaches.

Conclusions

Getting EF and GenericServices to work with the AdventureWorks Lite database was very revealing. Some things worked really well first time and some needed quite a bit of fiddling around. As a result of this I changed the implementation of GenericServices to better handle these more complex cases.

Overall it took about two weeks to build a web application with included List, Show Detail, Create, Update, and Delete of Customers and Orders + adding/editing line items in an order plus Product listing and lookup. When compared to hand-coding a web application like that all that then that I think is very fast, so I am pleased.

I hope you have found this article useful in describing an approach to building complex, database driven web applications. Hopefully it has suggested some approaches that may well save you time in the future.

Happy coding!

Entity Framework: working with an existing database

Last Updated: December 9, 2014 | Created: November 1, 2014
Quick Summary
This two-part article looks at the software issues raised when presented with a database that was designed and built prior to software development. Part 1 (this article) explores the low-level problems of connecting to that database using Entity Framework . Part 2 looks at the high-level problems of transforming the EF data into a form suitable for display via an ASP.NET MVC web site.

Part 1: Using Entity Framework with an existing database

In many large-scale projects software developers are often have to work with existing SQL Server databases with predefined tables and relationships. The problem can be that some predefined databases can have aspects that are not so easy to deal with from the software side.

This is a two part article following my experiences of putting together a web application using the AdventureWorksLT2012 database, which is a cut-down version of the larger AdventureWorks OLTP database. I am using Microsoft’s ASP.NET MVC5 (MVC) with the propriety Kendo UI (KendoUI) package for the UI/presentation layer. The two parts to the article are:

  1. PART 1 (this article). The best way to get Microsoft’s Entity Framework version 6 (EF) data access tool to import and access the database.
  2. PART 2: (this link) This looks at the higher level functions is needed to transform that data into a user-focused display. In this case using the GenericServices open-source library to build a service/application layer to connect to modern web front end.

Let us start on…

Using Entity Framework with an existing database

Entity Framework 6 has a number of features to make working with existing databases fairly straightforward. In this article I detail the steps I needed to take on the EF side to build a fully featured web application to work with the AdventureWorks database. At the end I also mention some other techniques that I didn’t need for AdventureWorks, but I have needed on other databases. The aim is to show how you can use EF with pre-existing databases, including ones that need direct access to T-SQL commands and/or Stored Procedures.

1. Creating the Entity Framework Classes from the existing database

Entity Framework has a well documented approach, called reverse engineering, to create the EF Entity Classes and DbContext from an existing database which you can read here. This produces data classes with various Data Annotations to set some of the properties, such as string length and nullablity (see below), plus a DbContext with an OnModelCreating method to set up the various relationships.

namespace DataLayer.GeneratedEf
{
    [Table("SalesLT.Customer")]
    public partial class Customer
    {
        public Customer()
        {
            CustomerAddresses = new HashSet<CustomerAddress>();
            SalesOrderHeaders = new HashSet<SalesOrderHeader>();
        }

        public int CustomerID { get; set; }

        public bool NameStyle { get; set; }

        [StringLength(8)]
        public string Title { get; set; }

        [Required]
        [StringLength(50)]
        public string FirstName { get; set; }

        [StringLength(50)]
        public string MiddleName { get; set; }

        //more properties left out to shorten the class...
        //Now the relationships

        public virtual ICollection<CustomerAddress> CustomerAddresses { get; set; }

        public virtual ICollection<SalesOrderHeader> SalesOrderHeaders { get; set; }
}

This does a good job of building the classes. Certainly having the Data Annotations is very useful as front-end systems like MVC use these for data validation during input. However I did have a couple of problems:

  1. The default code generation template includes the `virtual` keyword on all of the relationships. This enabled lazy loading, which I do not want. (see section 1.1 below)
  2. The table SalesOrderDetail has two keys: one is the SalesOrderHeaderID and one is an identity, SalesOrderDetailID. EF failed on a create and I needed to fix this. (See section 1.2 below)

I will now describe how I fixed these issues.

1.1. Removing lazy loading by altering the scaffolding of the EF classes/DbContext

As I said earlier the standard templates enable ‘lazy loading’. I have been corrected in my understanding of lazy loading by some readers. The documentation states that ‘Lazy loading is the process whereby an entity or collection of entities is automatically loaded from the database the first time that a property referring to the entity/entities is accessed’. The problem with this is it does not make for efficient SQL commands, as individual SQL SELECT commands are raised for each access to virtual relationships, which is not such as good idea for performance.

As you will see in part 2 of this article I use a technique that ‘shapes’ the sql read to only loads the individual properties or relationships I need. I therefore do not need, or want, lazy loading of relationships.

Now you could hand edit each generated class to remove the ‘virtual’, but what happens if (when!) the database changes? You would then reimport the database and lose all your edits, which you our your colleague might have forgotten about by then and suddenly your whole web application slows down. No, the common rule with generated code is not to edit it. In this case the answer is to change the code generated during the creating of the classes and DbContext. This leads me on to…

Note: You can turn off lazy loading via the EF Configuration class too, but I prefer to remove the virtual keyword as it ensures that lazy loading is definitely off.

The generation of the EF classes and DbContext is done using some t4 templates, referred to as scaffolding. Be default it uses some internal scaffolding, but you can import the scaffolding and change it. There is a very clear explanation of how to import the scaffolding using NuGet here, so I’m not going to repeat it.

Once you have installed the EntityFramework.CodeTemplates you will find two files called Content.cs.t4 and EntityType.cs.t4, which control how the DbContext and each entity class are build respectively. Even if you aren’t familiar with t4 (a great tool) then you can understand what it does – its a code generator and anything not surround by <# #> is standard text. I found the word virtual in the EntityType.cs.t4 and deleted it. I also removed virtual from the Content.cs.t4 file on the declaration of the DbSet<>.

You may want to alter the scaffolding more extensively, say to add a [Key] attribute on primary keys for some reason. All is possible, but you must dig into the .t4 code in more depth.

One warning about using importing scaffolding – Visual Studio threw a nasty error message when first tried to import using the EntityFramework.CodeTemplates scaffolding (see stackoverflow entry). It took a bit of finding but it turns out if you have Entity Framework Power Tools Beta 4 installed then they clash. If you have Entity Framework Power Tools installed then you need to disable it and restart Visual Studio before you can import/reverse engineer a database. I hope that gets fixed as Entity Framework Power Tools is very useful.

Note: There are two other methods to reverse engineer an existing database:

  1. EntityFramework Reverse POCO Code First Generator by Simon Hughes. This is Visual Studio extension recommended by the EF Guru, Julia Lerman, in one of her MSDN magazine articles. I haven’t tried it, but if Julia recommends it then its god.
  2. Entity Framework Power Tools Beta 4 can also reverse engineer a database. Its quicker, only two clicks, but its less controllable. I don’t suggest you use this.

1.2 Fixing a problem with how the two keys are defined in SalesOrderDetail table

The standard definition for the SalesOrderDetail table key parts are as followes

[Table("SalesLT.SalesOrderDetail")]
public partial class SalesOrderDetail
{
    [Key]
    [Column(Order = 0)]
    [DatabaseGenerated(DatabaseGeneratedOption.None)]
    public int SalesOrderID { get; set; }

    [Key]
    [Column(Order = 1)]
    public int SalesOrderDetailID { get; set; }

    //other properties left out for clarity...
}

You can see it marks the first as not database generated, but it does not mark the second as an Identity key. This caused problems when I tried to create a new SalesOrderDetail so that I could add a line item to an order. I got the SQL error:

Cannot insert explicit value for identity column in table ‘SalesOrderDetail’ when IDENTITY_INSERT is set to OFF.

That confused me for a bit, as other two-key items had worked, such as CustomerAddress. I tried a few things but as it looked like an EF error I tried telling EF that the SaledOrderDetailID was a Identity key by using the attribute [DatabaseGenerated(DatabaseGeneratedOption.Identity)]. That fixed it!

The best solution would be to edited the scaffolding again to always add that attribute to identity keys. That needed a bit of work and the demo was two days away so in the meantime I added the needed attribute using the MetadataType attribute and a ‘buddy’ class. This is a generally useful feature so I use this example to show you how to do this in the next section.

3. Adding new DataAnnotations to EF Generated classes

Being able to add attributes to properties in already generated classes is a generally useful thing to do. I needed it to fix the key problem (see 1.2 above), but you might want to add some DataAnnotations to help the UI/presentation layer such as marking properties with their datatype, e.g. [DataType(DataType.Date)]. The process for doing this is given in the Example section of this link to the MetadataType attribute. I will show you my example of adding  the missing Identity attribute.

The process requires me to add a partial class in another file (see section 3 coming later for more on this) and then add the [MetadataType(typeof(SalesOrderDetailMetaData))] attribute to the property SaledOrderDetailID in a new class, sometimes called a ‘buddy’ class . See below:

[MetadataType(typeof(SalesOrderDetailMetaData))]
public partial class SalesOrderDetail : IModifiedEntity
{
}

public class SalesOrderDetailMetaData
{
    [DatabaseGenerated(DatabaseGeneratedOption.Identity)]
    public int SalesOrderDetailID { get; set; }
}

The effect is to apply those attributes to the existing properties. That fixed my problem with EF creating new SalesOrderDetail properly and I was away.

2. What happens when the database changes?

Having sorted  the scaffolding as discussed above then just repeat step 1, ‘Creating the Entity Framework Classes from the existing database’. There are a few things you need to do before, during and after the reimport.

  1. You should remember/copy the name of the DbContext so you use the same name when you reimport. That way it will recompile properly without major name changes.
  2. Because you are using the same name as the existing DbContext you must delete the previous DbContext otherwise the reimporting process will fails. If its easier you can delete all the generated files as they are replaced anyway. That is why I suggest you put them in a separate directory with no other files added.
  3. When reimporting by default the process will add the connection string to your App.Config file again. I suggest you un-tick that otherwise you end up with lots of connection strings (minor point, but can be confusing).
  4. If you use source control (I really recommend you do) then a quick compare of the files to check what has changed is worthwhile.

3. Adding new properties or methods to the Entity classes

In my case I wanted to add some more properties or methods to the class? Clearly I can’t add properties that change the database – I would have to talk to the DBA to change the database definition and import the new database schema again. However in my case I wanted to add properties that accessed existing database properties to produce more useful output, or to have an intention revealing name, like HasSalesOrder.

You can do this because the scaffolding produces ‘partial’ classes, which means I can have another file which adds to that class. To do this it must:

  1. Have the same namespace as the generated classes
  2. The class is declared as public partial <same class name>.

I recommend you put them in a different folder to the generated files. That way they will not be overwritten by accident when you recreate the generated files (note: the namespace must be the original namespace, not that of the new folder). Below I give an example where I added  to the customer class. Ignore for now the IModifiedEntity interface (dealt with later in this article) and [Computed] attribute (dealt with the the part 2 of the article).

public partial class Customer : IModifiedEntity
{
    [Computed]
    public string FullName { get { return Title + " " + FirstName + " " + LastName + " " + Suffix; } }

    /// <summary>
    /// This is true if any sales orders. We use this to decide if a 'Customer' has actually bought anything
    /// </summary>
    [Computed]
    public bool HasSalesOrders { get { return SalesOrderHeaders.Any(); } }
}

Note that you almost certainly will want to add to the DbContext class (I did – see section 4 below). This is also defined as a partial class so you can use the same approach. Which leads me on to…

4. Dealing with properties best dealt with at the Data Layer

In the AdventureWorks database there are two properties called ‘ModifiedDate’ and ‘rowguid’. ModifiedDate needs to be updated on create or update and the rowguid needs to be added on create.

Many databases have properties like this and they are best dealt with at Data/Infrastructure layer. With EF this can be done by providing a partial class and overriding the SaveChanges() method to handle the specific issues your database needs. In the case of AdventureWorks I adding an IModifiedEntity interface to each partial class that has ModifiedDate and rowguid property.

Then I added the code below to the AdventureWorksLt2012 DbContext to provide the functionality required by this database.

public partial class AdventureWorksLt2012 : IGenericServicesDbContext
{
    /// <summary>
    /// This has been overridden to handle ModifiedDate and rowguid
    /// </summary>
    /// <returns></returns>
    public override int SaveChanges()
    {
        HandleChangeTracking();
        return base.SaveChanges();
    }

    /// <summary>
    /// This handles going through all the entities that have
    /// changed and seeing if we need to do anything.
    /// </summary>
    private void HandleChangeTracking()
    {
        foreach (var entity in ChangeTracker.Entries()
           .Where(e => e.State == EntityState.Added
               || e.State == EntityState.Modified))
        {
            UpdateTrackedEntity(entity);
        }
    }

    /// <summary>
    /// Looks at everything that has changed and
    /// applies any further action if required.
    /// </summary>
    /// <param name="entityEntry"></param>
    /// <returns></returns>
    private static void UpdateTrackedEntity(DbEntityEntry entityEntry)
    {
        var trackUpdateClass = entityEntry.Entity as IModifiedEntity;
        if (trackUpdateClass == null) return;
        trackUpdateClass.ModifiedDate = DateTime.UtcNow;
        if (entityEntry.State == EntityState.Added)
            trackUpdateClass.rowguid = Guid.NewGuid();
    }
}

The IModifiedEntity interface is really simple:

//This interface is added to all the database entities
//that have a modified date and rowGuid. Save Changes uses this
// to find entities that need the date updating, or a new rowguid added
public interface IModifiedEntity
{
    DateTime ModifiedDate { get; set; }
    Guid rowguid { get; set; }
}

5. Using SQL Store Procedures

Some databases rely on SQL Stored Procedures (SPs) for insert, update and delete of rows in a table. AdventureWorksLT2012 did not, but if you need to that EF 6 has added a neat way of linking to stored procedures. Its not trivial, but you can find good information here on how to do it.

Clearly if the database needs SPs for CUD (Create, Update and Delete) actions then you need to use them. However using EFs CUD actions is easier from the software point of view, and EFs CUD have some nice features. For instance EF has an in-memory copy of the original values and uses this for working out what has changed.

The benefit is that the EF updates are efficient – you update one property and only that cell in a row is updated. The more subtle benefit is tracking changes and handling SQL security, i.e. if you use SQL column level security (Grant/Deny) then if that property is unchanged we do not trigger a security breach. This is a bit of an esoteric feature, but I have used it and it works well.

6. Other things you could do

This is all I had to do to get EF to work with an existing database, but there are other things I have had to use in the past. Here is a quick run through of other items:

6a. Using Direct SQL commands

Sometimes it makes sense to bypass EF and use a SQL command, and EF has all the commands to allow you to do this. The EF documentation has a page on this here which gives a reasonable overview, but I recommend Julia Lerman’s book  ‘Programming Entity Framework: DbContext’ which goes into this in more detail (note: this book is very useful but it covers an earlier version of EF so misses some of the latest commands like the use of SPs in Insert, Update and Delete).

For certain types of reads SQL makes a lot of sense. For instance in my GenericSecurity library I need to read the current sql security setup (see below). I think you will agree it makes a lot of sense to do this with a direct sql read rather than defining multiple data classes just to build the command.

var allUsers = db.Database.SqlQuery<SqlUserAndRoleRow>(
@"select mp.name as UserName, rp.name as RoleName, mp.type as UserType
from sys.database_role_members drm
join sys.database_principals rp on (drm.role_principal_id = rp.principal_id)
join sys.database_principals mp on (drm.member_principal_id = mp.principal_id)
ORDER BY UserName");

For SQL commands such as create, update and delete is is less obvious, but I have used it in some cases. For these you use the SqlCommand method, see example from Microsoft below:

using (var context = new BloggingContext()) 
{ 
    context.Database.SqlCommand( 
        "UPDATE dbo.Blogs SET Name = 'Another Name' WHERE BlogId = 1"); 
}

Neither of these example had parameters, but if you did need any parameters then SqlQuery and SqlCommand methods can take parameters, which are checked to protect against a SQL injection attack. The Database.SqlQuery Method documentation shows this.

One warning on SqlCommands. Once you have run a SqlCommand then EF’s view of the database, some of which is held in memory, is out of date. If you are going to close/dispose of the DbContext straight away then that isn’t a problem. However if the command is followed by other EF accesses, read or write, then you should use the EF ‘Reload’ command to get EF back in track. See my stackoverflow answer here for more on this.

6b. SQL Transaction control

When using EF to do any database updates using the .SaveChanged() function then all the changes are done in one transaction, i.e. if one fails then none of the updates are committed. However if you are using raw SQL updates, or a combination of EF and SQL updates, you may well need these to be done in one transaction. Thankfully EF version 6 introduced commands to allow you to control transactions.

I used these commands in my EF code to work with SQL security. I wanted to execute a set of SQL commands to set up SQL Security roles and grant/deny access, but if any one failed I wanted to roll back. The code to execute a sequence of sql commands and rollback if any single command fails is given below:

using (var dbContextTransaction = db.Database.BeginTransaction())
{
    try
    {
        foreach (var text in sqlCommands)
        {
            db.Database.ExecuteSqlCommand(text);
        }
        dbContextTransaction.Commit();
    }
    catch (Exception ex)
    {
        dbContextTransaction.Rollback();
        //report the error in some way
    }
}

You can also use the same commands in a mixed SQL commands and EF commands. See this EF documentation for an example of that.

Conclusion

There were a few issues to sort out but all of them were fixable. Overall getting EF to work with an existing database was fairly straightforward, once you know how. The problem I had with multiple keys (see section 1.2) was nasty, but now I, and you, know about it we can handle it in the future.

I think the AdventureWorks Lite database is complex enough to be a challenge: with lots of relationships, composite primary keys, computed columns, nullable properties etc. Therefore getting EF to work with AdventureWorks is a good test of EFs capability to work with existing SQL databases. While the AdventureWorks Lite database did not need any raw SQL queries or Stored Procedures other projects of mine have used these, and I have mentioned some of these features at the end of the article to complete the picture.

In fact version 6 of EF added a significance amount of extra features and commands to make mixed EF/SQL access very possible. The more I dig into things the more goodies I find in EF 6. For instance EF 6 brought in Retry logic for Azure, Handling transaction commit failures, SQL transaction control, improved sharing connections between SQL and EF, plus a number of other things. Have a good look around the EF documentation – there is a lot there.

So, no need to hold back on using Entity Framework on your next project that has to work with an existing SQL database. You can use it in a major role as I did, or now you have good connection sharing just use it for the simple CRUD cases that do not need heavy T-SQL methods. Happy coding.

Now read Part 2, which looking at data to user interface side