Handling Entity Framework database migrations in production – part 1, applying the updates

Last Updated: July 31, 2020 | Created: November 4, 2015

This is the first article in a series about using Microsoft’s data access technology, Entity Framework (EF) in an application that to be ‘always on’, in my case an ASP.NET MVC e-commerce web site, yet is still being developed so there is a need to update the database schema, i.e. the tables, views, constraints etc. that define how the data is held in the database.

This is a series, and in this first article I deal with the issue of updating database schema in a robust and testable way. The list of articles will be:

However before detail my update method I will start by describing the general problems around a database update.

Why production database updates are so challenging

Databases contain date, and that data is often very valuable to someone. In the e-commerce site I am working on the database holds customer orders which they have paid for and they expect them to be delivered. If we want the business to succeed that data better not be lost or corrupted.

The problem is that as the business grows we are likely to need to change the type/format of data we hold in the database, known as the database schema. Simple changes like adding a column to hold say a comment is normally easy to add, as the new table will work with the existing software. However more complex additions of say a new table which links to existing data, i.e. in SQL terms it has a foreign key, are a lot more difficult to do.

Updates also become especially challenging if you want the existing web site to stay ‘live’ while you change the database, i.e. the old software still works with the new database while you are updating its schema and data.

In my case the application is an e-commerce web site using ASP.NET MVC with EF and is running on Microsoft Azure. In the diagram below we see a typical ‘live’ deployment via Azure’s staging slot.

Azure staging-live diagram

The point about this arrangement is that the staging slot holds the new version of the software, which is accessing the same database as the live site. When the new version checks out the DevOps person swaps the staging-live site. The users should see no interruption to their use of the site.

The challenge is if the new software needs a change to the database schema the database must be updated while the existing version of the software is still running. Also the database must support both the old and the new software until the swap is done. That is quite a challenge.

Note: You can read more about the challenges of deploying and updating databases, and the Simple-Talk web site has a whole section on Database Lifecycle Management.

My requirements for the database deployment process

After a lot of research I came up with a list of requirements a system that deployed a database. I decided the process must be:

  • Testable: I can test any database change before running it on the Production database.
  • Automated: I can automate the whole process so that I can’t get it wrong.
  • Trackable: Each database should have a log of what has been done to its schema.
  • Atomic: The update process must either be completed successful or entirely rolled-back.
  • Recoverable: Each update should automatically make a backup in case the worst happens.
  • EF-compatible: The database changes have to work with Entity Framework.
  • Comprehensive: There must be Full SQL access; I needed to change tables, views, stored procedures, constraints etc.

Why I decided EF’s standard migration feature was not sufficient

I am very familiar with EF’ database migration feature and I have used it in the early stages of developing an application. However, it’s this knowledge that made me decide it wasn’t sufficient for this current project. The primary reason is because it is not very testable, i.e. it is hard to run the update on a test database. A secondary reason is that I had come across problems when applying migrations to a Production system – the update runs as part of the application start-up and feedback of errors is very limited. There is a way to run EF migrations from a command line, which does improve the process, but the problem of testability still remains.

All in all I just didn’t feel comfortable with using EF’s own database migration feature, so I set out to find a better way. After a lot of searching I found a package called DbUp.

Note: I talk more about how EF matches its view of the database with the actual schema of the database in the second article.

Database Migrations the DbUp way

DbUp is an open-source, .NET-based tool for running scripts on a database. DbUp is elegantly simple, and uses an “Apply These Scripts…” approach, i.e. you write a list of scripts, SQL or C# based, with a name that sets the order, e.g. Script001, Script002 etc. You can call DbUp via a Console App or directly from PowerShell. It uses a special table in the database to see which scripts have been run and runs any scripts that are new.

Here is an example of using DbUp from my application:

var upgrader = DeployChanges.To
        .SqlDatabase(dbConnectionString)
        .WithScriptsAndCodeEmbeddedInAssembly(Assembly.GetExecutingAssembly())
        .WithTransaction()
        .LogToConsole()
        .Build();

var result = upgrader.PerformUpgrade();

if (result.Successful)
etc.…

Looks simple, but there are lots of little things in DbUp that shows someone has really thought the process through. I have to admit I found the documentation a little obscure in places, but that is because I was most likely thinking in an EF way. Once I got me head around it I started to appreciate the quality of DbUp.

A snippet from one of my test T-SQL scripts looks like this:

create table $schema$.[DataTop](
	[DataTopId] [int] IDENTITY(1,1) PRIMARY KEY,
	[MyString] [nvarchar](25) NULL,
	[DataSingletonId] [int] NULL,
)
go

create table $schema$.[DataChild](
	[DataChildId] [int] IDENTITY(1,1)PRIMARY KEY,
	[MyInt] [int] NOT NULL,
	[MyString] [varchar](max) NULL,
	[DataTopId] [int] NOT NULL
)
Go
ALTER TABLE $schema$.[DataChild]  
ADD  CONSTRAINT [FK_dbo.DataChild_dbo.DataTop_DataTopId] FOREIGN KEY([DataTopId])
REFERENCES $schema$.[DataTop] ([DataTopId])
ON DELETE CASCADE
GO

A couple of pointers for people that are used to EF Migrations:

  1. DbUp only does forward changes, i.e. there is no DOWN script like in EF migrations. The philosophy of DbUp is that at each stage you are transitioning the database to the new state. So going back is simply another transition.
    Considering that I have only used an EF DOWN script about once in four years then I don’t think this is a problem.
  2. There is no ‘Run Seed Every Time’ approach like EF. When you think about it you run one seed at the start why do you need to run it again? In DbUp if you want to change the data in the seed you just have a new script to update or add to the original seed.

Note: DbUp does have a way of running a script every time if you what to, see DbUp NullJournal.

One down side of using SQL scripts to apply database changes is that, unless the project has a DBA (Database Administrator), the developer will need to learn T-SQL. While I am not a DBA I have written quite a bit of T-SQL in my time. In fact I personally prefer T-SQL for configuring a database as I find using EF’s fluent interface can be quite difficult in some situations.

Conclusion

In this first article I have described the issue I have with EF’s own database migration code and shown and alternative method of applying updates to the schema of a SQL database using DbUp. However, on its own this will not work, as EF will not know you have changed things.

In the second article I will look at a method I have created to make sure that the view of the database schema that EF has does not diverge from the actual database schema.

Happy coding!

When is it not worth writing a Unit Test?

Last Updated: July 31, 2020 | Created: October 1, 2015

I’m currently working on a e-commerce site based on ASP.NET MVC 5 for a new venture (in stealth mode, so I can’t say what it is). I have been working on it for over six months and we set a target date to get a new Beta2 version of the site up by end of September (yesterday).

That puts a bit of pressure on development and I had a couple of interesting decisions around Unit Testing that I thought I would share.

I also include some quotes from one of the giants of Software Design, Martin Fowler, and at the end I give some criteria to help you decide what to test, and maybe what you might not test.

Setting the scene

I expect everybody who comes to this article will know about Unit Testing (here is link to a definition in case you are new to this). I also expect you have an opinion, ranging from “you need to have 100% test coverage” through to “Unit Testing is a waste of valuable development time”.

All my articles are based on my real-life experiences. In this one I found, just before the deadline, that a small piece of code took maybe 10 minutes to write and 1+ hours to write the Unit Tests. On the other had I realised two days earlier I had written another piece of code which took about the same time to write, but I didn’t write any Unit Tests for it.

On reflection I decided that both decisions were correct. In this article I go through my reasoning to see why I think this way. You may disagree with my reasons (please do!), but hopefully I will make you think.

Before I start let me give you my first quote from Martin Fowler. If you read my history you will see I came back to programming in 2009 after a 21 year break. I gobbled up all the reading I could find, and Martin Fowler was a great help to put me on a better track.

It is better to write and run incomplete tests than not to run complete tests.

Martin Fowler in the book ‘Refactoring’

1. Code I tested

Below is the code I wrote in about 10 minutes, which I then took over an hour to test. To save you reading it the name of the method says what it does, i.e. if the justification of a field is changed, say from Left to Centre then it changes the x position to match.

private void HandleChangeOfJustifyAndCompensatePosition(
    FieldJson fieldJson, CrudAdvancedStylesDto dto)
{
    if (fieldJson.styleInfo.justify != dto.Justify)
    {
        //The justification has changed, so we compensate the position

        if (fieldJson.styleInfo.rotation != 0 || dto.Rotation != 0)
            throw new NotImplementedException(
            "HandleChangeOfJustifyAndCompensatePosition does not currently support rotation.");
        if (fieldJson.styleInfo.shear != 0 || dto.Shear != 0)
            throw new NotImplementedException(
            "HandleChangeOfJustifyAndCompensatePosition does not currently support shear.");

        var leftX = 0.0;
        switch (fieldJson.styleInfo.justify)
        {
            case JustifyOptions.Left:
                leftX = fieldJson.xPosMm;
                break;
            case JustifyOptions.Center:
                leftX = fieldJson.xPosMm - (fieldJson.widthmm/2);
                break;
            case JustifyOptions.Right:
                leftX = fieldJson.xPosMm - fieldJson.widthmm;
                break;
            default:
                throw new ArgumentOutOfRangeException();
        }

        switch (dto.Justify)
        {
            case JustifyOptions.Left:
                fieldJson.xPosMm = leftX;
                break;
            case JustifyOptions.Center:
                fieldJson.xPosMm = leftX + (fieldJson.widthmm / 2);
                break;
            case JustifyOptions.Right:
                fieldJson.xPosMm = leftX + fieldJson.widthmm;
                break;
            default:
                throw new ArgumentOutOfRangeException();
        }
    }
}

Now, you might ask, why did it take me so long to write a test for such a small method. Well, it is a private method so I can only get to it via the business logic that uses it (Note: I really don’t want to change the natural design the the software to make this method easier to test. I have found that leads to all sorts of other problems).

So, why did the tests take so long? Here is a quick summary:

  • The new method added to an existing existing piece of business logic that already had quite a few tests, so it wasn’t a green field site.
  • The previous tests needed a significant amount of set-up, especially of the database, to  run. These needed changing.
  • Because the original tests had been written some time ago they didn’t use the new database mock I had developed, which would help with the new tests. I therefore decided to move all the tests in this group over to the mock database.
  • The Mock database didn’t have some of the features I needed so I had to add them.
  • … I could go on, but I am sure you get the picture.

This leads me onto a key point.

I find 80%+ of the work of writing Unit Tests in about getting the right set-up to allow you to exercise the feature you want to test.

So let me explore that.

Set up of the test environment

In any significant software development there is normally some sort of data hierarchy that can be quite complex. Often to test say one bit of business logic you need to set up other parts of the system that the business logic relies on. When this is stored on a database as well then it makes setting up the right environment really complex.

This might sound wrong for Unit Testing, which is about isolating sections of code and just testing that, but I find in the real world the lines are blurred. In the example code about I really don’t want to make a separate class, which I inject at runtime, just to isolate that method for testing. IMHO that leads to bad design.

Over the years I have tried lots of approaches and here is what I do:

1. Write ‘Set-up Helpers’ to create data for Unit Tests.

My test code is has a Helpers directory with a range of Helper classes (27 so far in this six-month+ project) which create data suitable for testing specific parts of the system. In the past I skimped on these, but I soon learnt that was counter-productive.

Most build the data by hand, but a few load it from json or csv files held in the Test project.

2. Use mocks where you can.

Mocks, or fakes, are pieces of code that replace a more complex subsystem or class. Mostly they replace a system library, but sometimes replace some of your own code so you can trigger thing that the normal code would rarely do, i.e. error conditions.

Mocks/fakes are great and I used them as much as I can. They can provide really good control of what goes on for a small about of work. I personally use Moq, which I find good.

However there are some limits to replicate all the feature of a big system like the Entity Framework ORM (EF), or some of the MVC Controller systems. I do have a mock for Entity Framework, which is pretty good, but it doesn’t do relational fix-up and current doesn’t support .Include().

2. Decide on how you will use a real database

I have found out the hard way that I need to use a real database for testing. There is down side to that of testing taking a lot more time to start (EF6 has a fairly long start-up time). However I have found, up to now (see comment at end of this section), the only way to truly check some code out.

There are a number of ways you can go, from clearing and reloading the database for every test up to just adding to the database. I have found clearing and reloading pretty difficult/time consuming, especially with complex foreign keys, so I use unique strings, via GUIDs, in strategic places to create a unique set of entries for a test.

Whatever way you go you need to think through a system and keep to it, otherwise thinks get messy.

Possible future options on EF?

  • Two separates readers of this blog have recommended Effort Library, which provides an in-memory EF database. That sounds like a great idea, but once I am in a project I try not to make big changes to the system so I haven’t tried it yet, although EF 7 may be better (see next).
  • Entity Framework 7 will have an in-memory database, which sound perfect for Unit Testing. I definitely plan to look at that.

Another Martin Fowler quote:

I’ve often read books on testing, and my reaction has been to shy away from the mountain of stuff I have to do to test. This is counterproductive…

Martin Fowler in the book ‘Refactoring’

The Unit Test framework has a BIG effect

I have found that the Unit Test framework is crucial. I started using NUnit many years ago and found it great, especially when run from inside Visual Studio via Resharper. However, when I started testing JavaScript I found other Unit Testing frameworks had some really useful features.

I will only summarise my thoughts on this because I wrote a long article called Reflections on Unit Testing in C# and JavaScript, which gives more detail. However let me give you the places where NUnit skews my Unit Testing approach.

NUnit does not have nested set-up calls

As I said earlier I find most of the work goes into setting things up to run a test. NUnit has a FixtureSetup/FixtureTearDown, which is run once at the start/end of a group of tests in a class, and a Setup/TearDown that is run before/after each test. These are useful, but not sufficient.

JavaScript Unit Test frameworks, like like Mocha and Jasmine , have a BeforeEach and AfterEach, which can be nested. Now why is that useful? I find that for some of the more complex methods to test I want to check multiple things when it is finished, e.g. was it successful, did it return the right data and did it change the database in the way I expected.

In Mocha and Jasmine I would have a group of tests with maybe an outer BeforeEach that sets up the environment and then an inner BeforeEach which calls the method, followed by separate tests for each of the parts I want to check. This means that a) the setup code is in one place, b) the call to the item under test is in one place and c) the tests are separately shown.

In NUnit I tend to find I either duplicate code or test lots of things in one test. (Note: If this doesn’t make sense then there is an example of nested set-ups in my article Reflections on Unit Testing in C# and JavaScript).

Possible future options:

One of my readers,

NUnit does not support async Setup/TearDowns

.NET 4.5’s async/await is a big change – see my full article on async/await for more. I find I use async/await in 90%+ of my  database calls and maybe 80% of my business logic. NUnit supports async tests, but not the FixtureSetup/FixtureTearDown or Setup/TearDown. That causes me to have to put more setup code in the actual tests.

2. The code I didn’t test

We have been off on a journey into why testing can be difficult. Now I’m back to the decision whether or not to Unit Test. Below is a fairly simple piece of business logic. It most likely took a bit longer than the first bit of code, as there was a DTO class to set up an the normal interface plumbing. What is does is find a customer order connected to something called a FOL.

public class GetFilledOutLabelLinksAsync : IGetFilledOutLabelLinksAsync
{
    private readonly IGenericActionsDbContext _db;
    private readonly IGenericLogger _logger;

    public GetFilledOutLabelLinksAsync(IGenericActionsDbContext db)
    {
        _db = db;
        _logger = GenericLibsBaseConfig.GetLogger(GetType().Name);
    }

    /// <summary>
    /// This returns information on the FOL and any associated order
    /// </summary>
    /// <param name="folId"></param>
    /// <returns></returns>
    public async Task<ISuccessOrErrors<FolAssociatedData>>
         RunActionAsync (int folId)
    {
        var status = new SuccessOrErrors<FolAssociatedData>();

        var fol = await _db.Set<FilledOutLabel>()
              .SingleOrDefaultAsync(x => x.FilledOutLabelId == folId);
        if (fol == null)
        {
            //very odd, but we fail softly
            _logger.WarnFormat(
                 "Could not find the FOL of id {0}", folId);
            return status.AddSingleError(
                  "Could not find the Label");
        }

        //Note: of FOL should only be associated with one LabelOrder,
        //but we have written this in a 'safe' way
        var associatedOrder = await _db.Set<LabelOrder>()
            .Include(x => x.Order)
            .OrderByDescending(x => x.Order.DateCreatedUtc)
            .FirstOrDefaultAsync(x => x.FilledOutLabelGuid == fol.GroupBy);
            
        return status.SetSuccessWithResult(
             new FolAssociatedData(fol, associatedOrder), "Got data");
    }
}

So why didn’t I write a Unit Test for this, and more importantly why am I happy with that decision? Well let me start my list of decison points with another quote from Martin Fowler.

 The style I follow (for Unit Testing) is to look at all the things the class should do and test each one for any condition that might cause a class to fail. This is not the same as “test every public method”, which some programmers advocate.

Testing should be risk driven; remember, you are trying to find bugs now or in the future.

Martin Fowler in the book ‘Refactoring’

My Criteria for Testing/Not Testing

I don’t have any hard and fast rules for Unit Testing, but over the years I have found, sometimes the hard way, what works for me. Here are some guiding principals, starting with the most important, that I use to decide if I need to test something.

Will it fail silently?

If the code could ‘fail’ silently, i.e. it goes wrong but doesn’t throw an Exception, then it needs watching! Any method that does maths, manipulates strings etc. can get the wrong answer and you need to check it, as it could cause problems and you don’t even know.

What is the cost of it going wrong?

It is a risk/reward thing. The more important/central the piece of code is to your application then the more you should consider Unit Testing it. (I can tell you that the payment side of my application is covered with Unit Tests!)

Will it change in the future?

If some code is likely to be changed/refactored in the future then adding Unit Tests makes sure it doesn’t get broken by accident. I love Unit Tests when it comes to Refactoring – I can change vast chunks of code and the Unit Tests will check I didn’t break it in the process.

Does it have lots of conditional paths?

Method with lots of conditional paths or loops are complex by nature. The logic could be wrong and its hard to test all the paths with functional testing. Unit Tests allow you to exercise each path and check that the logic is correct.

Does it have special error handling?

If the path through a method is only triggered on errors then Unit Testing may be the only way to check them out. If you added error handling then it must be important so maybe it should be Unit Tested.

I’m sure there are more, but I’ll stop there. You can always leave a comment and tell me what I have missed.

I should say there are some negative points, mainly around how hard it is to test. For me this isn’t the deciding factor, but it does play into the risk/reward part.

Final quote from Martin Fowler.

You get many benefits from testing even if you do a little testing. The key is to test the areas that you are most worried about going wrong. That way you get the most benefit for your testing effort.

Martin Fowler in the book ‘Refractoring’

Conclusion

So, if you look at the criteria and look back at the two pieces of code you will see that the code I tested a) could fail silently, b) will change in the future (see the NotImplementedExceptions), and c) had lots of conditional paths. It was hard work writing the tests but it was a no-brainer decision. (Note: I did find one bug around the tests for the not implemented ‘rotate’ and ‘shear’ checks).

The second piece of code wouldn’t fail silently, is unlikely to change and has one error path and one success path. You might also note the defensive style of coding as this is about a potential order a customer wants to make, so I don’t want it going wrong. Time will tell if that was a good decision or not, but I am happy for now.

Please feel feed to add your own comments.

GenericServices Masterclass: a deeper look at deleting entries

Last Updated: July 31, 2020 | Created: June 23, 2015

I have been building a new application using GenericServices and I thought it would be useful to others to reflect on the approaches I have used when working with more complex data. In this second master class article I am looking at deleting entries, i.e. data rows, in a SQL Server database using Entity Framework. The master class first article looked at creating and updating entries and can be found here.

I wrote this article because the more I used GenericServices IDeleteService in real applications I ended up using the more complex DeleteWithRelationships version of the command. This article says why and suggests some ways of tackling things, including not deleting at all!

What is GenericServices??

For those of you who are not familiar with GenericServices it is an open-source .NET library designed to simplify the interface between an ASP.NET MVC application and a database handled by Microsoft’s Entity Framework (EF) data access technology. You can read more about it at https://github.com/JonPSmith/GenericServices where the Readme file has links to two example web sites. The GenericServices’ Wiki also has lots more information about the library.

While this article talks about GenericServices I think anyone who uses EF might find it useful as it deals with the basics behind deleting entities.

An introduction to relationships and deletion

Note: If you already understand the different types of referential relations in a relational database and how they affect deletion then you can skip this part and go to the section with title Deleting with GenericServices bookmark.

On the face of it deleting rows of data, known in EF at entities, seems like a simple action. In practice in relational databases like SQL Server it can have far reaching consequences, some of which might not be obvious at first glance when using EF. Let me give you a simple example to show some different aspects to delete.

Below is a database diagram of the data used in the example application SampleMvcWebApp, which contains:

  1. A selection of Tags, e.g. Programming, My Opinion, Entertainment, etc.  that can be applied to a post.
  2. A list of Posts, where each is a article on some specific topic.
  3. A list of Bloggers, which are the authors of each of the Posts.

tagpostblog

Now this example shows the two most used type of relationships we can have in a relational database. They are:

  • One-to-many (1-*): A Post has one Blogger, and a Blogger can have from zero to many Posts.
  • Many-to-many (*-*). A Post can have zero to many Tags, and Tags may be used in zero to many Posts.

The other type of basic relationship is a one-to-one relationship. This is where one entity/table is connected to another. You don’t see this type of relationship so much as if both ends are required (see below) then they can be combined. However you do see one-on-one relationships where one end is optional.

So the other aspect to the ‘One’ part of a relationship is whether it is required or optional (required: 1, optional: 0..1). An example of a required relationship is that the Post must have a Blogger, as that defines the author. An example of an optional relationship would allowing the Blogger to add an optional ‘more about the author’ entry in another table. The Author can choose to set up that data, or not bother.

How EF models these relationships

EF has multiple ways of setting up relationships and I would refer you to their documentation. Here is my super simple explanation to help you relate the section above to Entity Framework classes:

  • The ‘One’ end of a relationship has a property to hold the key (or multiple properties if a composite key).
    • If the key property(ies) is/are nullable, e.g. int? or string without [Required] attribute, then the relationship is optional.
    • If the key property(ies) is/are not nullable, e.g. int or string with [Required] attribute, then the relationship is required. See the BlogId property in the Post class of SampleMvcWebApp.
  • The ‘Many’ end of a relationship is represented by a collection, normally ICollection<T> where T is the class of the other end of the relationship. See the Posts property in the Blog class of SampleMvcWebApp.
  • Many-to-Many relationships have Collections at each end, see the Tags property in the Post class and the Posts property in the Tag Class of SampleMvcWebApp.
    EF is very clever on many-to-many relationships and automatically creates a new table that links the two classes. See my article Updating a many to many relationship in entity framework for a detailed look at how that works.

How these relationships affect Deletion?

If you delete something in a relational database that has some sort of relationship then it is going to affect the other parts of the relationship.  Sometimes the consequences are so small that they don’t cause a problem. However, especially in one-to-one or one-to-many relationships, the effect of a delete does have consequences that you need to think about. Again, let me give you two examples you can actually try yourself on the SampleMvcWebApp web site.

  1. Delete a Many-to-Many relationship. If you go to the Tags Page of SampleMvcWebApp and delete a Tag then when you look at the list of Posts then you will see that that tag has been removed from all Posts that used it (Now press Reset Blogs Data to get it back).
  2. Delete a One-to-Many. However if we go to the list of Bloggers on SampleMvcWebApp and delete one of the Bloggers then when you look at the list of Posts you will see that all Posts by that author have gone. (Now press Reset Blogs Data to get them all back).

So, what has happened on the second one? Well, in this case the database could not keep the relationship without doing something because the Post’s Blog link wasn’t optional. There were two possibilities: either it  could delete all the Post for that Blogger or it could have refused to carry out the delete.

By default EF sets up what is called ‘cascade deletes‘, which is how SampleMvcWebApp is set up. In this case is what deleted the related Posts for that Blogger. If we turned off cascade deletes then the we would get a ‘DELETE statement conflicted with COLUMN REFERENCE’ (547) and the delete would fail.

The simplified rule is if entity A is involved in a required relationship with entity B then when A is deleted something has to happen: either B is deleted or the delete fails. Which happens depends on how you configure EF.

Bookmark

Deleting with GenericServices

GenericServices has a service called IDeleteService which has two delete methods:

  1. Delete<T>(key1, key2…) (sync and async) which deletes the row with the given key, or keys if it has a composite key, from the EF entity referred to by class T, e.g. Delete<Post>(1) would delete the Post with the key of 1.
  2. DeleteWithRelationships<T>(removeRelationshipsFunc, key1, key2…) (sync and async) which does the the same, but called the removeRelationshipsFunc as part of the delete.

I am not going to detail how to use them as the GenericServices Wiki has a good description.You can also find an example of the use of Delete<T> at line 121 in PostsController and an example of the use of DeleteWithRelationships<T> at line 218 in CustomerController.

The only other thing I would say is that deleting entries with composite keys is straightforward – just supply the keys in the order in which they occur in the database. Note: some programmer don’t like composite keys, but I do find them useful. There are places where composite keys are good at segregating data into groups: the primary key can be the group name, the secondary key is the name of the item itself.


When I first wrote GenericServices I just had a Delete<T> method. I very soon found that wasn’t enough, so I added DeleteWithRelationships<T>. Now I find I am using DeleteWithRelationships 80% of the time.

The rest of the article is about why I use DeleteWithRelationships so much, and a few pointers on alternatives to using Delete at all.

Why I use DeleteWithRelationships so much

I have used DeleteWithRelationships in three situations that I will describe:

  1. To provide better error messages when a delete would fail.
  2. To delete other associated entities that would not be caught be cascade deletes.
  3. To delete associated files etc. outside the relational database.

1. To provide better error messages when a delete would fail

I have many instances where I don’t want cascade deletes to work, so if I deleted I would get a SQL error 547. While GenericServices catches this and provides a fairly helpful error it isn’t that informative. I therefore often (actually, nearly always) use DeleteWithRelationships to provide a better error message. Let me give you an example.

In a web site I was building designers could set up designs with text fields. Each field had a ColourSpec, which is a database entity. I allowed a designer to delete a ColourSpec as long as that colour isn’t used in any of the text fields. If it is used then I output a list of designs where it is used so that the designer can decide if they want to remove those reference and try the delete again.

2. To delete other associated entities that would not be caught be cascade deletes

This happens rarely, but sometimes I have a complex relationship that needs more work. I found one in the AdvertureWorks database that I use in my example application complex.samplemvcwebapp.net. In this the customer address consists of two parts: the CustomerAddress with has a relationship to an Address. Now, if we want to delete one of the customer’s  addresses, say because they have closed that office, we want to try and delete the associated Address part, which isn’t linked by cascade deletes.

By using DeleteWithRelationships I can pick up the associated Address relationship and delete that too. In fact I might need to do a bit more work to check if I can delete it as it might be references in an old order. By calling a method which I can write specifically for this case then I insert special logic into the delete service.

NOTE: because the delete is done by GenericServices and any extra delete done in the DeleteWithRelationships method are all executed in one commit. That means if either part of the delete fails then both are rolled bakc, which is exactly what you need.

3. To delete associated files etc. outside the relational database

In a recent web application the system included image files. I chose to store then in the Azure blob storage, which I have to say works extremely well. The database stored all the information about the image, include a url to access the image from the blob, while the images, which can be very large were stored in a blob table.

Now, when I delete the database entity about the image I will end up with an orphan image in the blob storage. I could have a WebJob that runs in the background to delete orphan images, but that is complicated. What I did do was add code to the DeleteWithRelationships to delete the images as part of the delete process.

There is obviously a danger here. Because the database and the blob are not connected then there is no combined ‘commit’, i.e. I might delete the images and the SQL database delete might then fail. Then my url links don’t point to a valid image. In this case I have made a pragmatic choice: I check that the delete should work by checking all the SQL relationships before I delete the blob images. I could get a failure at the SQL end which would make things go wrong, but other parts of the system are designed to be resilient to not getting an image so I accept that small possibility for a simpler system.  Note that if I fail to delete the images from the blob I don’t stop the SQL delete – I just end up with orphan images.

An alternative to Delete

In working web applications I find it is sometimes better not to delete important entities, especially if I want to keep a history of orders or usage. In these cases, which happen a lot in live systems, I am now adding an enum ‘Status’ to my key entities which has an ‘Archive’ setting. So, instead of deleting something I set the Status to Archive.

All I have to do to make this useful is filter which entities are shown based on this Status flag. That way entries with a Status setting of ‘Archive’ are not shown on the normal displays, but can be seen in audit trails or admin screens.

I needed a Status flag anyway, so there is little extra overhead to providing this on my key entities. Also, in EF 6 you can now set a property to have a index, so the query is fairly quick. You need to think about your application, but maybe this would work for you too.

Note: See the comment below by Anders Baumann, with a link to an article called ‘Don’t Delete – Just Don’t‘ for a well formed argument for not using delete.

Conclusion

Deleting looks so simple, but my experience is that it is often more complicated than you think. Hopefully this article gives you both the background on why and the detail on how to use GenericServices to delete entities. I hope its helpful.

Happy coding!

User impersonation in MVC using ASP.NET Identity 2

Last Updated: July 31, 2020 | Created: June 19, 2015
Quick Summary
This post is about how impersonate another user in a modern ASP.NET MVC web application that uses Identity 2. It describes a system that takes advantage of MVC 5’s new AuthenticationFilter to add or change the claims on a user so that they can ‘semi-impersonate’ another user without giving them too much access.

ASP.NET Identity 2 (referred to as Identity2 from now on) is a great system for handling site membership, i.e. who can log in, referred to as authentication, and what they are allowed to do, called authorisation (UK spelling, authorization in .NET code). Identity2 has a host of new features, such as allowing social logins, e.g. login with your Goggle or Facebook account, and it also works well with modern, scaled applications because it uses a cookie for holding session information.

The problem I had was that I wanted to allow a suitably authorised person to ‘impersonate’ another user, i.e. they could gain access to certain data that was normally read-only to anyone other than the author. Identity2 changes how you do that and I went hunt for solutions.

I found a great solution by Max Vasilyev, see his article User impersonation with ASP.Net Identity 2, which I nearly went with. However my need was slightly different to Max’s so I tackled it a different way, which I call ‘semi-Impersonation’ for reasons you will see later. Like Max having got something to work I thought I would also write a blog to share a solution.

My Use-Case

Before I describe the solution I wanted to say why I needed this impersonation method, then the code should make more sense.

I am working on a web application where graphic designers can set up designs for others to buy. They have their own ‘key’ which protects their designs from being changed by anyone other than them. The problem is, sometimes they need some help on setting up a design and its often easier for the site’s chief designer, called SuperDesigner from now on, to sort it out for them.

There are all sorts of ways for solving this from the downright dangerous (i.e. sharing passwords), to not that nice (i.e. allowing the SuperDesigner to have write access to all designs all of the time). In the end I decided to use a ‘semi-impersonate’ method, i.e. the SuperDesigner could just change their ‘key’ to the ‘key’ of the designer that needed help.

Technically the ‘key’ is a Claim, which is was Identity2 uses for authorisation.  If you are not familiar with claims they are a Key/Value pairs that each user has. Some are fixed and some you can add yourself.

My semi-impersonation solution

The solution uses a semi-impersonation approach, i.e. I changes a small part of the user information which allows access to the Designer’s work. The solution  consists of four main parts:

  1. A session cookie to control when we are in ‘impersonate’ mode.
  2. Use of MVC’s 5 new AuthenticationFilter OnAuthentication method which changes the claims on the current user.
  3. Some change to my application code that provides the ‘key’ for each designer.
  4. Feedback to the SuperDesigner that they are in ‘impersonate’ mode and a button to drop out of impersonation.

The big picture of the design is as follows:

  1. The SuperDesigner looks through a list of designers and clicks ‘Impersonate’ on the designer that needs help.
  2. That creates an Impersonate session cookie which holds the ‘key’ of that designer, encrypted of course!
  3. My Impersonate AuthenticationFilter is registered at startup and runs on every request. If it finds an Impersonate session cookie it adds a new, temporary Impersonate claim with the ‘key’ from the cookie to the current ClaimPrincipal.
  4. The code that provides the unique ‘key’ to each designer was changed to check for an Impersonate claim first, and if present it uses that key instead.
  5. The _LoginPartial.cshtml code is changed to visually show the SuperDesigner is in ‘impersonate’ mode and offers a ‘Revoke’ button, which deletes the cookie and hence returns to normal.

1. The Session Cookie

In MVC cookies are found my looking in HttpRequestBase Cookies collection, and added or deleted by adding the HttpResponseBase Cookies collection. I didn’t find the MSDN documentation on Cookies very useful, so here some code snippets to help. Note that in the code snippets the ‘_request’ variable holds the MVC Request property and ‘_response’ variable holds the MVC Response property.

Adding a Cookie

public HttpCookie AddCookie(string value)
{
    var cookie = new HttpCookie(ImpersonationCookieName, Encrypt(value));
    _response.Cookies.Add(cookie);
    return cookie;
}

This code adds a Cookie to the HttpResponseBase cookie collection. A few things to note:

  • Because I don’t set a expiry date then it is a Session Cookie, i.e. it only lasts until the browser is closed, or until I set a negative Expiry date. That way it can’t hang around.
  • I encrypt the value using the MachineKey.Protect method to keep it safe.
  • I have a setting in my Web.Config to set all cookies to HTTPOnly, and Secure when running on Production systems.

Cookie Exists and GetValue

public bool Exists()
{
    return _request.Cookies[ImpersonationCookieName] != null;
}

public string GetValue()
{
    var cookie = _request.Cookies[ImpersonationCookieName];
    return cookie == null ? null : Decrypt(cookie.Value);
}

Fairly obvious. Decrypt is the opposite of Encrypt

Deleting the Cookie

public void DeleteCookie()
{
    if (!Exists()) return;
    var cookie = new HttpCookie(ImpersonationCookieName) { Expires = DateTime.Now.AddYears(-1) };
    _response.Cookies.Add(cookie);
}

Deleting a cookie proved to be a bit troublesome and I found some misleading information on stackoverflow. The best article is the MSDN one here. Watch out that the delete cookies is new, don’t use the one from the request, and the new cookie must have exactly the same attributes as the original cookie. I had a .Path constraint on the added cookie and not on the delete cookie and it didn’t work.

2. AuthenticationFilter OnAuthentication

The AuthenticationFilter is a great addition that came in with MVC version 5. It allows the Current Principal, both Thread.CurrentPrincipal and HttpContext.User, to be changed. The key to doing this is the OnAuthentication method inside the AuthenticationFilter, which I have listed in full below:

/// <summary>
/// This looks for a Impersonation Cookie. If not found then it simply returns.
/// If found it changes the filterContext.Principal to include a
/// ImpersonateClaimType with the Designer Key value taken from the cookie.
/// </summary>
/// <param name="filterContext"></param>
public void OnAuthentication(AuthenticationContext filterContext)
{
    var impersonateCookie = new ImpersonationCookie(
        filterContext.HttpContext.Request,
        filterContext.HttpContext.Response);

    if (!impersonateCookie.Exists()) return;

    //There is an impersonate cookie, so we build a new
    //ClaimsPrincipal to replace the current Principal
    var existingCp = filterContext.Principal as ClaimsPrincipal;
    if (existingCp == null)
        throw new NullReferenceException("Should be claims principal");

    var claims = new List<Claim>();
    claims.AddRange(existingCp.Claims);
    claims.Add(new Claim(ImpersonateClaimType,
         impersonateCookie.GetValue()));

    filterContext.Principal = new ClaimsPrincipal(
         new ClaimsIdentity(claims, "ClaimAuthentication"));
}

Hopefully the code is fairly obvious. The main thing to know is if you change the filterContext.Principal then MVC will change the Thread.CurrPrincipal. too, so you change gets propagated throughout your application. So, by creating a new ClaimsPrincipal with all the original claims, plus the new ImpersonateClaimType then this can be picked up by calling Thread.CurrPrincipal by casting it to a ClaimsPrincipal.

Note that the ‘new ImpersonationCookie(…’ is just my wrapper round my Cookie code shown earlier.

3. Picking up the key

This bit of code is specific to my application, but worth showing for completeness:

public static string GetDesignerKey()
{
   var claimsPrincipal = Thread.CurrentPrincipal as ClaimsPrincipal;
   if (claimsPrincipal == null) return null;  //system failure

   var impersonateClaim = claimsPrincipal.Claims
       .SingleOrDefault(x => x.Type == ImpersonateClaimType);
   if (impersonateClaim != null) return impersonateClaim.Value;

   var designerKeyClaim = claimsPrincipal
       .Claims.SingleOrDefault(x => x.Type == DesignerKeyClaimType);
   if (designerKeyClaim != null) return designerKeyClaim.Value;

   return null;   //system failure
}

You may wonder why I did not just change the DesignerKeyClaimType? I could have, but then it would have been hard for other parts of the system to find out if we were in impersonate mode, as we will see in the next section.

4. Feedback to user and ‘Revoke impersonation’

The final part is the feedback to the user that they are in ‘impersonate’ mode and offer them a ‘Revoke impersonation’ button. This I did in a Html helper method I wrote which went into the ‘if (Request.IsAuthenticated)’ path of  _loginPartial.cshtml. This helper did the following:

  1. It looked for the ImpersonateClaimType. If not there then output normal html.
  2. If ImpersonateClaimType there then:
    1. Say we are in impersonation mode
    2. Offer a ‘Revoke’ button that calls an MVC action which deletes the Impersonation cookie.

Why/how is my implementation different to Max Vasilyev solution?

It is always good to compare solutions so you know why they are different. Here are my thoughts on that (Max, please do tell me anything else you would like added).

  • Max Vasilyev’s solution is great if you want to take over all the user’s roles and claims – you become the user in total. In my case I just wanted to change one attribute, the designer key, hence the name ‘semi-impersonate’.
  • If you just need to get impersonation going quickly then use Max’s solution – it just works. My method needs more design work as you need target specific attributes/features for it to work. Therefore it is only worth considering my method if you want more control on how the impersonation works, see the points below:
    • My method only adds/changes the specific claims that are needed, so limiting the power of the impersonator.
    • My method would would with different levels/types of impersonation by having different cookies. The OnAuthentication method could then obey different rules for each cookie type.
    • My method can use the cookie attributes like Domain and Path to limit what part of my site supports impersonation. In my application I use the .Path attribute to make impersonation only work on the Designer setup area, thus restricting the power of the SuperDesigner to say edit the designer’s details  or payments.

Note: Max makes a good point in his feedback that using .Path in that way ‘is not future- or refactoring- proof’, i.e. if the routing is changed by someone who does not know about the .PATH on the cookie then it could lead to errors that are hard to find at testing time. You may want to take his advice on that and use a more obvious authentication filter attribute that checks for the cookie.

Other, more minor differences are:

  • My method allows for audit trails to work properly as only the key is changed and the rest of the user identity is left alone. Also means the user who is impersonation, SuperDesigner in my case, keeps their enhanced Authorisation Roles.
  • My method will be very slightly slower than Max’s solution as it runs the OnAuthentication method on every HTTP request.
  • My method is limited to what information you can put in  a cookie (4093 bytes). Also because I encrypt my content it can take up a lot of room so you might only get four or five big strings in an encrypted cookie.

Conclusion

This article has described a way of one user gaining access to facilities of another user in a secure and controlled way. Hopefully I have given you enough code and links to see how it works. I have also pointed you to Max Vasilyev solution, which is very good but uses a different approach, so now you have two ways to impersonate another user.

Do have a look at both and take your choice based on what you need to do, but between Max’s solution and mine I hope they help you with your web application development.

Happy coding!

Architecture of Business Layer – Calling multiple business methods in one HTTP request

Last Updated: July 31, 2020 | Created: April 25, 2015

In this article I explore the issues and problems of calling business logic in a web application. The aim is to supplement my last article,  Architecture of Business Layer working with Entity Framework, which looks at the design of the business layer by describing how a web application user interface or RESTful service might call the business logic to provide some useful function.

In this article I look at a specific issue around calling multiple business methods, some of which write to the database using Microsoft’s Entity Framework (EF) data access technology. My aim is to both show you what issues we need to consider and describe a solution that overcomes these issues. At the end I give your some further reading for the more advanced situations.

Calling business logic

In most web applications we have some data access methods and we have some business logic. Sometimes they are closely linked and sometimes they are separated. When building an application with EF as your database access technology  then your choices are constrained. You can go the pure Domain-Driven Design (DDD) route and beef up your EF classes, or you can separate your EF data Entity classes into a Data Layer and place your Business Logic in a separate Layer (in this article I use ‘Layer’ to mean a separate .NET assembly).

I don’t want to get into the whole argument about what is the base way, but I have chosen what is a fairly normal arrangement where I keep my business logic in a separate layer to my data classes.  This is not a pure DDD designer but I do design my Business Classes to it treat the database as far as possible as an in-memory collection. Because of this I architect my business logic in a certain way and use a private library, called GenericActions, for handling calls. (see Architecture of Business Layer working with Entity Framework article for more on this).

Everything was going fine until I had to call multiple business methods in sequence to achieve the result I needed. It is something I have just come across in a project I am working on and have developed a solution, so its a hot topic for me. This is the story of why I needed to do this, what the problems were and how I implemented my final solution.

Calling multiple business methods to achieve one business goal

I am working on a e-commerce site and I came across one business process that needed two parts: the first stage checked the data input by the user and wrote the top level data out to the database. The second stage then imported some files and used the primary keys of the data written by the first stage as a reference name for the files. That meant that the first stage needed to complete and EF .SaveChanges() needed to be called so that the primary keys are in place before the second stage started.

Couldn’t I combine the two stages into one business method? Well, if you read my other article on Business Layer Architecture you will know that I don’t let me business logic call .SaveChanges(). I could break my own rule, but I have found this level of isolation really useful (read the other article for the full story). Anyway, even if I did that I still have the problem that that if the second half failed then I would be left with a bad, half completed database entry.

Now there are other ways round this particular issue (like use a GUID), but the idea of having to run multiple business methods in series is one that I thought might happen. When the first case occurred I hand-coded a solution, but when I came across another case it was time to create a proper solution.

Solution design

I implemented a solution inside my GenericActions library. While this is private I can outline my approach so you can implement your own version if you need to.

Because my solution uses EFs transactions control, I called the key internal class ‘Transactional Runner’, referred to as TR from now on. The primary feature of TR is of course EF’s V6 transaction control methods. When running multiple business methods GenericServices creates a DbContextTransaction using the context.Database.BeginTransaction() method. This means that all writing to the database is controlled and not committed until the DbContextTransaction.Commit() method is called. If a problem is found than the DbContextTransaction.Rollback() method can be called and all database writes done within the transaction scope are undone.

The diagram below shows it running three business methods in turn. The first two write to the database, but the third fails. The runner then rolls back the changes using EF Rollback(). If they had all succeeded then it would have called EF’s Commit() to make the changes permanent and available to other application/threads.

genericactions-transactional-runner

The simplified code for this would be:

using(var context = new MyDbContext())
{
    using (var dbContextTransaction = context.Database.BeginTransaction())
    {
        try
        {
            var biz1Result = CallBizMethod1(input);
            context.SaveChanges();
            var biz2Result = CallBizMethod1(biz1Result);
            context.SaveChanges();
            var biz3Result = CallBizMethod1(biz2Result);
            context.SaveChanges();
 
            dbContextTransaction.Commit();
            return biz3Result;
        }
        catch (Exception)
        {
            dbContextTransaction.Rollback();
        }
    }  
}

If you look at the code above you see that I pass data between each business method. This allows the previous method to pass information onto the next, and so on. If you think back to my e-commerce problem that started it off then the first business method creates the database entities, which are written to the database. However the second method needs to know which entity to use, which is why the first method passes on the class, with its newly created primary key, to the second class.

As you would expect my generic solution is quite a bit more complicated than that, but the code above should give you the idea. If you are writing a solution you might like to consider that some business methods are really simple, with no database writing and maybe just a few lines of code, while some do a lot, with async writes to the database and maybe async outside requests. It is worth a bit of thought to optimise for each type. This leads to two things I do.

  • I have normal synchronous TR and also an async TR, which is what I mainly use. However I have written the Async TR version so that it can take a mix of sync and async business methods and only runs the async ones in async mode. This means small, code-only business methods are not burdened with unnecessary Tasking which makes writing/debugging them simpler and they run more quickly.
  • Not all of my business methods write to the database, so I have an attribute on the business classes that access the database and only call SaveChanges() on those methods.

Warning: use of DbContext needs careful thought

In looking at this I had to think long and hard if my final solution would work properly in a web application. It will, but only because of specific constraints I have placed on my libraries. It is worth spelling out why this is so that you understand the issues if you come to write your own version.

The sample code I gave you will work fine in all cases, but in the real world you would not normally create the DbContext in that way. I use Autofac Dependency Injection to create my DbContext and it has an .InstancePerLifetimeScope() on it. This means that a single instance of DbContext is created for each HTTP request. This is important for the following reasons:

  • If two business methods look at the same DbContext then any changes done by one method will be seen by the second method. If they each had there own DbContext then, depending on when the DbContexts accessed the database, they might be out of step.
  • If a method writes to the database, but then fails and .SaveChanges is not called then the writes are discarded before the next HTTP request. Because I control when .SaveChanges() is called by using either my open-source GenericService (database access) or private GenericActions (business logic calling) then I know a stray .SaveChanges() will not happen once a method has returned an error. This means any data written to the database from a failed method will not be persisted.

However in another application I ran a long running task, and then I have to create a separate DbContext. Why was that? The primary reason is DbContext is not Thread Safe, so you would get an error if you used the same instance in multiple threads. Getting an error is a good thing because it would cause all sorts of problems elsewhere if it didn’t.

So, if I am running multiple methods, some of which are async, so why does my solution work with a single instance of DbContext? It is because they all run in the same thread. Async simply releases the thread when busy – it does not run things in parallel, so all is fine.

I was recommended a great article called ‘Managing DbContext the right way with Entity Framework 6: an in-depth guide‘ on this whole subject by one of my readers, khalil. I found this very helpful and did consider using his library. However I am clear that in my current design the limits I have placed on my libraries , for now, everything will work reliably. Maybe I will have to look at this later if I need to extend to parallel running, but that is a fight for another day.

Conclusion

I have shown that there are cases where multiple business logic, one or more of which write the database, need to be run is series. I have then described an approach using Entity Framework where you can ensure that the database is only updated if the whole series of business methods finished successfully. While the solution is fairly simple there are both performance and tasking issues to consider.

Hopefully this article will have given you some pointers on what the problem is, what sort of solution you could use and what to watch out for. If anything is unclear then please leave a comment and I will try and I will try and improve the information.

Happy coding!

GenericServices Masterclass: a deeper look at Create and Update

Last Updated: July 31, 2020 | Created: April 21, 2015

I have been building a new application using GenericServices and I thought it would be useful to others to reflect on the approaches I have used when working with more complex data. In this first master class article I am looking at Create and Update of data, especially via DTOs (Data Transfer Objects).

What is GenericServices??

For those of you who are not familiar with GenericServices it is an open-source .NET library designed to simplify the interface between an ASP.NET MVC application and a database handled by Microsoft’s Entity Framework (EF) data access technology. You can read more about it at https://github.com/JonPSmith/GenericServices where the Readme file has links to two example web sites. The GenericServices’ Wiki also has lots more information about the library.

If you are not familiar or interested in GenericServices then save yourself time and skip this post.

Introduction to create and update with DTOs

GenericServices has five primary data access methods: List, Detail, Create, Update and Delete. Create and Update are the most complicated as they write data back to the database. If you are working directly with the database classes then it is fairly straightforward: in this case GenericServices is executing the EF commands for you through a standardised interface that adds some extra code to check and return any validation or selected SQL errors.

Create and update become more complex when you are using DTOs. However, if you have read any of my articles, such as Why DTOs, then you will know that 95% of my accesses to the database go through DTOs. Therefore handling DTOs is critical and GenericServices is built primarily for DTO mode database accesses.

The most common use of DTO is in reading data, but for Create or Update the DTOs need to be transformed back to the database entity before that entity is written out to the database. There are four levels of complexity, starting with the simplest and working up to the more complex. I will list them here and then go into more detail on what to do later on.

  1. Simple one-to-one mapping: Each property is a simple data type, i.e not another user class or relationship, and has an equivalent DTO property. In this case GenericServices ‘just works’.
  2. One-to-one mapping, but some properties are read-only. This case is like the first, but maybe you have properties you want to show the user, but you don’t want them to update them. A obvious example might be the price of something they can buy. In this case you will use the [DoNotCopyBackToDatabase] attribute – see later for an example.
  3. Contains properties that are developer defined classes or collections. In this case you might have a relationship class, a collection or an EF complex type. In this case AutoMapper will read these property using its conventions, but by default will not write them back. You need to add extra code to handle this which I will cover later.
  4. Contains properties that must be calculated on write. This happens when the value to be written depends on the user’s input, or we need to do some drastic reformat to the data before writing back. Again, I will show you some example later.

Before I describe these four cases I would like to point out that AutoMapper and the DelegateDecompiler’s [Computed] attribute (see GenericServices Wiki page on Calculated Properties) are all designed to work when READing from the database. When writing to the database GenericServices does use AutoMapper, but it has to supplement extra commands as, by default, AutoMapper is not meant to be used for write to database actions. That is why this article and the various examples exist – to show you how to handle this write operation successfully.

So, with that background let me go through all the four cases above.

1. Simple one-to-one mapping

To be honest if you have a simple one to one mapping between the data class and the DTO then you most likely should be using GenericServices with the data class itself as it is quicker and more transparent. I can’t think of any cases that I have had a ‘simple’ mapping where I used a DTO. However it will work if you do.

2. One-to-one mapping, but some properties are read-only

[DoNotCopyBackToDatabase] is GenericServices’ version of MVC’s [Bind(Exclude = “SomeProperty”)]. Both allow a property to be marked as not being overwritten by user input. You can get more about this in the GenericServices Wiki DoNotCopyBackToDatabase attribute page.

The good example is given in the Wiki page, i.e. you want a user to see the price of their purchase, but they should not be able to change the price. I use this myself in one of my applications to hold a status flag that can be seen by the user, but can only be changed by the business logic. Therefore I mark the property in the DTO with [DoNotCopyBackToDatabase] to ensure it can not be changed either by accident or maliciously.

In SampleMvcWebApp I apply [DoNotCopyBackToDatabase] in the DetailPostDto.cs class to stop the LastUpdated property being coped back. However that is really a comment as any value would be overwritten by the code inside the SampleMvcWebApp’s .SaveChanged() method which sets that property at save time.

 3. Contains properties that are developer defined classes or collections

This is where it starts to get more complicated. There are a few possible scenarios and I will deal with each in turn:

a. A relationship class

This is where you have a class that is a relationship. An example from SampleMvcWebAppComplex is a CustomerAddress class, which has a one-to-one relationship with the Address class, which holds the address details. In SampleMvcWebAppComplex I use a top level DTO called CrudCustomerAddressDto to hold all the customer address data. This type of situation doesn’t happen often, but when they do you need a way to handle them.

Firstly let me explain why the normal mappings wouldn’t work. If we used AutoMapper’s flattening convention it could read properties in the associated Address class by concatenating the names, e.g. AddressCity would read the City property in the Address part of CustomerAddress. The problem is it can read, but it won’t write as AutoMapper doesn’t ‘un-flatten’ things, i.e. setting the property AddressCity in the DTO will not be mapped back to Address.City in the CustomerAddress class.

There are a few ways round this. One is to hand-code the copy back, but we can also use AutoMapper’s nested mappings feature, which is what I actually did in this case. To use this I need to create a sub-DTO, called CrudAddressDto. I then need to do two extra things in the CrudCustomerAddressDto class.

Firstly I have to tell GenericServices to make sure that AutoMapper knows about how to map the CrudAddressDto, otherwise I will get an error from AutoMapper if I try to do the mapping. That is achieved by overriding AssociatedDtoMapping method in CrudCustomerAddressDto to add the extra mapping as show below (note: there is a AssociatedDtoMappings version if you need multiple extra mappings) :

protected override Type AssociatedDtoMapping
{
   get { return typeof(CrudAddressDto); }
}

Secondly for the update to work I need to make sure I have loaded the CustomerAddress and the associated Address class. This is done by overriding the FindItemTrackedForUpdate method (see below). If I don’t do this then updating the CustomerAddress will create a NEW Address class rather than update the existing class. As that Address class may be used elsewhere, such as in the delivery address, then its important to get that right.

 protected override CustomerAddress FindItemTrackedForUpdate(IGenericServicesDbContext context)
 {
    return context.Set<CustomerAddress>()
                             .Where(x => x.CustomerID == CustomerID && x.AddressID == AddressID)
                             .Include(x => x.Address).SingleOrDefault();
 }

b. A Complex Type

A ‘complex type’ is the term EF uses for a reference to class which has no key  – see section entitled ‘Complex Types Convention’ in Entity Framework documentation. This is often used to simplify or separate parts of the class.

If you want to do this then you can use the nesting mapping approach shown above to do the to/from mapping of the complex types. However, as complex types are simply a way of arranging things I tend ‘flatted at design time’, i.e. I don’t use complex types. That means that AutoMapper will do both the read and the write.

If that simplification does not work for you and you want to use Complex Types then you need to create a new dto for each complex type and override the AssociatedDtoMapping or AssociatedDtoMappings to allow GenericServices to set up the AutoMapper mappings. You don’t need to override FindItemTrackedForUpdate, as the class is in fact one single row in the database.

c. A Collection

This is the most complex case because a collection refer to multiple relationships (EF does not support lists and arrays of simple data like bytes or strings). As you can imagine, updating multiple relationships is complex and has to be hand coded by overriding CreateDataFromDto and UpdateDataFromDto, which I will explain in the next section.

I should say that updating multiple relationships isn’t a simple problem anyway. This normally falls under the next section, Contains properties that must be calculated on write, as it is complex enough that GenericServices cannot make a guess as to your intentions.

Contains properties that must be calculated on write

There are times where a property in a class is either set by the user via some other means, like a dropdownlist, or the data in the DTO needs a more complicated reformatting/parsing to get the required value for the property. In these cases the developer needs to intercept the CreateDataFromDto and UpdateDataFromDto methods and write code to set the properties.

Let us start by looking at the anatomy of the two methods. The first thing to note is that they are very similar, but have a slightly different signature. The method signatures are different because update is handed an database entity to update while the create returns a new instance of the database entity class (This was done because databases using a Domain-Drive Designs (DDD) approach the creation of a entity may be done via a factory). This means that the handling of the return status is slightly different between the two methods

However when overriding these methods the code inside has a common approach. Let us take UpdateDataFromDto as an example and look at the three stages the update and create goes through.

protected override ISuccessOrErrors UpdateDataFromDto(
    IGenericServicesDbContext context, TDto source, TEntity destination)
{
   //1) Before copy/create: If you put your code here then it should
   //set properties in the source, which AutoMapper will then copy over

   //2) Call the base method which uses AutoMapper to create/copy the data
   var status = base.UpdateDataFromDto(context, source, destination);

   //3) After copy/create: If you put your code here then it should
   //set properties in the destination, ready to be written out

   return status;
}

So, your options are:

  1. Change properties in the DTO and then let AutoMapper copying them over for you.
  2. Wait until Automapper has run and change the properties in the database entity class.
  3. Don’t call the base method and handle the whole copy/conversion yourself (this is what you would need to do for DDD databases).

I have used option one and two depending on the need. Option ones is good if the property needs to be in the DTO while option 2 is useful when you want to exclude a property from the DTO and only set it via your code. In both cases I try to do any validation of the values early so that if it fails it returns immediately and therefore saves the redundant call to AutoMapper.

SampleMvcWebApp has a classic example of option one in DetailPostDto. DetailPostDto is complex because it allows the user to select the post’s author from a dropdownlist and the post’s tags from a multi-select list. If you look at the code you will see the overrides of CreateDataFromDto and UpdateDataFromDto around line 140 or so. To save you swapping pages here is a copy of the overridden methods from DetailPostDto:

protected override ISuccessOrErrors<Post> CreateDataFromDto(IGenericServicesDbContext context,
    DetailPostDto source)
{
    var status = SetupRestOfDto(context);
    return status.IsValid
        ? base.CreateDataFromDto(context, this)
        : SuccessOrErrors<Post>.ConvertNonResultStatus(status);
}

protected override ISuccessOrErrors UpdateDataFromDto(IGenericServicesDbContext context,
    DetailPostDto source, Post destination)
{
    var status = SetupRestOfDto(context, destination);
    return status.IsValid
        ? base.UpdateDataFromDto(context, this, destination)
        : status;
}

As you can see both methods call a private method called SetupRestOfDto which handles the set up of the BloggerId and the Tags properties in the DTO. The SetupRestOfDto method needs to know if its a create or update, which it knows because if the Post is not given it is set to null. On return if the status for the SetupRestOfDto method is OK, the codes then calls the base method to do the create/update. This is a typical way of handling user selections from dropdown list.

I used option two, setting the value in the database entity class, in a recent project when I wanted to make the user input much simpler. In this case the user needed to input to input a CMYK colour and a RGB colour. The colours were stored as a series of numbers but it was simpler for the user to enter the CMYK and the RGB as two strings of digits, with different options on hex/decimal etc. I therefore overrided the create/update method and parsed the two strings into the right format directly into the database entity class.

I am sure your project has other requirements that GenericServices does not do as its stands. Overriding CreateDataFromDto and UpdateDataFromDto is the go-to place to add your code and still benefit from all of GenericServices’ features.

Note: Writing this section made me realise I could have simplified the the method signatures of the CreateDataFromDto and UpdateDataFromDto as I don’t (now) need to pass in the DTO. However that would be a breaking change to the interface and I’ll leave that for the next big change of GenericServices.

Conclusion

In this article I have covered in detail what happens in a Create or Update inside GenericServices. It is complex, especially around handling relationships. Hopefully this article will help anyone who needs to implement any of the more complex use of Create and Update.

Do ask any questions or make comments here, or raise an issue on the GenericServices GitHub site if you think there is a bug. Happy to answer any questions as it helps increase the documentation for the project.

Happy coding.

 

Using .NET Generics with a type derived at runtime

Last Updated: July 31, 2020 | Created: February 7, 2015

I have used .NET Generic types a lot and found them really useful. They form a core part of the solution in my open-source GenericServices library and my private GenericActions library. This article explores why and how I use Generics and some of the techniques I use to hide the more complex type definitions.

Even if you are not interested in my particular use of Generics this article has some useful information on:

  1. How to create an instance of a generic type where the type is defined at runtime.
  2. The two approaches to calling methods or read/write properties of the created generic instance.
  3. The performance issues around the two approaches to calling methods or read/write properties.

Only interested in how to use Generics with a runtime-derived type? Just skip to here in this article.

What are Generic Types

Generics are everywhere – if you use List<T> etc. you are using Generics. If you are not familiar with Generics then look at this introduction. The primary thing about Generics is it allows you to write one code implementation that can then work on a range of types. In particular they produce efficient code because they are ‘type safe’, i.e. the type is known, so the code can access what it needs.

Without Generics we would either:

  • Have to write the same code for each type, which is bad practice.
  • Write one method but use Reflection to access the items we want, which is not type-safe.

The down side of Generics is you (normally) have to know the type at compile time. However I will show you later how to overcome this.

Where I have found Generic types really useful

I developed a web application called Spatial Modeller that does mathematical predictive modelling for healthcare. The modelling is complex and I used a  Domain-Driven Design (DDD) approach, which said I should keep the database/business classes focused just on the modelling problem. That worked really well, but it meant that the database/business classes weren’t right for display to the user.

I therefore had a Service Layer which had classes more attuned to the display and visualisation of the data. These are typically called Data Transfer Objects (DTOs), or in ASP.NET MVC they are also called ViewModels. The Service Layer code transformed the data passing between the Data/Business Layers and the Presentation/WebApi Layer.

The problem was I ended up with writing very similar code, but with different data types, for the transformation of the database/business classes to DTOs. That is both bad software practice, time consuming and boring!

After that project I was determined to create a solution for that. It took a bit of experimenting, but Generics was the answer. However the final solution wasn’t quite a straightforward as you might think.

The problem of a complex Generic type definitions

Let me give you a real example of where my presentation layer wants to run a piece of business logic. In Spatial Modeller to model a scenario the following happens:

  1. The user picks the potential hospital locations for dropdown lists and fills in various selections.
  2. These are handed to the business layer modeller as primary keys for the hospital locations and various enums/parameters.
  3. The modeller does its work, writing some data to the database.
  4. The modeller returns the results as primary keys.
  5. These need to be looked up show the user.

So, in that process we have five classes:

  1. The presentation input DTO, called DtoIn.
  2. The business modeller data input, called BizIn.
  3. The class/method to call in Business Layer, called IBizAction.
  4. The output of the business modeller, called BizOut
  5. The presentation output DTO, called DtoOut.

So, if we define a Generic class to handle this its class definition would look like this:

var service = new ActionService<DtoIn, BizIn, IBizAction, BizOut, DtoOut>
     (db, new BizAction(db));
var dataIn = new DtoIn(some data here );
var result = service.RunAction(dataIn);

That is a) quite hard to write and b) not easy to use with Dependency Injection and c) downright ugly! In fact in both of my libraries its even more complex than that, with different types of input. Even worse the code will be the same, but have different transform parts. In GenericActions I calculated there were sixteen versions, all of which would need separate implementations.

GenericServices was much easier, but still produced ‘ugly’ generic type definitions, which needed special handling at dependency injection time.

Getting rid of the ‘Ugly’ type definitions

OK, so Generics got me so far but I needed to do something else to combine and simplify the type definition of the Generics. I did two things:

  1. For both libraries I hide the complex type definition via a outer, non-generic helper class where the called method does a decode of the types and creates the correct Generic class instance with the right types.
  2. In the case of GenericActions, which has so many versions, I did a further decode of the in/out types inside the Service Layer code, which reduced the sixteen possible versions down to six. I could have gone further, but this was the right level to keep performance up.

Before I describe the solution I will show you the improved code for the example in the last section (see below). I think you will agree that is much nicer, and the creation of the service is now very Dependency Injection friendly.

var service = new ActionService<IBizAction>(db, new BizAction(db));
var dataIn = new DtoIn(some data here);
var result = service.RunAction<DtoOut>(dataIn);

So, let us go through the steps to achieve this.

Part 1: Decode the underlying types

I’m going to use an example from GenericServices, as it is open-source and I can therefore link to the actual code. So the example for GenericServices is:

var service = new UpdateService(db);
var dataIn = new DetailPostDto(some data here);
var response = service.Update(dto);

The steps are:

  1. service.Update<T>(dto) is a top-level method (see first class) that can take either a database class, like Post, or a dto.
  2. If it is a dto then it is derived from abstract EfGenericDto<TEntity, TDto>. This forces the dto definition to include the database class. See the DetailPostDto.cs as an example.
  3. The Update method calls DecodeToService<UpdateService>.CreateCorrectService<T> (see this class, first method). It gets a bit complicated in this class because of sync/async versions of DTO, but the bottom line is it:
    1. Finds if it is a EfGenericDto or not. If not then it assumes its a direct access.
    2. If it inherits from EfGenericDto it finds the database class and the Dto class.

Bookmark

Part 2: Create the Generic Class from the runtime types

Ok, the last process found whether it was a database class, or a dto class. There are different classes to handle these two cases. The direct update has one Generic Type parameter, being the database class. The dto version has two Generic Type parameters: the database class and the dto class. However in the example below I look at a very simple case to make it easier to understand. GenericServices is a bit more complicated, but follows the same approach.

The code to create the instance of the Generic class is petty straightforward. For example if I wanted to create a List<string> at runtime I would

  1. Produce an array of the type(s) need to form the generic type, in this example ‘string’
  2. Get the generic type, in this example ‘List’
  3. Combine them using the ‘.MakeGenericType’ method
  4. Create an instance of that type using ‘Activator.CreateInstance’

The code below shows an example of creating ‘List<string>’ at runtime.

var dataType = new Type [] { typeof(string)};
var genericBase = typeof(List<>);
var combinedType = genericBase.MakeGenericType(dataType);
var listStringInstance = Activator.CreateInstance(combinedType);

Part 3: Calling methods from the created instance

You should be aware that ‘Activator.CreateInstance’ method returns an object, so you can’t just ‘listStringInstance.Add(“hello world”)’ as the compiler will something like “‘object’ does not contain a definition for ‘Add’“. You have two choices:

1. Use Dynamic Type

You can place the output of the ‘Activator.CreateInstance’ into a dynamic type (see start of line 4). This turns off compile time type checking which allows you to call any method, or access any property, with the type checking done at runtime. So in our List<string> case it would look like this:

var dataType = new Type [] { typeof(string)};
var genericBase = typeof(List<>);
var combinedType = genericBase.MakeGenericType(dataType);
dynamic listStringInstance = Activator.CreateInstance(combinedType);
listStringInstance.Add("Hello World");

Dynamic is easy to use, and allows much better freedom. However the dynamic runtime library takes a lot of time on the first call of the method. See Performance section for more detailed analysis.

2. Use Reflection

Reflection allows you to find methods, properties by name. You can then call the method or access the property via different Reflection methods. In our example we are calling a simple method ‘Add’ in the instance type (would be null if no method of that name existed) and then ‘Invoke’ that method, i.e.

var dataType = new Type [] { typeof(string)};
var genericBase = typeof(List<>);
var combinedType = genericBase.MakeGenericType(dataType);
var listStringInstance = Activator.CreateInstance(combinedType);
var addMethod = listStringInstance.GetType().GetMethod("Add");
addMethod.Invoke(genericInstance, new object[]{"Hello World"});

The reflection approach does have some complications, such as method or property accesses returns an object, which can be a problem if you need to type it. Also if the method had an output you need to build a generic Method using ‘MakeGenericMethod’. All in all reflection is harder to handle that dynamic.

However the reflection approach has a tiny first-use cost compared to the dynamic approach. However for lots of subsequent calls to the method then dynamic is quicker. See Performance section for more detailed analysis. (below)

Performance issues – simple types

The way you access the method or property does have a big effect on the performance of the command, as we are doing more. I care about performance so I have studied this in detail. There are two costs, one-off  first-use compute time and per instance compute time.

Let me start with a simple table giving the times for my simple List<string> example. What I did was run the same test on three types:

  1. Normal code, i.e var list = new List<string>(); list.Add(“Hello World”);
  2. Using reflection, i.e. listStringInstance.GetType().GetMethod(“Add”);
  3. Using dynamic, i.e. listStringDynamicInstance.Add(“Hello World”);

To take out any first-use costs I ran it on List<string> twice, followed by List<DateTime> three times, as it did seem to change. The List<DateTime> is there to check if building a different type has the same first-use cost. Here are the results, which were measured using the ANTS Profiler.

BE WARNED: This table is a bit misleading by implying refection is always faster than dynamic – it isn’t. See the ‘Performance issues – GenericAction library’ section where a real-life example shows that dynamic wins in the end.

Type Order Compiled Reflection dynamic
List<string> 1. First 0.8 ms 37.08 ms 5.0 ms
(was 600ms)
List<string> 2. Second < 0.001 ms 17.05 ms 1.0 ms
List<string> 2. Third < 0.001 ms 0.01 ms 0.6 ms
List<DateTime> 3. First (after List<string>) < 0.001 ms 0.03 ms 2.7 mS
List<DateTime> 4. Subsequent < 0.001 ms 0.03 ms 0.5 ms

 

When I first started testing I was getting 600ms for the first-use cost on the first dynamic method call, of which about 400 ms comes from the Dynamic Language Runtime. However once I installed .NET 4.5.2 on my windows system this dropped to 5ms. I cannot confirm that as the release notes do not say there is a change there. Maybe it was just reinstalling the .NET library. Anyway, be warned that things might have changed.

The way that .NET handled dynamic types is complex and I don’t claim to understand it properly. I would refer you to Simon Cooper’s series on ‘Inside DLR’ (Dynamic Language Runtime) starting with ‘Inside the DLR – Callsites‘ and then look at ‘Inside DLR – Invoking methods’ which describes the caching of methods.

Performance issues – GenericAction library

In GenericServices and GenericActions I have used the dynamic approach. Seeing the figures above made me wonder if that was the best way. I therefore forked a version of GenericActions and changed it to use a Refections approach with no use of dynamic anywhere. Here are my findings:

  1. When comparing the dynamic-based and the Reflection-based versions of GenericActions the first-use costs of dynamic are much less than the table above suggests. The figures are:
    1. Refection-based: 300 ms first-use cost due to AutoMapper first-use costs.
    2. dynamic-based: 700 ms first-use cost, which, if you exclude the AutoMapper part, means that the dynamic part is only 400 ms. I haven’t measured after installing .NET 4.5.2 but it now seems quicker.
  2. If a created instance is used many times then dynamic wins in the end. As an example my test of system on a 1000 accesses then dynamic was between 120% and 200% faster than reflection (on my tests anyway).
  3. Dynamic is much easier to use. I found it really hard to do everything with reflection, mainly because you cannot cast something. My altered GenericActions version worked, but some of the error checking on incorrect types did not work any more. I might have fixed them, but it wasn’t worth it.

So, lets look at the different parts of the performance problem.

1. One-off first-use compute time costs

With dynamic there is a cost on first decode and call of the method in my libraries which use dynamic – about 0.2 seconds or more on my system. As I explained earlier this because the first time you call a method in a dynamic type its needs to be created, which takes some time. However the method is then cached so later calls are very fast.

Any first-use performance cost is a pity, but I have other first-use time costs, like MVC, Entity Framework, AutoMapper etc., so I accepted , mainly because I can mitigate it (see next section) and overall the dynamic approach is faster.

Mitigating first-use costs

One thing that really helps is keeping the application alive so that the first-use cost only happens when you start/restart your application. I mainly develop web applications and one great feature introduced in .NET 4.5.1 was ‘Suspend‘ on an ASP.NET web site.

On shared hosting (maybe other hosting as well, not sure) when a web site has no traffic, i.e. all user sessions have timed out, then it stops. This means for low-traffic sites the web application would stop and the next user coming to it would have to wait for it to start. The ‘Suspend’ feature keeps the application alive so that this does not happen.

Suspend support does depend on your hosting provider setting up the feature in IIS so please read this post on it. My non-Azure shared hosting provider WebWiz supports this. Azure web sites doesn’t support ‘Suspend’ but fixes this with an ‘Always On’ feature which polls the site regularly enough that it stays up.

2. Per instance compute time

The whole decode and create takes about 0.0024 ms on my system, excluding first-use compute time. The actual creation of an instance of the generic type isn’t that costly (I haven’t measured it separately), but it is the decoding of the types etc. that take the time.

In the case of GenericServices the database call is so large, > 1 millsecond in all measured cases, that its not worth trying to improve the decode/create time.

However in GenericActions, which may call a very simple business method which could return very quickly, it is worth looking at. I therefore implemented a system caching system using a ConcurrentDictionary in my GenericActions library. It took quite a bit of tuning but the effect was worthwhile, as it brought the worse case, which has multiple decode calls, down from 0.037 ms to 0.018 ms. These are small figures, but I think worth the effort to make calls to the business logic have a low performance cost.

Conclusion

I have described not only a way of creating a Generic Type from a runtime type, but a whole system for using complex Generic Types, yet still having type definitions that are simple to use and Dependency Injection friendly. I have also looked how you call the methods or access the properties in the runtime created type, focusing on ease of use and detailed performance figures. I hope that is useful to people.

Happy coding!

Architecture of Business Layer working with Entity Framework

Last Updated: July 31, 2020 | Created: January 27, 2015

Jonathan Crosby asked a question on one of my GenericServices  GitHub projects about how to handle a Business Layer. I thought it was a good question so I have written an article to detail my approach to building business layers that use Entity Framework.

UPDATE 2016: See new, improved approach here

I have revisited this topic and refined my approach to Business Logic using EF. I suggest you read the new article called Architecture of Business Layer working with Entity Framework (Core and v6) – revisited.

UPDATE 2017: Book Entity Framework Core in Action

I have been commissioned my Manning Publishing to write the book Entity Framework Core in Action, in which chapter 4 is all about Business Logic in an Entity Framework Core environment (but the ideas are applicable to EF 6 too). 

This original article is old, but I have kept it because of all the comments.

What is the Business Layer?

GenericServices four layer designThe diagram on the left shows the type of layered application I normally build. This article is talking about the Business Layer (orange in diagram), which is called the “Domain Model” in Martin Fowler’s Service Layer diagram above.

The Business Layer is the place where all the business/domain logic, i.e. rules that are particular to the problem that the application has been built to handle, lives. This might be salary calculations, data analysis modelling, or workflow such as passing a order through different stages.

You can get a more in-depth coverage of the overall architecture in this  Simple-Talk article.

Aside: While having all the business rules in the Business Layer is something we should always strive for I find that in practice some issues go outside the Business Layer for good reasons. For instance validation of date often flows up to the Presentation Layer so that the user gets early feedback. It may also flow down to the database, which checks data written to the database, to ensure database integrity.Other business logic appears in the Presentation Layer, like not allowing users to buy unless they provide a credit card, but the status should be controlled by the Business Layer. Part of the skill is to decide whether what you are doing is justifiable or will come back and bite you later! I often get that wrong, but hopefully I learn from my mistakes.

My philosophy on the Business Layer

I have written numerous applications for predictive modelling of various healthcare issues, all of which have some quite complex business logic. Over the years I have tried different ways of organising these applications and I have come with one key philosophy – that the “The Business Layer is King“, i.e. its design and needs drives everything else.

I am a big fan of Domain-Driven Design (DDD) which has the same philosophy and provides some approaches to help keep the business logic at the forefront of any design. Here are some of the approaches from DDD that drive my designs.

1. The Business Layer defines the data structures

The problem we are trying to solve, often called the “Domain Model”, is the heart of the problem. These can be complex so the core data structures should be defined by, and solely focused on the business problem. How it is stored and how it is viewed are secondary issues.

2. The Business Layer should find persistence of data simple

While DDD accepts that the way data is stored/persisted will have an affect on the design (see Eric Evans book, page 159) the aim is to allow Business Layer code treat the database as almost an in-memory collection. Entity Framework really helps with this.

The architecture of my Business Layer

Based on the philosophy listed above the implementation I use has the following characteristics.

1. I use Adapters widely

As I said above the Business Layer is in control of the data structures – what makes the most sense for the Business Layer is what it does. This means it normally deals in data classes and/or entity keys (DDD term for the primary keys of data/entity classes).

This means the data needed/produced by the business logic is often not in the right format for communicating to the user and/or other external APIs. I therefore use Adapters, i.e. something that transforms the data to/from what the business logic. The Service Layer does this for the Business to Presentation layer communication while external services are adapted inside the Data Layer.

Further reading:

  • See Alistair Cockburn’s Hexagonal/Adapter-Port architecture . My applications are mainly in this style. I use Dependency Injection to link layers.
  • DDD call these anti-corruption layer – See Eric Evans’ talk on DDD in which he talks about different ways to adapt across Bounded Contexts (whole talk is great. Skip to 20mins in for part on linking bounded contexts).

2. The Business Layer uses EF directly

My early applications used the repository pattern and UnitOfWork pattern. However as EF has improved, and I have learnt more I have moved over to using EF directly in the Business Layer.

I found that the repository pattern in the end got in the way and made life much harder. Going for direct EF access allows the developer to access all the features of EF and stops some of the ranger tortuous approaches I used to use with repositories.

You can read more about this in my two blog posts:

  1. Is the Repository pattern useful with Entity Framework?
  2. Is the Repository pattern useful with Entity Framework? – part 2

3. The Business Layer does not do the final data save.

This is subtle, but I have found this very helpful. I don’t want the Business Layer to really know about saving data. I can’t totally ignore the data access code, in my case Entity Framework (EF), in the Business Layer, but I do minimise it.

Therefore Business Layer methods adds/inserts new data into the in-memory data classes or simply changes any loaded data from the database. However the Business Layer never calls EF’s SaveChanges. That is done in the Service Layer that called it.

Why do I find this useful? There are a number of reasons:

  • The Business Layer does not have to handle database validation errors, which can occur when ‘SaveChanges’ is called. The Service Layer does that, which has more information available to it on what is going on.
  • I know that all the data will be saved in one transaction. It is either there or its not.
  • If I need to chain multiple business methods I can use a transaction scope to rollback changes if one of the later methods fails.
  • The Service Layer can check other things to decide whether the data should be saved. I use this to allow users to save/not save on warnings.
  • It makes the Unit Testing of the Business Logic much easier as the database has not been changed. Most of my Business Logic has a property holding the data structures about to be written, so I can just check that.

Note: The discarding of data by not calling ‘SaveChanges’ only works in situation where each call has its own DbContext. This is the state in a web application as each HTTP request gets a new DbContext.

My implementation: GenericActions library

I have a library called GenericActions which has a very similar interface to the GenericServices library. Note: This isn’t an open-source library, but is part of my proprietary Rapid-Build™ library set that I use when building applications for clients.

Like GenericServices GenericActions can either work directly with the business class or more usually via a Data Transfer Object (DTO) – See Why DTO? in the GenericServices documentation for more on DTOs. It uses DI to pull in a Business Layer class, hence making sure the Business Class can pick up any services it needs through constructor injection.

A typical call of a business method at the MVC controller level looks very much like a GenericServices call.

[HttpPost]
[ValidateAntiForgeryToken]
public ActionResult SetupGrader(GraderDto dto,
            IActionService<IBizGraderSetup> service)
{
    if (!ModelState.IsValid)
        //model errors so return immediately
        return View(dto);

    var response = service.RunAction(dto);
    if (response.IsValid)
    {
        TempData["message"] = response.SuccessMessage;
        return RedirectToAction("Index");
    }

    //else errors, so copy the errors over to the ModelState and return to view
    response.CopyErrorsToModelState(ModelState, customerAddress);
    return View(customerAddress);
}

The main difference is the definition of the service (see line 4 above). The service contains the interface of the Business Layer class that contains the method it needs to call. DI is used to inject the right class into the service.

In the case above the action only returned a status to say if it was successful or not. However other Business Layer methods may need to return data, so there is another type of call that returns a class. This class can either be the class result produced by the business class or another DTO to do the conversion.

You can read about the way I implemented the GenericActions (and GenericServices) library in my new post, Using .NET Generics with a type derived at runtime.

Conclusions

Because GenericServices is focused on the database to/from Presentation Layer communication I have not described my view of the Business Layer. Hopefully this article fills in that gap by describing the ways and wherefores of the Business Layer in my  architectural view of applications.

Happy coding!

Announcing NuGet release of GenericServices

Last Updated: July 31, 2020 | Created: January 22, 2015

I am pleased to say that I have finally released a version of the GenericServices on NuGet!

For those who are not aware of GenericServices it is an open-source .NET library designed to help developer build a service layer, i.e. a layer that acts as a facard/adapter between your database in the data layer and your User Interface or HTTP service.

GenericServices makes heavy use of Entity Framework 6 – EF6 and .NET 4.5’s async/await commands. Its aim is to make the creation of the service layer simple while providing robust implementations of standard database access commands. You can get a summary of its features in from the Why GenericServices? page on the GenericServices documentation site or the README page on the GenericServices’ GitHub home page.

Link to NuGet Version

The library is known as GenericServices on NuGet, so simply search with that name in the ‘Manage NuGet Packages’ in Visual Studio. You can also find it on NuGet at https://www.nuget.org/packages/GenericServices/ .

Why have you released on NuGet now?

I did not feel confident to release GenericServices onto NuGet, where it is likely to get much more use, until I had good documentation and examples. In the past two weeks I have:

  1. I have written quite a bit of documentation and added a useful index in the sidebar. Do have a look at the documentation in the GenericServices’ Wiki.
  2. I have made available a much more advanced example called SampleMvcWebAppComplex. SampleMvcWebAppComplex is a ASP.NET MVC5 web site which was built as a ‘stress test’ of how well GenericServices library and Microsoft’s Entity Framework V6 could cope with an existing SQL database AdventureWorksLT2012.
    1. The example web site is at http://complex.samplemvcwebapp.net/.
    2. The source code is available as a reference source. Note: this project is not fully open-source because it uses a paid-for library, Kendo UI MVC.
  3. There are two articles I wrote on the Simple-Talk web site that describe in detail the process I went through to build SampleMvcWebAppComplex. They are:

I still have the original, more basic example web site called SampleMvcWebApp. You can access this via:

  1. The live example web site is at http://samplemvcwebapp.net/
  2. The source code, which is fully open-source, is available too.

Happy coding!

What makes a good software library?

Last Updated: July 31, 2020 | Created: January 5, 2015

If you are a professional software developer then you will have used lots of software libraries/framework to help you build an application. In fact nowadays one of the key skills a developer need is pick the right software libraries to use in a given situation. I also expect most development teams have built their own software libraries to help them with some specific aspect of the work they do.

In this article I explore what I think makes a good, even great, software library. I would appreciate your thoughts on this so do add comments at the end.

NOTE: By software libraries I mean a software subsystem with some sort of Application Programming Interface (API). In the Microsoft/.NET approach software libraries are normally a Class Library with a known API, which are downloaded as a .dll file or most likely using NuGet. JavaScript libraries are normally a series of JavaScript files, possibly provided by Bower or Node.js npm, that again have some sort of API.

What motivated me to write this article?

I am building a number of .NET software libraries, one of which is an open-source project called GenericServices. Recently I built a small, but fairly complex test application to check out GenericServices and various other supporting libraries. This ‘stress test’ was really interesting in ways I did not expect (you can read more about what I did in this article).

In writing the code I found some really complex uses ‘just worked’ – in fact in a couple of cases I wrote the Unit Tests that I though should fail but they passed first time, which really threw me for a while. In other parts I had a problem in someone else’s software library and even when looking at their code I hadn’t a clue how to fix the problem.

All of this got me thinking about “what makes a good software library?” This is especially important to me as I want my libraries to be easy to use by anybody. I therefore been thinking about this for some months and I thought I would write down my thoughts.

Categorising software libraries

To help me in considering this issue I came up with categories for the types of software libraries I have come across. This helped me work out what I thought worked well and what didn’t, and more importantly why the library was like that.

Over a few months I came up with six categorisations for software libraries. They are:

1. Normal software libraries

These are the normal, run of the mill software libraries. Their APIs are predictable and maybe a bit boring, but they get the job done. However they are great for simple problems and can save you time.

I would say that some loggers, like Log4Net, are like this. Log4Net is a fine package and it does great job. I understand the problem of logging and I could write it myself, but this library has already implemented a great solution so I use that. Job done.

2. Verbose software libraries

These are run-of-the-mill software libraries. They work, but the API’s might be a bit long-winded or verbose.

One example of a library I use that has some aspects of this is Kendo UI. I pay good money to use this library because of what it provides, but the API can be hard work. It uses a JQuery type approach where everything is configurable. That can be useful but in Kendo UI there can be lots of things that need to be configured. For instance of Charts there are over a thousand configuration options which makes finding the right one quite hard work. In contrast NVD3.js charting has separate methods for each chart type which exposes the configurations needed by that chart.

3. Clever software libraries

These are software libraries that do really clever things, but as a developer you really need to know what you are doing. You get comments like “the trick is…”

While not technically a software library I think SQL, T-SQL in this case, is an example of a ‘clever’ library/API. T-SQL is the language for talking to a SQL server, which is a form of API, but called a Domain Specific Language. Using SQL I have built some amazing queries which perform complex modelling really quickly, but it is NOT simple. That is because the problem isn’t simple and the SQL API doesn’t try to hide it from you.

4. ‘Magic’ software libraries

These are the software libraries that do things that are useful, but they are so intricate that it is hard to understand what they are doing – I call these ‘magic’ software libraries. If they are written well they can be great, but if the underlying problem is complex then they are hard to write.

I think Entity Framework (EF) is a really good example of a good ‘magic’ software library. EF is database front-end library, known as an ORM, which tries to hide the complexities of the T-SQL API mention above to make it easier to work with databases. It is a great library, but it does take a while to really understand EF because there is so much ‘magic’ going on inside it.

5. ‘Mysterious’ software libraries

These are software libraries are really hard to understand. This could simply be because it’s a very complex API. However ‘magic’ software libraries can often become ‘mysterious’ when their standard ‘magic’ doesn’t work.

Entity Framework (EF) can suffer from this in places. For instance if you have a many-to-many relationship in a database then EF inserts an extra database table which is looks after to make it really easy to use. The problem is this ‘magic’ has some limitations and when it doesn’t work you need to know how to fix it. In actual fact the link to Updating a many to many relationship in entity framework on my site gets the highest number of hits of all of my posts.

6. Elegant software libraries

These are software libraries that are a joy to use. They may do simple or clever things, but they are a) short and to the point and b) you are clear on what is happening. This what all of us strive to achieve when we write a software library.

One example of this is Microsoft’s LINQ. LINQ is a data query language which I think is really elegant. Yes, there are a few complicated bits in it, like Grouping, but most of the time it’s easy to string together a series of commands to achieve quite complex things. It is clean, clear and only mentions the things you need.

NOTE: I haven’t included a ‘must work’ category as I take that as a prerequisite. However some libraries, especially ones that have not been used much, can be suspect. Caveat emptor.

Having finished these software categories let us look at some particular issues that we should consider when building software libraries.

Think about how the developer will use your library

Here are some ideas to make you as the library writer think about the experience the end-user developer has when using your library.

1. Think from the user’s perspective

As the developer of a library we tend to be focused on the implementation, not on the user. That makes the library writer think about what he/she wants to achieve, not how the user wants to use it. Let me give you an example from my own experience of developing my GenericServices library.

When I started development of GenericServices I saw that the access to the database and the calling of business logic had some similarities. I therefore ‘saved time’ by writing code that handled both. Problem is the code was a little bit complex and a little bit opaque to the end user.

This became apparent when I started to us it. Even I might hesitate for a second or two about a certain term, so what would another developer feel like. I therefore split the two libraries, with GenericServices focused on the core problem of database access. Yes, there is some duplication but it is much clearer to the user/developer and two libraries are much simpler.

2. Focus on the most important use-cases

A Use Case is a term to define how a user, in this case a software developer, will use your tools, in this case the API of your software library, to achieve a specific action. I found it very useful to order the use cases.

The most obvious way to order is on how often a particular use case was likely to be used. Sometimes there isn’t this sort of split, but often there is – sometimes called the 80/20 rule. If you do have this split then you need to make the API for the ‘normal’ (80%) cases short and easy. The ‘special’ (20%) use cases still need to be supported, but the API can have a few more lines of code to archive it so that the ‘normal’ use cases stay simple. This can make a big difference to how many lines of code a developer will need to write in complete project.

In the case of my GenericServices library I had another ordering based on the architecture used in the data layer: a CRUD design or a Domain-Driven Design (DDD) design. These two design approaches have very different needs when it comes to adding or changing data. DDD is more difficult to support because the data layer imposes some rules that GenericServices has to abide by. I managed to support DDD while having no impact on the simpler CRUD usages, which was great. Maybe you have another perspective/ordering that will help you focus your API.

3. Principal of least astonishment

I came across the term ‘Principal of least astonishment’ in one of Addy Osmani blogs, although he didn’t coin the term. Addy explains this as follows:

Try not to violate the principle of least astonishment, i.e. the users of your component API shouldn’t be surprised by behaviour. Hold this true and you’ll have happier users and a happier team.

I really like this term. It sums up the need for the The API to be a) obvious b) consistent and c) in tune with what developers expects from a library.

To that end if you are developing for the Microsoft .NET arena I would recommend the book ‘Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable .NET Libraries: Conventions, Idioms, and Patterns for Reuseable .NET Libraries (Microsoft .Net Development)’. This book is packed with well researched guidelines which will make your .NET library look more like the standard .NET libraries.

NOTE: This book was written in 2008 so it does not include LINQ Lambda and the newer async/await, both of which I think change some of the guidelines. However I still recommend this book.

Complex problems demand a lot of thought

If your problem is complex then you have to be really careful when contemplating writing a good library. What you are normally doing in a library is providing a set of commands (API) that help you do a job more quickly than if you wrote the actual code yourself. This often requires you to ‘abstract’ the underlying problem into a set of commands.

With small problems like a logger then the abstraction isn’t that great, but the more complex the problem the more you are abstracting the problem. The problems come if you abstraction is not perfect, or as it is called ‘a leaky abstraction’. In the post ‘The Law of Leaky Abstractions‘ Joel talks about abstractions. He says:

All non-trivial abstractions, to some degree, are leaky.

Abstractions fail. Sometimes a little, sometimes a lot. There’s leakage. Things go wrong. It happens all over the place when you have abstractions.

Get the abstraction right and you have a ‘magic’, possibly Elegant design. Get it wrong and you can have a ‘Mysterious’ software library that people will give up on. Here are a few ideas I have on helping with this problem:

  • Include ‘get out of jail’ access points
    By this I mean the library tries to abstract things but it still provides direct access to the underlying system. Entity Framework does this by including access to direct SQL commands for the times when its abstraction gets in the way.
  • Don’t bite off more than you can chew
    Sometimes a complex problem can be broken down into sub-problems. This allows you to build a few smaller, focused libraries rather than one big complex library. I have used this a few times and found it has worked well for me.
  • If all else fails, don’t abstract too much. The T-SQL mentioned earlier does this. It does not abstract the problem much at all which leaves it up to the developer to learn its complexities of the problem space. That way it minimises the leakiness of the abstraction.

Striving for Elegance

Many years ago I read a brilliant book by Eric Evans called Domain-Driven Design. In this book Eric devotes a whole section called “Refactoring towards deeper insight”. His message is we are used to refactoring to make the software better, but what about refactoring to improve what the software does? My experience is that making a good software library comes from a series of “deeper insights” following by some key refactoring.

The development of my GenericServices library proceeded with a mixture of hard works and “deeper insights”. Below is a list of the “deeper insight” moments that I think improved the library.

  1. The service layer has a lot of nearly duplicate code. I could build a library to do that.
  2. Generics would be a really good approach to use.
  3. The class signatures are hard to use, especially with dependency injection. I need to make them simpler.
  4. Having database and business logic together is confusing. I should split them.
  5. The stress test (see next section below) has shown up some issues I need to improve.
  6. I need computed parameters. Let’s use DelegateDecompiler to do that.

I already have my eyes on a few more changes:

  1. I could simplify even further the class signatures.
  2. Domain-Driven Design databases are important to me. I could improve the support for that.

Of course I would love to have been able to go immediately to the final design, but just I don’t think I could have done that. You have to live with a library and see it being used to find out what is wrong. In fact my eldest son Ben, who is a good programmer in his own right, says “Build one to throw away”. It does seem like you need to build something to see what is wrong before you can make it better – well I do at least.

Stress your library to find the pinch points

I built hundreds of Unit Tests and even build a whole demo web site to show GenericServices in action. This weeded out lots of issues but because I wrote the code it was subconsciously biased towards the GenericServices way of doing things.

Normally the ‘stress test’ is on a real project of some size. However I didn’t have a big project on the go at that stage so I decided to stress test GenericServices myself by using someone else’s database. I used the Microsoft AdventureWorks2012Lite database which is a) not designed by me and b) has some database features I had not come across before. That really pushed my library in ways I had not done before, and revealed some areas that needed more work.I recommend anyone who is trying to perfect a software library to ‘stress test’ their library.

The users of the library will test your library, but not as well as you can and they aren’t likely to feed back if they just find it too hard. For the investment of only two weeks I could really dig into what worked/did not work myself and I think it radically improved my library. Maybe you can stress your library on a real project you are working on, but allow time to tweak the library.

You can read about some of the things I found in the article called ‘Using Entity Framework with an Existing Database: User Interface’ on the Simple-Talk web site.

The end

I hope you have enjoyed reading this article and I expect you have your own views/experiences. I would love to hear them as it is only by sharing ideas do we get better at what we do.