Entity Framework: working with an existing database – Part 2, MVC

Quick Summary
This two-part article looks at the software issues raised when presented with a database that was designed and built prior to software development. Part 1 explores the low-level problems of connecting to that database using Entity Framework (EF). Part 2 (this article) looks at the high-level problems of transforming the EF data into a form suitable for display via an ASP.NET MVC web site.

Part 2: ASP.NET MVC and Entity Framework with an existing database

  1. PART 1. The best way to get Microsoft’s Entity Framework version 6 (EF) data access tool to import and access the database.
  2. PART 2: (this article) This looks at the higher level functions is needed to transform that data into a user-focused display. In this case using the GenericServices open-source library to build a service/application layer to connect to modern web front end using ASP.NET MVC5.

Introduction

There are many examples of how to build ASP.NET MVC (MVC) web applications on the web. However in order to get the ideas across they often pick rather simplistic database schemes, as I did myself when I build the SampleMvcWebApp. The problem with these examples is that they do not deal with the more complex issues found in real world systems.

I wanted to stress-test some of my own libraries by building a MVC application with a database that has a more realistic level of complexity. For this I chose Microsoft’s sample AdventureWorksLT2012 database, which is a cut-down version of the larger AdventureWorks OLTP database. The AdventureWorks database is for a fictitious multinational manufacturing company producing and selling pedal cycles. (see pdf of database diagram).

In the first part of this article I looked at the problems of getting Entity Framework (EF) to work with existing database like AdventureWorks. In this second article I look at how to build a MVC web application that provided a comprehensive user interface as would be used by an employee of AdventureWorks. In doing this I try to pull out some of the key design and implementation issues that software developers face when working on real-life business systems.

The issues faced in building a web application

One of the fundamental issues that we face in building a complex web application is that what the business data held in the database is often not in the right format to display directly to the user. Let me give you a simple example from AdventureWorks.

I wanted to put up a list of existing companies that had bought from AdventureWorks before. This needed to quickly summarise who they were and give an idea of what level of purchasing they had done in the past. To do that I came up with following list as shown below:

Customer list

Now, to produce this I had to:

  1. Get the company name from the Customer table – easy
  2. Get the contact name. That needed me to combine the following properties from the Customer table:
    • Title, which can be null
    • FirstName
    • MiddleName, which can be null
    • LastName
    • Suffix, which could be null
  3. For the Total Sales Value, I needed to sum the TotalDue property in the SalesOrderHeader table.

On top of that I want to:

  1. Have a pre-filter to only show customers who had bought before in this list.
  2. Be able to search by a company or contact name.
  3. Sort on any column.
  4. Be able to page the data, i.e. only show a selection of say 10 or 20 customers at a time and be able to step forwards/backwards through the list. This is because there are hundreds of customers.

This is a very typical requirement and it happens hundreds of times in a reasonably sized application. The architecture of the system need careful thought as to how to provide these capabilities with a) good performance and b) with the minimum of development effort. That is a challenge.

The power of LINQ with Entity Framework

Before I go onto the solution of this challenge I want to delve into the two technologies that will form the cornerstone of my solution, that is LINQ/IQueryable and Microsoft’s data access technology, Entity Framework (EF).

Note: for those of you who are familiar with LINQ and EF you can skip to the next section, ‘Separation of Concerns and Layered Architecture’.

a. LINQ/IQueryable

LINQ (LINQ stands for Language-Integrated Query) is a way of writing queries on data. The important point is that any query written in LINQ is held as a query and only run when certain commands, like .ToList() are applied to that query. This allows software developers to write part of a filter/select command that can be added/extended later anywhere in the application that understands IQueryable.

Let me take a simple example using a subset of Customer List problem posed above. First let us select the customers who have bought from us before:

var realCustomers = Customers
                    .Where (c => c.SalesOrderHeaders.Any());

Then I select the CompanyName and form the ContactName from this filtered list:

var details = realCustomers.Select( x => new
   {
       CompanyName = x.CompanyName,
       ContactName = x.Title + " " + x.FirstName + " " +
                     x.MiddleName + " " + x.LastName +
                     " " + x.Suffix
   });

This returns an IQueryable<> class, but it has not been ‘realised’, i.e. it is just a command and has not accessed the database yet. This means I can add the following filter commands onto the query:

var filteredQuery = details.Where (x => x.CompanyName.Contains("Bike"));

Finally I might add something to add paging on to that query, e.g.

var finalQuery = filteredQuery.OrderBy(x => x.CompanyName)
                 .Skip(pageSize*pageNumber).Take(pageSize);

Hold that finalQuery as we step onto the next part of the solution which is…

b. Entity Framework

ListResultingFromFinalQueryEF is a database access framework that works by converting LINQ queries into SQL Server database commands, i.e. T-SQL code, and it does it very well. So, if we set pageSize to 5 and the pageNumber to 0 and then realise the finalQuery mentioned previously by putting a .ToList() on the end we get the results on the right.

EF has converted the finalQuery into T-SQL and applied it to the database. If we look at the final SQL command then we will see that it has:

  1. Only selected the columns that it needs.
  2. The concatenation of the names has been turned into a T-SQL string addition, including replacing null items with an empty string.
  3. The two .Where()’s, one looking for customers who have bought and the other looking for company names which contain “Bike”, are combined into a single SQL WHERE clause.
  4. The SQL command orders the data by CompanyName and then does a Skip + Take to return only the data that was requested.

For those who are interested here is the SQL command (text file) that EF produced from these concatenated queries.

Separation of Concerns and Layered Architecture

There is a software approach, almost a rule, called Separation of Concerns. In a Microsoft document it defines this term as follows:

Separation of Concerns: Divide your application into distinct features with as little overlap in functionality as possible. The important factor is minimization of interaction points to achieve high cohesion and low coupling…
(Microsoft, Software Architecture and Design 2009, Key Principles of Software Architecture)

Applying Separation of Concerns with what we have learnt about LINQ and EF I have separated the applications into Layers (see Multilayered architecture). In .NET these are separate projects/assemblies but are bound together into one application, i.e. direct method calls between layers. (Note: In today’s cloud system, such as Azure, scaling up is done by having the whole application running multiple times rather than the previous approach of having each layer running on a separate server). My layers are:

  1. Data Layer: sometimes called the Infrastructure Layer, or Persistence Layer.
    This deals with the reading and writing data to the database. It should provide IQueryable versions of each table/class for reading. Depending on the software design used it will either allow direct write/update of a database class, or if using Domain-Driven Design may impose some factories/methods that need to be called to accomplish this.
  2. Business Layer:
    This will contain all the business logic that goes beyond simple read/write of data. For instance in the AdventureWorks system a new order starts in the ‘Pending’ state and needs to be ‘Approved’ before it can be taken forward. That sort of logic should live in this Layer and it will directly interact with the Data Layer.
  3. Service Layer: sometimes called the Application layer.
    This section’s job is to format the data flowing from the Data Layer and the Business Layer to/from the format that the Presentation Layer It may need specialised handling for changing the data in the Data section if a Domain-Driven Design approach is used.
  4. Presentation Layer: sometimes called the UI (User Interface) Layer.
    This section’s job is to present the formatted data from the Service Layer to either a user via a web page, to another system via say a RESTful interface. This section will also handle any paging, filtering, sorting etc. as required.

As you can see I have apportioned different parts of the query handling to different layers, i.e. queries/updates can be passed up the system and modified as it moves through the layers.

Before I leave this section I should say this is one of the possible ways in which a system could be architected. I recommend the book ‘Microsoft .NET: Architecting Applications for the Enterprise’ for an excellent discussion of other approaches.

The Service Layer is key

My experience is that the Service Layer is one of the most useful layers because it totally separates the data model, i.e. the key data classes/tables that we have in the system, from the presentation layer.

Why is the service layer so useful – because in any reasonably complex application the data model should be defined by the business process/data and not by how the user wants to see the data. By adding a service layer to ‘adapt’ the data we keep the data model focused solely on the business needs and do not allow it to be contaminated by features that are only used for display.

Obviously we need another class more attuned to the Presentation Layer’s needs. These are often called Data Transfer Objects (DTOs). In my system the Service Layer methods do two things:

  1. They take on the work of reformatting the data in the Data Layer or Business Layer to/from a DTO, which holds the data in format that is suitable for the Presentation Layer.
  2. They call the correct commands in the Data Layer or Business Layer to execute certain commands. The most obvious are the Create, Update or Delete commands to the Data Layer to update data held in the database.

The only problem is – the Service Layer has a lot of really repetitive code, and boring to write! So I set about fixing that with a library called GenericServices.

GenericServices and how it solves certain problems

GenericServices uses C# Generics to provide a set of CRUD (Create, Read, Update and Delete) services to the database. The developer only needs to write a DTO which inherits from a GenericServices’ template DTO to gain access to GenericServices CRUD methods which includes intelligent copying of data layer classes to/from DTOs.

You might not want to use GenericServices but you might find it useful to understand how it solves certain problems when working with the AdventureWorksLt2012 database. Note: If you do want to look at GenericServices, which is an open-source project, then you can find out more about GenericServices on GitHub or via the SampleMvcWebApp site.

In the next four sections I describe various issues that I came across in my design and how I solved them in GenericServices.

1. Copying properties to/from Data Layer classes and DTOs using LINQ

I think I have made a case for using DTOs – they reduce the data required and allow data from various relationships to be pulled together into one place (called flattening). I am also saying that using LINQ commands to select the data means the EF can make a very efficient database access, only pulling in the specific cells needed for the Presentation Layer.

In my early projects I hand-coded the data to DTO and DTO to data code. This worked but was incredibly repetitive, boring and needed Unit Tests to check I hand not fallen asleep during the process. I was committed to automating this process, which I have achieved in GenericServices.

GenericServices used AutoMapper for the task of data<->DTO copying. AutoMapper is an excellent, conversion based mapper that can handle some complex mappings between two classes, i.e. the data class and the DTO. I had come across AutoMapper years ago, but had rejected it because I did not support LINQ commands. However now it does support LINQ projections, which means it converts the copy into a LINQ expression which EF then uses to make an efficient SQL query.

So, my solution was to put AutoMapper at the core of my data<->DTO copying process and I have not been disappointed. If you have a similar need to copy data between similar classes then you should look at the AutoMapper Wiki site and Jimmy Bogard’s AutoMapper blogs for newer information.

2. Adding new calculated properties for display or editing

The other issue you saw in the Customer List example at the start was that I needed to come up with some new, calculated values to use to help display the customer list. In early projects I simply hand-coded these using LINQ expressions. Getting a solution to this was a little harder.

Just to recap, from the earlier example the Customer list needs three calculated properties:

  1. HasBoughtBefore, which should be true if the customer has any sales orders
  2. FullName, which combines the parts of the name to make something sensible for the user.
  3. TotalAllOrders, which sums all the order values to give total.

I initially used custom mapping in AutoMapper, where I basically inserted my hand-coded LINQ expressions. This works but Jimmy Bogard’s AutoMapper blog put me onto a very elegant solution called DelegateDecompiler which turns calculated properties into LINQ expressions. That may not make sense so let me give an example.

If I added a calculated property to the EF data class as shown below then using it in a LINQ expression would throw an exception, as it cannot handle properties that have calculations in them.

public bool HasBoughtBefore { get { return SalesOrderHeaders.Any(); } }

What DelegateDecompiler does is allows you to mark such properties with a [Computed] attribute and in your query you add a call to .Decompile(). This then turns the calculated properties into a LINQ expression, which means EF can then execute that code.

So in my data layer I can add the following partial class to the Customer data class to add two of the three properties I need.

public partial class Customer : IModifiedEntity
{
    [Computed]
    public string FullName { get { return Title + " " + FirstName
                                   + " " + LastName + " " + Suffix; } }

    /// <summary>
    /// This is true if a 'Customer' has bought anything before
    /// </summary>
    [Computed]
    public bool HasBoughtBefore { get { return SalesOrderHeaders.Any(); } }
}

GenericServices notices that the class has some properties marked with the [Computed] attribute and invokes the DelegateDecompiler ‘s .Decompile() method before passing the query up to the Presentation Layer.

DelegateDecompiler is very elegant, but complicated library so some types of LINQ commands are not supported. In particular calculation like TotalAllOrders does not work, nor does it support properties that need to access a local variable. Therefore I have also provided an entry point in GenericServices to allow additional AutoMapper mapping to handle these complex cases. See the overridden AddedDatabaseToDtoMapping property in the GenericServices DTO definition.

public class ListCustomerDto : EfGenericDto<Customer, ListCustomerDto>
{

    // … other properties removed for clarity 

    /// <summary>
    /// This will hold the total value of all orders for this customer
    /// </summary>
    [Display(Name = "Total Sales Value")]
    [DataType(DataType.Currency)]
    public decimal TotalAllOrders { get; set; }

    // … other code removed for clarity

    protected override Action<IMappingExpression<Customer, ListCustomerDto>>
        AddedDatabaseToDtoMapping
    {
        get
        {
            return m => m.ForMember(d => d.TotalAllOrders,
                opt => opt.MapFrom(c =>
                       c.SalesOrderHeaders.Sum(
                           x => (decimal?)x.TotalDue) ?? 0));
        }
    }
}

Note: The LINQ calculation for TotalAllOrders is complex because it is summing a property in a collection relationship. Because a collection can be empty LINQ correctly insists on the nullable version of Sum. This returns null for empty collections, which I replace with zero.

Presentation Layer

Before I leave this example you might like to see the MVC controller code that displays the result. My design aim is to make the code in the MVC controller as simple and generic as possible. This will allow me to use scaffolding for building the controller code and the views. Below is the code for displaying a list.

public class CustomersController : Controller
{
    [DisplayName("Current Customers")]
    public ActionResult Index()
    {
        //the ‘true’ tells the Grid to only show rows where HasBoughtBefore is true
        return View(true);
    }

    public JsonResult IndexListRead(
        [DataSourceRequest]DataSourceRequest request,
        IListService service)
    {
        var result = service.GetAll<ListCustomerDto>()
                     .OrderBy( x => x.CustomerID).ToDataSourceResult(request);
        return Json(result, JsonRequestBehavior.AllowGet);
    }
    //… more create, update, delete method left out for clarity
}

Note the IListService, which is the GenericServices List service where we call GetAll<ListCustomerDto>(). to get the formatted DTO data. I then display this using a Kendo UI Pro Grid which takes an IQueryable query and adds its own filter, sort, group and paging controls to the query before executing the query. Note: Kendo UI Pro is a paid-for UI library for MVC and JavaScript. The result is a grid with paging, sorting, filtering etc.

Analysis of final SQL, and resulting changes

Finally I using Red Gates’ ANTS Performance Profiler to analyse the final SQL command (text file) that EF produced after all these different parts are combined. The display is for existing customers, i.e. HasBoughtBefore is true, and filters the results to only include customers whose company name containing “bike”.

I am no SQL expert but looking at the SQL command produced it seemed to me that sum of the SalesOrderHeader.TotalDue seemed to be done twice, once to see if it was null and then again if it wasn’t null. This didn’t seem too efficient so I changed the LINQ code to first check if there was any SalesOrderHeader before attempting the sum. The new LINQ code to define the TotalAllOrders property is shown below:

SalesOrderHeaders.Any() ? SalesOrderHeaders.Sum(x => x.TotalDue) : 0;

This produced the following SQL command (text file), which a) seems simpler and b) was faster that the original SQL command. As I say I am not a SQL expert but I would say this is an improvement.

In real life you should look at areas of poor performance and then apply increasing levels of refactoring, right up to using direct SQL commands, to obtain the performance you need. However this does take development time so in some projects a quickly written but possibly poor performing database access may be acceptable.

3. Handling composite keys

Quite a few of the tables in the AdventureWorks Lite database has composite keys, i.e. a primary key made up of two or more columns in the table. Now GenericServices is designed to handle composite keys and the Detail, Update and Delete services all take multiple keys. However there is one table in AdventureWorks Lite database where the keys can be a challenge: CustomerAddress. It is worth pointing out how to handle this in EF.

The picture below shows that the Customer has a collection of CustomerAddresses, which consists of a property called AddressType, which says if it’s their main office etc. and a link to a standard Address class. CustomerAddress has a composite key consisting of the CustomerID and the AddressID.

CsutomerAddress-table-with-relationships

So, the problem comes when creating a new CustomerAddress. We need to create the CustomerAddress and the Address in the same commit so that they are consistent, but we also need to get the new AddressID to fill in part of the composite key on the CustomerAddress.

The answer is that EF does this as part of its relational fixup. To get this to work I create the CustomerAddress and set its EF ‘Address’ relational property to a new Address type. We add the CustomerAddress to the database using db.CustomerAddresses.Add( myNew CustomerAddress) and call EF’s SaveChanges() and its done.

4. Deleting database classes with have relationships

When deleting items from the database that have relationships, there are a few issues we need to think about. Firstly, if something is ‘pointing to’ that item, in SQL terms it is referencing that key and this reference has been enforced by a foreign key constraint. Because of this constraint, you cannot delete the row if it is being referenced by another table. You need to remove the foreign keys that reference the key value for the row. In our case, we’d want to delete all rows that reference the row we wish to delete. You need a way of handling this in a graceful way so the user is not faced with a ‘There was an Error’ message.

In GenericServices I have used a layered approach. You can add your own code to check before you delete, but by default the GenericServices Delete method relies on a special .SaveChangedWithValidation() method which catches certain SQLExceptions. In this case the SQL error code is 547, which is a constraint error triggered on an attempt to delete a row that another row is relying on. I describe this approach in a previous Simple-Talk article called, Catching Bad Data in Entity Framework.

5. Properly deleting rows any their relationships

The other delete issue is the opposite of the previous one. What happens if I want to delete a database row that has a relationship to other database entries? In cases where the relationship is only between these two items then the lower relationship is often set as ‘cascade on delete’, so deleting the main row will cause its child relationship to be deleted. However there can be complex cases where you have to handle this yourself. That is true of the CustomerAddress and its Address listed in the last section.

When I delete CustomerAddress I most likely want to delete the Address entry that it points to. I say ‘most likely’ as it is quite possible that the Address part was used in an old Sales Order as the delivery or billing address. Because of this dual use there is no cascade delete on the Address so we need to do this by hand. In GenericServices I added a DeleteWithRelationships(Func<>) method which allows a small helper Func<> to be written by the developer that looks after any relationship deletion within the same SQL commit as the primary delete. That way if the delete of the Address entry fails because it is using in a delivery or billing address then the delete of the CustomerAddress is rolled back. The system also provides a rather generic, but friendly message.

 

Any down sides to this approach?

It is always good to think of what could be done better – that is how I got to write GenericServices. Looking at the approach outline above I would make these comments:

  1. The proposed architecture is very reliant on LINQ and its conversion to database commands. I don’t think this is a major problem, but it does limit you to providers that handle LINQ. Here are some providers.
    1. Entity Framework (EF) is acknowledged as having the best LINQ-to-SQL conversion. EF currently only supports Microsoft SQL Server, although the EF v7 roadmap includes support for other database types.
    2. NHibernate, MySql and other database access providers support LINQ-to-database converter and supports a lot of database types. However I cannot vouch for the depth of their implementation.
    3. You can write your own LINQ data source provider already, and it is said to become easier in EF v7, which is out ‘soon’.
  2. LINQ/IQueryable is not easily serializable, so a system that has separated systems talking over some message bus would not work. However current scaling practice, see Azure’s How to Scale an Application, is to run multiple instances of the same application which works for this approach.
  3. Some people say that passing IQueryable right up to the Presentation Layer is an anti-pattern. The main concern is that used wrongly a developer can add a .ToList() to the end of a query and load thousands of rows of data. That is a concern, which is why I make the Presentation layer calls very simple and standard. My long term aim is to scaffold most of Presentation Layer, but I need to do more projects before I really know what is needed.
  4. Similar to point 2 but in untrained hands software developers can write really bad queries that hit performance issues. This is true of nearly every ORM (Object-Relational Mapping) software system, but EF is big and it has taken me years to get really understand what EF is doing under the hood. However the fact is the software development part of a project is likely to be the biggest cost and a good ORM, plus training, is likely to reduce timescales.

Finally I should say that I describe a Layered Architecture but for really big systems this may not scale enough. However the overall approach of using LINQ and separating different parts of the processing of the query would be valid in other architectures like CQRS (Command and Query Responsibility Segregation) architecture or within each service in a MicroService architecture. I would recommend looking at the pdf of an excellent book called ‘Microsoft .NET: Architecting Applications for the Enterprise’ for a detailed discussion of other approaches.

Conclusions

Getting EF and GenericServices to work with the AdventureWorks Lite database was very revealing. Some things worked really well first time and some needed quite a bit of fiddling around. As a result of this I changed the implementation of GenericServices to better handle these more complex cases.

Overall it took about two weeks to build a web application with included List, Show Detail, Create, Update, and Delete of Customers and Orders + adding/editing line items in an order plus Product listing and lookup. When compared to hand-coding a web application like that all that then that I think is very fast, so I am pleased.

I hope you have found this article useful in describing an approach to building complex, database driven web applications. Hopefully it has suggested some approaches that may well save you time in the future.

Happy coding!