Flattening Entity Framework relationships with ExpressMapper

Last Updated: April 20, 2016 | Created: April 20, 2016

This article looks at how Microsoft’s Entity Framework (EF) handles extracting data from nested relationships, i.e. linked EF classes. This is a process known as flattening. I will show that EF produces efficient SQL commands for this type of access.

I then talk about making development of such accesses by using an Object-to-Object mapper and introduce the new Flattening feature in ExpressMapper, a relatively new mapper that is quite lean and quick. The last part is a detailed description, with lots of examples, of how to use flattening in ExpressMapper and what is is capable of.

Quick background on flattening

If you understand the issues around relational databases, foreign keys and flattening then skip this.

Well constructed relational databases try to minimise the duplication of data, e.g. only have one copy of the customer details. They do this by creating extra tables to hold each of these unique pieces of data and using ‘foreign keys’ if another piece of data needs to refer to it. Let me give you an example.

In the diagram below, which shows the EF classes, a product has a ‘Variant’, which is linked by the foreign key ‘VariantId’ and that Variant can have a number of size information, given by the ‘Size’, which has a foreign key ‘SizeId’.

Product, ProductVariant, Size

The benefit of splitting the data like this is we can change say the Color information on a specific  ProductVariant and it is automatically picked up by all the products that point to it. In software terms we would call this an application of the DRY (Don’t Repeat Yourself) principal.

The down side if if we want to show a list of products on screen its quite likely that we want to include the Color and Size information. This means we need to ‘pick out’ the Color column from the linked ‘ProductVariant’ table, and the Name column in the Variant’s linked ‘Size’ table. This is called flattening and happens a lot in real-life applications.

How Entity Framework does flattening

I’m going to use a very simple example to show how EF does flattening. The diagram below shows a simple Father->Son->Grandson nested relationship.

ExpressMapper-father-son-grandson

The EF command to get all of the properties from all of the linked records would be:

var flattened = dbContext.Fathers.Select(f => new { 
    f.MyInt,
    f.MyString, 
    SonMyInt = f.Son.MyInt,
    SonMyString = f.Son.MyString, 
    GrandsonMyInt = f.Son.Grandson.MyInt,
    GrandsonMyString = f.Son.Grandson.MyString});

The result is an anonymous class containing all the data from all the relationship.

The T-SQL EF produces for this access

It turns out that EF produces the very efficient T-SQL command for this sort of flattening. The T-SQL command that EF puts out is as follows:

SELECT 
    [Extent1].[MyInt] AS [MyInt], 
    [Extent2].[MyInt] AS [MyInt1], 
    [Extent2].[MyString] AS [MyString], 
    [Extent3].[MyInt] AS [MyInt2], 
    [Extent3].[MyString] AS [MyString1], 
    [Extent1].[MyString] AS [MyString2]
    FROM   [dbo].[Father] AS [Extent1]
    INNER JOIN [dbo].[Son] AS [Extent2] 
        ON [Extent1].[Son_Id] = [Extent2].[Id]
    LEFT OUTER JOIN [dbo].[Grandson] AS [Extent3] 
        ON [Extent2].[Grandson_Id] = [Extent3].[Id]

This is an optimal T-SQL command and here is execution plan that proves it.

ExpressMapper-father-son-grandson-sqlNOTE: Flattening is only useful for reading data, as any update to a column in a related table can only be done in EF by reading in the inner class, updating its data and calling EF’s .SaveChanges(). However reading data for display is very common and therefore we want it to be really efficient.

Using an Object-to-Object Mapper with EF

My experience has been that you spend a LOT of time writing EF .Select() statements like the one above in even a small application. They are repetitive and boring, which often leads to making mistakes. This caused me to build the open-source package GenericServices, which uses an object-to-object mapper  (currently AutoMapper, but l am planning to change to ExpressMapper).

The idea behind using an object-to-object mapper (referred to as mapper from now on) is that you create a class, or classes, which represent what you need from the database. These classes are often called DTOs (Data Transfer Objects), or in ASP.NET MVC style ViewModels – I will refer to them as DTOs in this article.

The role of the mapper is to work out how to map the EF classes to the DTO(s). The mapper does this based on a set of rules. Let’s consider one implementation of an object-to-object mapper that will work with EF – ExpressMapper.

Introducing ExpressMapper

As I said earlier I have used AutoMapper, but I have been tracking the development of a new mapper called ExpressMapper, which has some advantages. ExpressMapper is newer than AutoMapper and relies on building LINQ commands for all its mappings. The main advantage is that its setup is much faster than AutoMapper, and one or two other small things make it fit with EF more easily.

However while I wanted to start using ExpressMapper but it didn’t have the flattening feature. So, with the help of ExpressMapper’s author, Yuriy Anisimov, I added the flattening feature. This is now released in version 1.8.1 which you can find as a Nuget package here.

I will give you three examples of how ExpressMapper’s flattening works and then describe the differences between ExpressMapper and AutoMapper, which is used more widely because it has been around a long time.

1. Flattening nested references

1.a. Non-null nested references

In the case of flattening ExpressMapper and AutoMapper uses a concatenated name rule,  e.g. SonMyInt would match Son.MyInt and SonGrandsonMyString  would match Son.Grandson.MyString. So, looking at our previous example of Father->Son->Grandson if we created a class as shown below and then ran the code at the end then ExpressMapper would produce the same sort of result, and SQL that my hand-coded select would have done.

public class FlattenDto
{
   public int MyInt { get; set; }
   public string MyString { get; set; }

   public int SonMyInt { get; set; }
   public string SonMyString { get; set; }

   public int SonGrandsonMyInt { get; set; }
   public string SonGrandsonMyString { get; set; }
}

...
//code to set up the mapping 
ExpressMapper.Mapper.Register<Father, FlattenDto>().Flatten();
ExpressMapper.Mapper.Compile(CompilationTypes.Source);
//code to execute the EF command and get the data
var flattened = dbContext.Fathers.Project<Father, FlattenDto>().ToList(); 

The SQL code is the same as I showed on the hand-written code.

1.b. Nullable nested references

If either the Son or the Grandson can be null then these relationships are called an ‘optional relationship’ in EF, or One-to-ZeroOrOne references in SQL. ExpressMapper handles null references, so handling null references mapping is fairly straight forward. However you do need to think that some of the properties may be null if a nested relationship is null, otherwise EF will complain that it can put a null in it.

For example if we take our previous example and we make the GrandSon an optional relationship with the following EF configuration (see the HasOptional on line 7) then the Grandson property may be null.

public class SonConfiguration 
   : EntityTypeConfiguration<Son>
{
   public SonConfiguration()
   {
      ToTable("Son");
      HasKey(t => t.Id);
      HasOptional(t => t.Grandson);
   }
}

Having done that ExpressMapper will handle not having a grandson (or a son if you set that as optional too). However we do need to change the DTO, as now the the SonGrandsonMyInt can now be null (see the int? on line 13 of this class definition from the Unit Tests of ExpressMapper).

//code to set up the mapping 
ExpressMapper.Mapper
    .Register<Father, FlattenDtoWithNullable>().Flatten();
ExpressMapper.Mapper.Compile(CompilationTypes.Source);
//code to execute the EF command and get the data
var flattened = dbContext.Fathers
    .Project<Father,FlattenDtoWithNullable>().ToList(); 

The result of this will be that SonGrandsonMyInt and SonGrandsonMyString will both be null if no Grandson relationship exists, but will contain the Grandson’s MyInt and MyString if the relationship does not exist.

NOTE: AutoMapper does not have this feature by default – see later section of differences between ExpressMapper and AutoMapper.

2. Running Linq commands on collections

As well as accessing columns in one-to-one relationships there is another feature that EF and ExpressMapper’s flattening can provide for one-to-many relationships. That is it can convert a small but useful number of LINQ collection commands into SQL commands. The commands that EF supports are:

  • Any: This returns true if there is one or more rows in the collection.
  • Count: This returns an int of the number of rows in the collection.
  • LongCount: Same as Count, but returns a long (64 bit) type.

To use these you simply add the method name on the end of a EF POCO class properly that implements IEnumerable (but not a string). Let me give you a very simple example. The code below shows a EF POCO class called FatherSons which contains a collection called Sons. The DTO below it has a property called SonsCount, which ExpressMapper turns into the LINQ command Sons.Count().

public class FatherSons
{
   public int Id { get; set; }

   public int MyInt { get; set; }
   public string MyString { get; set; }

   public ICollection<Son> Sons { get; set; }
}
public class FlattenFatherSonsCountDto
{
   public int MyInt { get; set; }
   public string MyString { get; set; }

   public int SonsCount { get; set; }
}

...
//code to set up the mapping 
ExpressMapper.Mapper
    .Register<FatherSons, FlattenFatherSonsCountDto>().Flatten();
ExpressMapper.Mapper.Compile(CompilationTypes.Source);
//code to execute the EF command and get the data
var flattened = dbContext.Fathers
    .Project<FatherSons, FlattenFatherSonsCountDto>().ToList(); 

The resulting SQL is again very good (see below):

 
SELECT 
    [Extent1].[Id] AS [Id], 
    (SELECT 
        COUNT(1) AS [A1]
        FROM [dbo].[Son] AS [Extent2]
        WHERE [Extent1].[Id] = [Extent2].[FatherSons_Id]) AS [C1], 
    [Extent1].[MyInt] AS [MyInt], 
    [Extent1].[MyString] AS [MyString]
    FROM [dbo].[FatherSons] AS [Extent1]

Full list of ExpressMapper flattening features

  1. Flattening looks at properties inside classes, either EF relationships or EF Complex Types. For the DTO to match it must match on two counts:
    1. The Name must match: The DTO name should be  a concatenating the names without the dots, e.g. Variant.Size.Name is accessed by having a property called VariantSizeName in the DTO.
    2. The Type must match: The type of the DTO property must be either:
      1. The same type as the inner property, e.g. SonMyInt can be of type int.
      2. A nullable version of type of the inner property, e.g. SonGrandsonMyInt is of type Nullable<int>. This is useful when you have optional relationships (see 3 below).
      3. A class that has been registered with ExpressMapper as mapping between the inner property and the DTO class (see 2 below for a better explanation!)
  2. You can have nested DTOs within DTOs. For instance in our Father->Son->Grandson examples you can have a property called SonGrandson, which is a DTO class. The only rule is you must also register the SonGrandson DTO class as well as the outer DTO. There is an example of this in the ExpressMapper Unit Tests – see test on line 95 of FlattenEntityTests.cs and note the registering of the extra DTO class on line 27 of the same file.
  3. Flattening can handle optional relationships, i.e. null references to a class. In that case it will return null for all the properties you access in the nested class(es). See example 1.b above.
  4. Flattening can handle circular references, e.g. if class person has a property called Boss of type person then you can have a DTO property such as BossBossBoss to get the third boss up (or null if the hierarchy doesn’t go that high).
  5. Flattening also looks at collection properties, i.e. one-to-many relationships.  If the DTO contains a property with the name of the collection, plus an ending of Any, Count or LongCount then it adds the LINQ method of that name. The property must be of the right type, i.e. bool, int or long respectively otherwise you get an ExpressMapperException.

Configuring flattening in ExpressMapper

  • Flattening only happens if you add the method .Flatten() to the registration of the mapping.
  • If you use ExpressMapper’s .Ignore() or .Member() methods in registration with Flattening then your .Ignore() or .Member() calls will take over from flattening, i.e. Flattening will not touch those source properties. The order in which you apply .Flatten, .Ignore and .Member does not matter.
  • The comparison of names follows ExpressMapper configuration, which defaults is case insensitive. For instance in the Variant.Size.Name case the match would work with variantsizename or any other upper/lower sequence of the same letters.
    NOTE: this is useful when you are mapping to json in an API and you often want the first letter of the property to be lower case.
  • Flattening only works on the Source side, i.e. it does not ‘unflatten’ if you do a DTO->class mapping. I did not implement this because it just doesn’t work with EF, which needs you to link to tracked classes. It is also very difficult to get right in all cases, e.g. what do you do with a Count??
    NOTE: That is why you see the command ExpressMapper.Mapper.Compile(CompilationTypes.Source) in the examples. Because I only use the class->DTO direction then I don’t bother to compile the reverse direction.

Comparison with AutoMapper

Many people, including me, are already using AutoMapper, so it is useful to point out the differences.

  1. AutoMapper is a bit more specific on matching the concatenated names, and has some extra features for changing this. My implementation in ExpressMapper simply matches the next part of the string as it goes, plus the type. I think AutoMapper would not accept vAriantsiZenAme but ExpressMapper’s Flatten will.
  2. AutoMapper has other flattening features, like GetMethod. ExpressMapper’s Flattening does not support that as they don’t work with EF.
    NOTE:  The way to overcome this is either by specific .Member() set ups or more elegantly by using DelegateDecompiler. DelegateDecompiler allows you to add ‘computed’ properties which DelegateDecompiler can turn into LINQ code that EF can execute. I use DelegateDecompiler in my GenericServices package and it is very helpful. You should look it up!
  3. AutoMapper does not, by default, handle nullable nested references (see example 1.b above). This is a plus for ExpressMapper.

Conclusion

I wanted to get three main things across in this article:

  1. Entity Framework is creates very efficient SQL code when accessing nested One-to-One or One-to-ZeroOrOne relationships.
  2. Object-to-Object mappers can make your life a lot easier in writing EF POCO classes to DTO code.
  3. ExpressMapper now has a good Flattening feature.

When I have the time (!) I plan to update my GenericServices project to use ExpressMapper. It has some nice features and my very simple testing says its about eight times faster on setup, i.e. the registering of the mappings, than AutoMapper. However ExpressMapper is about the same speed as AutoMapper on the actual mapping of the data.

I should say that EF doesn’t do so well at producing efficient T-SQL commands if you use EF’s .Include() method to eager load properties that are collections. I would like to write about that later if I have time, as there are definitely some things to watch out for in cases like that.

Well done for getting to here. The article is a bit long but hopefully useful.

Happy coding!