Part 4: Building a robust and secure data authorization with EF Core

Last Updated: August 22, 2019 | Created: July 9, 2019

This article covers how to implement data authorization using Entity Framework Core (EF Core), that is only returning data from a database that the current user is allowed to access. The article focuses on how to implement data authorization in such a way that the filtering is very secure, i.e. the filter always works, and the architecture is robust, i.e. the design of the filtering doesn’t allow developer to accidently bypasses that filtering.

This article is part of a series and follows a more general article on data authorization I wrote six months ago. That first article introduced the idea of data authorization, but this article goes deeper and looks at one way to design a data authorization system that is secure and robust. It uses a new, improved example application, called PermissionAccessControl2 (referred to as “version 2”) and a series of new  articles which cover other areas of improvement/change.

UPDATE: See my NDC Oslo 2019 talk which covers these three articles.

Original articles:

TL;DR; – summary

  • This article provides a very detailed description of how I build a hierarchical, multi-tenant application where the data a user could access depends on which company and what role they have in that company.
  • The example application is built using ASP.NET Core and EF Core. You can look at the code and run the application, which has demo data/users, by cloning this GitHub repo.
  • This article and its new (version 2) example application is a lot more complicated than the data authorization described in the original (Part 2) data authorisation article. If you want to start with something, then read the original (Part 2) first.
  • The key feature that makes it work are EF Core’s Query Filters, which provides a way to filter data in ALL EF Core database queries.
  • I break the article into two parts:
    • Making it Secure, which covers how I implemented a hierarchical, multi-tenant application that filters data based on the user’s given data access rules.
    • Making it robust, which is about designing an architecture that guides developers so that they can’t accidently bypass the security code that has been written.

Setting the scene – examples of data authorization

Pretty much every web application with a database will filter data – Amazon doesn’t show you every product it has, but tried to show you things you might be interested in. But this type of filtering for the convenience of the user and is normally part of the application code.

Another type of database filtering is driven from security concerns – I refer to this this as data authorization. This isn’t about filtering data for user convenience, but about applying strict business rules that that dictate what data a user can see. Typical scenarios where data authorization is needed are:

  • Personal data, where only the user can read/alter their personal data.
  • Multi-tenant systems, where one database is used to support multiple, separate user’s data.
  • Hierarchical systems, for instance a company with divisions where the CEO can see all the sales data, but each division can only see their own sales data.
  • Collaboration systems like GitHub/BitBucket where you can invite people to work with you on a project etc.

NOTE: If you are new to this area then please see this section of the original article where I have a longer introduction to data protection, both what’s it about and what the different part are.

My example is a both a multi-tenant and a hierarchical system, which is what one of my client’s needed. The diagram below shows two companies 4U Inc. and Pets2 Ltd. with Joe, our user in charge of the LA division of 4U’s outlets.

The rest of this article will deal with how to build an application which gives Joe access to the sales and stock situation in both LA outlets, but no access to any of the other divisions’ data or other tenants like Pets2 Ltd.

In the next sections I look at the two aspects: a) making my data authorization design secure, i.e. the filter always works, and, b) its architecture is robust, i.e. the design of the filtering doesn’t allow developer to accidently bypasses that filtering.

A. Building a secure data authorization system

I start with the most important part of the data authorization – making sure my approach is secure, that is, it will correctly filter out the data the user isn’t allowed to see. The cornerstone of this design is EF Core’s Query Filters, but to use them we need to set up a number of other things too. Here is a list of the key parts, which I then describe in detail:

  1. Adding a DataKey to each filtered class: Every class/table that needs data authorization must have a property that holds a security key, which I call the DataKey.
  2. Setting the DataKey property/column: The DataKey needs to be set to the right value whenever a new filtered class is added to the database.
  3. Add the user’s DataKey to their claims: The user claims need to contain a security key that is matched in some way to the DataKey in the database.
  4. Filter via the DataKey in EF Core: The EF Core DbContext has the user’s DataKey injected into it and it uses that key in EF Core Query Filters to only return the data that the user is allowed to see.

A1. Adding a DataKey to each filtered class

Every class/table that needs data authorization must have a property that holds a security key, which I call the DataKey.

In my example application there are multiple different classes that need to be filtered. I have a base IDataKey interface which defines the DataKey string property. All the classes to be filter inherit this interface. The DataKey definition looks like this

As you will see the DataKey is used a lot in this application.

A2. Setting the DataKey property/column

The DataKey needs to be set to the right value whenever a new filtered class is added to the database.

NOTE: This example is complex because of the hierarchical data design. If you want a simple example of setting a DataKey see the “personal data filtering” example in the original, Part 2 article.

Because of the hierarchical nature of my example the “right value” is bit complex. I have chosen to create a DataKey than is a combination of the primary keys of all the layers in the hierarchy. As you will see in the next subsection this allows the query filter to target different levels in the multi-tenant hierarchy.

The diagram below shows you the DataKeys (bold, containing numbers and |) for all the levels in the 4U Company hierarchy.

The hierarchy consists of three classes, Company, SubGroup and RetailOutlet, which all inherit from an abstract class called TenantBase. This allows me to create relationships between any of the class types that inherit from TenantBase (EF Core to treat these classes as a Table-Per-Hierarchy, TPH, and stores all the different types in one table).

But setting the DataKey on creation is difficult because the DataKey needs the primary key, which isn’t set until its created in the database. My way around this is to use a transaction. Here is the method in the TennantBase class that is called by the Company, SubGroup or RetailOutlet static creation factory.

protected static void AddTenantToDatabaseWithSaveChanges
    (TenantBase newTenant, CompanyDbContext context)
{
    // … Various business rule checks let out

    using (var transaction = context.Database.BeginTransaction())
    {
        //set up the backward link (if Parent isn't null)
        newTenant.Parent?._children.Add(newTenant);
        context.Add(newTenant);  //also need to add it in case its the company
        // Add this to get primary key set
        context.SaveChanges();

        //Now we can set the DataKey
        newTenant.SetDataKeyFromHierarchy();
        context.SaveChanges();

        transaction.Commit();
    }
}

The Stock and Sales classes are easier to handle, as they use the user’s DataKey. I override EF Core’s SaveChanges/SaveChangesAsync to do this, using the code shown below.

public override int SaveChanges(bool acceptAllChangesOnSuccess)
{
    foreach (var entityEntry in ChangeTracker.Entries()
        .Where(e => e.State == EntityState.Added))
    {
        if (entityEntry.Entity is IShopLevelDataKey hasDataKey)
            hasDataKey.SetShopLevelDataKey(accessKey);
    }
    return base.SaveChanges(acceptAllChangesOnSuccess);
}

A3. Add the user’s DataKey to their claims:

The user claims need to contain a security key that is matched in some way to the DataKey in the database.

My general approach as detailed in the original data authorization article is to have claim in the user’s identity that is used to filter data on. For personal data this can be the user’s Id (typically a string containing a GUID), or for a straight-forward multi-tenant it would by some form of tenant key stored in the user’s information. For this you might have the following class in your extra authorization data:

public class UserDataAccessKey
{
    public UserDataAccessKey(string userId, string accessKey)
    {
        UserId = userId ?? throw new ArgumentNullException(nameof(userId));
        AccessKey = accessKey;
    }

    [Key]
    [Required(AllowEmptyStrings = false)]
    [MaxLength(ExtraAuthConstants.UserIdSize)]
    public string UserId { get; private set; }

    [MaxLength(DataAuthConstants.AccessKeySize)]
    public string AccessKey { get; private set; }
}

In this hierarchical and multi-tenant example it gets a bit more complex, mainly because the hierarchy could change, e.g. a company might start with simple hierarchy of company to shops, but as it grows it might move a shop to sub-divisions like the west-coast. This means the DataKey can change dynamically.

For that reason, I link to the actual Tenant class that holds the DataKey that the user should use. This means that we can look up the current DataKey of the Tenant when the user logs in.

NOTE: In Part 5 I talk about how to dynamically update the user’s DataKey claim of any logged in user if the hierarchy they are in changes.

public class UserDataHierarchical
{
    public UserDataHierarchical(string userId, TenantBase linkedTenant)
    {
        if (linkedTenant.TenantItemId == 0)
            throw new ApplicationException(
                "The linkedTenant must be already in the database.");

        UserId = userId ?? throw new ArgumentNullException(nameof(userId));
        LinkedTenant = linkedTenant;
    }

    public int LinkedTenantId { get; private set; }

    [ForeignKey(nameof(LinkedTenantId))]
    public TenantBase LinkedTenant { get; private set; }

    [Key]
    [Required(AllowEmptyStrings = false)]
    [MaxLength(ExtraAuthConstants.UserIdSize)]
    public string UserId { get; private set; }

    [MaxLength(DataAuthConstants.AccessKeySize)]
    public string AccessKey { get; private set; }
}

This dynamic DataKey example might be an extreme case but seeing how I handled this might help you when you come across something that is more complex than a simple value.

At login you can add the feature and data claims to the user’s claims using the code I showed in Part 3, where I add to the user’s claims via a UserClaimsPrincipalFactory, as described in this section of the Part 3 article.

The diagram below shows how my factory method would lookup the UserDataHierarchical entry using the User’s Id and then adds the current DataKey of the linked Tenant.

A4. Filter via the DataKey in EF Core

The EF Core DbContext has the user’s DataKey injected into it and it uses that key in EF Core Query Filters to only return the data that the user is allowed to see.

EF Core’s Query Filters are a new feature added in EF Core 2.0 and they are fantastic for this job. You define a query filter in the OnModelCreating configuration method inside your DbContext and it will filter ALL queries, that comprises of LINQ queries, using Find method, included navigation properties and it even adds extra filter SQL to EF Core’s FromSql method (FromSqlRaw or FromSqlInterpolated in EF Core 3+). This makes Query Filters a very secure way to filter data.

For the version 2 example here is a look the CompanyDbContext class with the query filters set up by the OnModelCreating method towards the end of the code.

public class CompanyDbContext : DbContext
{
    internal readonly string DataKey;

    public DbSet<TenantBase> Tenants { get; set; }
    public DbSet<ShopStock> ShopStocks { get; set; }
    public DbSet<ShopSale> ShopSales { get; set; }

    public CompanyDbContext(DbContextOptions<CompanyDbContext> options,
        IGetClaimsProvider claimsProvider)
        : base(options)
    {
        DataKey = claimsProvider.DataKey;
    }

    //… override of SaveChanges/SaveChangesAsync left out

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        //… other configurations left out

        AddHierarchicalQueryFilter(modelBuilder.Entity<TenantBase>());
        AddHierarchicalQueryFilter(modelBuilder.Entity<ShopStock>());
        AddHierarchicalQueryFilter(modelBuilder.Entity<ShopSale>());
    }

    private static void AddHierarchicalQueryFilter<T>(EntityTypeBuilder<T> builder) 
         where T : class, IDataKey
    {
        builder.HasQueryFilter(x => x.DataKey.StartsWith(Datakey));
        builder.HasIndex(x => x.DataKey);
    }
}

As you can see the Query Filter uses the StartsWith method to compare the user’s DataKey and the DataLeys of the tenants and their Sales/Stock data. This means if Joe has the DataKey of 1|2|5| then he can see the Stock/Sales data for the shops “LA Dress4U” and “LA Shirt4U” – see diagram below.

NOTE: You can try this by cloning the PermissionAccessControl2 repo and running it locally (by default it uses an in-memory to make it easy to run). Pick different users to see the different data/features you can access.

B. Building a robust data authorization architecture

We could stop here because we have covered all the code needed to secure the data. But I consider data authorization as a high-risk part of my system, so I want to make it “secure by design”, i.e.  it shouldn’t be possible for a developer to accidently write something that bypasses the filtering. Here are the things I have done to make my code robust and guide another developer on how to do things.

  1. Use Domain-Driven design database classes. Some of the code, especially creating the Company, SubGroup and RetailOutlet DataKeys, are complicated. I use DDD-styled classes which provides only one way to create or update the various DataKeys and relationships.
  2. filter-Only DbContext: I build a specific DbContext to contain all the classes/tables that need to be filtered.
  3. Unit test to check you haven’t missed anything: I build unit tests that ensure the classes in the filter-Only DbContext have a query filter.
  4. checks: With an evolving application some features are left for later. I often add small fail-safe checks in the original design to make sure any new features follow the original design approach.

B1. Use Domain-Driven design (DDD) database classes

DDD teaches us that each entity (a class in .NET world) should contain the code to create and update itself and its aggregates. Furthermore, it should stop any other external code from being able to bypass theses methods so that you are forced to use the methods inside the class. I really like this because it means there is one, and only one, method you can call to get certain jobs done.

The effect is I can “lock down” how something is done and make sure everyone uses the correct methods. Below the TenantBase abstract class which all the tenant classes inherit from showing the  MoveToNewParent method that moves a tenant to another parent, for instance moving a RetailOutlet to a different SubGroup.

public abstract class TenantBase : IDataKey
{
    private HashSet<TenantBase> _children;

    [Key]
    public int TenantItemId { get; private set; }
    [MaxLength(DataAuthConstants.HierarchicalKeySize)]
    public string DataKey { get; private set; }
    public string Name { get; private set; }

    // -----------------------
    // Relationships 

    public int? ParentItemId { get; private set; }
    [ForeignKey(nameof(ParentItemId))]
    public TenantBase Parent { get; private set; }
    public IEnumerable<TenantBase> Children => _children?.ToList();

    //----------------------------------------------------
    // public methods

    public void MoveToNewParent(TenantBase newParent, DbContext context)
    {
        void SetKeyExistingHierarchy(TenantBase existingTenant)
        {
            existingTenant.SetDataKeyFromHierarchy();
            if (existingTenant.Children == null)
                context.Entry(existingTenant).Collection(x => x.Children).Load();

            if (!existingTenant._children.Any())
                return;
            foreach (var tenant in existingTenant._children)
            {                   
                SetKeyExistingHierarchy(tenant);
            }
        }

        //… various validation checks removed

        Parent._children?.Remove(this);
        Parent = newParent;
        //Now change the data key for all the hierarchy from this entry down
        SetKeyExistingHierarchy(this);
    }
    //… other methods/constructors left out
}

The things to note are:

  • Line 3: The Children relationship is held a private field which cannot be altered by outside code. It can be read via the IEnumerable<TenantBase> Children property (line 17) but you can’t add or remove children that way.
  • Lines 6 to 16: Similarly, all the properties have private setters, so it can only be changed by methods inside the TenanBase class.
  • Lines 22 to the end: This has the code to change the relationship and then immediately runs a recursive method to change all the DataKeys of the other tenants underneath this one.

B2. Build a filter-Only DbContext

One possible problem could occur if a non-filtered relationship was present, say a link back to some authorization code (there is such a relationship linking a tenant to a UserDataHierarchical class). If that happened a developer could write code that could expose data that the user shouldn’t see – for instance accessing the UserDataHierarchical class could expose the user’s Id which I wish to keep secret.

My solution was to create a separate DbContext for the multi-tenant classes, with a different (but overlapping) DbContext for the extra authorization classes (see the diagram below as to what this looks like). The effect is to make a multi-tenant DbContext which any contains the filtered multi-tenant data. For a developer makes it clear what classes you can access when using multi-tenant DbContext.

NOTE: having multiple DbContexts with a shared table can make database migrations a bit more complicated. Have a look at my article “Handling Entity Framework Core database migrations in production” for different ways to handle migrations.

B3. Using unit tests to check you haven’t missed anything

With feature and data authorization I add unit tests that check I haven’t left a “hole” in my security. Because I have a DbContext specifically for the multi-tenant data I can write a test to check that every class mapped to the database has a Query Filter applied to it. Here is the code I use for that.

[Fact]
public void CheckQueryFiltersAreAppliedToEntityClassesOk()
{
    //SETUP
    var options = SqliteInMemory.CreateOptions<CompanyDbContext>();
    using (var context = new CompanyDbContext(options, 
        new FakeGetClaimsProvider("accessKey")))
    {
        var entities = context.Model.GetEntityTypes().ToList();

        //ATTEMPT
        var queryFilterErrs = entities.CheckEntitiesHasAQueryFilter().ToList();

        //VERIFY
        queryFilterErrs.Any().ShouldBeFalse(string.Join('\n', queryFilterErrs));
    }
}
public static IEnumerable<string> CheckEntitiesHasAQueryFilter(
    this IEnumerable<IEntityType> entityTypes)
{
    foreach (var entityType in entityTypes)
    {
        if (entityType.QueryFilter == null &&
             entityType.BaseType == null  && //not a TPH subclass
             entityType.ClrType
                 .GetCustomAttribute<OwnedAttribute>() == null && //not an owned type
             entityType.ClrType
                 .GetCustomAttribute<NoQueryFilterNeeded>() == null) //Not marked as global
            yield return 
                 $"The entity class {entityType.Name} does not have a query filter";
    }
}

B4. Adding fail-safe checks

You need to be careful of breaking the Yagni (You Aren’t Gonna Need It) rule, but a few fail-safe checks on security stuff makes me sleep better at night. Here are the two small things I did in this example which will cause an exception if the DataKey isn’t set properly.

Firstly, I added a [Required] attribute to the DataKey property (see below) which tells the database that the DataKey cannot be null. This means if my code fails to set a DataKey then the database will return a constraint error.

[Required] //This means SQL will throw an error if we don't fill it in
[MaxLength(DataAuthConstants.HierarchicalKeySize)]
public string DataKey { get; private set; }

My second fail-safe is also to do with the DataKey, but in this case I’m anticipating a future change to the business rules that could cause problems. The current business rules say that only the users that are directly linked to a RetailOutlet can create new Stock or Sales entries, but what happens if (when!) that business rule changes and divisional managers can create items in a RetailOutlet. The divisional managers don’t have the correct DataKey, but a new developer might miss that and you could “lose” data.

My answer is to add a safely-check to the retail outlet’s DataKey. A retail outlet has a slightly different DataKey format – it ends with a * instead of a |. That means I can check a retail outlet format DataKey is used in the SetShopLevelDataKey and throw an exception if it’s not in the right format. Here is my code that catches this possible problem.

public void SetShopLevelDataKey(string key)
{
    if (key != null && !key.EndsWith("*"))
        //The shop key must end in "*" (or be null and set elsewhere)
        throw new ApplicationException(
             "You tried to set a shop-level DataKey but your key didn't end with *");

    DataKey = key;
}

This is a very small thing, but because I know that change is likely to come and I might not be around it could save someone a lot of head scratching working out why data doesn’t end up in the right place.

Conclusion

Well done for getting to the end of this long article. I could have made the article much shorter if I only dealt with the parts on how to implement data authorization, but I wanted to talk about how handling security issues should affect the way you build your application (what I refer to as have a “robust architecture”).

I have to say that the star feature in my data authorization approach is EF Core’s Query Filters. These Query Filters cover ALL possible EF Core based queries with no exceptions. The Query Filters are the cornerstone of data authorization approach which I then add a few more features to manage user’s DataKeys and do clever things to handle the hierarchical features my client needed.

While you most likely don’t need all the features I included in this example it does give you a look at how far you can push EF Core to get what you want. If you need a nice, simple data authorization example, then please look at the Part 2 article “Handling data authorization in ASP.NET Core and Entity Framework Core” which has a personal data example which uses the User’s Id as the DataKey.