I have been building a new application using GenericServices and I thought it would be useful to others to reflect on the approaches I have used when working with more complex data. In this second master class article I am looking at deleting entries, i.e. data rows, in a SQL Server database using Entity Framework. The master class first article looked at creating and updating entries and can be found here.
I wrote this article because the more I used GenericServices IDeleteService in real applications I ended up using the more complex DeleteWithRelationships version of the command. This article says why and suggests some ways of tackling things, including not deleting at all!
What is GenericServices??
For those of you who are not familiar with GenericServices it is an open-source .NET library designed to simplify the interface between an ASP.NET MVC application and a database handled by Microsoft’s Entity Framework (EF) data access technology. You can read more about it at https://github.com/JonPSmith/GenericServices where the Readme file has links to two example web sites. The GenericServices’ Wiki also has lots more information about the library.
While this article talks about GenericServices I think anyone who uses EF might find it useful as it deals with the basics behind deleting entities.
An introduction to relationships and deletion
Note: If you already understand the different types of referential relations in a relational database and how they affect deletion then you can skip this part and go to the section with title Deleting with GenericServices bookmark.
On the face of it deleting rows of data, known in EF at entities, seems like a simple action. In practice in relational databases like SQL Server it can have far reaching consequences, some of which might not be obvious at first glance when using EF. Let me give you a simple example to show some different aspects to delete.
Below is a database diagram of the data used in the example application SampleMvcWebApp, which contains:
- A selection of Tags, e.g. Programming, My Opinion, Entertainment, etc. that can be applied to a post.
- A list of Posts, where each is a article on some specific topic.
- A list of Bloggers, which are the authors of each of the Posts.
Now this example shows the two most used type of relationships we can have in a relational database. They are:
- One-to-many (1-*): A Post has one Blogger, and a Blogger can have from zero to many Posts.
- Many-to-many (*-*). A Post can have zero to many Tags, and Tags may be used in zero to many Posts.
The other type of basic relationship is a one-to-one relationship. This is where one entity/table is connected to another. You don’t see this type of relationship so much as if both ends are required (see below) then they can be combined. However you do see one-on-one relationships where one end is optional.
So the other aspect to the ‘One’ part of a relationship is whether it is required or optional (required: 1, optional: 0..1). An example of a required relationship is that the Post must have a Blogger, as that defines the author. An example of an optional relationship would allowing the Blogger to add an optional ‘more about the author’ entry in another table. The Author can choose to set up that data, or not bother.
How EF models these relationships
EF has multiple ways of setting up relationships and I would refer you to their documentation. Here is my super simple explanation to help you relate the section above to Entity Framework classes:
- The ‘One’ end of a relationship has a property to hold the key (or multiple properties if a composite key).
- If the key property(ies) is/are nullable, e.g. int? or string without [Required] attribute, then the relationship is optional.
- If the key property(ies) is/are not nullable, e.g. int or string with [Required] attribute, then the relationship is required. See the BlogId property in the Post class of SampleMvcWebApp.
- The ‘Many’ end of a relationship is represented by a collection, normally ICollection<T> where T is the class of the other end of the relationship. See the Posts property in the Blog class of SampleMvcWebApp.
- Many-to-Many relationships have Collections at each end, see the Tags property in the Post class and the Posts property in the Tag Class of SampleMvcWebApp.
EF is very clever on many-to-many relationships and automatically creates a new table that links the two classes. See my article Updating a many to many relationship in entity framework for a detailed look at how that works.
How these relationships affect Deletion?
If you delete something in a relational database that has some sort of relationship then it is going to affect the other parts of the relationship. Sometimes the consequences are so small that they don’t cause a problem. However, especially in one-to-one or one-to-many relationships, the effect of a delete does have consequences that you need to think about. Again, let me give you two examples you can actually try yourself on the SampleMvcWebApp web site.
- Delete a Many-to-Many relationship. If you go to the Tags Page of SampleMvcWebApp and delete a Tag then when you look at the list of Posts then you will see that that tag has been removed from all Posts that used it (Now press Reset Blogs Data to get it back).
- Delete a One-to-Many. However if we go to the list of Bloggers on SampleMvcWebApp and delete one of the Bloggers then when you look at the list of Posts you will see that all Posts by that author have gone. (Now press Reset Blogs Data to get them all back).
So, what has happened on the second one? Well, in this case the database could not keep the relationship without doing something because the Post’s Blog link wasn’t optional. There were two possibilities: either it could delete all the Post for that Blogger or it could have refused to carry out the delete.
By default EF sets up what is called ‘cascade deletes‘, which is how SampleMvcWebApp is set up. In this case is what deleted the related Posts for that Blogger. If we turned off cascade deletes then the we would get a ‘DELETE statement conflicted with COLUMN REFERENCE’ (547) and the delete would fail.
The simplified rule is if entity A is involved in a required relationship with entity B then when A is deleted something has to happen: either B is deleted or the delete fails. Which happens depends on how you configure EF.
Deleting with GenericServices
GenericServices has a service called IDeleteService which has two delete methods:
- Delete<T>(key1, key2…) (sync and async) which deletes the row with the given key, or keys if it has a composite key, from the EF entity referred to by class T, e.g. Delete<Post>(1) would delete the Post with the key of 1.
- DeleteWithRelationships<T>(removeRelationshipsFunc, key1, key2…) (sync and async) which does the the same, but called the removeRelationshipsFunc as part of the delete.
I am not going to detail how to use them as the GenericServices Wiki has a good description.You can also find an example of the use of Delete<T> at line 121 in PostsController and an example of the use of DeleteWithRelationships<T> at line 218 in CustomerController.
The only other thing I would say is that deleting entries with composite keys is straightforward – just supply the keys in the order in which they occur in the database. Note: some programmer don’t like composite keys, but I do find them useful. There are places where composite keys are good at segregating data into groups: the primary key can be the group name, the secondary key is the name of the item itself.
When I first wrote GenericServices I just had a Delete<T> method. I very soon found that wasn’t enough, so I added DeleteWithRelationships<T>. Now I find I am using DeleteWithRelationships 80% of the time.
The rest of the article is about why I use DeleteWithRelationships so much, and a few pointers on alternatives to using Delete at all.
Why I use DeleteWithRelationships so much
I have used DeleteWithRelationships in three situations that I will describe:
- To provide better error messages when a delete would fail.
- To delete other associated entities that would not be caught be cascade deletes.
- To delete associated files etc. outside the relational database.
1. To provide better error messages when a delete would fail
I have many instances where I don’t want cascade deletes to work, so if I deleted I would get a SQL error 547. While GenericServices catches this and provides a fairly helpful error it isn’t that informative. I therefore often (actually, nearly always) use DeleteWithRelationships to provide a better error message. Let me give you an example.
In a web site I was building designers could set up designs with text fields. Each field had a ColourSpec, which is a database entity. I allowed a designer to delete a ColourSpec as long as that colour isn’t used in any of the text fields. If it is used then I output a list of designs where it is used so that the designer can decide if they want to remove those reference and try the delete again.
2. To delete other associated entities that would not be caught be cascade deletes
This happens rarely, but sometimes I have a complex relationship that needs more work. I found one in the AdvertureWorks database that I use in my example application complex.samplemvcwebapp.net. In this the customer address consists of two parts: the CustomerAddress with has a relationship to an Address. Now, if we want to delete one of the customer’s addresses, say because they have closed that office, we want to try and delete the associated Address part, which isn’t linked by cascade deletes.
By using DeleteWithRelationships I can pick up the associated Address relationship and delete that too. In fact I might need to do a bit more work to check if I can delete it as it might be references in an old order. By calling a method which I can write specifically for this case then I insert special logic into the delete service.
NOTE: because the delete is done by GenericServices and any extra delete done in the DeleteWithRelationships method are all executed in one commit. That means if either part of the delete fails then both are rolled bakc, which is exactly what you need.
3. To delete associated files etc. outside the relational database
In a recent web application the system included image files. I chose to store then in the Azure blob storage, which I have to say works extremely well. The database stored all the information about the image, include a url to access the image from the blob, while the images, which can be very large were stored in a blob table.
Now, when I delete the database entity about the image I will end up with an orphan image in the blob storage. I could have a WebJob that runs in the background to delete orphan images, but that is complicated. What I did do was add code to the DeleteWithRelationships to delete the images as part of the delete process.
There is obviously a danger here. Because the database and the blob are not connected then there is no combined ‘commit’, i.e. I might delete the images and the SQL database delete might then fail. Then my url links don’t point to a valid image. In this case I have made a pragmatic choice: I check that the delete should work by checking all the SQL relationships before I delete the blob images. I could get a failure at the SQL end which would make things go wrong, but other parts of the system are designed to be resilient to not getting an image so I accept that small possibility for a simpler system. Note that if I fail to delete the images from the blob I don’t stop the SQL delete – I just end up with orphan images.
An alternative to Delete
In working web applications I find it is sometimes better not to delete important entities, especially if I want to keep a history of orders or usage. In these cases, which happen a lot in live systems, I am now adding an enum ‘Status’ to my key entities which has an ‘Archive’ setting. So, instead of deleting something I set the Status to Archive.
All I have to do to make this useful is filter which entities are shown based on this Status flag. That way entries with a Status setting of ‘Archive’ are not shown on the normal displays, but can be seen in audit trails or admin screens.
I needed a Status flag anyway, so there is little extra overhead to providing this on my key entities. Also, in EF 6 you can now set a property to have a index, so the query is fairly quick. You need to think about your application, but maybe this would work for you too.
Note: See the comment below by Anders Baumann, with a link to an article called ‘Don’t Delete – Just Don’t‘ for a well formed argument for not using delete.
Conclusion
Deleting looks so simple, but my experience is that it is often more complicated than you think. Hopefully this article gives you both the background on why and the detail on how to use GenericServices to delete entities. I hope its helpful.
Happy coding!
Interesting article, but in general it is not a good idea to delete anything from the DB. Better to change status of the entity: http://www.udidahan.com/2009/09/01/dont-delete-just-dont/
Hi Anders,
I am coming to the same conclusion – see the last section of the article entitled ‘An alternative to Delete’. However I thought the whole topic of delete was worthy of discussion as I have found it much less straight forward than it looks.
I have added a comment to the article pointing to the article you mention.