Your ORM “Model” is not your Domain Model
There’s been a lot of talk in our company around ORMs and the best data access strategy lately. As .NET developers, we’ve grown accustomed to using ORM’s (whether it be Linq2Sql, EntityFramework or NHibernate (I assume it applies just as well to non-.NET ORMs)) for our domain model. Sometimes the ORM classes would be extended and customized, but at the root the ORM “Model” always represented the “Domain Model” of the application. I’ve even seen a couple of guides saying that an ORM negates the need for a data layer – the ORM is your data layer.
When ORMs went mainstream in the .NET world with the release of Linq2Sql, it completely blew my mind. I learned later that it was not a new concept, with patterns like ActiveRecord being widely applied in other languages, but it was new to me. For my whole development career up to that point (admittedly quite short) I had manually mapped relational database queries to their object counterparts using repeated, ugly and cumbersome code. I clearly remember watching a video by Anders Hejlsberg – the main developer behind C# – around the mismatch of relational data vs classes and objects. The problem presented was exactly the problem I faced. The solution presented was perfect. You can now treat your database rows, with all their related data, as native CLR objects and child objects. “Mindblowingly awesome” could not describe it.
The benefits of going this route seems endless. Refactoring becomes easier. You reduce boring, repetitive boilerplate mapping code. It’s even, dare I say it, elegant.
Since ORMs became mainstream in the .NET world with the release of Linq2Sql, the pattern of using ORM models (hand-coded POCOs or generated code) as the domain model of an application had become commonplace. You’d find it in tutorials and guides from reputable websites, and everyone was doing it, so it must be best practice, right? It was for a while, but with hard experience, issues started to creep in.
The dreaded Select N + 1 problem.
If the phrase “Select N + 1” is new to you, start here (and keep googling). The main problem with an ORM model is that it gives you the illusion of an always-available hierarchy of objects that you can call at will. That means you can loop over an object’s child properties, and it’ll just be there, thanks to the lazy-loading capabilities of most ORMs.
Which means, if you don’t consider the ORM when writing domain logic code, you have an ever-increasing number of trips to the database for each child entity, and the result can’t be cached, because the rest of the application depends on having that lazy-loading, get-it-if-I-need-it capability always being there. The problem compounds when you have child entities numbering in the thousands.
Of course, you can use a LoadWith() method or something similar to specify which child entities should be loaded with the main object to solve this problem to some degree.
Blurring of application layers
But even if your data layer is optimised to load child entities when needed, you still have the unloaded child entities as exposed properties of your ‘main’ entity object, handing the incentive to another developer to load those child entities as needed. And having to give consideration to these issues at a lower level defeats the purpose of treating your data model as native CLR objects in the first place.
Your ORM model is not your Domain Model
The conclusion I’ve come to is that, except in very simple cases, treating your ORM “model” as your domain model is a fundamentally bad idea. Your ORM model is a very convenient representation of your database, nothing more. Sure, you can abstract a little bit using inheritance hierarchies and class extensions, and configure your Repository to load the child entities you might need, but underneath it all, you’re still working with a relational database and SQL queries. Using an immediate-feedback tool like MiniProfiler to see what goes on in the database makes that uncomfortably obvious.
Of course, chucking out the ORM pattern completely deprives you of many benefits. An ORM gives you type-safe access to the database with no use of magic strings. When using a type-safe language such as C#, changing the data model will tell you at compile-time where else you need to make changes. And it just makes data-access code that much easier to write.
A solution?
So what I’ve started to do with new projects is to treat the ORM as a data-specific layer, similar to how you would treat data consumed from a web service. There’s a separate layer dealing with the mapping logic, and you never expose the underlying data model. It creates some extra work, but makes the rest of the application easier to write and more reliable. It also means that if you decide to use stored procedures or some other custom data access logic code, it’s completely abstracted.
For anything vaguely complex, I find the latter of the following patterns desirable.
ORM-centered Pattern:
(ORM + business logic) -> (UI/ORM mapping) -> UI
ORM-ignorant pattern:
ORM model -> (ORM/Domain model mapping) -> Domain model + business logic -> (UI/Domain model mapping) -> UI
Nice food for thought. I agree from a purist point of view.
When looking at Entity Framework though, your model is mostly independent from your storage layer (apart from the need to query, at times, by telling the data store what to eager load).
So I don’t feel *too* bad treating my Entity Framework model as my Domain Model.
I do have a problem however with treating a Linq2Sql model as a domain model though, that’s going too far. i.e. mixing active record with domain model. Eugh.
Few points to consider, your ORM is just an infrastructure concern. Our domain model, in pure DDD, should really be persistence agnostic. Your domain shouldn’t really care if you are saving to relational db, file system, xml or storing in a in-memory collection.
That’s where patterns like Repository pattern and Unit of Work come into play. There shouldn’t really be any talking to the ORM in your model.
It is not really an “ORM pattern” though.
Problem sometimes with the designers in tools like EF, is that devs create the database and the drag around some tables and classes and the assume the have a “domain”. Clearly that not the case and always encourage design to start in the domain model first and then map that to a schema when done.
Sam, sounds like you are saying exactly what John is suggesting? Map from ORM model to your actual domain model, so keep them completely separate?
Yip, they are not one and the same. I would always start by creating my domain model first as opposed to the DB schema first. Makes a big difference when being able to talk about the actual business domain as opposed to tables. We don’t care about tables we care about entities and they may be stored in a bunch of different places from relational db to document db etc.
(Beware anti-DDD-purist comments below)
Having said that – if you know you are always going to use a relational db for persistence, you can get great benefits from letting your ORM leak into your model as there are often specific ORM provider features that are very powerful and useful that would be lost when developing against a set of interfaces exposing lowest common denominator like IRepository.
@Willem: I agree that EF/NH makes your model cleaner in that you don’t use generated code. The problem is (as mentioned in the post) that e.g. at Controller level in an MVC app you still have access to the related entities as properties of the main entity. This applies to POCOs as well as Linq2Sql-generated classes.
And if it’s there, it can be used. If the entity was loaded from cache, it’ll break (because the related entities were not loaded pre-caching, and it’s not bound to the DataContext). If it wasn’t loaded from cache, it’ll cause the select n+1 issue. This is the main reason I’m going for the additional layer approach.
@avesse I actually mark almost all my associations and collections as not-lazy to ensure that each method that returns entities will eager-load only what is needed for that request.
This ensures you model is not being called from your view, MVC pattern after all, and avoids issues like select n+1. Normally I would transform the entities into a simple DTO and cache the DTO, not the entities.
I normally profile every call to see exactly what is being returned and sql generated. Lazy loading is great in desktop apps or when the db is close to the app, but don’t feel it works in web apps.
I see your website needs some unique content.
Writing manually is time consuming, but there is
solution for this. Just search for: Masquro’s strategies