Ever since I started delving into architecture, and specifically service oriented architecture, there has been one matter where opinions get divided. Let me state the problem first, and then take a look at both sides of the barricade.
Given that your service layer needs to access persistent storage, how do you model that layer? It is almost common knowledge what to do here: use the Repository design pattern.
So we look at the pattern and decide that it seems simple enough! Let's implement the shit out of it!
Now, let's say that you will use an ORM - here comes trouble. Specifically we're using EF, but we could be talking about NHibernate or really any other. The real divisive theme is this question: should you be using the repository pattern at all when you use an ORM?
I'll flat out say it: I don't think you should... except with good reason.
So, sharpen your swords, pray to your gods and come with me to fight this war... or maybe stay in the couch?
Reasons to use a repository with ORM
- You want to separate the database access logic from the business logic.
- You want to have the ability to use different databases - Imagine you have a product that needs to work both in Sql Server and Postgres.
- You want the ability to use multiple ORMs, so you abstract that.
You think about a repository class per Entity. You have the ClientRepository, the InvoiceRepository, etc. Then you notice there is a LOT of duplicated code just by exposing a few basic methods. Time to ABSTRACT
For Create or Update operations it is easy enough, we just have to push them to the DbContext. But what about the Get method? What parameters should we expose here?
You decide to expose the Expression<Func<T,bool>> type which allows you to pass as parameter the whole filter. Fixed? Well you just destroyed argument #1. Now your filter logic is everywhere and you require your developers to be very disciplined in how they use it. Tough life, let's move on. We need to make some hard decisions, right?
In the future comes the need for ordering. And grouping too. Maybe some paging too? Do you include all that in the generic Get method? Might as well expose the IQueryable<T>! Big mistake because now you have an even leakier abstraction. You are essentially already exposing your DbContext.
Moving on, you realize that to use different databases in EF all you need to do is change a few configurations. This seems to be similar for other major ORMs, so this advantage is not really given by your repository layer. There goes reason #2.
How about being able to use multiple ORMs? TBH, I never had to do this in my whole career. If you can't do something with your ORM of choice then you go straight to SQL for that operation, but there are other ways of abstracting that without creating a whole new layer in your architecture.
But let's say we ignore all of this and we move on.
If you have a repository per entity (even a generic one), what to do with those queries that must deal with multiple entities? What if you have a query that projects from multiple tables at once, for example the Invoices, the Clients and the Products at the same time? Where should it be?
And how do you model such operations in a transactional way? If you are saving data in multiple repositories and you have a DbContext for each one then that means you are possibly using multiple transactions and you will get in trouble.
Enter our friend, the Unit Of Work design pattern. The idea is simple. We open a transaction (say, TransactionScope), record every change we want in the repositories and finally we commit the transaction. If any of those fails we just rollback the transaction and we're done.
Well that's great, but guess what? EF is also implementing the Unit of Work pattern and does all of this for you in a DbContext. Again, you don't gain anything here. I think every major ORM does this.
Here are a few more reasons why I don't like to implement the repository pattern when working with ORM's:
- It is an additional layer of abstraction, which tends to add to the cognitive load you already have when working on significant projects. And it needs to be maintained of course.
- The repository becomes just a pass-through layer to the DbContext adding no value to your architecture.
- We are throwing away the ability to use some more advanced functionalities of EF, such as Lazy Loading, accessing the ChangeTracker, etc.
- The queries are also business logic. To me it makes sense to test the query along with everything else.
- It adds significant clutter to things like Unit Testing. Nowadays with EF there are excellent solutions such as the InMemoryProvider specifically designed for this scenario. Do not downplay the difficulty that complex tests can present!
Your mileage may vary of course!
While what you say here is correct, there are ways to make a proper repository over an ORM - use the specification pattern. I think it is not worth it for a regular project but if I ever judge the risk of switching data access methods as high I'd do it that way.
ReplyDeleteStilgar, I find that the specification pattern is a heavy weight that brings a lot of complexities associated. In my own implementations I found that it becomes confusing fast and since it is a rarer pattern it adds a lot of cognitive load if one is not familiar with it. It also needs to be very complex to handle operations like grouping, sorting, etc.
ReplyDeleteAt that point you might just give up on using the Specification and simply use custom methods for the queries, of course.
Another problem is I found that I couldn't take advantage of advanced ORM features (that are generally specific to each ORM) or else I need to expose that somehow as an "options" parameter if possible. It's just another very leaky abstraction.
This is where I get off the bus and give up on abstracting the ORM away.
Thank you for the insight!