Skip to main content

The repository's repository



Ever since I started delving into architecture,  and specifically service oriented architecture, there has been one matter where opinions get divided. Let me state the problem first, and then take a look at both sides of the barricade.

Given that your service layer needs to access persistent storage, how do you model that layer? It is almost common knowledge what to do here: use the Repository design pattern.

So we look at the pattern and decide that it seems simple enough! Let's implement the shit out of it!

Now, let's say that you will use an ORM - here comes trouble. Specifically we're using EF, but we could be talking about NHibernate or really any other. The real divisive theme is this question: should you be using the repository pattern at all when you use an ORM?

I'll flat out say it: I don't think you should... except with good reason.

So, sharpen your swords, pray to your gods and come with me to fight this war... or maybe stay in the couch?

Reasons to use a repository with ORM
  1. You want to separate the database access logic from the business logic.
  2. You want to have the ability to use different databases - Imagine you have a product that needs to work both in Sql Server and Postgres.
  3. You want the ability to use multiple ORMs, so you abstract that.
I'm sure you can find a few more. I've implemented a few architectures and I know that the implementation goes something like this:

You think about a repository class per Entity. You have the ClientRepository, the InvoiceRepository, etc. Then you notice there is a LOT of duplicated code just by exposing a few basic methods. Time     to      ABSTRACT
For Create or Update operations it is easy enough, we just have to push them to the DbContext. But what about the Get method? What parameters should we expose here?

You decide to expose the Expression<Func<T,bool>> type which allows you to pass as parameter the whole filter. Fixed? Well you just destroyed argument #1. Now your filter logic is everywhere and you require your developers to be very disciplined in how they use it. Tough life, let's move on. We need to make some hard decisions, right?

In the future comes the need for ordering. And grouping too. Maybe some paging too? Do you include all that in the generic Get method? Might as well expose the IQueryable<T>! Big mistake because now you have an even leakier abstraction. You are essentially already exposing your DbContext.

Moving on, you realize that to use different databases in EF all you need to do is change a few configurations. This seems to be similar for other major ORMs, so this advantage is not really given by your repository layer. There goes reason #2.

How about being able to use multiple ORMs? TBH, I never had to do this in my whole career. If you can't do something with your ORM of choice then you go straight to SQL for that operation, but there are other ways of abstracting that without creating a whole new layer in your architecture.

But let's say we ignore all of this and we move on.



If you have a repository per entity (even a generic one), what to do with those queries that must deal with multiple entities? What if you have a query that projects from multiple tables at once, for example the Invoices, the Clients and the Products at the same time? Where should it be?

And how do you model such operations in a transactional way? If you are saving data in multiple repositories and you have a DbContext for each one then that means you are possibly using multiple transactions and you will get in trouble.

Enter our friend, the Unit Of Work design pattern. The idea is simple. We open a transaction (say, TransactionScope), record every change we want in the repositories and finally we commit the transaction. If any of those fails we just rollback the transaction and we're done.

Well that's great, but guess what? EF is also implementing the Unit of Work pattern and does all of this for you in a DbContext. Again, you don't gain anything here. I think every major ORM does this.

Here are a few more reasons why I don't like to implement the repository pattern when working with ORM's:

  • It is an additional layer of abstraction, which tends to add to the cognitive load you already have when working on significant projects. And it needs to be maintained of course.
  • The repository becomes just a pass-through layer to the DbContext adding no value to your architecture.
  • We are throwing away the ability to use some more advanced functionalities of EF, such as Lazy Loading, accessing the ChangeTracker, etc. 
  • The queries are also business logic. To me it makes sense to test the query along with everything else.
  • It adds significant clutter to things like Unit Testing. Nowadays with EF there are excellent solutions such as the InMemoryProvider specifically designed for this scenario. Do not downplay the difficulty that complex tests can present!
I think I made my point clear that doing repository + ORM is over-architecting your code, but I have definitely read a lot of literature that says otherwise. Again, the scenario I presented is one I found myself in and I never looked back when I decided to get rid of it.

Your mileage may vary of course!


Comments

  1. While what you say here is correct, there are ways to make a proper repository over an ORM - use the specification pattern. I think it is not worth it for a regular project but if I ever judge the risk of switching data access methods as high I'd do it that way.

    ReplyDelete
  2. Stilgar, I find that the specification pattern is a heavy weight that brings a lot of complexities associated. In my own implementations I found that it becomes confusing fast and since it is a rarer pattern it adds a lot of cognitive load if one is not familiar with it. It also needs to be very complex to handle operations like grouping, sorting, etc.

    At that point you might just give up on using the Specification and simply use custom methods for the queries, of course.

    Another problem is I found that I couldn't take advantage of advanced ORM features (that are generally specific to each ORM) or else I need to expose that somehow as an "options" parameter if possible. It's just another very leaky abstraction.
    This is where I get off the bus and give up on abstracting the ORM away.

    Thank you for the insight!

    ReplyDelete

Post a Comment

Popular posts from this blog

Follow up: improving the Result type from feedback

This post is a follow up on the previous post. It presents an approach on how to return values from a method. I got some great feedback both good and bad from other people, and with that I will present now the updated code taking that feedback into account. Here is the original: And the modified version: Following is some of the most important feedback which led to this. Make it an immutable struct This was a useful one. I can't say that I have ever found a problem with having the Result type as a class, but that is just a matter of scale. The point of this is that now we avoid allocating memory in high usage scenarios. This was a problem of scale, easily solvable. Return a tuple instead of using a dedicated Result type The initial implementation comes from a long time ago, when C# did not have (good) support for tuples and deconstruction wasn't heard of. You would have to deal with the Tuple type, which was a bit of a hassle. I feel it would complicate the ...

C# 2.0 - Partial Types

For those of you interested, i found a very interesting list of features that were introduced in C# in  here . This is a very complete list that contains all the features, and i'm explaining them one by one in this post series. We've talked about  Generics  and  Iterators . Now it's time for some partial types . A partial type  is a type which definition is spread across one or more files. It doesn't have to be in multiple separated files, but can be. This is a very simple concept that can give us many benefits, let's see: If a type is partial, multiple developers can work on every part of it. This allows a more organized way of working and can lead to production improvement.  Winforms , for example, generates a partial class for the form so that the client can separately edit other parts it. This way, a part contains information about the design and the other contains the logic of the form. In fact, this is a very spread pattern across .Net. Ent...