Skip to main content

The repository's repository



Ever since I started delving into architecture,  and specifically service oriented architecture, there has been one matter where opinions get divided. Let me state the problem first, and then take a look at both sides of the barricade.

Given that your service layer needs to access persistent storage, how do you model that layer? It is almost common knowledge what to do here: use the Repository design pattern.

So we look at the pattern and decide that it seems simple enough! Let's implement the shit out of it!

Now, let's say that you will use an ORM - here comes trouble. Specifically we're using EF, but we could be talking about NHibernate or really any other. The real divisive theme is this question: should you be using the repository pattern at all when you use an ORM?

I'll flat out say it: I don't think you should... except with good reason.

So, sharpen your swords, pray to your gods and come with me to fight this war... or maybe stay in the couch?

Reasons to use a repository with ORM
  1. You want to separate the database access logic from the business logic.
  2. You want to have the ability to use different databases - Imagine you have a product that needs to work both in Sql Server and Postgres.
  3. You want the ability to use multiple ORMs, so you abstract that.
I'm sure you can find a few more. I've implemented a few architectures and I know that the implementation goes something like this:

You think about a repository class per Entity. You have the ClientRepository, the InvoiceRepository, etc. Then you notice there is a LOT of duplicated code just by exposing a few basic methods. Time     to      ABSTRACT
For Create or Update operations it is easy enough, we just have to push them to the DbContext. But what about the Get method? What parameters should we expose here?

You decide to expose the Expression<Func<T,bool>> type which allows you to pass as parameter the whole filter. Fixed? Well you just destroyed argument #1. Now your filter logic is everywhere and you require your developers to be very disciplined in how they use it. Tough life, let's move on. We need to make some hard decisions, right?

In the future comes the need for ordering. And grouping too. Maybe some paging too? Do you include all that in the generic Get method? Might as well expose the IQueryable<T>! Big mistake because now you have an even leakier abstraction. You are essentially already exposing your DbContext.

Moving on, you realize that to use different databases in EF all you need to do is change a few configurations. This seems to be similar for other major ORMs, so this advantage is not really given by your repository layer. There goes reason #2.

How about being able to use multiple ORMs? TBH, I never had to do this in my whole career. If you can't do something with your ORM of choice then you go straight to SQL for that operation, but there are other ways of abstracting that without creating a whole new layer in your architecture.

But let's say we ignore all of this and we move on.



If you have a repository per entity (even a generic one), what to do with those queries that must deal with multiple entities? What if you have a query that projects from multiple tables at once, for example the Invoices, the Clients and the Products at the same time? Where should it be?

And how do you model such operations in a transactional way? If you are saving data in multiple repositories and you have a DbContext for each one then that means you are possibly using multiple transactions and you will get in trouble.

Enter our friend, the Unit Of Work design pattern. The idea is simple. We open a transaction (say, TransactionScope), record every change we want in the repositories and finally we commit the transaction. If any of those fails we just rollback the transaction and we're done.

Well that's great, but guess what? EF is also implementing the Unit of Work pattern and does all of this for you in a DbContext. Again, you don't gain anything here. I think every major ORM does this.

Here are a few more reasons why I don't like to implement the repository pattern when working with ORM's:

  • It is an additional layer of abstraction, which tends to add to the cognitive load you already have when working on significant projects. And it needs to be maintained of course.
  • The repository becomes just a pass-through layer to the DbContext adding no value to your architecture.
  • We are throwing away the ability to use some more advanced functionalities of EF, such as Lazy Loading, accessing the ChangeTracker, etc. 
  • The queries are also business logic. To me it makes sense to test the query along with everything else.
  • It adds significant clutter to things like Unit Testing. Nowadays with EF there are excellent solutions such as the InMemoryProvider specifically designed for this scenario. Do not downplay the difficulty that complex tests can present!
I think I made my point clear that doing repository + ORM is over-architecting your code, but I have definitely read a lot of literature that says otherwise. Again, the scenario I presented is one I found myself in and I never looked back when I decided to get rid of it.

Your mileage may vary of course!


Comments

  1. While what you say here is correct, there are ways to make a proper repository over an ORM - use the specification pattern. I think it is not worth it for a regular project but if I ever judge the risk of switching data access methods as high I'd do it that way.

    ReplyDelete
  2. Stilgar, I find that the specification pattern is a heavy weight that brings a lot of complexities associated. In my own implementations I found that it becomes confusing fast and since it is a rarer pattern it adds a lot of cognitive load if one is not familiar with it. It also needs to be very complex to handle operations like grouping, sorting, etc.

    At that point you might just give up on using the Specification and simply use custom methods for the queries, of course.

    Another problem is I found that I couldn't take advantage of advanced ORM features (that are generally specific to each ORM) or else I need to expose that somehow as an "options" parameter if possible. It's just another very leaky abstraction.
    This is where I get off the bus and give up on abstracting the ORM away.

    Thank you for the insight!

    ReplyDelete

Post a Comment

Popular posts from this blog

The evolution of C# - Part III - C# 2.0 - Iterators

It's been a while since i wrote the last post, but i did not forget my purpose of creating a series that shows the evolution of C#. Today i came here to talk about one of the most useful features of C#, even if you dont know you're using it. Let's talk about iterators ! What is an iterator? For those of you who didn't read about the iterator pattern somewhere in the internet or in the "Gang of Four" book, you can read a description  here . The iterator is a class/object/whatever which knows how to traverse a structure. So, if you have a list or collection of objects, an iterator would have the knowledge of how to traverse that collection and access each element that it contains. The iterator is a well known design pattern and is behind many of the wonderful that we have nowadays in .NET (Linq comes to mind). Why is it a feature? Truth be told, an iterator is a concept well known way before .NET even existed. Being an OO Design Pattern, the iterator has

My simplest and most useful type

I have been doing some introspection on the way I write code to find ways that I need to improve. I consider this a task that one must do periodically so that we keep organized. There is a very, very simple problem that occurs in every application I know: How to return the results of an operation to the user? I've seen many implementations. Some return strings, some throw exceptions, some use out parameters, reuse the domain classes and have extra properties in there, etc. There is a myriad of ways of accomplishing this. This is the one I use. I don't like throwing exceptions. There are certainly cases where you have no choice, but I always avoid that. Throughout my architectures there is a single prevalent type that hasn't changed for years now, and I consider that a sign of stability. It is so simple, yet so useful everywhere. The name may shock you, take a look: Yes, this is it. Take a moment to compose yourself. Mind you, this is used everywhere , in every