I tweeted about ORM last week, and since then several people have asked me to clarify what I meant. I have actually previously written about ORM, but it was in the context of a larger discussion about SQL and I shouldn’t have confused the two issues. So here I’m going to focus on ORM itself. I’m also going to try to be very brief, since it became very apparent from my SQL article that people tend to stop reading at the first sentence that makes them angry (and then leave a comment about it, whether or not their point is addressed later on).
What’s an anti-pattern?
I was pleased to discover that Wikipedia has a comprehensive list of anti-patterns, both from within the world of programming and outside of it. The reason I call ORM an anti-pattern is because it matches the two criteria the author of AntiPatterns used to distinguish anti-patterns from mere bad habits, specifically:
It initially appears to be beneficial, but in the long term has more bad consequences than good ones
An alternative solution exists that is proven and repeatable
It is the first characteristic that has led to ORM’s maddening (to me) popularity: it seems like a good idea at first, and by the time the problems become apparent, it’s too late to switch away.
What do you mean by ORM?
The chief offender that I’m talking about is ActiveRecord, made famous by Ruby on Rails and ported to half a dozen languages since then. However, the same criticisms largely apply to other ORM layers like Hibernate in Java and Doctrine in PHP.
The benefits of ORM
Simplicity: some ORM layers will tell you that they “eliminate the need for SQL”. This is a promise I have yet to see delivered. Others will more realistically claim that they reduce the need to write SQL but allow you to use it when you need it. For simple models, and early in a project, this is definitely a benefit: you will get up and running
faster with ORM, no doubt about it. However, you will be running in the wrong direction.
Code generation: eliminating user-level code from the model through ORM opens the way for code generation, the “scaffolding” pattern which can give you a functional interface to all your tables through a simple description of your schema. Even more magically, you can change your schema description and re-generate the code, eliminating CRUD. Again, this definitely works initially.
Efficiency is “good enough”: none of the ORM layers I’ve seen claim efficiency gains. They are all fairly explicit that you are making a sacrifice of efficiency for code agility. If things get slow, you can always override your ORM methods with more efficient hand-coded SQL. Right?
The problems with ORM
Inadequate abstraction
The most obvious problem with ORM as an abstraction is that it does not adequately abstract away the implementation details. The documentation of all the major ORM libraries is rife with references to SQL concepts. Some introduce them without indicating their equivalents in SQL, while others treat the library as merely a set of procedural functions for generating SQL.
The whole point of an abstraction is that it is supposed to simplify. An abstraction of SQL that requires you to understand SQL anyway is doubling the amount you need to learn: first you need to learn what the SQL you’re trying to run is, then you have to learn the API to get your ORM to write it for you. In Hibernate, to perform complicated SQL you actually have to learn a third language, HQL, which is maddeningly almost-but-not-quite SQL, which then gets translated to SQL for you.