Subtle issues with ORMs and how to avoid them

One of the most common change requests I get with my Web Development course is to stop using GORM, and instead use the database/sql package that is part of Go’s standard library.

When I receive feedback like this I often respond asking, “Why?” Why do they feel the database/sql package would be better suited for their education? Why do they feel ORMs will be problematic long term?

I don’t ask these questions to be snarky or because I don’t care; I ask because I truly want to understand a problem before I look for solutions, and oftentimes people will tell you their problem is X, when in reality X is only a symptom of a deeper problem.

What is shocking to me is that an overwhelming number of the people who request this change don’t actually have a compelling reason. Instead, they respond with answers like, “everyone says ORMs are bad” or “someone on r/golang said I should be using the database/sql package instead.”

I can’t really blame new developers for believing what they read. All it takes is a week or two hanging out on the Go subreddit and you will quickly start to believe that ORMs are clearly the spawn of Satan and should never be used. But as I see this occurring more often, I started to realize that this was a symptom of a deeper problem - a lack of understanding. A lack of knowing when an ORM is appropriate, how to design your code so that an ORM won’t cause you issue long term, and many other subtleties involved with using ORMs.

The goal of this post isn’t to convince everyone to use an ORM. They aren’t always appropriate, and if your team already knows SQL really well then you should definitely lean more heavily on a library like database/sql. The goal of this post is to discuss the pros and cons of ORMs so that you can make an educated decisions about using them on your own rather than parroting what everyone else says.

To accomplish this goal, we will also need to look at some of the subtle issues that stem from using ORMs and discuss ways to avoid those issues. Oh and just a forewarning - none of these issues are actually limited to ORMs. You can make any of these mistakes when using the database/sql package, but developers just tend to fall for them more often when using ORMs, which gives ORMs the bad rap.

ORMs can hide complexity

ORMs can make it incredibly easy to write code that masks a great deal of complexity. Normally this is a good thing - we don’t want to have to write complex queries on our own all the time - but when these things get chained together to create unexpected queries it can make it hard to design a database that scales efficiently.

Ruby on Rails’ Active Record is likely the biggest culprit here, but truthfully this can happen with any ORM. In fact, it could even happen without an ORM, but ORMs tend to mask what is going on just enough to let bad things slip through the cracks.

First, let’s discuss what I mean. Imagine you are writing an application and somewhere inside of your code you write some code like this:

# I'm using rails for this example, but it applies to Go
# and other languages as well.
user.orders.each do |order|
  # ... use the order
end

When this code is run, it will end up executing some SQL behind the scenes (with a few exceptions). While this might not be problematic at first, over time this can become an issue. For example, if you start to chain together different clauses you could end up with something like the code below:

user.orders.not_shipped.high_value.with_issue.each do |order|
  # ... use the order
end

As queries becomes more complex, they are going to become less performant. This becomes especially pronounced as we start to do more complex queries, like multiple joins with conditional clauses.

As this happens, the engineers designing the database will have a hard time making decisions because they won’t have a clear picture of what is happening. They won’t know which columns need indexed, or which joins need optimized unless they scan through the entire application digging for any code that generates SQL queries.

Sure, we could add in some profiling and start to measure which queries are taking the most time, but this will always be reactive. We can’t be proactive and avoid potential major issues. We can’t dictate what queries should be run versus which shouldn’t because they will bring our database to a crawl.

As I said before, this isn’t limited to ORMs but it is a bigger issue with ORMs because developers can write complex queries without actually seeing the SQL being generated and thus be ignorant to the fact that they are hurting the performance of the application.

Writing SQL on the other hand doesn’t mask this, so when you write a complex query is it hard to say, “Oh, I didn’t know!”

And that brings us to the other major issue with ORMs.

ORMs allow developers to remain ignorant

ORMs allow developers to remain ignorant of the tech they are using and how their decisions will impact the overall performance of the application.

In a small application this typically doesn’t matter - an SQL database can operate on thousands of records with relative ease. But as an application scales, these issues become more pronounced. Iterating over thousands of records might be okay, but iterating over millions is going to start to take a toll.

One of the primary motivations behind writing pure SQL is that developers have to learn enough SQL to understand the complex queries they are writing, so they should realistically understand how they are going to affect the performance of the application. If they are writing queries that do joins or filter by specific attributes they likely also know enough to figure out which indexes need created to keep things performant.

Avoiding these issues

While both of these issues are possible with ORMs, the truth is neither of them are directly caused by the ORM. Instead, both are caused by poor code design and a lack of education.

The best way to mitigate these issues isn’t to stop using an ORM, but is to instead learn better design patterns. To educate your team members and yourself, and to incorporate solid code review so that if a developer unfamiliar with SQL needs to update your code, a developer familiar with the ramifications of their changes can review any database-specific changes.

One way to do this is to create a database layer in your code and to isolate all of your database interactions to this code. This might be one package, or split into several. The important thing here is that the only code that ends up creating SQL statements is contained within the this layer of your application.

package database

type DB struct {
  // ...
}

func (db *DB) UserOrders(user *User) ([]Order, error) {
  // ...
}

By writing code this way, you can clearly separate the database interactions with the rest of your code. You can also easily incorporate better code reviews when code here is changed, allowing multiple developers who are familiar with SQL and your database design to review these changes before they get shipped to production. This also present an opportunity to educate developers who may be less familiar with SQL.

While this approach is more rigid because you need to expose functions for each query you want the rest of your application to have access to, it is significantly easier to test, maintain, and keep efficient over time.

The coolest thing about this approach is that it doesn’t matter if you use an ORM or not. In the example above you got an idea of what we are doing, but I didn’t have to write any code using the database/sql or gorm pacakge because it simply does not matter. An ORM might be used to aid in building SQL queries, but it doesn’t have to. We could even move from using an ORM to using pure SQL without changing any other code in our application.

But ORMs slow down your app!

It is hard to talk about ORMs without someone jumping in and saying, “But ORMs will slow down your application! Use this other library for faster performance.”

As Mark Bates puts it, “The Go community loves benchmarks. It is obsessed with them.“

The problem with this obsession with speed and benchmarks, as Mark continues to explain in his post, is that very few applications actually need to be as fast as possible. Instead, they simply need to be “fast enough”.

End users don’t notice the difference between a 31ms and a 30ms response time. Now if ORMs were causing 100ms delays then sure, you could make this argument, but the actual speed cost of using an ORM is negligible in a real application. It will amount to less than 1% of your application’s total latency.

Rather than spending too much time here, I suggest you check out Mark’s post: http://www.metabates.com/2017/03/03/youre-benchmarking-the-wrong-thing/

It isn’t specifically directed at ORMs or web applications, but the point still stands. For most applications there are other, more important factors to consider than a very minor slowdown caused by using an ORM.

We shouldn’t learn something new when we already know SQL!

This is probably the most common reason I hear for avoiding ORMs, and I agree with it. If you or your team already knows SQL and prefer it, then using an ORM is a bad idea.

The problem with this mindset is that it only applies to one group of developers - those that already know SQL very well and prefer to use it.

On the other hand, there is a large group of developers who either (a) DO NOT know SQL very well, or (b) prefer using an ORM or other SQL building library.

Most of the people I teach fall into the first category - they do not know SQL very well. In fact, many of them are learning about web development for the first time, so adding SQL to that already massive list of things to learn isn’t likely to turn out well.

Instead, I find that an ORM (or SQL builder) that works similar to raw SQL is a better option. Not only does this help get beginners up and running faster, but it also helps aid them in learning SQL. For example, you can enable logging in GORM to see what SQL query ends up being executive for each piece of code you write, and the code looks very similar to SQL.

db.Where("email = ?", "jon@calhoun.io").First(&user)

Want to know (roughly) what the SQL generated by this is?

SELECT * FROM users WHERE email='jon@calhoun.io' LIMIT 1

As I said, they don’t look that different and can be a great tool for learning.

So yes, if you already know SQL and prefer it you shouldn’t use an ORM. But this doesn’t prove that ORMs are a bad idea. It simply demonstrates that if you know and prefer SQL then you should clearly not use an ORM.

In summary…

If you are new to SQL, or you simply want to use an ORM then go for it. There isn’t anything “bad” or “evil” about them. When others tell you they are bad, what they are really expressing is an opinion. A preference, created by their very different educational background or experiences with teams that may or may not have used an ORM effectively.

And if you dislike ORMs, then great. More power to you. But please stop telling everyone that ORMs are awful tools just because you don’t prefer them. You are actively making it harder for beginners to get into development by making them believe they need to learn everything you know before they can even get started, and that simply is not possible.

This article is part of the series, Using PostgreSQL with Go.

Learn Web Development with Go!

Sign up for my mailing list and I'll send you a FREE sample from my course - Web Development with Go. The sample includes 19 screencasts and the first few chapters from the book.

You will also receive emails from me about Go coding techniques, upcoming courses (including FREE ones), and course discounts.

Email Address