As a Ruby programmer, I often hear that Ruby and Ruby on Rails are very slow, but are they really? After a few years working with Ruby code, I think that the problem is more complex. In this short article, I would like to show you where the main problems are and how not to fall into their trap. I hope that after reading it, you will look at code in a different way.
Why is Ruby slow?
Before looking at code, you have to understand what the biggest historical problem of Ruby is. When you look at the documentation for version 1.9, you will see that the core team added a virtual machine that executes the code faster. But the main change occurred from version 2.1 and 2.2 when a new garbage collector (GC) was implemented to Ruby. It changed the language’s performance dramatically but it did not change the way that programmers write the code. When I look at some of my code, I’m not an exception. The key here is to refactor the code, not only to shorten it, but also to optimize the performance. Always when you estimate a project, remember to add extra time for refactoring and testing. The client will probably complain about it but you save their money on fixing and increasing the speed of the application.
You might be thinking that this is not possible – there must be a problem with the language and not with my code. So let’s take a look at the example. Here is a very simple program:
require "benchmark"
num_rows = 100 000
num_cols = 10
data = Array.new(num_rows) { Array.new(num_cols){ "x" * 1000 } }
time = Benchmark.realtime.do
csv = data.map { |row| row.join(",") }.join("\n")
end
puts time.round(2)
Now let’s run it on a different Ruby version and performance. I won’t test the code on 1.8.x because you probably won’t work with such an old version.
Ruby version | 1.9.3 | 2.0 | 2.1 | 2.2 |
---|---|---|---|---|
Execution time | 9.18 | 11.42 | 2.65 | 2.43 |
As I mentioned before, from version 2.1 Ruby started to be very fast. Implementing a new GC rapidly changed Ruby’s performance. To be more sure about it, let’s execute the same code with the GC disabled:
1.9.3 | 2.0 | 2.1 | 2.2 | |
---|---|---|---|---|
GC enabled | 9.18 | 11.42 | 2.65 | 2.43 |
GC disabled | 1.14 | 1.15 | 1.19 | 1.16 |
% of time spent in GC | 88% | 90% | 55% | 52% |
After disabling the GC, we can see that the execution time for all versions is almost the same. So the first step for speeding up your Ruby app is to update Ruby to ad least 2.1.
How to speed up?
Now that you know that Ruby’s main problem is the GC, so when you write code you have to remember that. The less memory you use, the less time GC will need to clear it. Even the example above can be optimized to work faster.
First let’s look where the problem with the code is:
require "benchmark"
num_rows = 100 000
num_cols = 10
data = Array.new(num_rows) { Array.new(num_cols){ "x" * 1000 } }
time = Benchmark.realtime.do
csv = data.map do |row|
row.join(",")
end.join("\n")
end
puts time.round(2)
The rows that are generated inside the block are intermediate results that the program has to store in memory and which the GC has to clear. That extra results add 1GB of memory. So how can you rewrite it to get rid of that extra memory?
require "benchmark"
num_rows = 100 000
num_cols = 10
data = Array.new(num_rows) { Array.new(num_cols){ "x" * 1000 } }
time = Benchmark.realtime.do
csv = ''
num_rows.times do |i|
num_cols.times do |j|
csv << data[i][j]
csv << "," unless j == num_cols - 1
end
csv << "\n" unless i == num_rows - 1
end
end
puts time.round(2)
The new code is much longer and uglier, so while you can think that refactoring should work a different way, let’s first look at the execution of the code:
1.9.3 | 2.0 | 2.1 | 2.2 | |
---|---|---|---|---|
GC enabled | 9.18 | 11.42 | 2.65 | 2.43 |
GC disabled | 1.14 | 1.15 | 1.19 | 1.16 |
Optimized | 1.01 | 1.06 | 1.05 | 1.09 |
After the code change, we can see even better times than when the GC is disabled. Less memory, less execution time. This simple rule should always be in your mind.
When should we optimize?
We all know that optimization is important but when should you actually refactor your code from the optimization side? For me, the first clue is the cache key. When your cache key is long and you have to remember what rule you used while clearing the cache, you probably have implemented bad code. Not always will you get rid of caching, but the code before caching should be better, and maybe after refactoring the keys will be shorter.
The second sign is when you have big models and controller – the more code, the more probable it is that they are not optimized. A very quick example are loops. Depending on what way you iterate it, Ruby can copy a whole array to memory; imagine an array of 100, 000 which Ruby copies at every step. A much better approach is to shift an element so on every interaction, Ruby copies less data. This approach is very often used in functional programming. If you would like to know more about it, check out my other article.
I hope that this brief introduction will help you in writing Ruby code. If you would like an article with more examples, please let me know in the comments below.
Example’s source: Ruby Performance Optimization by Alexander Dymo
Author: Przemek Olesiński