Bogdan Gusiev's blog

How to make good software for people

Benchmark your performance patches
01 May 2012

After a dozen performance patches to many gems want to share some practical experience I gain. Tools I’ve picked up was perftools.rb and ruby built-in benchmark library. They are fit well for cases when optimization stays at Ruby level and doesn’t require to fix something in native extensions or dig into IO operations.

General flow on how to tackle performance is clear:

  • perftools.rb shows slow method calls in a nice call graph format.
  • Benchmark proves that we made an improvement.

Perftools does all the hard work for you.

require 'perftools'
PerfTools::CpuProfiler.start("/tmp/profile_result") do
  # code to profile

Run test and open result:

gem install perftools.rb
ruby test.rb
pprof.rb --gif /tmp/profile_result > /tmp/profile_result.gif
$IMAGE_VIEWER /tmp/profile_result.gif 

Example output: Rails route generator call graph

Here is short instruction how to read call graph:

Each block represents a method that was called. Percentage in a block shows how much time was spend in current method with and without all it’s nested calls comparing to overall time. Arrow with number shows how many times parent method called it’s children method.

That’s it. All you need to do now is find blocks with most percentage and try reduce them.

This flow works fine till the moment when benchmark results before and after patch should be compared. Ruby Benchmark can not merge them together and represent in a human readable form like this:

Running benchmark with current working tree
Checkout HEAD^
Running benchmark with HEAD^
Checkout to previous HEAD again

                    user     system      total        real
----------------------------------headers parsing when long
After patch:    0.100000   0.000000   0.100000 (  0.089926)
Before patch:   0.700000   0.000000   0.700000 (  0.697444)

----------------------------------headers parsing when tiny
After patch:    0.000000   0.000000   0.000000 (  0.009930)
Before patch:   0.020000   0.000000   0.020000 (  0.024283)

---------------------------------headers parsing when empty
After patch:    0.010000   0.000000   0.010000 (  0.002160)
Before patch:   0.000000   0.000000   0.000000 (  0.002354)

So, decided to create a library that fixes this problem. Try out DiffBench if you ever come up with performance patch.

DiffBench in action examples:

gem contribution performance callgraph patch