Having the need to profile a rake task in order to figure out why it was taking so long, I decided to take the opportunity to check out the perftools.rb gem. It proved to be interesting, though I’m still working out the best way to get useful information from it. Getting it running on my MacBook Air running Lion (OS X 10.7) was a little involved so I thought I’d write up a post here both for my own memory as well as to help anyone else who might be interested. And it’s been a while since my last post because things have been so busy at work. Time to get back to writing things here.
The perftools.rb gem is of course on GitHub: here. Because I needed to profile a rake task rather than part of our Rails app, I couldn’t use the rack-perftools_profiler, so I had to do it a bit more manually. First thing of course was to add the gem to the Gemfile:
gem "perftools.rb", "~> 2.0.0"
After a bundle install, I was good to go there. Next I had to put the profiling around the block of code I wanted to investigate:
require 'perftools' # At the top of the class PerfTools::CpuProfiler.start("/tmp/my_method_profile") my_method_call() PerfTools::CpuProfiler.stop
It’s pretty easy. I ran my code, and then I had the file /tmp/my_method_profile. But what to do with this binary file? Well, the simplest option is to run pprof.rb on it:
pprof.rb --text /tmp/my_method_profile
That will output a bunch of information, basically a list of the “samples” the profiler took in descending order of percentage of samples in the function — that is, the first entry is the one which was “seen” the most by the profiler. I recommend taking a look at the pprof documentation here.
That textual information is surely useful, but I hoped for more. Using callgrind output seemed very interesting, but I couldn’t use kcachegrind on my Mac. Thanks to this gist I was able to get qcachegrind, a Qt version, running. It’s not quite 100% happy with OS X 10.7 but it works. You’ll need Xcode installed, then follow the directions in that gist. I installed the latest “libs only” version of Qt, 4.8.0. Installing Graphviz was straightforward; you can follow the instructions in the gist or use Homebrew as well. I was then able to do the svn checkout of qcachegrind and build it. Note that the patch gist mentioned to let it open any file is no longer necessary — since that was written, a change to qcachegrind was made to allow it to open any file. Go ahead and do the qmake and make, and you should be good to go.
Okay, so now I could process the profile into callgrind format, and view it:
pprof.rb --callgrind /tmp/my_method_profile > /tmp/my_method_profile.grind open qcachegrind.app
Using qcachegrind I opened the /tmp/my_method_profile.grind file, and there it was! I could view the call graphs, and get some nice views of what was going on in the code.
If all you want is a straight-forward call graph, though, you can also (once you have Graphviz installed) generate a call graph GIF image with “pprof.rb –gif /tmp/my_method_profile > /tmp/my_method_profile.gif”. Open that GIF and you’ll have a very useful view of what was happening in the block of code you’re profiling.
I hope this quick summary is helpful — it’s not always easy to figure out precisely what’s happening your code, and perftools.rb can help out a great deal.