The long awaited Ruby virtual machine shootout is here. In this report I’ve compared the performances of several Ruby implementations against a set of synthetic benchmarks. The implementations that I tested were Ruby 1.8 (aka MRI), Ruby 1.9 (aka Yarv), Ruby Enterprise Edition (aka REE), JRuby 1.1.6RC1, Rubinius, MagLev, MacRuby 0.3 and IronRuby.
Just as with the previous shootout, before proceeding to the results, I urge you to consider the following important points:
- Engine Yard sponsors this website, and also happens to sponsor, to a much greater extent, the Rubinius project. Needless to say, there is no bias in the reporting of the data below concerning Rubinius;
- Don’t read too much into this and don’t draw any final conclusions. Each of these exciting projects has its own reason for being, as well as different pros and cons, which are not considered in this post. They each have a different level of maturity and completeness. Furthermore, not all of them have received the same level of optimization yet. Take this post for what it is: an interesting and fun comparison of Ruby implementations;
- The results here may change entirely in a matter of months. There will be other future shootouts on this blog. If you wish, grab the feed and follow along;
- The scope of the benchmarks is limited because they can’t test every single feature of each implementation nor include every possible program. They’re just a sensible set of micro-benchmarks which give us a general idea of where we are in terms of speed. They aren’t meant to be absolutely accurate when it comes to predicting real world performance;
- Many people are interested in the kind of improvements that the tested VMs can bring to a Ruby on Rails deployment stack. Please do not assume that if VM A is three times faster than VM B, that Rails will serve three times the amount of requests per minute. It won’t. That said, a faster VM is good news and can definitely affect Rails applications positively in production;
- These tests were run on the machines at my disposal, your mileage may vary. Please do test the VMs that interest you on your hardware and against programs you actually need/use;
- In this article, I sometimes blur the distinction between “virtual machine” and “interpreter” by simply calling them “virtual machines” for the sake of simplicity;
- Some of the benchmarks are more interesting for VM implementers than for end users. That said, if you think the benchmarks being tested are silly/inadequate/lame, feel free to contribute code to the Ruby Benchmark Suite and if accepted, they’ll make it into the next shootout;
- Finally, keep in mind that there are three kinds of lies: lies, damned lies, and statistics.
Ruby implementations being tested
All of the Ruby implementations that were able to run the current Ruby Benchmark Suite have been grouped together in one main shootout. This group consists of Ruby 1.8.7 (p72, built from source, and installed through apt-get), Ruby 1.9.1 (from trunk, p5000 revision 20560), Ruby Enterprise Edition (1.8.6-20081205), JRuby 1.1.6RC1 and Rubinius (from trunk), all of them were tested on Ubuntu 8.10 x64, plus Ruby 1.8.6 (p287. from the One-Click Installer) on Windows Vista Ultimate x64. The hardware used for this benchmark was my desktop workstation with an Intel Core 2 Quad Q6600 (2.4 GHz) CPU and 8 GB of RAM. JRuby was run with the -J-server option enabled and by specifying 4 Mb of stack (required to pass certain recursive benchmarks). The best times out of five iterations were reported, and these do not include startup times or the time required to parse and compile classes and method for the first time. Several of these new tests also have variable input sizes.
The MagLev team provided me with an early alpha version of MagLev for the purpose of testing it in this shootout. Since this VM is not mature enough yet to run the Ruby Benchmark Suite, I used custom scripts against an old version of the Ruby Benchmark Suite on Ubuntu 8.10 x64. MagLev was tested, along with Ruby 1.8.6 (p287), on the same machine as that of the main shotoout, though the benchmarks were different (even when they had the same names as the ones in the main shootout).
MacRuby 0.3 and Ruby 1.8.6 (p114) were tested on Mac OS X Leopard using the previous version of the Ruby Benchamrk Suite. Since my MacBook Pro died (sigh), for this benchmark I used a Mac Pro, with two Quad-Core Intel Xeon 2.8 Ghz processors and 18 GB of RAM.
IronRuby (from trunk) and Ruby 1.8.6 (p287) were tested on a previous version of the Ruby Benchmark Suite on Windows Vista x64 on the same quad-core used for the main shootout. The MagLev, MacRuby and IronRuby numbers reported here were the best times out of five iterations, and include startup time. IronRuby on Mono was not tested because I couldn’t get it to work on my machine, despite having tried several IronRuby versions and two different Mono versions. Please also notice that Ruby 1.8.6 (p287) was tested twice on Windows, once for the main shootout against the current Ruby Benchmark Suite, and a second time to compare it with IronRuby, against the old benchmarks.
Note: As tempting as it is, do not compare implementations that belong to different shootouts directly to one another. It would be very disingenuous to directly compare VMs tested with different benchmarks and/or different machines. The only comparisons that make sense are the ones within each of the four groups.
The following table shows the run times for the main implementations. The table is fairly wide, so you’ll have to click on the image to view the data in a new tab.
Green, bold values indicate that the given virtual machine was faster than Ruby 1.8.7 on GNU/Linux (our baseline), whereas a yellow background indicates the absolute fastest implementation for a given benchmark. Values in red are slower than the baseline. Timeout indicates that the script didn’t terminate in a reasonable amount of time and was (automatically) interrupted. The values reported at the bottom are the total amounts of time (in seconds) that it would take to run the common subset of benchmarks which were successfully executed by every virtual machine. When our baseline VM generated an error, others were used, starting with Ruby 1.8.7 on Vista (for color coding purposes only).
The following image shows a bar chart of the total time requested for the common subset of successfully executed benchmarks (those whose names are in blue within the tables):
More interestingly, the following table shows the ratios of each Ruby implementation based on the baseline (MRI):
The baseline time is divided by the time at hand to obtain a number that tells us “how many times faster” an implementation is for a given benchmark. 2.0 means twice as fast, while 0.5 means half the speed (so twice as slow). The geometric mean at the bottom of the table tells us how much faster or slower a virtual machine was when compared to the main Ruby interpreter, on “average”. Just as with the totals above, only those 101 tests, which were successfully run by each VM, where included in the calculation.
More concisely, here is a bar chart showing the geometric mean of the ratios for the various implementations tested:
I prefer to let the data speak for itself, but I’d like to briefly comment on these results. Just a few quick considerations.
Working off of the geometric mean of the ratios for the successful tests, Ruby MRI compiled from source is twice as fast than the Ruby shipped by Ubuntu, and by the One-Click Installer on Vista. The huge performance gap between ./configure && make && sudo make install and sudo apt-get install ruby-full should not be taken lightly when deploying in production. These numbers also reveal what most of us already knew: Ruby is particularly slow on Windows (800-pound gorillas in the room, or not).
Performance-wise Rubinius has more work left to be done to catch up with Ruby 1.8.7 and other faster VMs, particularly if we take into account the number of timeouts. But it has improved in the past year and I think it’s on the right track.
Ruby Enterprise Edition is about as fast as Ruby 1.8.7 compiled from source, which is reasonable considering that it’s a patched version of Ruby 1.8.6 aimed at the reduction of memory consumption (a parameter which wasn’t tested within the current shootout).
Speaking of excellent results, Ruby 1.9.1 and JRuby 1.1.6 both did very well. It looks like we finally have a couple of relatively fast alternatives to what is a slow main interpreter. According to the results above, and with the exception of a few tests, on average they are respectively 2.5 and 2 times faster than Ruby 1.8.7 (from source), and 5 and 4 times faster than Ruby 1.8.7 installed through apt-get on Ubuntu or Ruby 1.8.6 installed through the One-Click installer on Vista. Again, this does not mean than every program (particularly Rails) will gain that kind of speed, but these results are very encouraging nevertheless.
There has been a lot of buzz about MagLev since Avi Bryant’s first benchmarks were shown a few months ago. Here we finally see it being put to the test. The table below shows the times obtained by running MagLev and Ruby 1.8.6 (p287) against MagLev’s set of benchmarks based on the old Ruby Benchmark Suite:
And here are the ratios:
You’ll notice how MagLev swings from being much faster than MRI to being much slower. I believe there is much room for improvement, but at almost twice the speed of MRI, these early results are definitely promising.
These are the times for MacRuby 0.3 on Mac OS X 10.5.5:
And of course, the ratios against the MRI baseline:
MacRuby is relatively new, so these are not bad results. More work is required, but it’s a good start.
Finally (I promise these are the last ones), here are the two tables for IronRuby and Ruby 1.8.6:
IronRuby is slower than Ruby 1.8.6 on Windows, which in turn is much slower than Ruby 1.8.7 on GNU/Linux. This is not very surprising. This project has been focusing on integrating with .NET and catching up with the implementation of the language by improving the RSpec pass rate, as opposed to performing any optimizations and/or fine tuning (as per John Lam’s presentation at RubyConf 2008). We’ll measure its improvements in the next shootouts.
Overall I think these are great results. Ruby 1.8 (MRI), with its slowness and memory leaks, belongs to the past. It’s time for the community to move forward and on to something better and faster – and we don’t lack interesting alternatives to do so at this stage.
I hope that for the next shootout, MagLev, MacRuby and IronRuby will be able to run the benchmark suite, so that they can all be tested and directly compared with each other. I also hope to include Tim Bray’s XML benchmark, some sort of “Pet Shop” sample Rails and Merb application and, above all, include memory usage statistics.
You can find the Excel file for the main shootout here. That’s all for now. Feel free to comment, subscribe to my feed, share this link and promote it on Hacker News, Reddit, DZone, StumbleUpon, Twitter, and Co. Putting together this shootout was a lot of work, so I definitely appreciate you spreading the word about it. Until next time…
Update (December 10, 2008): This article has been updated to correct a couple of major issues with yesterday’s results. I adjusted my commentary as well, in light of the corrected figures.
Update (February 7, 2009): Thanks to Makoto Kuwata, a Japanese version of this article was published in the Rubyist Magazine.
Get more stuff like this
Subscribe to my mailing list to receive similar updates about programming.
Thank you for subscribing. Please check your email to confirm your subscription.
Something went wrong.
does the meaning of jruby is best?
@lg2047: JRuby was the second best, Ruby 1.9.1 was fastest.
Thanks for putting this together. I had recently tested JRuby, IronRuby and the Ruby versions myself, but MacRuby, Rubinius and MagLev I had yet to test… 🙂
JRuby is decidedly a good 1.8.6 implementation at this point. I have recently switched to it on my Web applications serving in order to test it in a longer running application and so far it looks to be stable:
It may use more memory than Ruby 1.9 when doing the same sort of thing, but Java can handle lots of memory due its maturity and excellent garbage collectors and stuff. 🙂
But Ruby 1.9 is decidedly the way to go for pure Ruby.
Regarding the other Ruby implementations, while they had an easier time trying to beat the oldie Ruby 1.8, with Ruby 1.9 it has become a little tougher, so in terms of incentive, they have just got what they asked for. Go and create great Ruby implementations for all of us! Thanks!
The results look very nice, thanks for updating the suite and running this shootout again.
I’m looking forward to seeing another update once JRuby finishes 1.9 support, since these JRuby numbers don’t reflect execution and library optimizations 1.9 provides. I expect that JRuby in 1.9 mode will perform much closer to Ruby 1.9.
Until then, it’s good to hear we’re the fastest 1.8 implementation. Thanks again!
Well, Jruby is doing very well!
I hope we can benchmark rails or merb with this various implementations also.
This is really encouraging for all of the new Ruby implementations.
I’d also be very interested in seeing how JRuby does in 1.9 mode when it’s finished (somepoint in the next couple of months I believe)
Suggest you label the “Total Time” and “Geometric Mean” title or x-axis so that they can survive being cut/pasted out of context.
Time – what units?
Geometric Mean – relative to what?
As the charts only show those tests which every implementation successfully completed – how many tests out of the total were successfully completed?
The tests are run with a range of input values – so are timings from all input values included in the “Total Time” and “Geometric Mean” or only times for some particular input value?
Could you comment on the errors? E.g., were they a matter of memory exhaustion, or bad memory references (e.g., segfault), or wrong results? Thank you for distinguishing timeout from error.
Isaac, you make a good point about them being pasted out of context. I’ll see if I can update the two bar charts. All the input values were included in the total time.
Thanks for taking the time to run these tests! really interesting 🙂
Stephen, I didn’t want to overload the post with details about errors, but since you asked, let me give you a bit more information regarding the more established VMs.
Ruby 1.8 on Ubuntu: Each error resulted in a “stack level too deep” message. On Windows, the first is a “stack level too deep”, the second a “wrong constant name (null)”, while the last three errors were all “failed to allocate memory” ones.
Ruby 1.9.1: The first error was a “stack level too deep” and the second a “undefined method `zero?’ for “\x01″:String”.
Ruby Enterprise Edition: “stack level too deep” for all three errors.
JRuby: The first error is an Internal Error (Error: guarantee(false,”missing exception handler”)) that dumped a pid log, while the last two errors were “org.jruby.util.ByteList:193:in `ensure’: java.lang.NegativeArraySizeException”.
Rubinius: There was a segmentation fault.
We should all switch to JRuby.
Second in the performance test and first concerning security. The sandbox built into the JVM is really great and is one the reasons tons of server-side code is written in Java and not C++ nowadays.
The JVM is a mature piece of software and JRuby benefits from this naturally.
+1, sir. Good show
Thanks so much for this, it was a fascinating read.
Now we know which Ruby runtime is the best. How about a shootout with PHP, Perl, Python, ASP and JSP? That’ll be very interesting too.
Ruby 1.8.7 uses green threads, right?
How is that that JRuby fell behind Ruby 1.8.7 in the thread management set of tests?
Any chance you could throw into the shootout a java-language-based application that runs in the same environment as the JRuby application?
Thank you for this enlightening article.
Thank you for your benchmarks!
Submitted to Digg
Just to be clear, these results are not very representative of real-life scenarios. Assuming, from these benches, that “Ruby 1.9 is fastest” and “Ruby 1.8.7 is slowest” would be a mistake, and not borne out by benches on even slightly complex scenarios (like running a Rails or Merb request through the dispatch cycle or even running a simple Rack or Mongrel request).
Of course, real-life scenarios are confounded by other variables (is it really fair to compare MRI’s mongrel with JRuby’s?) so these VM benches provide some measure of head-to-head performance, but again, I’d be wary about drawing any high-level conclusions from these results.
I’d be very interested in participating in a project to create a more representative suite for real-life results, as Antonio seems to be interested in according to his post 🙂
I am in no way demeaning Antonio’s work. I think it’s useful and important. The question is what exactly it tells us.
Overall I think these are great results. Ruby 1.8 (MRI), with its slowness and memory leaks, belongs to the past.
What in this test led you to the conclusion that MRI 1.9.1 doesn’t suffer from 1.8’s memory leaks?
I’m not familiar with the test suite, does it deliberately try to exercise those bugs?
Great, great, great work!
Some higher level conclusions:
– These results give a boost in confidence that Matz made a great choice with YARV and that the upgrade to 1.9 is looking like a success (duh?). It is crucial that the leader of the pack stay ahead of the pack so nobody looses confidence in their decisions and starts a rebellion.
– JRuby is the leader in the alternative implementations and is a viable alternative especially to those already on the JVM.
– MagLev looks like the rising star but they have yet to prove themselves. Still, this platform is extremely secretive, closed and will probably be expensive.
1) Antonio, I can’t get the Excel file “Times” to sum (using OpenOffice) to the same totals you show!
For example, even if I set all the Error and Timeout values to 0 Rubinius sums to 714.65 not 601.01?
2) “All the input values were included in the total time.”
Doesn’t that mean some implementations had 0s included in the total time when they failed or timedout while successful implementations were disadvantaged because they succeeded? That would mean the “Total Times” chart is really really misleading.
The Geometric Mean values seem to sum differently also – JRuby 4.68 not 3.62?
Seems like when a test is only run with one input size the input size is set to n/a – why not just provide the information, that single input value?
For example, micro-benchmarks/bm_nbody.rb takes 9.96s with Ruby 1.8.7 so I’d guess something like 100,000 but why make me guess?
@Isaac, thank you so much for inspecting my results. There are a couple of mistakes that affect the results. Tonight I’ll update this post with the due corrections. Regarding the input size parameter, it’s there to distinguish amongst identical benchmarks, but yours is a good suggestion for the next shootout.
This is a very interesting article. Thanks for the excellent work. One thing that I am still wondering about and cannot get a straight answer for is – why is rails so slow on Windows. While the performance of the VM is not very different on Windows vs. linux, the observed difference in our practice has been much larger. Rails startup time on Windows is on an order of 10x slower than on linux on the same machine. Our rspec suite (not a performance benchmark) runs similarly slow mostly due to the poor performance of boot.rb on Windows (I was using Vista 64bit on a dual core w/ 4GB RAM for those observations).
I noticed that the benchmarks that performed poorly on Windows in your shootout seemed to be focused on string processing more than others (maybe? am I off?). Does this account for the Rails performance difference? Not sure.
Another thing is that rails produces a large amount of transient objects, which can tax the VM garbage collector. Could this account for a large difference. Does this hint at possibly better performance of JRuby, especially on Windows, if garbage collection is delegated to the Java VM (is it?)
So bottom line – I’ve been unable to get straight answers from the rails community, as there seems to be overt hostility to Windows, and commonly the anser one gets is “switch to Linux”, which is no answer at all.
I’d be *very* interested in seeing a rails based benchmark numbers, and would also like to get solid numbers on JRuby performance differences between Linux and Windows. Do you know of any that were done already? Does any reader know *objectively* why there are such huge performance differences between rails performance on Windows and Linux, even though the underlying VM has very similar performance characteristics? Does anyone know of any serious effort to fix this problem?
I recommend to you to test JRuby with -J-client option for reasons behind the jruby implementation in some cases several JRuby programs can run faster with this option instead of -J-server
I also share Arnons curiosity about the performance issues surrounding Ruby on windows.
Switching to Linux is not an option for us. All our internal (yes, these are ‘enterprise’ apps) applications run on Windows and management will not budge on this issue.
Has anyone done any benchmarks of JRuby and Ruby 1.9.1 (if possible) on Windows?
Has anyone successfully deployed JRuby on an IBM AS400, iSeries or Power products?
I would love to see JRuby and Rails/Merb applications running on these products. RPG, COBOL and C are WAY long in the tooth and I’m surprised that IBM hasn’t supported Ruby or Python on these platforms. If they did, these platforms may prove viable for sometime to come, instead of dying in the slow death spiral in which they currently find themselves.
Looks like I’m going to look at JRuby again. Thanks!
Man, very great this tests!
Good job man, very good.
also has some nice ‘gc related’ benchmarks that might be nice to add to the suite [with some modification].
Thanks for your good work on this.
Also interesting would be to try to ‘eke’ out performance of MRI, using compile options [compilers, etc], though I suppose that merits its own separate test 🙂
Isn’t Ruby suposed to be horribly slow on Windows? Possibly we should ignore the 1.8.6 results.
Assuming 1.8.7 is on Linux or some such then this test confirms again that Yarv is about 3 times faster than the old Ruby. I was hoping for 10x. So it does look like Ruby is always going to be slower than Python.
why was REE slow?
You can also get better windows performance out of the mingw version in the works.
Balast: Yes, we have heard of at least one person deploying on AS/400. I don’t know about the others, but we recently made an effort to get JRuby running well on IBM JDKs. It ought to work fine.
All interested in Windows: Hotspot has had a lot of work done to run well on Windows, and I doubt we’d see any significant performance drop from using JRuby on Windows. Give it a try yourself, but if Ruby has such severe perf issues on Windows, then JRuby could be a really excellent option for you.
BlackHand: The -client option is usually on by default, and switches to the faster-starting, non-optimizing JIT compiler. So it’s faster for some things, but in almost all cases the -server option will perform a lot better.
Arnon: JRuby is internally “just Java” so we’re taking full advantage of the JVM’s GC, threading, and optimization capabilities.
Shimnon: JRuby is much more than a good option for those already on the JVM; we have many users starting to switch to JRuby simply because it performs so well and scales well on multicore systems. The JVM is just a means for us to create an excellent Ruby impl.
Hey man, great ! Is nice to see Ruby rising up! good job !
Could you please tell us which version of the JVM was used to test JRuby? If I’m not mistaken, Ubuntu includes many implementations of the JVM, the 3 most popular beeing Sun’s 6.10, Sun’s 220.127.116.11 and OpenJDK’s 6b12.
Nicolas, I used Sun’s 6.10.
Are there plans to do a shootout for 2009? It’ll be interesting with all the progress rubinius has made.