A newer shooutout is available.
Many brilliant developers are working on improving the current implementation of Ruby and on creating alternatives. I was curious about their current respective speeds, so I installed and ran some benchmarks for the most popular implementations. In this article, I’m sharing the results for the community to see.
Disclaimer
- Don’t read too much into this and don’t draw any final conclusions. Each of these exciting projects have their own reason for being, as well as different pros and cons, which are not considered in this post. They each have a different level of stability and completeness. Furthermore, some of them haven’t been optimized for speed yet. Take this post for what it is: an interesting experiment;
- The results may entirely change in the next 3, 6, 12 months… I’ll be back!
- The scope of the benchmarks is limited because they can’t stress every single feature of each implementation. It’s just a sensible set of benchmarks that give us a general idea of where we are in terms of speed;
- These tests were run on my machine, your mileage may vary;
Benchmark Environment
My tests were conducted on an AMD Athlon™ 64 3500+ processor, with 1 GB of RAM.
The tested Ruby implementations were:
- Ruby 1.8.5-p12 stable on Linux;
- Ruby 1.8.5-p12 stable on Windows Vista;
- Ruby 1.9 on Linux;
- JRuby on Linux;
- Rubinius on Linux;
- Cardinal on Linux;
- Gardens Point Ruby .NET Beta 0.6 on Windows Vista;
The operating system that was used for all – but Ruby.NET – is Ubuntu 6.10 (for x86). Ruby.NET currently runs on Microsoft Windows only, therefore I’ve used Vista with the .NET Framework 2.0 and have also run Ruby 1.8.5-p12 on Windows as a means of having a more direct comparison with Ruby.NET.
Ruby 1.9, JRuby, Rubinius and Cardinal were all installed using their respective latest development versions from trunk.
Tests used
The 41 tests used to benchmark the various Ruby implementations can be found within the benchmark folder in the repository of Ruby 1.9. The following is a list of the tests with a direct link to the source code for each of them:
- bm_app_answer.rb
- bm_app_factorial.rb
- bm_app_fib.rb
- bm_app_mandelbrot.rb
- bm_app_pentomino.rb
- bm_app_raise.rb
- bm_app_strconcat.rb
- bm_app_tak.rb
- bm_app_tarai.rb
- bm_loop_times.rb
- bm_loop_whileloop.rb
- bm_loop_whileloop2.rb
- bm_so_ackermann.rb
- bm_so_array.rb
- bm_so_concatenate.rb
- bm_so_count_words.rb
- bm_so_exception.rb
- bm_so_lists.rb
- bm_so_matrix.rb
- bm_so_nested_loop.rb
- bm_so_object.rb
- bm_so_random.rb
- bm_so_sieve.rb
- bm_vm1_block.rb
- bm_vm1_const.rb
- bm_vm1_ensure.rb
- bm_vm1_length.rb
- bm_vm1_rescue.rb
- bm_vm1_simplereturn.rb
- bm_vm1_swap.rb
- bm_vm2_array.rb
- bm_vm2_method.rb
- bm_vm2_poly_method.rb
- bm_vm2_poly_method_ov.rb
- bm_vm2_proc.rb
- bm_vm2_regexp.rb
- bm_vm2_send.rb
- bm_vm2_super.rb
- bm_vm2_unif1.rb
- bm_vm2_zsuper.rb
- bm_vm3_thread_create_join.rb
Results
The following table shows the execution time expressed in seconds for Ruby 1.8.5 on Linux, Ruby 1.8.5 on Windows, Ruby 1.9 (Yarv/Rite) on Linux, JRuby on Linux, Gardens Point Ruby.NET on Windows, Rubinius on Linux and finally Cardinal on Linux.
- A blue bold font indicates that the given Ruby implementation was faster than the current stable, mainstream one was (Ruby 1.8.5 on Linux);
- The baby blue background indicates that the given Ruby implementation was the fastest of the lot for the given test;
- ‘Error’ indicates an abnormal interruption of the program. ‘Too long’ instead, is an indication that the execution took longer than 15 minutes and was manually interrupted;
- Average and Median values take in consideration only working tests (they exclude ‘Too long’ programs as well).
Below is a chart which shows the average and median values, visually:
Of course, the bold green values indicate a positive performance, so for example Cardinal was 4 times faster than Ruby 1.8.5 on Linux for the test vm1_swap, but it was also 18 times slower for so_matrix (therefore in red).
I won’t provide too many personal considerations but rather let you enjoy the numbers. Generally speaking though, Ruby on Windows was about 1.5 times slower than on Linux. Yarv (merged in the development version of Ruby) is clearly the fastest by a long shot. This is good news (there are hopes for a fast Ruby 2.0), and it is not an unexpected result.
Ruby.NET and JRuby had similar performances and were able to execute most of the tests. It is clear though that they will need to focus on improving their individual speeds in the coming future, in order to be ready for prime time.
Cardinal wasn’t able to complete most tests, and was extremely slow in some others. However on a few occasions, it also showed decent results (beating Ruby 1.8.5 in 3 tests). Rubinius was extremely slow too but correctly handled a larger amount of tests than Cardinal was able to (and it was significantly faster in executing so_sieve.rb).
I’d like to conclude by saying that all the people involved with these projects are doing an amazing job. And while some implementations show that they are in an early stage of development, it is in no way detrimental of the great effort and work done by their developers, nor attempts to predict their future success or failure. So once again, great job guys, all of this is nothing short of exciting!
UPDATE 02/21/07: Wow, it looks like this article received a lot of attention and naturally I’m glad it did. Slashdot linked to this and traffic sky rocketed, giving major exposure to all these projects.
Most importantly, I initially thought I’d run another batch of tests in 3 months time, but given the amount of feedback that I’ve received, I’ll carry out another test run fairly soon to incorporate many of the insightful suggestions and requests that were received.
By the way, Ruby 1.8.6 is out in preview, and some of you sent me emails asking to test it out. Running the test shows that it’s usually slightly faster than 1.8.5 and it seems to notably speeds up recursion based tests. The next test run will have details for Ruby 1.8.6 as well.
Get more stuff like this
Subscribe to my mailing list to receive similar updates about programming.
Thank you for subscribing. Please check your email to confirm your subscription.
Something went wrong.
Very intresting, thanks for doing this Antonio
Great to see how the different implementations compare. It is also another example of the speed boost Yarv will provide.
In another comparisons I saw, Ruby was actually the fastest scripting language, between PHP and Python, when used with YARV.
The new JRuby compiler stuff should make it a lot faster.
Awesome benchmarks. Ruby 2.0 is going to be fantastic then! Thanks so much for doing this.
I was worried Yarv was not going to be fast enough, your tests show otherwise, I will sleep better ๐
Michael Silver:
Would you care to publish the url? Seems interesting.
Very interesting results, and very promising for JRuby. I’d like to point out the following details about the JRuby results:
– We’ve only recently started to work on performance; so these numbers actually look pretty good to me.
– This is JRuby in interpreted mode, compared to YARV, Ruby.NET, and Rubinius running with compiled code. Our prototype compiler has shown we can improve speed many times.
– Some of these are IO bound, and our IO subsystem is in bad need of improvement. But we know what’s wrong, and we know we can fix it.
– We’ve focused on correctness, which counts for a lot. We’re the only alternative implementation that can run Rails out of the box. Now for the next
step ๐
But it’s great to see these numbers and see how we compare. It will motivate us to continue working (and I see from the mailing lists and IRC channel that’s already happening ๐
Just wanted to drop a note to let people know they shouldn’t worry about rubinius. The project is so young that we’ve not yet begun to perform any of the normal speed optimizations such as selector caching. These optimizations are now being added, and I’ll ask that hopefully the tests be rerun against rubinius soon with these in place.
Hi Charles,
thank you so much for stopping by and adding details and a good contribution to the discussion. I’m glad that you found this post somewhat useful. I believe JRuby is very promising, in fact it’s always a part of my slides whenever I present Ruby and Ruby on Rails to meetings and conferences.
Hi Evan,
thanks for commenting. As I wrote in the disclaimer “… some of them havenโt been optimized for speed yet”, and Rubinius, being a fairly young project, is one of them. I plan to re-run these tests periodically so there will be plenty of space for testing improvements.
Also, feel free to drop me a line by email whenever you feel like Rubinius is ready for another test.
Thanks,
Antonio
PS: we’re being slashdotted, so hopefully this will give more exposure to each single project as well.
I’d just like to say thanks, Antonio. It’s exciting to see just how much Ruby 2.0 is going to improve over the current codebase. I’m really happy to see just how much Ruby is spreading, and to see that not only YARV, but other great projects working to better the language. I’m looking forward to seeing the stable 2.0 release, as well as what JRuby and Rubinus can do as well.
These numbers have me totally psyched for Ruby 2.0. Thanks for pulling them together.
Your background-repeat does not work in IE7.
Thanks Drew, I’m aware of the issue and I’ll try to fix it soon.
How did you calculate the values for the final table? I’m a bit confused by it.
The mixture of negative values and no values between -1 and 1 suggests to me some sort of mixture of subtraction and multiplication that probably wasn’t intended.
If we’re going by “how many times faster it runs” then I would expect 0.5 to mean it took twice as long and 2.0 to mean it ran twice as fast, calculated by taking the baseline and dividing it by the particular value.
Also, I think a geometric mean would have been a better choice so that if one odd test runs 20 times faster and another one takes 20 times as long, the average doesn’t get artifiically skewed toward the larger value.
Hi Grant,
I set the formula to generate a conventional positive value for improvements over the current interpreter, and a negative value to indicate how many times it was slower (taking Ruby 1.8.5 as 1 time). Therefore if you get 1.5 times in green, it means that it was 50% faster, while a -1.5 times in red indicates that it took 50% longer. Perhaps a more common convention would be +0.5 and -0.5, setting Ruby to zero.
Given the skewed distribution, I’ve also provided the Median. A geometric mean would have yielded results that were slightly different but comparable to the Median.
I know scripting languages emphasize startup times, but the tests might want to consider the JIT factor. The exact testing methodology isn’t described, but possibly the JRuby and Ruby.NET implementations are suffering from starting a larger more generic VM and having a JIT kick in part way through the run. A longer running program that amortized for such ‘boot up’ time might favour such implementations a little more.
Thanks for running these comparative tests. These results, along with future test runs will be invaluable in guaging the progress of the different implementations of Ruby. It look forward to seeing the results in 6 and 12 months. Ruby is not going away and the veriety of implementations will only allow for wider adoption…
To all the projects: Keep up the excellent work!! It is much appreciated by us mere users in the trenches.
It seems suspicious that most of the Windows’ times are slower on what are mostly non-I/O tests.
Ruby.NET works on Mono in Linux, I have not tried recompiling it, but the binaries published work out of the box with Mono 1.2.3
Miguel.
It seems suspicious that most of the Windowsโ times are slower on what are mostly non-I/O tests
How so? It’s well known that Ruby is slower on Windows, man.
I must say I do worry more about the fact that all implementations except 1.8 on linux and yarv failed to run valid code. What kind of “Error”s did You encounter?
Miguel, thank you very much for stopping by and pointing this out. Rest assured that Ruby.NET will be tested on Windows and on Linux through Mono in the next test run.
Did you build an optimized version of Parrot to use with Cardinal? By default, Parrot builds with all optimizations off and with all debug flags on. This makes it slower but makes development a lot easier.
To build an optimized bird, pass the “–optimize” flag to Configure.pl:
perl Configure.pl –optimize && make
Cardinal is still in its very early stages, so I don’t expect anything remarkable yet, but this should make a difference.
That’s a good suggestion Matt, so far I’ve used the installation instructions provided by each candidate, to be fair to everyone.
Perhaps Cardinal should indicate this default option in their README. I’ll keep this in mind for the next test run and hopefully we’ll see some improvements.
There were all sorts of errors from ‘stack level too deep’ in a recursive method, to serious but rarer errors like Segmentation Faults.
Many of these implementations are early versions, so I wouldn’t worry too much.
JRuby and Ruby 1.8.5 on Windows failed app_answer and app_factorial with a ‘stack level too deep’ error. This doesn’t happen if the number of recursions required is lowered. JRuby also failed so_ackermann for the same reason and raised a java.lang.OutOfMemoryError while executing vm1_swap. JRuby didn’t show any serious problems, and neither did Ruby.NET.
Thanks for this. Between Rails and YARV, Ruby should really grow in adoption.
In my last post, I failed to give the URL for a speed test comparing Ruby with other languages including yarv:
http://www.ruby-forum.com/topic/72785
Not scientific at all, but the Ruby numbers are consistent with this blogs comparison.
…Michael…
Hi!,
really nice tests! Ruby is hot this days.
Can you add XRuby implementation?
http://xruby.com/default.aspx
Adios!
How about memory usage stats? Not everything is solved with faster execution. ๐
Antonio: With rubinius did you bench the first run (compiling of bytecode included) or after a second run (.rbc generated)? At the moment Rubinius only supports bytecode interpration.
I would be curious to see the benchmark results on Windows XP instead of Vista. There are many things that actually run slower on this marvelous new shiny OS…
I also agree with the comment about taking into account the startup/JIT time of the various platforms.
when will 2.0 be released?
can we run yarv now?
I’m pretty excited about the Ruby support in objective c 2.0 (apple’s rewrite of cocoa). I can see a renaissance of mac desktop apps inheriting from the experience of today’s web-app crew. And what if it becomes relatively simple to port your web app to the mac by just putting your ruby code into a cocoa app? That could get interesting.
I think its clear that the success of textmate is due to rails, but I also think that the success of rails is partly due to the use of textmate in that first screencast. The bright colors and snippet expansion were hard not to love. And I definitely associate the mate with rails – it helps us write our sites, it even helped David write basecamp. To some degree, textmate catalyzed rails, which catalyzed Ruby. Now I think they’ll combine forces to catalyze not just a new generation but a new lineage of desktop apps. And maybe rails 4.0 will be written on one of them.
If you can find a way to meaningfully test the implementation of Ruby in objective c 2, great.
This is a very exciting time for Ruby indeed. Great work here.
I’d be interested to know what options you are using to launch the JVM when testing JRuby. As you probably know, a Java 5 JVM will by default limit itself to (I believe on your platform) 64m of memory on a non server class machine. The command line parameter -Xmx=512m can be used to raise that value. This will probably eliminate your OutOfMemoryExceptions on heavily recursive code.
There also is an option to increase the JVM stack size e.g.
-Xss2048k
Comments from Mono creator is fun .. sorry Miguel, nobody trust mono has any future I am affraid … and especially not ruby fans :o) Well, if you add up GPL Java, not Java fans I think … so only .net fans ? Well, yes, if running on MS Windows with MS VS :o)
Opensource is just about liberty !
Thanks for providing this comparison. It’s good to know that YARV is a) successfully running all these tests and b) kicking butt performance-wise! It sounds like my ruby app servers are going to get a bit of a rest in a few months time. ๐
Did tests on Vista include the time it takes to click OK on all the UAC prompts? ๐
@TSO
Hilarious.
Another YARV rival in the performance’s comparison:
http://luajit.luaforge.net/luajit.html
LuaJIT is JITted Lua scripting lang. like
YARV is JITted Ruby scripting lang.
Thanks to Antonio for compiling the stats. So far in the Ruby.NET project we have only been concentrating on getting rid of the “error” lines. We will be doing performance when we have more interop and RoR working.
@John Gough
That’s definitely a sound approach. Thank you for stopping by, Prof Gough.
Using the raw time values to compute the median isn’t a appropriate here because the different tests’ times vary quite a bit. It would probably be better to normalize the times by dividing each by a “standard” running time and then compute the median off of those values. The most obvious choice would for the standard running time would be that of the current stable Ruby implementation.
The computation of the mean could also benefit from the use of normalized running times.
Great job dude! Iโm looking forward to seeing JRuby improvements in the next test. When will it be?
Nice work. I would also be very interested in a memory usage comparison.
The code for fib(n) isn’t exactly correct.
def fib(n)
if n < 3
1
else
fib(n-1) + fib(n-2)
end
end
It produces fib(0)=1, it should be fib(0)=0
This produces correct result for all n.
def fib(n)
if n < 2
n
else
fib(n-1) + fib(n-2)
end
end
see
http://en.wikipedia.org/wiki/Fibonacci_number
http://goldennumber.net/fibonser.htm
Hi Jabari,
you’re correct, whoever wrote the test missed the degenerated case. This of course, does not affect the performance results. =)
Could you please zip all the source code and let it available for download.
I am sorry, SVN helped me to get the source code.
This is great work!! Thank you, Antonio!! I would add one note. When you are summarizing the relative performance statistics — reducing the results for all of the benchmarks to a single relative performance number for the implementation — you want to compute the *geometric* mean of the ratios! See the paper “How not to lie with statistics: the correct way to summarize benchmark results”, which can be found at
http://portal.acm.org/citation.cfm?id=5673
Another way to look at this is to take the logs of all the ratios, do your descriptive/summary stats on the logs, and then take the antilogs of the summaries. So you could compute “geometric medians” or even do a Tukey jackknife. ๐
By the way — I just returned from the Mountain West Ruby Conference and talked to the jRuby, Rubinius and Parrot/Cardinal implementors. I’ve thrown another benchmark into the mix, which you are welcome to pick up from RubyForge using
svn checkout svn://rubyforge.org/var/svn/cougar/MatrixBenchmark
Oh yeah:
1. My matrix benchmark runs about 4 times as fast with YARV as it does with “stock Ruby”.
2. 1.8.6 runs it about 11 percent faster than 1.8.5
3. If you look in the project, it also *profiles* the Ruby interpreter on the benchmark using “gprof”. I’d be really interested if someone would merge the profiles of all of these benchmarks for YARV and stock Ruby. If the other environments are implemented in C, this could be done for them too. I’m somewhat tied up in another project right now, but I’ve volunteered to mentor for Summer of Code, and if a student wants to pick this project up — profiling the Ruby interpreter — I’d be happy to give it away. ๐
Thank you very much M. Edward, your tips on statistical reporting are very appreciated and I’ll try to include them in my next test run. ๐
Hi Antonio!
in your next test, please include the factorial of 6120
thanks
Antonio, thanks for these benchmarks. Did you ever think about setting these up as automated benchmarks? If it could check out the source for each imeplentation nightly, run the benchmarks, and then automatically update the graphs and charts we’d be able to watch the progress as they improve.
Great stuff Antonio!
I was worried Yarv was not going to be fast enough, your tests show otherwise, I will sleep better ๐
why it’s always writing NULL in empty fields?
Beck ham, please stop posting the same question over and over. NULL fields appear only in IE. You can get a decent browser here: “http://www.mozilla.com/en-US/firefox/”:http://www.mozilla.com/en-US/firefox/.