The Great Ruby Shootout (Windows Edition)

This post contains the results of a Ruby shootout on Windows that I recently conducted. You can find the Mac edition, published last month, here. I was planning to have this one ready much sooner, but a couple of serious events in personal life prevented that from happening. Be sure to grab my feed or join the newsletter to avoid missing the upcoming Linux shootout.

The setup

For this shootout I included a subset of the Ruby Benchmark Suite. I opted to primarily exclude tests that were executed in fractions of a second in most VMs, focusing instead of more substantial benchmarks (several of which come from the Computer Language Benchmarks Game). The best times out of five runs are reported here for each benchmark.

All tests were run on Windows 7 x64, on an Intel Core 2 Quad Q6600 2.40 GHz, 8 GB DDR2 RAM, with two 500 GB 7200 rpm disks.

The implementations tested were:

Ruby 1.8.7 (2010-01-10 patchlevel 249) [i386-mingw32] (RubyInstaller)
Ruby 1.9.1 p378 (2010-01-10 revision 26273) [i386-mingw32] (RubyInstaller)
Ruby 1.9.2 dev (2010-05-31) [i386-mingw32] (experimental)
JRuby 1.5.1 (ruby 1.8.7 patchlevel 249) (2010-06-06 f3a3480) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_20) [amd64-java]
IronRuby 1.0 x64 for .NET 4.0

JRuby was run with the --fast and --server optimization flags.

Disclaimer

Synthetic benchmarks cannot predict how fast your programs will be when dealing with a particular implementation. They provide an (entertaining) educated guess, but you shouldn’t draw overly definitive conclusions from them. The values reported here should be assumed to be characteristic of server-side – and long running – processes and should be taken with a grain of salt.

The results

Please find below the execution times for the selected tests. Timeouts indicate that the execution of a single iteration for a given test took more than 60 seconds and had to be interrupted. Bold values indicate the best performance for each test.

RBS Windows Shootout

Conclusions

Despite a couple of errors and a few timeouts, JRuby was the fastest of the lot, which can be seen as impressive ~~if we consider that this is Windows we are talking about after all~~.

Ruby 1.9.1 and 1.9.2 were almost as fast as JRuby on these tests. With a few exceptions, the performances of the two 1.9 implementations were, expectedly, very similar.

JRuby, 1.9.1 and 1.9.2 were all faster than the current MRI implementation, which can be seen as a prerequisite as we move, as a community, away from Ruby 1.8. Finally, it’s worth noting that IronRuby’s performance was however in line with that of Ruby 1.8.7.

Update (July 3, 2010): The following box plot compares the various implementations for the tests for which all the implementations were successful. Only times for the largest successful input number were used in those tests where multiple input numbers were tested.

Windows Shootout Boxplot

Prev Article Next Article

About The Author

Antonio Cangiano

22 Comments

Luis Lavena June 28, 2010

Thank you Antonio for this shootout.

One point to mention is that these shootouts do not exercise IO operations.

Disk-based IO between 1.9.1 and 1.9.2 has improved a lot, which could be considered one *big* difference.

Of course, JRuby beats MRI on that, too.

Loading...

Reply
Isaac Gouy June 28, 2010

@Luis Lavena >> do not exercise IO operations <<

Please say more.

Doesn't bm_fasta.rb write ~1MB? Doesn't bm_regex_dna.rb read 150kB 20 times? Doesn't bm_sum_file.rb read 390kB 100 times?

Loading...

Reply
Isaac Gouy June 28, 2010

iirc the JRuby bm_meteor_contest.rb error can be avoided by not using –fast (perhaps also the bm_app_pentomino.rb error?)

Loading...

Reply
Luis Lavena June 28, 2010

@Issac:

bm_fasta: no real IO:
http://github.com/acangiano/ruby-benchmark-suite/blob/master/benchmarks/micro-benchmarks/bm_fasta.rb

bm_regex_dna.rb
File is read 20 times in text mode:
http://github.com/acangiano/ruby-benchmark-suite/blob/master/benchmarks/micro-benchmarks/bm_regex_dna.rb#L10

The actual IO operations are reduced to the brute force required by the iteration itself.

The bm test are oriented to the operations, not the IO part.

IO benchmarking is something else.

Loading...

Reply
- Isaac Gouy June 28, 2010
  
  >> IO benchmarking is something else. <<
  
  Something more to do with measuring hardware?
  
  Loading...
  
  Reply
  - Luis Lavena June 29, 2010
    
    Actually no.
    
    Ruby IO implementation in Windows does a lot of stuff in C code where should be using Windows own functionality.
    
    That defined how Ruby works, and can’t be changed.
    
    JRuby had a hard time implementing those “specs” of Ruby, but it leveraged on Java NIO functionality, which is pretty damn fast, after all, is Java.
    
    Anyhow, your mileage may vary on every case, the shootout is just to give a comparison, but is not reality 🙂
    
    Loading...
    
    Reply
Jimmy Schementi June 28, 2010

Antonio,

Thanks for doing this shootout! I’ve got a couple of questions about your setup:

How come you picked the 64-bit versions of JRuby and IronRuby to benchmark against the 32-bit versions of MRI 1.8.7, 1.9.1, and 1.9.2? The 64-bit versions will definitely be slower than the 32-bit versions, just by definiton. I’m not sure if there are any benefits to using the 64-bit JVM, but for .NET it is preferred to use the 32-bit .NET runtime. I’d suggest re-running this with all 32-bit versions of the Ruby engines; for IronRuby this just means running ir.exe rather than ir64.exe.

Also, IronRuby does have an optimization flag that should be used for these types of raw-performance benchmarks, but whether or not it should be used depends on how you’re running these benchmarks. Ideally, you’d allow for some warm-up time, like run the benchmark for 60 seconds, and then start timing your desired number of iterations; this more-accurately simulates the server-scenario your disclaimer states, and IronRuby will perform optimally in this case. However, if you are not doing that, then IronRuby should be run with the “-X:NoAdaptiveCompilation” flag, which will force IronRuby to generate optimal code from the start (by default, IronRuby will use an interpreter until a certain method-call threshold is reached, and then start generating .NET bytecode; this let’s IronRuby avoid the overhead of emitting bytecode through .NET, but obviously trades off on raw performance as an interpreter is being used). The downside to using “-X:NoAdaptiveCompilation” is it will force evals to also be compiled, making any eval benchmarks much slower, which is why it’s ideal to just warm the process up first.

~Jimmy

Loading...

Reply
- Antonio Cangiano June 28, 2010
  
  How come you picked the 64-bit versions of JRuby and IronRuby to benchmark against the 32-bit versions of MRI 1.8.7, 1.9.1, and 1.9.2?
  
  This was done under the (reasonable) assumption that people would be running 64 bit VMs on a 64 bit machine. Unfortunately 64 bit versions of MRI/KRI are not available on Windows yet. I was not expecting a major difference in speed between the two either.
  
  Ideally, you’d allow for some warm-up time, like run the benchmark for 60 seconds, and then start timing your desired number of iterations; this more-accurately simulates the server-scenario your disclaimer states, and IronRuby will perform optimally in this case.
  
  Only the best time for each test is reported, so you could think of the first few iterations as the warmup (for most tests here). I will add additional warmup time in the future though.
  
  Loading...
  
  Reply
  - Jimmy Schementi June 28, 2010
    
    The 64-bit JIT-compiler has very different performance characteristics than the 32-bit JIT. Specifically, the 64-bit JIT is (as you assumed correctly) optimized for the server, so it does a ton more optimizations. For IronRuby, we see a 2x slowdown when JIT-compiling in 64-bit. This is OK if the process is warmed up enough, but it essentially needs double the warm-up time.
    
    WRT warm-up: the tests are run repeatedly in the same process, taking the best run from that? If so, that’s fine. Otherwise, shelling out to ir.exe multiple times, doesn’t do any good for the compiler-warmup, so adaptive-compilation should be turned off.
    
    Oh the joys of performance testing =)
    
    Loading...
    
    Reply
    - Antonio Cangiano June 28, 2010
      
      WRT warm-up: the tests are run repeatedly in the same process, taking the best run from that? If so, that’s fine.
      
      That’s correct, Jimmy.
      
      Loading...
      
      Reply
Brazen June 28, 2010

If this is expected “server-side” performance, shouldn’t this be a Ruby on Linux shootout?

Also, you left out Rubinius :/

Loading...

Reply
- Antonio Cangiano June 28, 2010
  
  If this is expected “server-side” performance, shouldn’t this be a Ruby on Linux shootout?
  
  Some people use Windows. The Linux one is upcoming.
  
  Also, you left out Rubinius :/
  
  As far as I know, Rubinius doesn’t have a ready to run Windows version.
  
  Loading...
  
  Reply
  - Brazen June 29, 2010
    
    Ah, thanks for the reply. I probably should have put a smiley or something to indicate: the linux comment was meant to be tongue-in-cheek 😉 (although I do use linux, server AND desktop).
    
    I’ll look forward to the linux shootout and seeing how Rubinius compares (I’m assuming you intend to include Rubinius on the linux shootout). I hope you use the same hardware, because it would also be interesting to see how linux compares to the windows shootout.
    
    Loading...
    
    Reply
    - Antonio Cangiano June 29, 2010
      
      Yes, Rubinius will be included and the same hardware will be used. Stay tuned. 🙂
      
      Loading...
      
      Reply
Jim Gay June 29, 2010

Thanks for posting this. I love coming across your detailed comparisons.

Loading...

Reply
Earle Clubb June 29, 2010

WRT Linux shootout: Will you include REE?

Loading...

Reply
- Antonio Cangiano June 29, 2010
  
  Yes.
  
  Loading...
  
  Reply
Michael Campbell June 30, 2010

> JRuby was the fastest of the lot, which can be seen as impressive if we consider that this is Windows we are talking about after all.

Why the Windows caveat? Why should this NOT be seen as impressive if you’re talking about Linux, etc.?

Loading...

Reply
- Antonio Cangiano June 30, 2010
  
  Why the Windows caveat? Why should this NOT be seen as impressive if you’re talking about Linux, etc.?
  
  This would be impressive on Linux as well, Michael. However, given that it’s a language that’s based on the JVM, it is especially impressive on Windows where you wouldn’t necessarily expect JVM based implementations to shine.
  
  Loading...
  
  Reply
  - Isaac Gouy July 1, 2010
    
    Why wouldn’t you necessarily expect JVM based implementations to shine on Windows?
    
    Here are Java measurements on x86 Ubuntu –
    
    http://shootout.alioth.debian.org/u32q/measurements.php?lang=java
    
    Here are the corresponding measurements –
    
    http://shootout.alioth.debian.org/demo/measurements.php?lang=java
    
    The only big differences might be linked to threading: chameneos-redux and thread-ring.
    
    Loading...
    
    Reply
raggi July 1, 2010

The JVM has been more available on Windows than it has on Linux for a long time, I remember trying to get that stuff working on Linux many years ago and suffering all kinds of headaches. Solaris is a different story, but seriously, that caveat did seem a bit more like a random stab than based on industry reality.

Hell, you work for IBM, we used to resort to jikes on Linux back in the day…

Anyway, I had originally ignored that statement, which I’m now going to go back to doing.

Loading...

Reply
- Antonio Cangiano July 1, 2010
  
  It wasn’t meant to be a stab. I simply thought that the performance of the JVM on Linux would be better than that on Windows. I’m happy to learn that the JVM performs just as well on Windows. 🙂
  
  Loading...
  
  Reply
Pingback: The Great Ruby Shootout (July 2010) July 19, 2010

The Great Ruby Shootout (Windows Edition)

Related

About The Author

Antonio Cangiano

22 Comments

Leave a ReplyCancel reply

Get more stuff like this

Share this:

Related

About The Author

Antonio Cangiano

22 Comments

Leave a ReplyCancel reply