The Great Ruby Shootout measures the performance of several Ruby implementations by testing them against a series of synthetic benchmarks. Recently I ran Mac and Windows shootouts as well, which tested a handful of implementations. However this article reports on the results of extensive benchmark testing of eight different Ruby implementations on Linux.
The setup
For this shootout I included a subset of the Ruby Benchmark Suite. I opted to primarily exclude tests that were executed in fractions of a second in most VMs, focusing instead of more substantial benchmarks (several of which came from the Computer Language Benchmarks Game). The best times and least memory allocations out of five runs are reported here for each benchmark.
All tests were run on Ubuntu 10.4 LTS x86_64, on an Intel Core 2 Quad Q6600 2.40 GHz, 8 GB DDR2 RAM, with two 500 GB 7200 rpm disks.
8 implementations
The implementations tested were:
JRuby was run with the –fast and –server optimization flags.
Disclaimer
Synthetic benchmarks cannot predict how fast your programs will be when dealing with a particular implementation. They provide an (entertaining) educated guess, but you shouldn’t draw overly definitive conclusions from them. The values reported here should be assumed to be characteristic of server-side — and long running — processes; they should be taken with a grain of salt.
Time Results
Please find below the execution times for the selected tests. Timeouts indicate that the execution of a single iteration for a given test took more than 300 seconds and had to be interrupted. Bold, green values indicate the best performer out of each test.
Warning: The bm_primes.rb benchmark was originally written to aid the development of the Prime class. As such in 1.9.2 it was rewritten in C, which makes it a poor representation of implementation performance. This benchmark will removed in the future.

If you are not interested in the individual test results, the information presented in the table above is summarized directly below:
Ruby 1.9.2 JRuby
Min. : 0.013 Min. : 0.382
1st Qu.: 3.258 1st Qu.: 3.051
Median : 4.543 Median : 4.997
Mean : 9.262 Mean : 9.180
3rd Qu.: 8.573 3rd Qu.: 8.969
Max. :45.009 Max. :48.850
MagLev Ruby 1.9.1
Min. : 0.351 Min. : 0.015
1st Qu.: 2.140 1st Qu.: 3.387
Median : 6.069 Median : 6.205
Mean : 9.100 Mean :10.860
3rd Qu.: 9.266 3rd Qu.:11.559
Max. :51.221 Max. :46.849
Ruby 1.8.7 IronRuby
Min. : 0.708 Min. : 3.601
1st Qu.: 5.102 1st Qu.: 10.505
Median : 8.380 Median : 12.912
Mean :18.785 Mean : 26.539
3rd Qu.:24.793 3rd Qu.: 36.115
Max. :75.653 Max. :135.204
Rubinius REE
Min. : 0.484 Min. : 0.584
1st Qu.: 3.087 1st Qu.: 4.343
Median : 9.636 Median : 6.660
Mean :13.232 Mean :15.036
3rd Qu.:17.674 3rd Qu.:21.336
Max. :73.050 Max. :61.960
For the sake of convenience, I also produced a box plot from the successful data points:

There are a few considerations based on these results that I feel are worth mentioning:
Memory Results
The following table shows the approximate memory consumption for each implementation when running each benchmark:

Summarized:
Ruby 1.9.2 Ruby 1.9.1
Min. : 4.320 Min. : 4.580
1st Qu.: 4.378 1st Qu.: 4.695
Median : 6.285 Median : 6.920
Mean : 20.795 Mean : 15.669
3rd Qu.: 10.162 3rd Qu.: 11.383
Max. :171.500 Max. :100.570
Ruby 1.8 REE
Min. : 3.040 Min. : 8.220
1st Qu.: 4.290 1st Qu.: 9.682
Median : 7.745 Median : 15.565
Mean : 20.698 Mean : 27.014
3rd Qu.: 11.273 3rd Qu.: 38.620
Max. :103.520 Max. :125.910
Rubinius MagLev
Min. : 37.63 Min. : 81.74
1st Qu.: 39.78 1st Qu.: 82.52
Median : 45.48 Median : 83.53
Mean : 65.70 Mean : 96.29
3rd Qu.: 58.22 3rd Qu.: 98.10
Max. :215.33 Max. :175.85
JRuby
Min. : 49.04
1st Qu.: 71.23
Median :176.72
Mean :169.41
3rd Qu.:209.04
Max. :404.06
And finally, in graph form:

A few considerations on memory:
Linux Vs. Windows
This shootout and the Windows one were both performed on the same machine, thus we can compare how the same implementation perform under different operating systems. The only adjustment that’s required is the timeout. Every result longer than 60 seconds from this shootout needs to be considered a timeout, because the previous shootout did so as well.
It is commonly believed that Ruby performs much better on Linux than on Windows (with the exception of IronRuby). Let’s find out if these test results confirm that notion.
Ruby 1.8.7:

Ruby 1.9.2:

JRuby:

Finally, in chart form (yellow entries are on Windows as indicated by the labels containing W):

To use a beloved MythBusters expression, this myth is confirmed.
Note: As requested by a few commenters, here is a comparison of IronRuby as well (.NET 4.0 Vs. Mono 2.4.4):

Conclusion
In conclusion, let me just state that it’s nice to see several implementations getting faster. Ruby 1.9.2, JRuby, MagLev and Rubinius are all becoming serious competitors and working their respective ways closer to a similar performance level. If you think these benchmark shootouts are becoming boring, then the results are becoming more stable and predictable. I suspect that as time goes on, performance will not be the real distinguishing factor when choosing a Ruby implementation, other features will be.
Lisp has had a tremendous impact on the world of programming. Even though Common Lisp and Scheme — the two main Lisp dialects — may not be considered mainstream today, several popular languages have been influenced by one or both of them.
It isn’t stretching things too much to say that both Ruby and Python can be seen as slower, easier (for beginners), object-oriented, infix Lisp dialects.
Some may say Ruby is a bad rip-off of Lisp or Smalltalk, and I admit that. But it is nicer to ordinary people. — Yukihiro “Matz” Matsumoto
Ruby and Python aren’t intimidating and remain very approachable for absolute beginners. Furthermore, their approachability is not confined to the language design itself, but transcends into the community and ecosystem that surrounds them.
I’m not here to discuss how languages like Ruby and Python managed to become more popular than major Lisp dialects nowadays. I’d rather focus on how these gentler introductions to functional programming are acting as gateway drugs to Lisp for many developers.
A community that values metaprogramming and is obsessed with the construction of DSLs (Domain Specific Languages) like the Ruby’s is, will no doubt find in Lisp a valuable ally. Plus, if you know Ruby inside and out, you should find Lisp to be easy enough to learn.
To attract Ruby developers though, Lisp has to offer something more than just a set of powerful features. You could say that Rails is enough of a reason to learn and use Ruby. But what is Lisp able to solve all that better than Ruby? I’ll answer that question by focusing on a specific dialect of Lisp, that I and continually more Ruby developers are getting into: Clojure.
It wouldn’t be fair to characterize the Lisp community as stagnant, but Clojure is definitely a welcomed dose of new blood. Clojure is a JVM-based modern Lisp designed for concurrency, which elegantly includes a set of carefully chosen features that are not easily found in mainstream languages.
In my opinion, Clojure has three main advantages over Ruby:
Clojure’s interoperability with Java resolves the issue of only having a few available libraries, which often affects new languages. It also helps in getting people to use the language within the enterprise world where Java still dominates.
Of all the “new” languages out there, I find Clojure to be the most fun, interesting and pragmatic: it’s something worth getting excited about. I don’t really care if it turns out to be the next Ruby or not, it’s a language that’s worth knowing and using. (If you haven’t tried it yet, a decent, short introductory book is the recently published Practical Clojure.)
Clojure’s popularity may even bring more attention to Lisp in general (for example, most must-read literature uses Scheme or Common Lisp). Perhaps then, it may indirectly help introduce more traditional Lisp dialects to a new generation of programmers.
This post contains the results of a Ruby shootout on Windows that I recently conducted. You can find the Mac edition, published last month, here. I was planning to have this one ready much sooner, but a couple of serious events in personal life prevented that from happening. Be sure to grab my feed or join the newsletter to avoid missing the upcoming Linux shootout.
The setup
For this shootout I included a subset of the Ruby Benchmark Suite. I opted to primarily exclude tests that were executed in fractions of a second in most VMs, focusing instead of more substantial benchmarks (several of which come from the Computer Language Benchmarks Game). The best times out of five runs are reported here for each benchmark.
All tests were run on Windows 7 x64, on an Intel Core 2 Quad Q6600 2.40 GHz, 8 GB DDR2 RAM, with two 500 GB 7200 rpm disks.
The implementations tested were:
JRuby was run with the --fast and --server optimization flags.
Disclaimer
Synthetic benchmarks cannot predict how fast your programs will be when dealing with a particular implementation. They provide an (entertaining) educated guess, but you shouldn’t draw overly definitive conclusions from them. The values reported here should be assumed to be characteristic of server-side – and long running – processes and should be taken with a grain of salt.
The results
Please find below the execution times for the selected tests. Timeouts indicate that the execution of a single iteration for a given test took more than 60 seconds and had to be interrupted. Bold values indicate the best performance for each test.
Conclusions
Despite a couple of errors and a few timeouts, JRuby was the fastest of the lot, which can be seen as impressive if we consider that this is Windows we are talking about after all.
Ruby 1.9.1 and 1.9.2 were almost as fast as JRuby on these tests. With a few exceptions, the performances of the two 1.9 implementations were, expectedly, very similar.
JRuby, 1.9.1 and 1.9.2 were all faster than the current MRI implementation, which can be seen as a prerequisite as we move, as a community, away from Ruby 1.8. Finally, it’s worth noting that IronRuby’s performance was however in line with that of Ruby 1.8.7.
Update (July 3, 2010): The following box plot compares the various implementations for the tests for which all the implementations were successful. Only times for the largest successful input number were used in those tests where multiple input numbers were tested.

From the Padrino’s site:
Padrino is a ruby framework built upon the excellent Sinatra Microframework. Sinatra is a DSL for creating simple web applications in Ruby with speed and minimal effort. This framework makes it as fun and easy as possible to code increasingly advanced web applications by expanding upon Sinatra while maintaining the spirit that made it great.
The Ruby community has plenty of web frameworks at this point. Padrino — self-described as “The Elegant Ruby Web Framework” — is interesting because it’s built on top of Sinatra, it’s highly modular, quite fast, and provides a drop-in admin interface. It fits between Sinatra and a large framework like Rails.
If it wasn’t for the fact that Rails 3 is about to be released, Padrino may have had a fighting chance at acquiring a good market share within the Ruby community. Rails 3 is here though, and it too is very modular and fast. Plus, it’s hard to beat the huge ecosystem that’s already built around it.
That said, the presence of an admin interface, a la Django, and the Sinatra core are definitely inviting features. Check out their documentation and screencast, to see if you think it’s worth considering for your own web development needs.
Recently MacRuby 0.6 was released. The development team put a lot of emphasis on improving compatibility with Ruby 1.9, and the viability of MacRuby as a tool for developing Mac OS X applications. Focus on these aspects took precedence over performance, but I was still curious to see how well it performed when compared to Ruby 1.8.7 and Ruby 1.9, respectively.
This article showcases the results of a small Ruby shootout for Mac. I plan to publish a Windows one by next week, and then a week or two after that, a complete Linux shootout that will have many more implementations. Grab my feed or join the newsletter to avoid missing upcoming shootout posts.
The setup
The tests are a large subset of the Ruby Benchmark Suite. Each test was run 10 times, five to detect the best execution time, and five to detect the minimal memory consumption. All of the tests were run on Mac OS X 10.6.3, on my MacBook Pro 2.66 GHz Intel Core 2 Duo, 4 GB 1067 MHz DDR3 RAM, 320 GB 7200 rpm disk.
Stable implementations tested:
Disclaimer
Synthetic benchmarks cannot predict how fast your programs will be with one implementation or another. They provide an (entertaining) educated guess, but you shouldn’t draw overly definitive conclusions from them. Furthermore, the Ruby Benchmark Suite has many tests that don’t provide much insight when it comes to comparing implementations. They are there for legacy reasons and will probably be removed in the future. For the time being, take them with a grain of salt.
The results
Without further hesitation, here are the execution times for the tests (divided in A-L, M-Z). Timeouts indicate that the execution of a single run took more than 60 seconds and had to be interrupted. Bold values indicate the best performance for each test.


And here is the estimated memory usage:


Conclusions
MacRuby 0.6 is faster than Ruby 1.9.1 at times, but it can also be significantly slower. Overall, as things stand now, its performance appears to be between that of Ruby 1.9.1 and Ruby 1.8.7, with several outliers and a greater variance compared to those two implementations. Memory wise, MacRuby appears to be significantly more “memory hungry” than the other implementations (even though this wasn’t all that much of a surprise to me).
I’m interested in seeing how future releases that will be focused more on performance will affect these preliminary results. For the time being however, don’t let this outcome discourage you from using MacRuby 0.6, which is the first release that’s considered stable for Mac OS X development.
Download: CSV Files
PS: If you are looking for a fun and easy way to get started with MacRuby, check out ThinkCode.TV’s screencast on the subject.
Update (July 3, 2010): The following box plot compares the various implementations for the tests for which all the implementations were successful. Only times for the largest successful input number were used in those tests where multiple input numbers were tested.

The following is a very short guide on setting up Ruby Enterprise Edition (REE), nginx and Passenger, for serving Ruby on Rails applications on Ubuntu. It also includes a few quick and easy optimization tips.
We start with setting up REE (x64), using the .deb file provided by Phusion:
wget http://rubyforge.org/frs/download.php/66163/ruby-enterprise_1.8.7-2009.10_amd64.deb
sudo dpkg -i ruby-enterprise_1.8.7-2009.10_amd64.deb
ruby -v
In output you should see “ruby 1.8.7 (2009-06-12 patchlevel 174)…” or similar. If this is the case, good; while you are there, update RubyGems and the installed gems:
sudo gem update --system
sudo gem update
Next, you’ll need to install nginx, which is a really fast web server. The Phusion team has made it very easy to install, but if you simply follow most instructions found elsewhere, you’ll get the following error:
checking for system md library ... not found checking for system md5 library ... not found checking for OpenSSL md5 crypto library ... not found ./configure: error: the HTTP cache module requires md5 functions from OpenSSL library. You can either disable the module by using --without-http-cache option, or install the OpenSSL library in the system, or build the OpenSSL library statically from the source with nginx by using --with-http_ssl_module --with-openssl=options.
Instead, we are going to install libssl-dev first and then nginx and its Passenger module:
sudo aptitude install libssl-dev
sudo passenger-install-nginx-module
Follow the prompt and accept all the defaults (when prompted to chose between 1 and 2, pick 1).
Before I proceed with the configuration, I like to create an init script and have it boot at startup (the script itself is adapted from one provided by the excellent articles at slicehost.com):
sudo vim /etc/init.d/nginx
The content of which needs to be:
#! /bin/sh
### BEGIN INIT INFO
# Provides: nginx
# Required-Start: $all
# Required-Stop: $all
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: starts the nginx web server
# Description: starts nginx using start-stop-daemon
### END INIT INFO
PATH=/opt/nginx/sbin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
DAEMON=/opt/nginx/sbin/nginx
NAME=nginx
DESC=nginx
test -x $DAEMON || exit 0
# Include nginx defaults if available
if [ -f /etc/default/nginx ] ; then
. /etc/default/nginx
fi
set -e
. /lib/lsb/init-functions
case "$1" in
start)
echo -n "Starting $DESC: "
start-stop-daemon --start --quiet --pidfile /opt/nginx/logs/$NAME.pid \
--exec $DAEMON -- $DAEMON_OPTS || true
echo "$NAME."
;;
stop)
echo -n "Stopping $DESC: "
start-stop-daemon --stop --quiet --pidfile /opt/nginx/logs/$NAME.pid \
--exec $DAEMON || true
echo "$NAME."
;;
restart|force-reload)
echo -n "Restarting $DESC: "
start-stop-daemon --stop --quiet --pidfile \
/opt/nginx/logs/$NAME.pid --exec $DAEMON || true
sleep 1
start-stop-daemon --start --quiet --pidfile \
/opt/nginx/logs/$NAME.pid --exec $DAEMON -- $DAEMON_OPTS || true
echo "$NAME."
;;
reload)
echo -n "Reloading $DESC configuration: "
start-stop-daemon --stop --signal HUP --quiet --pidfile /opt/nginx/logs/$NAME.pid \
--exec $DAEMON || true
echo "$NAME."
;;
status)
status_of_proc -p /opt/nginx/logs/$NAME.pid "$DAEMON" nginx && exit 0 || exit $?
;;
*)
N=/etc/init.d/$NAME
echo "Usage: $N {start|stop|restart|reload|force-reload|status}" >&2
exit 1
;;
esac
exit 0
Change its permission and have it startup at boot:
sudo chmod +x /etc/init.d/nginx
sudo /usr/sbin/update-rc.d -f nginx defaults
From now on, you’ll be able to start, stop and restart nginx with it. Start the server as follows:
sudo /etc/init.d/nginx start
Heading over to your server IP with your browser, you should see “Welcome to nginx!”. If you do, great, we can move on with the configuration of nginx for your Rails app.
Edit nginx’ configuration file:
sudo vim /opt/nginx/conf/nginx.conf
Adding a server section within the http section, as follows:
server {
listen 80;
server_name example.com;
root /somewhere/my_rails_app/public;
passenger_enabled on;
rails_spawn_method smart;
}
The server name can also be a subdomain if you wish (e.g., blog.example.com). It’s important that you point the root to your Rails’ app public directory.
The rails_spawn_method directive is very efficient, allowing Passenger to consume less memory per process and speed up the spawning process, whenever your Rails application is not affected by its limitations (for a discussion about this you can read the proper section in the official guide).
If you have lots of RAM (e.g., more than 512 MB) on your server, you may want to consider increasing you maximum pool size, with the directive passenger_max_pool_size from its default size of 6. Conversely, if you want to limit the number of processes running at any time and consume less memory on a small VPS (e.g., 128 to 256MB), you can decrease that number down to 2 (or something in that range). (Always test a bunch of configurations to find one that works for you). You can read more about this directive, in the official guide.
While you are modifying nginx’ configuration, you may also want to increase the worker processes (e.g., to 4, on a typical VPS) and add a few more tweaks (such as enabling gzip compression):
# ...
http {
passenger_root /usr/local/lib/ruby/gems/1.8/gems/passenger-2.2.5;
passenger_ruby /usr/local/bin/ruby;
include mime.types;
default_type application/octet-stream;
access_log logs/access.log;
sendfile on;
keepalive_timeout 65;
tcp_nodelay on;
gzip on;
gzip_comp_level 2;
gzip_proxied any;
server {
#...
When you are happy with the changes, save the file, and restart nginx:
sudo /etc/init.d/nginx restart
If you wish to restart Passenger in the future, without having to restart the whole web server, you can simply run the following command:
touch /somewhere/my_rails_app/tmp/restart.txt
Passenger also provides a few handy monitoring tools. Check them out:
sudo passenger-status
sudo passenger-memory-stats
That’s it, you are ready to go! I hope that you find these few notes useful.
There is major news in Rubyland today. MacRuby’s team just released their fist beta of version 0.5 (an experimental, still incomplete version of Ruby), which brings JIT, removal of the dreaded GIL (Global Interpreter Lock), native threads, GCD (Grand Central Dispatch) for multicore computing, and a whole new set of features found in the release announcement to the table.
The most important new feature is the presence of a compiler. That’s right, thanks to this release, Ruby code can now become highly optimized executable code. How awesome is that? I can sense that you’re pumped by this news, so why not head over to MacRuby.com and download the installation file for yourself? After you’ve done that, the next thing you’re going to want to do is run a small test like the following:
$ macrubyc world_domination.rb -o world_domination
Can't locate program `llc'
Oh noes! llc is a tool that ships with the LLVM (upon which MacRuby is built), however it’s not included with MacRuby’s installer (it will be in the future). But fear not my friends, there is a solution:
$ svn co -r 82747 https://llvm.org/svn/llvm-project/llvm/trunk llvm-trunk
$ cd llvm-trunk
$ ./configure
$ UNIVERSAL=1 UNIVERSAL_ARCH="i386 x86_64" ENABLE_OPTIMIZED=1 make -j2
$ sudo env UNIVERSAL=1 UNIVERSAL_ARCH="i386 x86_64" ENABLE_OPTIMIZED=1 make install
If your machine does not have 2 cores, remove the -j2 option from the fourth line or adjust the number accordingly.
The compilation phase may take a couple of centuries, depending on your machine’s speed, but it should eventually build the LLVM.
llc will be placed in your PATH, and you’ll finally be able to compile Ruby code and obtain an executable to help you carry out your world domination plans.
$ macrubyc world_domination.rb -o world_domination
$ ./world_domination
MUAHAHAHAHA!
An easy way to improve the performance and security of SQL queries is to replace literals with parameters. By replacing literal values with parameters, advanced relational databases will be able to compile your queries and have their execution plans cached. This saves time and precious resources when the same query (minus the actual values) is executed over and over.
Consider the following series of queries:
SELECT * FROM users WHERE karma BETWEEN 100 AND 499;
SELECT * FROM users WHERE karma BETWEEN 500 AND 999;
SELECT * FROM users WHERE karma BETWEEN 1000 AND 1999;
SELECT * FROM users WHERE karma BETWEEN 2000 AND 4999;
SELECT * FROM users WHERE karma BETWEEN 5000 AND 9999;
SELECT * FROM users WHERE karma BETWEEN 10000 AND 50000;
These each represent the same query and can be transformed into a single parameterized query:
SELECT * FROM users WHERE karma BETWEEN ? AND ?;
Trying to use clever tricks with quotes in order to inject arbitrary SQL code becomes futile. Parameters are considered values, and have no effect on the structure of the query itself.
Parameterized queries are therefore efficient and go a long way towards preventing SQL injection attacks in your applications. They have virtually no downside.
Newbie developers often ignore the existence of this feature and end up irritating seasoned DBAs who have to deal with the consequences of their incompetence. Leon Katsnelson argues that this is such an important matter, that every DBA should forward this Computerworld article to their developers. I tend to agree with how important of an issue that is.
That article provides the following example in Java:
String lastName = req.getParameter("lastName");
String query = "select * from customers where last_name = ?"
PreparedStatement pstmt = connection.prepareStatement(query);
pstmt.setString(1, lastName);
try { ResultSet results = pstmt.execute(); }
Here I’ll show you an example of how to work with parameterized queries from Ruby and Python. I’ll use the Ruby and Python drivers for DB2.
Ruby first:
require 'ibm_db'
conn = IBM_DB.connect("mydb", "db2inst1", "mypassword")
query = "SELECT * FROM users WHERE karma BETWEEN ? AND ?"
pstmt = IBM_DB.prepare(conn, query)
values = [500, 999]
IBM_DB.execute(pstmt, values)
while row = IBM_DB.fetch_array(pstmt)
puts "#{row[0]}:#{row[1]}"
end
We load the driver (use mswin32/ibm_db on Windows, and ibm_db.bundle on Mac), create a prepared statement, and then bind the two parameter values to it through the execute method. We then fetch the resultset one row at a time and print the value of the first two fields for each record. For fine-tuned control we could have used the IBM_DB::bind_param method.
The Python version is very similar:
import ibm_db
conn = ibm_db.connect("mydb", "db2inst1", "mypassword")
query = "SELECT * FROM users WHERE karma BETWEEN ? AND ?"
pstmt = ibm_db.prepare(conn, query)
values = (500, 999)
ibm_db.execute(pstmt, values)
tuple = ibm_db.fetch_tuple(pstmt)
while tuple:
print tuple[0] + ":" + tuple[1]
tuple = ibm_db.fetch_tuple(pstmt)
As you can see, working with parameterized queries is not any harder than dynamically generating SQL queries. Yet the benefits of doing so are huge.
Unfortunately, despite being a very sound choice to base an Object-Relational Mapper (ORM) on, ActiveRecord does not use parameterized queries. Even when it looks like you are passing parameters to a given method, these are actually used to dynamically form an SQL query. Of course you are still free to use parameterized queries in your Rails applications by employing the driver directly. But I really think this is something ActiveRecord should be built upon.
Luckily for Django developers, Django’s ORM uses parameterized queries, thus improving both performance and security with a single design choice. In the Python world you couldn’t get away with ignoring parameterized queries.
For those of you using Rails, all is not lost. DB2 Express-C 9.7 has a killer feature known as the Statement Concentrator, which caches similar queries allowing them to use a shared access plan. It’s not as efficient as using prepared statements in your code, but it’s the best you can do when, as in the case of ActiveRecord, you can’t use parameterized queries directly. Leon’s article explains in greater detail how this feature actually works.
Now that Mac OS X 10.6 is out, it’s time to leave the world of 32 bit computing behind. The pre-installed Ruby interpreter will run in 64 bit mode by default, so you may need to pay attention when installing some C-based gems. The ibm_db Ruby gem for DB2 can easily be installed or updated to the latest available version by following these simple steps:
$ sudo -s
$ export IBM_DB_LIB=/Users/<username>/sqllib/lib64
$ export IBM_DB_INCLUDE=/Users/<username>/sqllib/include
$ export ARCHFLAGS="-arch x86_64"
$ gem install ibm_db
You can verify that the installation was successful my running the following:
$ irb
>> require 'ibm_db.bundle'
=> true
Please let me know if you encounter any issues, I’d be glad to help you.
I finally got around to updating the Ruby and Rails book pages. The existing list was getting a bit obsolete and I didn’t like the idea of recommending old books to newcomers. I also had some interesting new entries.
Without further ado:
A few people may disagree with the choices, but I think most experienced Ruby and Rails programmers, who’ve read those books, will concur with my recommendations. I’m quite confident that these are, all things considered, some of the best books available on the subject.
A word to the publishers
As tempting as it is to collect Ruby and Rails books, these days I don’t feel I can economically justify the act of purchasing every Ruby or Rails book put out there. So if you are a publisher or an author, and you’d like for me to consider your book, you are certainly welcome to send me a review copy. I will definitely read it, but only include it on these lists if it’s either outstanding or as good as the existing ones. If it’s a programming book that’s not related to Ruby/Rails, yet is really good, I would consider reviewing it on my blog.
In a previous article I compared the performance of Ruby on Windows, built through Microsoft Visual C++ and GCC. The numbers for the MinGW version were very impressive. So the question now becomes, how does its performance compare to that of Ruby on Linux? To quote one person (Alex) who commented on the aforementioned post:
With the new mingw32 substantial speed improvements, think it makes sense now to also test at least the baseline (MRI) on Mac/Linux on the same battery of tests, so we Windows folks could get a better idea of how far behind are we yet and what the different Windows interpreters speed target shall be.
Any sort of performance improvement for something that is notoriously slow on Windows is more than welcome, but would this be enough to fill the gap between Ruby’s performance on Windows and on Linux? How much faster is Ruby on Linux? Let’s find out.
Setup
Benchmark results
The following table/image compares the performance of Ruby 1.8.6 on Windows and Linux. A light green background indicates which of the two was faster. The total times exclude tests that raised an error or were not available (N/A) for any of the four implementations, but includes timeouts (they are counted as 300 seconds to provide a lower bound). The ratio column indicates how many times faster Ruby on Linux was:

The second table/image below compares Ruby 1.9.1 on Windows and on Linux, using the same criteria as above.

Note: The totals shown are different from the ones seen in other posts since the subset of benchmarks included in the totals is different.
Conclusion
According to the geometric mean of the ratios for these tests, it appears that on average Ruby 1.8.6 on Linux is about twice as fast as Ruby 1.8.6 on Windows. Conversely, Ruby 1.9.1 on Linux is about 70% faster than the Windows version.
The Windows implementations use GCC 3.4.5 (a four year old compiler) at the moment, while I built the implementations on Ubuntu with GCC 4.3.3 (which is available by default). This helps, at least in part, to justify the performance gap. Luis Lavena, leader of the Windows port, confirmed to me that a switch to GCC 4.4.x is planned for the future. This should significantly increase performance on Windows yet again, and bump Ruby’s speed on Windows a bit closer to the speed that’s obtainable on Linux.
For the time being, switching to Ruby 1.9.1 on Windows will give you a performance that is better than what’s obtained by those who are still using Ruby 1.8.x on Linux. If it’s possible, switch.
Kenneth McDonald posted the following question about Scala’s future in the Scala mailing list:
I thought it would be interesting to find out people’s predictions for how much of the Java market Scala will eventually penetrate. It’s nice to see Scala doing reasonably well so far, so now’s your chance to make a prediction on the future of Scala:
a) Scala will remain a niche language, competing with Groovy, JRuby, etc.
b) Scala will become the dominant “second” language for the JVM.
c) Scala will actually become big enough to compete with Java in many respects.If none of these fit your outlook, feel free to make up your own answer.
In my opinion, the answer will be A (niche) or B (second language), depending on the community’s ability to:
The software development community is embracing dynamically typed languages like Python and Ruby, as well as functional programming. Languages like Python and Ruby simplify the development process for a beginner. They are easy to read, learn and use. Their adoption is therefore growing quickly. They also offer practical solutions to web development needs through frameworks like Rails and Django.
The paradigm shift to functional programming is much slower however. Functional programming done a la Scala, is undoubtedly advantageous for the competent software engineer, but moving from Java to Scala requires significant commitment and a higher degree of understanding about how to program, in my opinion. Are Java developers up to this? Some of them are, but I doubt that a majority of them would be. It should be noted that Scala has the potential to also attract developers who don’t program in Java, but rather come from a different background, such as the aforementioned Python and Ruby. These people may look into Scala as a more functional-oriented, robust and faster alternative.
This, coupled with the current, limited popularity of Scala, leads me to believe that option C (let’s say ~50% marketshare) is not a realistic option for the foreseeable future. We are entering the long tail period of programming, where many lesser known languages will start gaining some traction. Achieving a “second language” status is still an ambitious and worthy goal. There are many formidable opponents aiming for that, including JRuby, Jython, Groovy, and Clojure, and the rewards for doing so would be substantial for everyone involved.
The real question is not how far Scala is capable of going in terms of adoption, but rather what can be done to ensure that it will achieve widespread acceptance, however that may be defined. Paying close attention to the success stories of Ruby and Python may give the community some insight as to what should be done. To a minor extent, fellow functional programming languages like Erlang and Haskell are doing a fine job marketing-wise and are gaining traction. As of 3.30 pm today, there are 147 members in the irc channel for Scala, and 576 for Haskell. This is not to say that one is better than the other, or even that Haskell is more popular than Scala in general, but rather that there may be something to learn from a “similar” language, that poses compatible challenges for those who intend to approach it.
Being based on the JVM, well integrated with Java, and to a certain extent being Java-like, are all major advantages that may help Scala’s ability to gain major popularity. And being ingrained in the Java world, will affect the way Scala’s image and the ways it can be promoted as well. Whereas Ruby has a somewhat anti-corporate spirit and image, and is tough to sell to the Enterprise world, Scala doesn’t face these challenges. As such it may appeal to a much larger category of developers and companies. However, a lot of work will be required to reach that tipping point. Scala is already an excellent language, as far as I can see, but it will be a combination of technical efforts and marketing to really decide Scala’s popularity.
In yesterday’s post I compared IronRuby 0.9, Ruby 1.8.6 (from the One-Click Installer) and Ruby 1.9.1 (downloaded from the official site) against one another. IronRuby did great, but the discussion in the comment section quickly veered towards what version of the One-Click Ruby Installer should have been used.
I justified my choice of using the “old” One-Click Installer, by the fact that I wasn’t aware of official releases of the new installer. As well as that the old One-Click Installer is the most widely downloaded version. Very few people are familiar with the upcoming version of the project. This point is about to change.
Luis Lavena took over the One-Click Installer project and has been working on the next version (RubyInstaller from now on), the aim of which is to replace the One-Click Installer by building Ruby 1.8 and 1.9 with MinGW and GCC. In theory, this brings performance gains on Windows to the table, and gets rid of having to use Visual C++ 6 (a “ten year old compiler”) to build Ruby and other native gems. The project also strives to be lighter by bundling fewer (unnecessary) gems for Windows users.
There’s no doubt that in the long run, this new project will become the de facto standard for Windows, but the questions on everyone’s mind are, should I bother with it now? How much of a performance boost are we talking about here? 10%? 20%? Let’s find out.
In this follow up article I’m going to compare the performance of Ruby 1.8.6 from the One-Click Installer (mswin32), Ruby 1.8.6 from RubyInstaller (mingw32), Ruby 1.9 (mswin32) downloaded from the Ruby-Lang.com site, and Ruby 1.9 from the RubyInstaller project (mingw32) against one another. I’ll copy and paste part of the setup and disclaimer from yesterday’s post, for those who haven’t read it. Feel free to skip this part if you wish.
Setup
Disclaimer
Benchmark results
The table/image below shows the times for each benchmark, for Ruby 1.8.6 (mswin32), Ruby 1.8.6 (mingw32), Ruby 1.9.1 (mswin32), and Ruby 1.9.1 (mingw32). In the table I used (RI) as shorthand for RubyInstaller to indicate mingw32 versions.

Red values are errors, timeouts and inapplicable tests. Green, bold values indicate better times than what Ruby 1.8.6 (mswin32) delivered. A pale yellow background indicates the best time for a given benchmark. Total time is the run-time for the subset of benchmarks that were successfully executed by all four implementations. Timeouts have been included this time around. Each timeout has been counted as an additional 300 seconds.
The total runtime (including timeouts) is summarized in the chart below:

Conclusion
Wow! Ruby 1.8.6 (mingw32) improves from 3% to 664% (depending on the test), over the current One-Click Installer. The geometric mean of the ratios (read “on average”) tells us that it was about 283% faster. The Ruby 1.9.1 version provided by the RubyInstaller was slower than the mswin32 version in a couple of tests, but faster everywhere else. How much faster? Up to 342% faster, with an average (again calculated through the geometric mean of the ratios) of a 77% increase in speed.
These finds prompt me to ask, what are we waiting for? You know how unresponsive Ruby is on Windows, and how tests take forever to execute? These mingw32-based releases may very well solve this. And incidentally the bar has been raised for IronRuby as well. I formally invite the Ruby on Windows community to embrace these two projects.
In my latest article I discussed the importance of JRuby as a means of introducing Ruby to the Enterprise world. Most of the companies that belong to this ecosystem are Java based, but we cannot forget that a sizable portion of them are Microsoft-centric. Within these companies, Ruby will be far more welcome if a .NET implementation is available. The answer to this need is sufficiently fulfilled by IronRuby (version 0.9 was just released).
IronRuby has been progressing fast lately. First came support for Rails and then, with this release, a great deal of effort has been placed on improving performance. In the past, IronRuby was all but fine-tuned. In fact, it was several times slower than Ruby MRI, as the team worked on improving compatibility with Ruby 1.8 and mostly ignored performance.
In this article I’m going to provide some performance results for IronRuby 0.9 on Windows, which I’m sure will interest readers of this blog as well as of my book. Before revealing all the details, let’s start with the setup and a disclaimer. Please read through it carefully, because the old, trite comments about how “micro-benchmarks are useless” won’t be published. We’ve already been there, folks. Thank you all for your understanding.
Setup
Disclaimer
Benchmark results
The table below shows the times for each benchmark, for IronRuby 0.9, Ruby 1.8.6 (2008-08-11 patchlevel 287) and Ruby 1.9.1p0 (2009-01-30 revision 21907):
| Benchmark File | # | Ruby 1.8.6 | IronRuby | Ruby 1.9.1 |
| macro-benchmarks/bm_gzip.rb | 100 | Timeout | IOError | N/A |
| macro-benchmarks/bm_hilbert_matrix.rb | 20 | 1.891 | 0.453 | 0.125 |
| macro-benchmarks/bm_hilbert_matrix.rb | 30 | 7.422 | 1.719 | 0.656 |
| macro-benchmarks/bm_hilbert_matrix.rb | 40 | 21.500 | 4.625 | 2.266 |
| macro-benchmarks/bm_hilbert_matrix.rb | 50 | 56.765 | 10.031 | 5.109 |
| macro-benchmarks/bm_hilbert_matrix.rb | 60 | 111.859 | 18.781 | 11.297 |
| macro-benchmarks/bm_norvig_spelling.rb | 50 | Timeout | 41.313 | 31.453 |
| macro-benchmarks/bm_sudoku.rb | 1 | 43.734 | Timeout | 6.313 |
| micro-benchmarks/bm_app_factorial.rb | 5000 | 1.328 | 0.063 | 0.266 |
| micro-benchmarks/bm_app_fib.rb | 30 | 6.156 | 0.594 | 0.813 |
| micro-benchmarks/bm_app_fib.rb | 35 | 74.125 | 6.922 | 9.344 |
| micro-benchmarks/bm_app_mandelbrot.rb | 1 | 11.953 | 6.922 | 0.641 |
| micro-benchmarks/bm_app_pentomino.rb | 1 | Timeout | 59.938 | 75.859 |
| micro-benchmarks/bm_app_strconcat.rb | 1.5M | 30.469 | 2.141 | 4.813 |
| micro-benchmarks/bm_app_tak.rb | 7 | 5.516 | 0.531 | 0.578 |
| micro-benchmarks/bm_app_tak.rb | 8 | 15.609 | 1.484 | 1.703 |
| micro-benchmarks/bm_app_tak.rb | 9 | 45.843 | 3.953 | 4.531 |
| micro-benchmarks/bm_app_tarai.rb | 3 | 19.985 | 1.844 | 2.156 |
| micro-benchmarks/bm_app_tarai.rb | 4 | 19.796 | 2.219 | 2.656 |
| micro-benchmarks/bm_app_tarai.rb | 5 | 24.235 | 2.688 | 3.063 |
| micro-benchmarks/bm_binary_trees.rb | 1 | Timeout | 53.078 | 37.375 |
| micro-benchmarks/bm_count_multithreaded.rb | 16 | 0.297 | 0.266 | 0.328 |
| micro-benchmarks/bm_count_shared_thread.rb | 16 | 0.250 | 0.188 | 0.203 |
| micro-benchmarks/bm_fannkuch.rb | 8 | 3.625 | 0.344 | 0.563 |
| micro-benchmarks/bm_fannkuch.rb | 10 | Timeout | 40.750 | 65.438 |
| micro-benchmarks/bm_fasta.rb | 1M | 192.937 | 23.703 | 35.234 |
| micro-benchmarks/bm_fractal.rb | 5 | 43.172 | 4.672 | 5.781 |
| micro-benchmarks/bm_gc_array.rb | 1 | 228.672 | 32.031 | 59.828 |
| micro-benchmarks/bm_gc_mb.rb | 500K | 8.109 | 1.109 | 0.469 |
| micro-benchmarks/bm_gc_mb.rb | 1M | 16.172 | 2.391 | 1.016 |
| micro-benchmarks/bm_gc_mb.rb | 3M | 44.953 | 6.906 | 2.938 |
| micro-benchmarks/bm_gc_string.rb | 1 | 47.937 | 25.250 | 11.938 |
| micro-benchmarks/bm_knucleotide.rb | 1 | 9.625 | 2.906 | 2.016 |
| micro-benchmarks/bm_lucas_lehmer.rb | 9689 | 122.672 | 18.125 | 36.250 |
| micro-benchmarks/bm_lucas_lehmer.rb | 9941 | 156.750 | 19.625 | 39.391 |
| micro-benchmarks/bm_lucas_lehmer.rb | 11213 | 179.915 | 28.844 | 61.063 |
| micro-benchmarks/bm_lucas_lehmer.rb | 19937 | Timeout | 159.078 | Timeout |
| micro-benchmarks/bm_mandelbrot.rb | 1 | Timeout | 65.781 | 81.766 |
| micro-benchmarks/bm_mbari_bogus1.rb | 1 | 0.031 | 40.406 | 8.781 |
| micro-benchmarks/bm_mbari_bogus2.rb | 1 | 0.156 | Timeout | N/A |
| micro-benchmarks/bm_mergesort_hongli.rb | 3000 | 25.282 | 3.531 | 6.031 |
| micro-benchmarks/bm_mergesort.rb | 1 | 24.735 | 3.906 | 3.219 |
| micro-benchmarks/bm_meteor_contest.rb | 1 | 147.704 | 19.713 | 19.781 |
| micro-benchmarks/bm_monte_carlo_pi.rb | 10M | 79.406 | 5.109 | 20.672 |
| micro-benchmarks/bm_nbody.rb | 100K | 37.625 | 8.281 | 10.938 |
| micro-benchmarks/bm_nsieve_bits.rb | 8 | 69.656 | 33.156 | 6.531 |
| micro-benchmarks/bm_nsieve.rb | 9 | 58.344 | 5.453 | N/A |
| micro-benchmarks/bm_partial_sums.rb | 2.5M | 93.391 | 10.797 | 26.422 |
| micro-benchmarks/bm_pathname.rb | 100 | Timeout | Timeout | Timeout |
| micro-benchmarks/bm_primes.rb | 3000 | 21.359 | 9.594 | 0.031 |
| micro-benchmarks/bm_primes.rb | 30K | Timeout | Timeout | 0.469 |
| micro-benchmarks/bm_primes.rb | 300K | Timeout | Timeout | 5.281 |
| micro-benchmarks/bm_primes.rb | 3M | Timeout | Timeout | 100.406 |
| micro-benchmarks/bm_quicksort.rb | 1 | 51.046 | 11.594 | 8.703 |
| micro-benchmarks/bm_regex_dna.rb | 20 | 181.172 | 21.188 | 11.938 |
| micro-benchmarks/bm_reverse_compliment.rb | 1 | 61.875 | 48.469 | 138.047 |
| micro-benchmarks/bm_so_ackermann.rb | 7 | 2.234 | 0.563 | 0.484 |
| micro-benchmarks/bm_so_ackermann.rb | 9 | 50.000 | 14.938 | 9.281 |
| micro-benchmarks/bm_so_array.rb | 9000 | 26.328 | 8.984 | 10.781 |
| micro-benchmarks/bm_so_count_words.rb | 100 | Timeout | 60.688 | 42.250 |
| micro-benchmarks/bm_so_exception.rb | 500K | 78.125 | Timeout | 32.672 |
| micro-benchmarks/bm_so_lists_small.rb | 1000 | 13.906 | 4.250 | 3.172 |
| micro-benchmarks/bm_so_lists.rb | 1000 | 64.531 | 22.266 | 16.797 |
| micro-benchmarks/bm_so_matrix.rb | 60 | 8.312 | 2.781 | 2.125 |
| micro-benchmarks/bm_so_object.rb | 500K | 16.375 | 5.313 | 1.672 |
| micro-benchmarks/bm_so_object.rb | 1M | 29.312 | 10.500 | 2.844 |
| micro-benchmarks/bm_so_object.rb | 1.5M | 43.312 | 16.000 | 4.281 |
| micro-benchmarks/bm_so_sieve.rb | 4000 | 241.922 | 37.859 | 35.688 |
| micro-benchmarks/bm_socket_transfer_1mb.rb | 10K | 13.266 | SocketError | 3.359 |
| micro-benchmarks/bm_spectral_norm.rb | 100 | 5.110 | 0.922 | 0.719 |
| micro-benchmarks/bm_sum_file.rb | 100 | Timeout | 20.406 | 23.797 |
| micro-benchmarks/bm_word_anagrams.rb | 1 | 70.828 | 30.188 | 8.125 |
| TOTAL TIME | - | 2933.334 | 607.088 | 664.094 |
Red values are errors, timeouts, inapplicable tests and times that were worse than Ruby 1.8.6. Green, bold values are better times than what Ruby 1.8.6 delivered. A pale yellow background indicates the best time for a given benchmark. Total time is the runtime for the subset of benchmarks that were successfully executed by all three implementations (whose cardinality is 54).
The total runtime is summarized by the chart below:

And let’s compare each of the “macro-benchmarks” on an individual basis:

Conclusions
IronRuby went from being much slower than Ruby MRI to considerably faster across nearly all the tests. That’s major progress for sure, and the team behind the project deserves mad props for it.
One final warning before we get too excited here. IronRuby is not faster than Ruby 1.9.1 at this stage. Don’t let that first chart mislead you. While it’s faster in certain tests, it’s also slower is many others. Currently, it’s situated between Ruby 1.8.6 and Ruby 1.9.1, but much closer to the latter. The reason why this chart is misleading is that it doesn’t take into account any tests that timed out, and several of such timeouts were caused by IronRuby (more than those caused by Ruby 1.9.1). If you were to add, say, 300 seconds to the total, for each timeout for the two implementations, you’d quickly see that Ruby 1.9.1 still has the edge. The second chart that compares macro-benchmarks does a better job at realistically showing how IronRuby sits between Ruby 1.8.6 and Ruby 1.9.1 from a performance standpoint. If you were to plot every single benchmark on a chart, you’d find a similar outcomes for a large percentage of the tests.
Whether it’s faster than Ruby 1.9 or not, now that good performances are staring to show up, it’s easier to see IronRuby delivering on it’s goal of becoming the main implementation choice for those who both develop and deploy on Windows. This, paired with the .NET and possible Visual Studio integration, the great tools available to .NET developers, and the ability to execute Ruby code in the browser client-side thanks to projects like Silverlight/Moonlight and Gestalt, make the project all the more interesting.
What are your thoughts on IronRuby, and how will this dramatic performance gain affect your projects?
In a recent blog entry, Charles Nutter argues about the importance of JRuby for Ruby’s adoption within the Enterprise. Or, in his own words:
The idea of “Enterprise Ruby” has become less repellant since Dave Thomas’s infamous keynote at RailsConf 2006. There are a lot of large, lumbering organizations out there that have yet to adopt any of the newer agile language/framework combinations, and Rails has most definitely led the way. I personally believe that in order for Ruby to become more than just a nice language with a great community, it needs to gain adoption in those organizations, and it needs to do it damn quickly. JRuby is by far the best way for that to happen.
He has a very good point. Working for IBM (it doesn’t get much more Enterprise than that) I can testify to the number of colleagues and partners who ask me questions like, “Can I interface Rails with Java?”, “Can I deploy it with WebSphere?” or “How can I generate a Rails WAR file?”. The answers to these and similar questions are all found in JRuby.
A couple of years ago I “toured” Canada, speaking at a few IBM, internal conferences. The vast majority of my attendees were experienced Java developers who were doing business consulting for IBM’s clients. They were all very enthusiastic about my presentation on Ruby and Rails. It was a break from J2EE’s complexity. These people were genuinely excited about the perspective of using Rails when doing client work.
Mid-conference, one attendant said to me, “This is cool, but they’ll never let us use this stuff”. And that’s when I reached for the JRuby slides. The mood in the room suddenly shifted. These developers started to think “OK, this could actually work”. At the end of my speech, most of the questions I received had to do with JRuby.
As I mentioned during that series of conferences “JRuby can be your gateway to introducing Rails into your workplace”. Many people within the Enterprise world don’t have an option. It’s either a JVM-based solution or they have to give up on Rails altogether.
JRuby is not only attractive to Ruby fans who’d like to use Ruby/Rails in certain work environments, it’s also appealing to those who are looking for an alternative to Java as a language. Here is where we could hit the jackpot in terms of Ruby’s adoption. There are countless Java programmers in the world. Convincing even just a fraction of them to switch would be enough to drastically increase the size of our community.
As Charles mentioned in his post, people can now pick between Scala, Clojure, Groovy, JRuby and Jython. I believe that the choice developers ultimately make boils down to three key usability aspects:
Charles’ team has been focusing on the right things. If I can be permitted one criticism though, it would be to avoid responding to every post that praises a competing implementation. Openly fighting against other implementations can backfire and looks unprofessional. I understand the desire to set the record straight and being competitive, but there is no reason to constantly point out that “these implementations are not done” every time an early project shows some form of promise or progress. Otherwise, it’s easy to come across as someone whose “heart turned black as coal, and who finds himself wishing bad luck towards other implementations”.
Today JRuby is an Enterprise-friendly alternative to Ruby MRI/KRI; and Charles is right, JRuby is important for Ruby’s future. It would however be wrong to assume that JRuby is the only sort of future for Ruby and that C/C++ based implementations are becoming irrelevant. Ruby has never been a zero-sum game. Plurality is a substantial part of what the Ruby ecosystem is all about.
Finally, let me conclude by congratulating the JRuby team, who have just been hired by Engine Yard. I think this could be a very strategic move for both JRuby and Engine Yard.
Most programmers I know hate marketing. Their dislike stems from two root causes: the fact that they aren’t naturally good at it, and their misconception of what technical marketing actually is. “Naturally” is the keyword here, given that technical marketing takes a certain sort of conscious effort and is a skill (a social one) that can be learned, just like programming.
I fully understand that most programmers prefer to focus on coding, and coding alone. But being good at marketing is a valuable skill, whether you are the last code monkey on a project or the CTO of an emerging multimillion dollar company.
Technical marketing is not about spamming, selling out, deceiving or spreading FUD regarding your competition, whoever they may be. Technical marketing is about promoting a given product or technology by clearly illustrating its advantages to a technical audience. This audience might consist of your boss at work, who you’re trying to convince to let you use a certain technology or programming language; it could be potential investors for your startup, users of your open source project, or technically-minded customers that you’re selling your product to.
The underlying product has to be good, regardless of how you promote it. Yet just being good is not enough on its own. Many virtually unknown products are good. The way you present your product can make the difference between success and indifference, an unwanted slug-paced growth and the birth of a phenomenon.
Ruby on Rails’ story is a proverbial example of technical marketing done right – and it all started with a convincing screencast by David Heinemeier Hansson. David’s demo was not the most amazing technical demonstration of all time, but it was effective at conveying the potential benefits that could derive from the adoption of this new framework. That’s what got people interested enough to want to take a second look at it. The framework actually being good, did the rest.
Keep in mind that I’m mostly talking about marketing as a mindset, as opposed to a series of actions taken once you’ve released your product. The features you decide to include, the UI, the user experience you ultimately provide, the documentation, the logo, the name, the domain url you choose, are all affected by the way you think about the marketing and promotion of your product, before you ship it.
Take Ruby 1.9 for example, which has clearly improved over its 1.8 predecessor. When asked the question, “Why hasn’t the Ruby community switched to Ruby 1.9 yet?”, most Ruby programmers would tell you it’s due to the fact that many gems and plugins do not work yet. Some may even argue that most hosting providers have not yet extended their support to cover Ruby 1.9. True as these points may be, they are only side effects and not the real cause. From a marketing perspective, using version number 1.9 was a huge mistake.
Arbitrary and meaningless as version numbers can be, when you slap a version number on a piece of software, you are making a statement about it. For example, there’s a commonly held perception that anything below version 1.0 is considered as being in the experimental stage, it’s a moving target (and API), and something you’ll most likely want to avoid in production. A 1.0 release is worth considering, but it still doesn’t convey a sense of trust. This take on version number is ultimately silly, right? Well, that’s because humans by their very nature are silly at times. People often make emotionally guided decisions, not necessarily rational ones.
1.9 was a terrible choice for several reasons. Within the Ruby community (and it’s not the only one that operates like this) odd numbered releases are considered to be development releases. For a very long time, the plan was to have 1.9 be a development release, while work was underway on Ruby 2.0. This may have taken too long though as the plan was altered. Ruby 1.9, with all its drastic changes, is now the stable, official version. The decision to go this route caused confusion for many people and failed to convey the importance and newness of this release.
If Matz manages to release a 2.0 version within the next two years, I can guarantee you that people would rather jump directly to 2.0 (from 1.8) than pass through the gates of 1.9 – even if all he’s done is add a couple of new/improved features. All you have to do is call it 2.0, and people will run, not walk, towards the possibility of dropping Ruby 1.8.x faster than yesterday’s newspaper.
In my opinion, the right thing to do would have been to call this release 2.0, get most of the community to switch to YARV and the language’s new features, and then incrementally develop and bake-in additional features for the next version. Ruby is no longer the project that only a few people cared about back in the 90s; it’s now a major player in the programming language arena. As such it’s understandable if things don’t take ten years to move from one version to another. And keep in mind, this is just a arbitrary version number we’re talking about here.
Calling your open source forum “Beast” may not be the smartest idea either. People who go searching for it, will suddenly realize that they’ve stumbled upon another kind of beast forum entirely, one of the sort that is illegal in many of the 50 states. Similarly, obscure names that are hard to pronounce and communicate will usually end up damaging the growth of your project. As so does having a complicated installation procedure, a getting started guide that makes a lot of assumptions, a crowded and ugly looking site, and so forth.
There are countless other aspects to consider that can improve the appeal and the perceived value of one’s product/project/company. Small adjustments to the way we think about projects and the way we showcase them, can have a huge impact on their success. It’s worth genuinely caring about these details and embracing the possibilities that begin to open up when you make decisions with marketing in the forefront of your mind.
This is a great day for those of us who love DB2, as DB2 Express-C 9.7 has just been released. As mentioned before, this is the best DB2 ever, and an extremely important release.
To learn more about what’s new in this release, please check out the recording of our latest webinar:
If you run Linux, Unix or Windows, download it while it’s hot.
DB2 9.7 on the Cloud
Another great aspect of this release is that for the first time ever, DB2 has been released both as a product and as a deployment on the Cloud. If you pop over to RightScale, you can get a trial account for free and should see DB2 Express-C 9.7 on both CentOS and Ubuntu within the partner catalog. RightScale has been an amazing partner and they really do wonders to simplify Cloud Computing. In ten minutes time you can be up and running on the Cloud, thanks to the templates provided.
DB2 support for Django
But the good times don’t stop there, we are also announcing the first official release of the Django adapter for DB2. It sounded crazy when I first proposed the idea within IBM back in 2006, but now it’s a reality.
You can download the .tar.gz archive from the Google Code homepage for the project, or simply by clicking here. This version fully supports the Django 1.0.2 API. For instructions on how to install it, please read the Getting started with the IBM DB Django adapter guide. The current version supports DB2 for Linux, Unix, Windows and MAC OS X, version 8.2 or higher (9.5 FP2 or higher for MAC OS X). In the future, IBM Cloudscape, Apache Derby, Informix (IDS) and both System i & z/OS will be supported.
ibm_db gem updated to 1.1
I’ll conclude this DB2-centric post with a smaller, but still interesting announcement. The ibm_db gem has been updated to version 1.1. This release includes support for ActiveRecord’s QueryCache mechanism, enhanced support for BigInt (and BigSerial), support for rename_column (requires DB2 9.7), parametrization of the timestamp datatype (requires DB2 9.7), and a few fixes and performance enhancements as well. It is recommended that you upgrade to this version.
Counting rows is an ubiquitous operation on the web, so much so that it’s often overused. Regardless of misuse, there is no denying that the performance of counting operations has an impact on most applications. In this post I’ll discuss my findings about the performance of DB2 9.5 and MySQL 5.1 regarding counting records.
For those of you who are not into science fiction, let me clarify that the odd title of this post is a tongue-in-cheek reference to the great novel, Do Androids Dream of Electric Sheep?.
I connected to the database, created the table, imported the data and benchmarked counting operations using ActiveRecord in a standalone script. Here is the code I used:
#!/usr/bin/env ruby
require "rubygems"
require "active_record"
require 'benchmark'
ActiveRecord::Base.establish_connection(
:adapter => :mysql,
:username => "myuser",
:password => "mypass",
:database => "mydb")
ActiveRecord::Schema.define do
create_table :people, :force => true do |t|
t.string :name, :null => false
t.string :fbid, :null => false
t.string :gender
t.string :profession
end
end
class Person < ActiveRecord::Base
end
# This can be sped up by performing an import instead
Person.transaction do
File.open("person.tsv").each_line do |line|
line = line.split(/\t/)
p = Person.new
p.name = line[0]
p.fbid = line[1]
p.gender = line[6]
p.profession = line[17]
p.save!
end
end
n = 100
Benchmark.bm(26) do |x|
x.report("Count all:") { n.times { Person.count } }
x.report("Count profession:") { n.times { Person.count(:profession) } }
x.report("Count females:") do
n.times { Person.count(:conditions => "gender = 'Female'") }
end
x.report("Count males w/ profession:") do
n.times { Person.count(:profession, :conditions => "gender = 'Male'") }
end
end
Please note that importing records in a huge transaction containing hundreds of thousands of INSERT operations is far from the most efficient way to import. Massive imports of data using the load/import facilities provided by each database is the way to go (also see the ar-extensions plugin). The lengthy import wasn’t benchmarked here though, so it isn’t determinant for this article.
people.tsv is a 92.7 MB tab separated values file that contains 875,857 records from the Freebase project (in my file I removed the header line, leaving only records).
For those who are not familiar with ActiveRecord, the queries executed behind the scenes are (in order):
SELECT count(*) AS count_all FROM people
SELECT count(people.profession) AS count_profession FROM people
SELECT count(*) AS count_all FROM people WHERE (gender = 'Female')
SELECT count(people.profession) AS count_profession FROM people WHERE (gender = 'Male')
While the table definition (for MySQL) is:
CREATE TABLE `people` (
`id` int(11) DEFAULT NULL auto_increment PRIMARY KEY,
`name` varchar(255) NOT NULL,
`fbid` varchar(255) NOT NULL,
`gender` varchar(255),
`profession` varchar(255)
) ENGINE=InnoDB
As easily verified by enabling logging with:
ActiveRecord::Base.logger = Logger.new(STDOUT)
Without much further ado, here are the times I obtained on my last generation MacBook Pro 2.66 GHz with 4 GB DDR3 RAM, and 320 GB @ 7200 rpm hard disk, running Mac OS X Leopard:
MySQL:
Count all: 42.467522
Count profession: 52.130935
Count females: 54.575469
Count males w/ profession: 64.046631
DB2:
Count all: 5.818485
Count profession: 7.714391
Count females: 8.556377
Count males w/ profession: 9.656739
Or in graph form:
That’s an impressive difference. To be exact, in this example DB2 was between 6 and 7 times faster than MySQL. In the case of COUNT(*), DB2 counted almost a million records in 58 milliseconds, or in about the blink of an eye according to Wolfram Alpha.
For those who are skeptical, please note that DB2 was not manually fine-tuned in any way. The client codepage was set to 1252 to allow Greek letters, and the log size was increased to permit such a huge transaction during the import. That’s it, no optimizations were attempted. This is DB2 Express-C out of the box. It looks like smart androids count electric sheep with DB2 after all.
The advantages of DB2 over MySQL when dealing with a massive volume of traffic are well known (and not limited to performance either), but DB2 can dramatically improve performance even for your average web application. And DB2 9.7, which will be released this month, increases the performance and the ability to self-tune itself to the available resources and required workload even further. If you’d like to try DB2 Express-C for yourself, you can download it here. It doesn’t cost you a dime to obtain and can be used for development, testing and production absolutely free of charge.
Wikipedia defines memoization as “an optimization technique used primarily to speed up computer programs by having function calls avoid repeating the calculation of results for previously-processed inputs.”. This typically means caching the returning value of a function in a dictionary of sorts using the parameters passed to the function as a key. This is done in order to reuse that returning value immediately without calculating it again, when the function is invoked with the same arguments. Even though we are trading space for time, it is often invaluable for speeding up certain recursive functions and when dealing with dynamic programming where intermediate calls are often repeated many times.
Using memoization in Ruby is very easy thanks to the memoize gem. The first step to getting started is therefore to install it:
$ sudo gem install memoize
Successfully installed memoize-1.2.3
1 gem installed
Installing ri documentation for memoize-1.2.3...
Installing RDoc documentation for memoize-1.2.3...
Now we can use the memoize method as illustrated in the example below:
require 'rubygems'
require 'memoize'
require 'benchmark'
include Memoize
def fib(n)
return n if n < 2
fib(n-1) + fib(n-2)
end
Benchmark.bm(15) do |b|
b.report("Regular fib:") { fib(35) }
b.report("Memoized fib:") { memoize(:fib); fib(35)}
end
In the first block we simply invoke fib(35), while in the second one we first invoke the method memoize(:fib) to memoize the method fib. Running this code on my machine prints the following:
user system total real
Regular fib: 55.230000 0.160000 55.390000 ( 55.819205)
Memoized fib: 0.000000 0.000000 0.000000 ( 0.001305)
We went from almost a minute of run time to an instantaneous execution. Optionally we could even pass a file location to the function memoize and this would use marshaling to dump and load the cached values on/from disk.
For Python we can write a simple decorator that behaves in a similar manner. In its simplest form it can be implemented as follows:
# memoize.py
def memoize(function):
cache = {}
def decorated_function(*args):
try:
return cache[args]
except KeyError:
val = function(*args)
cache[args] = val
return val
return decorated_function
Or more efficiently:
# memoize.py
def memoize(function):
cache = {}
def decorated_function(*args):
if args in cache:
return cache[args]
else:
val = function(*args)
cache[args] = val
return val
return decorated_function
When the memoized function has been invoked, we look in the cache to see if an entry for the given arguments already exist. If it does, we immediately return that value. If not, we call the function, cache the results and return its returning value.
Truth be told, the limit of this approach lies in the fact that since we are using a dictionary, only immutable objects can be used as keys. For example, we can use a tuple but are not allowed to have a list as a parameter. For the example within this article, this approach will suffice, but to take advantage of memoization when using arguments that are mutable, you may want to consider the approach described in this recipe.
We can now rewrite the Ruby example above in Python as follows:
import timeit
from memoize import memoize
def fib1(n):
if n < 2:
return n
else:
return fib1(n-1) + fib1(n-2)
@memoize
def fib2(n):
if n < 2:
return n
else:
return fib2(n-1) + fib2(n-2)
t1 = timeit.Timer("fib1(35)", "from __main__ import fib1")
print t1.timeit(1)
t2 = timeit.Timer("fib2(35)", "from __main__ import fib2")
print t2.timeit(1)
Running this code on my machine prints the following:
9.32223105431
0.000314950942993
In Python 2.5′s case by employing memoization we went from more than nine seconds of run time to an instantaneous result.
Granted we don’t write Fibonacci applications for a living, but the benefits and principles behind these examples still stand and can be applied to everyday programming whenever the opportunity, and above all the need, arises.
Previously I mentioned the importance of migrating away from Ruby 1.8, in favor of 1.9. Before my business trip to Italy, I had a chance to watch David A. Black’s new videos for Envycast, in which he presents the essential concepts required to migrate from Ruby 1.8 to 1.9. This pair of videos totals roughly an hour and a quarter, and can be purchased in a package deal for $16. You probably won’t find them to be as entertaining as the ones filled with gags by Gregg Pollack and Jason Seifer, but in my opinion these videos are well thought out and highly informative. The price is fair if you consider that they can bring you up to speed with Ruby 1.9 in no time at all and with very little effort on your part.
Speaking of screencasts, in Italy I had a chance to pre-announce my “startup on the side”. It’s called ThinkCode.TV and will, you guessed it, create and sell high quality screencasts about programming. ThinkCode.TV was founded with a couple of friends of mine who are top notch programmers and teachers respectively in the Python and the XP/Agile world. Initially we’ll focus on the Italian market (the three of us are Italian) by producing screencasts in Italian about Ruby, Python and TDD. But we plan to expand our horizons, by covering more subjects, accepting external authors, and eventually expanding to the international market by migrating our best sellers to English versions, narrated by native English speakers (to save you the hassle of having to hear a foreign accent).
Should things go well, we may expand beyond the Italian and English markets. But for the time being, I invite Italian speakers to join our newsletter (which is in Italian) to learn about the development of this project and be advised of when we release the first videos. When we branch out to the English speaking world, my readers who don’t speak Italian, will be able to learn about it through this blog.