The Great Ruby Shootout measures the performance of several Ruby implementations by testing them against a series of synthetic benchmarks. Recently I ran Mac and Windows shootouts as well, which tested a handful of implementations. However this article reports on the results of extensive benchmark testing of eight different Ruby implementations on Linux.
The setup
For this shootout I included a subset of the Ruby Benchmark Suite. I opted to primarily exclude tests that were executed in fractions of a second in most VMs, focusing instead of more substantial benchmarks (several of which came from the Computer Language Benchmarks Game). The best times and least memory allocations out of five runs are reported here for each benchmark.
All tests were run on Ubuntu 10.4 LTS x86_64, on an Intel Core 2 Quad Q6600 2.40 GHz, 8 GB DDR2 RAM, with two 500 GB 7200 rpm disks.
8 implementations
The implementations tested were:
JRuby was run with the –fast and –server optimization flags.
Disclaimer
Synthetic benchmarks cannot predict how fast your programs will be when dealing with a particular implementation. They provide an (entertaining) educated guess, but you shouldn’t draw overly definitive conclusions from them. The values reported here should be assumed to be characteristic of server-side — and long running — processes; they should be taken with a grain of salt.
Time Results
Please find below the execution times for the selected tests. Timeouts indicate that the execution of a single iteration for a given test took more than 300 seconds and had to be interrupted. Bold, green values indicate the best performer out of each test.
Warning: The bm_primes.rb benchmark was originally written to aid the development of the Prime class. As such in 1.9.2 it was rewritten in C, which makes it a poor representation of implementation performance. This benchmark will removed in the future.

If you are not interested in the individual test results, the information presented in the table above is summarized directly below:
Ruby 1.9.2 JRuby
Min. : 0.013 Min. : 0.382
1st Qu.: 3.258 1st Qu.: 3.051
Median : 4.543 Median : 4.997
Mean : 9.262 Mean : 9.180
3rd Qu.: 8.573 3rd Qu.: 8.969
Max. :45.009 Max. :48.850
MagLev Ruby 1.9.1
Min. : 0.351 Min. : 0.015
1st Qu.: 2.140 1st Qu.: 3.387
Median : 6.069 Median : 6.205
Mean : 9.100 Mean :10.860
3rd Qu.: 9.266 3rd Qu.:11.559
Max. :51.221 Max. :46.849
Ruby 1.8.7 IronRuby
Min. : 0.708 Min. : 3.601
1st Qu.: 5.102 1st Qu.: 10.505
Median : 8.380 Median : 12.912
Mean :18.785 Mean : 26.539
3rd Qu.:24.793 3rd Qu.: 36.115
Max. :75.653 Max. :135.204
Rubinius REE
Min. : 0.484 Min. : 0.584
1st Qu.: 3.087 1st Qu.: 4.343
Median : 9.636 Median : 6.660
Mean :13.232 Mean :15.036
3rd Qu.:17.674 3rd Qu.:21.336
Max. :73.050 Max. :61.960
For the sake of convenience, I also produced a box plot from the successful data points:

There are a few considerations based on these results that I feel are worth mentioning:
Memory Results
The following table shows the approximate memory consumption for each implementation when running each benchmark:

Summarized:
Ruby 1.9.2 Ruby 1.9.1
Min. : 4.320 Min. : 4.580
1st Qu.: 4.378 1st Qu.: 4.695
Median : 6.285 Median : 6.920
Mean : 20.795 Mean : 15.669
3rd Qu.: 10.162 3rd Qu.: 11.383
Max. :171.500 Max. :100.570
Ruby 1.8 REE
Min. : 3.040 Min. : 8.220
1st Qu.: 4.290 1st Qu.: 9.682
Median : 7.745 Median : 15.565
Mean : 20.698 Mean : 27.014
3rd Qu.: 11.273 3rd Qu.: 38.620
Max. :103.520 Max. :125.910
Rubinius MagLev
Min. : 37.63 Min. : 81.74
1st Qu.: 39.78 1st Qu.: 82.52
Median : 45.48 Median : 83.53
Mean : 65.70 Mean : 96.29
3rd Qu.: 58.22 3rd Qu.: 98.10
Max. :215.33 Max. :175.85
JRuby
Min. : 49.04
1st Qu.: 71.23
Median :176.72
Mean :169.41
3rd Qu.:209.04
Max. :404.06
And finally, in graph form:

A few considerations on memory:
Linux Vs. Windows
This shootout and the Windows one were both performed on the same machine, thus we can compare how the same implementation perform under different operating systems. The only adjustment that’s required is the timeout. Every result longer than 60 seconds from this shootout needs to be considered a timeout, because the previous shootout did so as well.
It is commonly believed that Ruby performs much better on Linux than on Windows (with the exception of IronRuby). Let’s find out if these test results confirm that notion.
Ruby 1.8.7:

Ruby 1.9.2:

JRuby:

Finally, in chart form (yellow entries are on Windows as indicated by the labels containing W):

To use a beloved MythBusters expression, this myth is confirmed.
Note: As requested by a few commenters, here is a comparison of IronRuby as well (.NET 4.0 Vs. Mono 2.4.4):

Conclusion
In conclusion, let me just state that it’s nice to see several implementations getting faster. Ruby 1.9.2, JRuby, MagLev and Rubinius are all becoming serious competitors and working their respective ways closer to a similar performance level. If you think these benchmark shootouts are becoming boring, then the results are becoming more stable and predictable. I suspect that as time goes on, performance will not be the real distinguishing factor when choosing a Ruby implementation, other features will be.

Title: Practical Clojure
Authors: Luke VanderHart and Stuart Sierra
True pp.: 198
Publisher: Apress
Published on: June 2010
ISBN-13: 978-1430272311
Rating: 6.5/10
Published in June 2010, Practical Clojure by Luke VanderHart and Stuart Sierra is the latest Clojure book to hit stores. Despite the Clojure 1.0 jar shown at the beginning of the book, this title tries to cover the current version of the language, including references to concepts that will be introduced by the upcoming 1.2 version.
The target audience of this book is programmers who are absolutely new to Clojure. It didn’t strike me as being particularly aimed at developers who are coming from the Java camp, or the Lisp camp; in this regard, the book is rather “background agnostic”, even though Lisp programmers will feel much more at home than Java programmers will, due to the nature of the language itself.
The authors of the book are clearly well versed in this new language (Sierra is part of Clojure/core, the equivalent of the A-Team in Clojureland) and their confidence with the concepts presented is demonstrated throughout the book. Their explanations tend to be clear and to the point. Longer discussions are occasionally included when required to introduce concepts that are novel to most programmers, like the Software Transactional Memory (STM), refs, atoms and agents.
The book starts out by presenting a short but well-argued case for why Clojure is a worthwhile language, and then focuses almost exclusively on the core of the language. I’m afraid they do so to the detriment of the ecosystem surrounding Clojure. The authors don’t talk about how to install Clojure, recommend editors and IDEs (albeit a few are casually mentioned), or how to use build tools like Ant, Maven or Leiningen.
clojure.contrib, a fundamental extension library, is barely mentioned and there is no coverage of other important libraries or emerging frameworks. For instance, perhaps expectedly, Compojure (a web framework) and Incanter (a statistical and graphical environment) are only mentioned as examples of DSLs, however examples of their usage are not provided. (I believe the authors mistakenly refer to Compojure as Enclojure, which is a different project).
Despite the narrow focus, Practical Clojure doesn’t shy away from complex subjects and manages to include a chapter on Java interoperability, parallel programming, metaprogramming, and performance considerations. It does so briefly however, favoring a cursory presentation of the fundamental concepts rather than in-depth coverage, which would provide the reader with the degree of confidence required to approach real world problems.
The core language is covered in a manner that acts as both a tutorial and a reference. Major concepts, data structures, and common functions are presented to the reader with an endless supply of tiny examples. It’s easy to fly through them, but typing along in the REPL will be a far more valuable exercise for readers who hope to retain the information presented.
This leads us to another shortcoming of this book, which is the lack of more structured and complex examples. When I define their examples as “tiny”, with very few exceptions, I really mean it. For the first few chapters of the book the examples don’t get much larger than calculating the square root of a number through Newton’s method or adding contacts to an address book. Most of the other examples do a good job of illustrating the point they are trying to make with one, two, or just a handful of lines of code.
This is an actual sample of the kind of examples you’ll find throughout the book to illustrate many core API functions:
user=> (reduce + [1 2 3 4 5])
15
Note that this approach is didactically valid, because it isolates the function to show exactly how it works. After dozens of these functions though, you may expect larger examples to show how to integrate the use of some of these functions and data structures you’ve learned about. Such examples are seldom included. Furthermore, the book lacks any exercise for the reader. Foundational books that fail to offer many articulated examples and that lack exercises, tend to make it hard for the reader to retain the information and get some hands-on practice.
I have lots of respect for short books that get to the point and avoid wasting the reader’s time. K&R is notoriously acclaimed thanks to its clear and concise nature. However, Clojure is not C, and I feel that the 198 pages fall a little short when it comes to introducing this wonderful language to new readers. There is more to Clojure than simply surveying the language itself, even though I suspect that certain readers may appreciate this extremely narrow focus.
Overall the book is well-edited, despite the presence of minor issues. Aside from a few typos (e.g., “becauseall” on page 79), readers may find the formatting to be slightly inconsistent at times. For example in chapter 5 when presenting sequences, after the map function has been introduced, the font for the subsequent functions is substantially decreased for no apparent reason. Readers may be misled into thinking that the functions presented afterwards are somehow different from the previous ones, when in fact they’re all defined in clojure.core. In Listing 6-3, at page 103, the authors present their first “complex” example (the address book) and they do so by using, among others, doseq. This macro was not introduced before that page nor is it really explained within the example.
From a physical standpoint, this book is a rather thin and wide paperback. A small font, coupled with small margins and a wide layout, imply that the readability of the book suffers a little. The paper itself is off-white, fairly thick and slightly textured, not as pleasant to the touch as other books by Apress or most other technical publishers, even though I recognize that this is a matter of taste (some people may actually love it because of these characteristics).
With two introductory Clojure books on the market, drawing comparisons is unavoidable. Stuart Halloway’s Programming Clojure is a slightly older book (published in May 2009), which grants Practical Clojure a distinct advantage. This is not to say that Programming Clojure is obsolete, on the contrary it’s still a valid choice, but it doesn’t illustrate some of the new features that are available today. For example, in chapter 13 Practical Clojure introduces protocols and datatypes that will be available in Clojure 1.2 for the first time. Given that Halloway’s book was published more than a year ago, there was no possible way he could have included such powerful abstractions at the time.
Despite being older and less methodical than Practical Clojure, Programming Clojure tends to offer more complex examples. In the introduction of Programming Clojure you’ll see examples which Practical Clojure fails to include until much later in the book. Practical Clojure, the subject of this review, may leave you wanting for more practical examples of how all the language features fit together. Whereas Programming Clojure may leave you longing for more consistent explanations of how each part of the language works on its own.
Practical Clojure and Programming Clojure are competitors in the marketplace, but it wouldn’t be a bad idea to get ahold of both, because they complement each other quite well, in my opinion. Having to pick just one, I would probably recommend Practical Clojure, given its more consistent and up to date presentation. The sizzle offered by Programming Clojure, can be found to a much greater degree in upcoming and less introductory books, such as The Joy of Clojure. In this sense, reading Practical Clojure first followed by The Joy of Clojure, would be a solid learning path (Clojure in Action is another worthy addition, but it doesn’t replace The Joy of Clojure, which is a real gem).
In conclusion, Practical Clojure is not the Clojure equivalent of the highly praised Practical Common Lisp, from the same publisher. Reading it cover to cover and typing all the snippets included within, will not give you enough knowledge to start writing complex, idiomatic Clojure programs out of the gate.
However, if you are learning Clojure today, I do recommend this book. It’s a clear, well thought-out, concise introduction to the language that will give you a solid foundation as you go on to learn more about Clojure and Lisp in general.
My previous post about Clojure generated quite a bit of interest, so I thought I’d follow it up with something a bit more concrete. I primarily wrote this article for a friend who asked me for guidance on how to set it all up; and while this isn’t he only way to setup Clojure, I hope it will help other people who are also getting started with this great language.
As some people pointed out, setting up your Clojure environment can be slightly challenging. For example, when you install Ruby or Python, you can expect a Read-Eval-Print Loop (REPL) such as IRB or IDLE to be ready for you to use right out of the box, whereas the current version of Clojure (version 1.1) does not include an interactive top-level.
This post explains how to setup a Clojure environment step-by-step, including a working clj script (the common name for Clojure’s REPL). For this post I used the clj script contained within the Getting Started guide on Wikibooks as a base, but I built on top of it , so as to customize and improve it. In writing this post, I’ve made the assumption that you’re using a *nix environment. My instructions will be Mac OS X specific, even though it should be a breeze to adapt them for Linux, if required.
Step 1: Download and Build Clojure
The first step is to download and build the latest version of Clojure. (Please note, I used bit.ly below for code formatting reasons.)
$ cd /tmp
$ wget http://bit.ly/clojure
$ unzip clojure-1.1.0.zip
$ cd clojure-1.1.0
$ ant
This will generate several jars for us, including clojure.jar, which we’re interested in. Aside from clojure.core, we are also interested in clojure.contrib, a collection of useful libraries and functions that are often used in Clojure programs. To download and build clojure.contrib, follow these instructions:
$ cd /tmp
$ wget http://bit.ly/clojure-contrib
$ unzip clojure-contrib-1.1.0.zip
$ cd clojure-contrib-1.1.0
$ ant
This should generate a clojure-contrib.jar file. We can now proceed to copy both the clojure.jar and clojure-contrib.jar files to a convenient location. (You can copy the files with a single line, but for formatting reasons I split the operation into two copy operations).
$ mkdir -p ~/Library/Clojure/lib
$ cd ~/Library/Clojure/lib
$ cp /tmp/clojure-1.1.0/clojure.jar .
$ cp /tmp/clojure-contrib-1.1.0/clojure-contrib.jar .
Note that on Linux, you’d choose a different location (e.g., /opt/clojure/lib).
Step 2: Install rlwrap
Before we can proceed with creating a fancy clj script, we need rlwrap, which will make our interactive prompt much more friendly. On Mac OS X, if you are using MacPorts, you can simply run the following:
$ sudo port install rlwrap
Step 3: Create a clj script
Now we can finally finally create a clj script and save it as ~/Library/Clojure/clj:
#!/bin/sh
breakchars="(){}[],^%$#@\"\";:''|\\"
CLOJURE_DIR=$HOME/Library/Clojure/lib
CLOJURE_JAR=$CLOJURE_DIR/clojure.jar
CONTRIB_JAR=$CLOJURE_DIR/clojure-contrib.jar
CLOJURE_CP=$CLOJURE_JAR:$CONTRIB_JAR:$PWD
if [ -f .clojure ]; then
CLOJURE_CP=$CLOJURE_CP:`cat .clojure`
fi
if [ $# -eq 0 ]; then
exec rlwrap --remember -c -b $breakchars \
-f $HOME/.clj_completions \
java -cp $CLOJURE_CP clojure.main -i $HOME/.clojure.clj --repl
else
exec java -cp $CLOJURE_CP clojure.main -i $HOME/.clojure.clj $1 -- $@
fi
Make it executable and symlink it to a convenient place on your path (e.g., ~/bin):
$ chmod a+x ~/Library/Clojure/clj
$ ln -s ~/Library/Clojure/clj ~/bin/.
At this point we almost have a working script. Before declaring it fully functional, let’s take a look at three files this script references:
.clojure: This file is optional and is located in the same folder as the scripts you want to execute. It can be used to include further jars in your classpath.~/.clj_completions: Includes a list of Clojure functions for tab completion (courtesy of rlwrap).~/.clojure.clj: Your Clojure startup file. It’s the first Clojure file that’s run when we start the REPL or execute a script. We’ll use this to add a couple of nice functions to the default clj REPL, but you can add anything you’d like to load at startup.Step 4: Create a startup file
By default, clj will not have an exit function. You can either CTRL+D or enter a call to System/exit. This is not a big deal, but I like the idea of exiting with a simple (exit) call (as one does with Ruby and Python). Fortunately, we can add this function to the startup file. Furthermore, for learning purposes, few things beat having the source function which reveals the source code of built-in Clojure functions. This too can be added to ~/.clojure.clj:
(use 'clojure.contrib.repl-utils)
(defn exit [] (System/exit 0))
(exit) will now exit the REPL and source will reveal the source code as shown below:
user=> (source peek)
(defn peek
"For a list or queue, same as first, for a vector, same as, but much
more efficient than, last. If the collection is empty, returns nil."
[coll] (. clojure.lang.RT (peek coll)))
nil
Step 5: Create a ~/.clj_completions file
As explained in the Clojure Wikibook referenced above, you can create the .clj_completions file by running the following from the REPL or as a script (e.g., clj myscript.clj):
(def completions
(reduce concat (map (fn [p] (keys (ns-publics (find-ns p))))
'(clojure.core clojure.set clojure.xml clojure.zip))))
(with-open [f (java.io.BufferedWriter. (java.io.FileWriter. (str (System/getenv "HOME") "/.clj_completions")))]
(.write f (apply str (interleave completions (repeat "\n")))))
This will populate the file .clj_completions with a list of 500+ symbols. You can now run clj and try tab completion by inserting a few characters and then pressing tab. For example:
mbp:~ acangiano$ clj
Clojure 1.1.0
user=> (re<tab>
re-find ref-min-history repeatedly
re-groups ref-set replace
re-matcher refer replicate
re-matches refer-clojure require
re-pattern release-pending-sends reset!
re-seq rem reset-meta!
read remove resolve
read-line remove-method rest
read-string remove-ns resultset-seq
reduce remove-watch reverse
ref rename reversible?
ref-history-count rename-keys
ref-max-history repeat
There you have it, a complete setup of Clojure from source to clj. You can find information about setting up your favorite editor/IDE on the Assembla Wiki.
Note: As pointed out by Phil and other commenters elsewhere, the easiest way to get started these days is probably to install cljr, a standalone REPL and package manager. My post outlines how to do it from scratch (as you can deduct from the title), just like we used to before these tools were released. You may however consider the pre-packaged solution instead.
Lisp has had a tremendous impact on the world of programming. Even though Common Lisp and Scheme — the two main Lisp dialects — may not be considered mainstream today, several popular languages have been influenced by one or both of them.
It isn’t stretching things too much to say that both Ruby and Python can be seen as slower, easier (for beginners), object-oriented, infix Lisp dialects.
Some may say Ruby is a bad rip-off of Lisp or Smalltalk, and I admit that. But it is nicer to ordinary people. — Yukihiro “Matz” Matsumoto
Ruby and Python aren’t intimidating and remain very approachable for absolute beginners. Furthermore, their approachability is not confined to the language design itself, but transcends into the community and ecosystem that surrounds them.
I’m not here to discuss how languages like Ruby and Python managed to become more popular than major Lisp dialects nowadays. I’d rather focus on how these gentler introductions to functional programming are acting as gateway drugs to Lisp for many developers.
A community that values metaprogramming and is obsessed with the construction of DSLs (Domain Specific Languages) like the Ruby’s is, will no doubt find in Lisp a valuable ally. Plus, if you know Ruby inside and out, you should find Lisp to be easy enough to learn.
To attract Ruby developers though, Lisp has to offer something more than just a set of powerful features. You could say that Rails is enough of a reason to learn and use Ruby. But what is Lisp able to solve all that better than Ruby? I’ll answer that question by focusing on a specific dialect of Lisp, that I and continually more Ruby developers are getting into: Clojure.
It wouldn’t be fair to characterize the Lisp community as stagnant, but Clojure is definitely a welcomed dose of new blood. Clojure is a JVM-based modern Lisp designed for concurrency, which elegantly includes a set of carefully chosen features that are not easily found in mainstream languages.
In my opinion, Clojure has three main advantages over Ruby:
Clojure’s interoperability with Java resolves the issue of only having a few available libraries, which often affects new languages. It also helps in getting people to use the language within the enterprise world where Java still dominates.
Of all the “new” languages out there, I find Clojure to be the most fun, interesting and pragmatic: it’s something worth getting excited about. I don’t really care if it turns out to be the next Ruby or not, it’s a language that’s worth knowing and using. (If you haven’t tried it yet, a decent, short introductory book is the recently published Practical Clojure.)
Clojure’s popularity may even bring more attention to Lisp in general (for example, most must-read literature uses Scheme or Common Lisp). Perhaps then, it may indirectly help introduce more traditional Lisp dialects to a new generation of programmers.
As I write a series of thoughts on the pursuit of excellence in programming, I must preface my essay by asking you to ignore that I wrote these words. I invite you to evaluate the opinions and ideas presented here not ad hominem, but rather on the basis of their own merits. It would be easy to otherwise mistakenly dismiss them with the infamous question posed by Steve Jobs to a blogger: “What have you done that’s so great?”.
This is to say that I talk about the ambitious and noble goal of achieving excellence in programming, fully conscious of not having achieved said excellence. For the time being, I don’t feel like I can point my finger at something that would impress Steve Jobs (or less critical observers). I’m just a traveler on the journey of learning, with a desire to share his experiences and plans.
Two visions of intelligence
Mastering a complex discipline such as programming requires a great amount of learning over the course of several years, perhaps even decades. Maximizing one’s ability to learn is therefore an early investment that can quickly repay itself.
The biggest impact on my ability to learn was caused by a shift in the way I considered the matter of intelligence. There are mainly two ways to think about it. You can either consider intelligence to be a static, intrinsic ability or a more dynamic, cultivable characteristic of human beings.
Cognitive scientists and psychologists conclusively determined that people who perceived intelligence as a dynamic characteristic, outperformed and were more successful than people who internalized intelligence as an intrinsic, static ability.
It’s worth noting that it’s not really important whether intelligence can actually be developed through application. It’s the perception of it that forges students’ approach to learning.
This difference in perception is often conditioned by early parenting. Kids who are encouraged to work hard to achieve results and are praised on the basis of their effort, tend to develop a perception of intelligence and results as something they can work on. Other kids are conditioned to think that they are doing well because they are “smart” and that their intelligence alone will most likely lead them to success.
Society has a fascination with genius, and parents like to fancy their little ones to be several standard deviations better than the norm, but conditioning children this way has dangerous and counterproductive consequences.
Kids who are labeled and praised because of their “innate capabilities”, will often suffer from an overconfidence that will affect their ability to challenge themselves through the depths of the unknown, because they feel it would threaten their status. What if they fail? It would mean, in their eyes, that perhaps they are not the smart person they have been assumed to be all along. We all have seen such kids failing here and there, and quickly making excuses such as, “Oh, I wasn’t trying at all”.
A parent who is cultivating a kid’s interest in hard work, may be more likely to encourage their child with words such as, “It’s OK. Keep studying, and you’ll definitely do better next time”. A parent proposing a model of static intelligence, may justify their child’s failure in a given subject by concluding that “maybe you are not cut out for subject X”. [1]
When facing failure, the “static intelligence” child may crumble under the weight of his own demise, as if failure was a reflection of their intrinsic value rather than a temporary speed bump and occasion for growth. A “dynamic intelligence” child will simply try harder next time. Genius or not, excellence and mastery of any subject requires hard work and many “smart” kids fall short when the bar is raised high enough so that “smartness” alone won’t cut it anymore. This usually corresponds with the switch from high school to college.
I’m very familiar with all of this, because I was one of those kids. I was labeled by my parents and science teachers as a “genius”. Even psychologists at school, who came to help us figure out what careers we were better suited for, ended up telling me that I could pursue virtually any career (at the time I was interested in nuclear physics) and that according to their (virtually meaningless) IQ tests I would be classified as a “genius”.
Please note that the problem wasn’t so much the label. Most smart kids figure out that they are smart on their own rather quickly. The real problem was that I wasn’t taught the value of long-term intellectual effort. Effort itself was considered as being somewhat detrimental to my status. Not only was I supposed to succeed, but I supposedly had to do so without putting forth any effort (a “utopic” ambition).
One of the first example of this that I can remember is when my father “caught me studying” for a few hours the same book before a test in middle school. He told me something along the lines of “Why do you need to study? A genius like you should figure out the test without studying.”. It’s absurd, I know, and probably one of the dumbest things my — otherwise bright — father has ever told me. As a young kid though, such a statement can have a strong impact on you.
Another example has to do with Latin. My teacher was a palm reading, crazy cat lady and I had no respect for her from the start. So I didn’t pay attention in class, on top of not studying Latin at home, I set myself up for failure. When the first translation assignment came around, I got a mildly negative score. My less than professional teacher told me, “Oh I thought you were good, but I guess you are not”. Afterwards my father added to this by saying something along the lines of, “Well, don’t worry, I guess languages are not your strength.”.
Boom. That was enough for me to stop having any interest in Latin and completely ignore a subject at which I wasn’t excelling. Nobody could tell me, “You are stupid because you don’t understand Latin,” if I didn’t try at all. So I was the high school kid who did advanced Calculus stuff on his own for fun, when my classmates where struggling with Algebra, yet I pretty much sucked at Latin. (In retrospective, the thought patterns required to excel at Latin where not very different from those required to excel at math or mastering English as a second language. I suspect that had I put in some effort I could have been very good at, and actually enjoyed, the subject.)
Over the years I had to readjust this perception entirely. By falling on my face more than once, I learned that excellence is only achieved through a combination of talent and effort. The real genius may lie in the ability to put in thousands of hours of focused study and practice, in the pursuit of whatever one person is trying to learn and understand.
If people knew how hard I worked to get my mastery, it wouldn’t seem so wonderful at all. — Michelangelo
By this new definition, I was a complete idiot who had to entirely learn from scratch how to appreciate the value of effort to hone and develop his “talent” (which is nothing but a seed on its own).
Learning to learn
Being very interested in programming, computer science, mathematics, and science in general, I decided at some point that I had to entirely change my attitude towards learning if I were to master any of those disciplines. Effort was now more important than intelligence on its own, and I would feel satisfied only when doing a really good job in the pursuit of something challenging, that couldn’t be achieved by sheer “talent”.
The saddest thing in life is wasted talent, and the choices that you make will shape your life forever. — Calogero ‘C’ Anello from A Bronx Tale
In the process, I started to internalize a few principles and I dealt with issues related to both the art of learning and the art of programming.
An awful feeling
There is a cognitive bias known as the Dunning-Kruger effect [2], in which subjects who are inexperienced or less competent within a given discipline, tend to overestimate their abilities (they are in other words affected by illusory superiority [3]).
The other side of this coin is that the more you study, the more you realize how little you know and how much there is to know (a concept put forward by Socrates back in the days of Ancient Greece). This is both a pleasure and a discomfort. There is a huge amount of satisfaction in finding things out. Yet, being in doubt and fully aware of how little you know tends to be an unpleasant side effect all learners have to live with. Doubt truly is the water that’s fundamental for the growth of the flower of intellectual curiosity. [4]
My approach in this case is to embrace and dominate my ignorance and fears. Whenever there is a concept that I feel particularly ignorant about or that is way over my head, I try to tackle it as if my life depended on it. There are still countless things I’m ignorant about, but this approach as really paid off for me over the years.
If you are trying to learn a whole branch of computer science or mathematics, it will take a long time, so you may want to start first with smaller “fears” that can be mastered, at least at an introductory level, in a short amount of time. Rather than thinking, “Oh yeah, I should really learn Git”, for months, act on the thought. The essential knowledge to work with Git or Hg doesn’t take months to learn (assuming you have a need for either of these particular tools).
But you are in this for the long run, so don’t be afraid of improving your craft by studying advanced topics that require a bigger commitment in terms of time as well. There is no royal shortcut, mastery of our craft will require thousands of hours of dedicated study and practice.
Theory and practice
Masters of any intellectual discipline tend to have good working knowledge of both theoretical and practical aspects. Pursuing excellence in programming requires the study of many insightful books that will widen your view of the field and as a result necessarily improve your craft.
Working with books alone is not enough though. Programming requires writing and reading programs every day for years. That’s why I have a rule: I refuse to go to sleep if I haven’t read and written some code on a given day (this doesn’t include the code I write for work, of course). So far, this rule has had a positive impact on my ability to code.
It is a mistake to think that the practice of my art has become easy to me. I assure you, dear friend, no one has given so much care to the study of composition as I. There is scarcely a famous master in music whose works I have not frequently and diligently studied. — Mozart
Breadth Vs. Depth
As we progress in our journey towards the pursuit of excellence in programming, a question that will no doubt pester people’s minds is whether one should go for breadth of knowledge, or depth. There are countless programming languages, paradigms, methodologies, technologies, et cetera.
The truth of the matter is that if we are to become GrandMaster programmers, we cannot ignore either of them. In practice, depth has a much stronger impact in the way we construct software. It is through deep understanding that we can see the whole through the parts. There is therefore value in specializing in just a few languages and technologies, and really mastering them in-depth.
Getting things done in software development requires a certain pragmatism and proficiency with the tools at hand. There is no escaping it. Depth is therefore necessary, but not sufficient.
I find that the web is particularly good at covering the breadth aspect of things. There are always new and interesting areas I can learn about and experiment with. I don’t need a whole 600 page book to get a feeling for — or to better understand — certain technologies that are not crucial to my area of specialization.
While there are quite a few exceptions I could mention, I’d say that I tend to use the Internet as an aid for horizontal scaling of my knowledge, and books for vertical scaling. And again, more often than not, the depth level is mostly determined by the amount of time, practice and effort I put into it, rather than the media I’m using.
Where to find time
One of the objections I hear often, particularly when it comes to reading books is, “I don’t have time!”. In a few extreme cases, that may actually be true, but I think that most people severely underestimate how much time is “wasted” on a daily basis (even just surfing online).
Breaks are extremely important, and I’m not advocating any regime of incessant study. I simply know how crucial it is to be constant. It’s a marathon, not a sprint.
My second rule is: I refuse to got sleep if I haven’t read at least a chapter of a book that day. Very often I get caught up in the book I’m reading or working through, and end up getting through much more than just one chapter (it really depends on the book, of course). But for me the rule is clear: no sleep allowed until one chapter has been read every day.
Try this approach and you’ll see that reading doesn’t have to take up much time, yet doing so you can still read several books (including technical ones) every month.
We are what we repeatedly do. Excellence, then, is not an act but a habit. — Aristotle
For certain books, it is convenient to have the book in PDF format on your computer, as you switch from the book to the editor/console and back. However, generally speaking the computer tends to be quite distracting and thoughts like “I’ll just check my email quickly” can easily lead to hours spent doing something else.
For this reason I prefer to read from a paper book which is also easier on the eyes for extensive reading after a day in front of a computer screen. Even if I’m using the computer at the same time to input code, the physical presence of a book next to my laptop is enough to remind me that I shouldn’t get distracted online. (For me the depth and focus required by books is also an antidote to the re-wiring that the web tends to do to our brains. [5])
By the way, a few days ago Amazon announced a gorgeous, brand new graphite color Kindle DX [6]. I think I may pull the trigger and get it for my upcoming 30th birthday. Buying numerous paper books is expensive, and given the price of Kindle books, this move would end up being cheaper in the long run. The large e-ink display almost looks like paper and the device is not as distracting as an iPad (you read on a Kindle, and that’s it). Plus, it’s lighter than most technical books and surely takes up less space in your home.
Achieving focus
With so much going on within the programming world, distractions are easy to come by. My approach is to focus only on the given macro-task at hand. If I’m trying to learn about process calculi for example, then for the next few months my “learning time” will be ruthlessly dedicated to that subject, as if the rest of the programming ecosystem stopped in time.
Then there is focusing at a micro-task level. Learning about a given subject can always be divided into a long series of smaller steps. When I’m focusing on one such, tiny step, then everything else ceases to exist (or at least in theory).
One trick I use to achieve solemn focus on micro-tasks, whether reading code, writing code, or reading a technical book, is the use of the Pomodoro Technique [7]. In short, I use timer software [8] which alerts me when 25 minutes (a pomodoro) have passed, and gives me a 5 minute break for each pomodoro. Every 4 pomodoros, I can take a longer break.
When I first started using this technique I thought it was mostly a gimmick and I had a hard time taking breaks. I just wanted to keep going and the “tracking” seemed silly. However, I must say that it strikes an elegant balance between the desire to focus for extensive periods of time and the importance of taking regular mini-breaks.
This approach has become routine for my mind now, so even if it is just a gimmick, it’s still a good way for me to get focused and “in the zone”.
Aiming for sprezzatura
The pursuit of excellence requires a huge drive from within, and a fundamental dissatisfaction with just being good at a given discipline. I believe this is true regardless of the profession at hand.
As I progress in my journey, I’m discovering how the key is to make the pursuit of excellence a habit. This goes against my nature of being an intellect sprinter, but I’m in for the long run and I’m really learning to enjoy the, marathon like, process.
My long-term goal is to program with sprezzatura [9], where the process is so internalized and part of my subconscious that it almost looks effortless (as if the act of programming was committed to muscle memory). It will be an overnight success, 15 years in the making.
Regardless of the improvement level achieved, I will always have the joy, privilege and need to continue to learn for the betterment of myself and my craft. When there is no set destination, the journey is what really matters.
Ancora imparo. (I’m still learning.) — Michelangelo
Notes
[1] These concepts are explained, in a much more eloquent manner, in the early chapters of the excellent book, The Art of Learning by Josh Waitzkin.
[2] Dunning-Kruger Effect on Wikipedia.
[3] Illusory Superiority on Wikipedia.
[4] For more on the importance of doubt in science, check out the beautiful epilogue in What do you care what other people think? by Richard P. Feynman.
[5] For more on this phenomenon, read The Shallows by Nicholas Carr.
[6] The new Kindle DX on Amazon.
[8] Pomodoro for Mac OS X.
[9] Sprezzatura on Wikipedia.
Translations
Carlos Marcelo Cabrera translated this article into Spanish: La búsqueda de la excelencia en la programación.
This post contains the results of a Ruby shootout on Windows that I recently conducted. You can find the Mac edition, published last month, here. I was planning to have this one ready much sooner, but a couple of serious events in personal life prevented that from happening. Be sure to grab my feed or join the newsletter to avoid missing the upcoming Linux shootout.
The setup
For this shootout I included a subset of the Ruby Benchmark Suite. I opted to primarily exclude tests that were executed in fractions of a second in most VMs, focusing instead of more substantial benchmarks (several of which come from the Computer Language Benchmarks Game). The best times out of five runs are reported here for each benchmark.
All tests were run on Windows 7 x64, on an Intel Core 2 Quad Q6600 2.40 GHz, 8 GB DDR2 RAM, with two 500 GB 7200 rpm disks.
The implementations tested were:
JRuby was run with the --fast and --server optimization flags.
Disclaimer
Synthetic benchmarks cannot predict how fast your programs will be when dealing with a particular implementation. They provide an (entertaining) educated guess, but you shouldn’t draw overly definitive conclusions from them. The values reported here should be assumed to be characteristic of server-side – and long running – processes and should be taken with a grain of salt.
The results
Please find below the execution times for the selected tests. Timeouts indicate that the execution of a single iteration for a given test took more than 60 seconds and had to be interrupted. Bold values indicate the best performance for each test.
Conclusions
Despite a couple of errors and a few timeouts, JRuby was the fastest of the lot, which can be seen as impressive if we consider that this is Windows we are talking about after all.
Ruby 1.9.1 and 1.9.2 were almost as fast as JRuby on these tests. With a few exceptions, the performances of the two 1.9 implementations were, expectedly, very similar.
JRuby, 1.9.1 and 1.9.2 were all faster than the current MRI implementation, which can be seen as a prerequisite as we move, as a community, away from Ruby 1.8. Finally, it’s worth noting that IronRuby’s performance was however in line with that of Ruby 1.8.7.
Update (July 3, 2010): The following box plot compares the various implementations for the tests for which all the implementations were successful. Only times for the largest successful input number were used in those tests where multiple input numbers were tested.

Several years ago I knew a programmer, we’ll call him Joe, who fancied himself to be a great developer. He was a senior developer at “Big Co.”, who received a large enough pay check to just as easily compensate a few junior developers.
The guy had Microsoft certifications, as expected of one in his position, and he appeared to know Visual Studio inside and out, just as you’d imagine.
What was surprising for me at the time, was that as I got to know Joe better, I slowly started to realize that the guy was absolutely clueless about programming beyond the scope of the Microsoft bubble.
He had never used Linux in his life, nor was he interested in learning about it. He wasn’t aware of tools like CVS or SVN. He didn’t seem familiar with the ideas behind unit testing or Agile methodologies; words like “refactoring” were outside of Joe’s vocabulary.
How about other programming languages or paradigms, like functional programming? Nope, nothing there either. He knew Visual Basic 6 and VB.NET, C# (the first release at the time) and a bit of C++. That’s it. And he was proud of it. As I enquired further in an attempt to figure out what he was all about, he didn’t mind admitting to a complete ignorance of what wasn’t printed on Microsoft paper.
What’s interesting is that, despite his cluelessness – due to a strict adherence to the Microsoft view of the programming world – and perhaps a lack of intellectual curiosity, the guy did manage to be somewhat adequate at his job. Not spectacular by any stretch of the imagination, but good enough to keep his well paid job.
Joe simply didn’t care about tools and techniques that fell outside of the narrow, yet mainstream, beaten Microsoft path.
Over the years I’ve met countless Joes. In fact, if I were to generalize and characterize many Microsoft developers in the early 2000s, Joe would come to mind like an unpleasant stereotype.
I tell this story not to criticize Joe or to claim that Microsoft developers suck. On the contrary, my argument is that Microsoft will be a key factor in terms of enabling functional programming to become mainstream. In fact, what Microsoft is doing is introducing .NET developers to functional programming, one piece at a time.
Just a few days ago I was reading the blog of a guy who seemed happy to discover a cool “new” function called Zip in LINQ (hey Haskell programmer over there, stop laughing, you are disrupting my article). These days following blogs by Microsofties is in fact like witnessing some sort of Renaissance, with plenty of talk about exciting features that are clearly borrowed from the functional programming community.
The evolution of C# and Visual Basic, LINQ, and more recently the inclusion of F# as a fully supported language within the .NET Framework 4.0, all indicate Microsoft’s new outlook towards functional programming. F# in particular is essentially OCaml for .NET, and has been received with open arms by the Microsoft community (as far as I can tell).
There are very few books by mainstream publishers on OCaml. On the other hand, F# already has a rich ecosystem of printed books that have been published by O’Reilly, Apress, Wrox, Manning, and likely in the near future Microsoft itself. On top of that there are titles devoted to LINQ and countless books on the recent, more functional oriented, C# and VB.
I’m not measuring the popularity of programming languages by the availability of books alone, but it’s clear to me that there are millions of Joes out there who are ready to learn these new concepts that are now being put out and promoted by Microsoft. It doesn’t matter that they’re not new or that they’re borrowed from other programming languages like Ruby, Python, LISP, ML, Haskell, etcetera.
The end result of Microsoft’s new approach is that now Joes everywhere are getting exposed to functional programming (masses of people who would otherwise be virtually shielded from the rest of the programming world).
Microsoft may no longer be the influential powerhouse it once was, but I think it is fair to acknowledge the impact it’s currently having on making functional programming, or at least some degree of it, more mainstream.
Update: Initially I used the name “Dick” (short for Richard) to identify the programmer I talked about in this story. It was meant to be a humorous double entendre, but turned out to somewhat distract certain readers from the key points that this article was actually about. As well it seems that some people felt I was coming across as being strongly against Microsoft developers (which I’m not in the least). As a result, some readers ended up being offended by this post, which was never my intention. I don’t love political correctness, but I feel that replacing the name “Dick” with “Joe” instead ends up detracting less from the heart of the article.
From the Padrino’s site:
Padrino is a ruby framework built upon the excellent Sinatra Microframework. Sinatra is a DSL for creating simple web applications in Ruby with speed and minimal effort. This framework makes it as fun and easy as possible to code increasingly advanced web applications by expanding upon Sinatra while maintaining the spirit that made it great.
The Ruby community has plenty of web frameworks at this point. Padrino — self-described as “The Elegant Ruby Web Framework” — is interesting because it’s built on top of Sinatra, it’s highly modular, quite fast, and provides a drop-in admin interface. It fits between Sinatra and a large framework like Rails.
If it wasn’t for the fact that Rails 3 is about to be released, Padrino may have had a fighting chance at acquiring a good market share within the Ruby community. Rails 3 is here though, and it too is very modular and fast. Plus, it’s hard to beat the huge ecosystem that’s already built around it.
That said, the presence of an admin interface, a la Django, and the Sinatra core are definitely inviting features. Check out their documentation and screencast, to see if you think it’s worth considering for your own web development needs.
Recently MacRuby 0.6 was released. The development team put a lot of emphasis on improving compatibility with Ruby 1.9, and the viability of MacRuby as a tool for developing Mac OS X applications. Focus on these aspects took precedence over performance, but I was still curious to see how well it performed when compared to Ruby 1.8.7 and Ruby 1.9, respectively.
This article showcases the results of a small Ruby shootout for Mac. I plan to publish a Windows one by next week, and then a week or two after that, a complete Linux shootout that will have many more implementations. Grab my feed or join the newsletter to avoid missing upcoming shootout posts.
The setup
The tests are a large subset of the Ruby Benchmark Suite. Each test was run 10 times, five to detect the best execution time, and five to detect the minimal memory consumption. All of the tests were run on Mac OS X 10.6.3, on my MacBook Pro 2.66 GHz Intel Core 2 Duo, 4 GB 1067 MHz DDR3 RAM, 320 GB 7200 rpm disk.
Stable implementations tested:
Disclaimer
Synthetic benchmarks cannot predict how fast your programs will be with one implementation or another. They provide an (entertaining) educated guess, but you shouldn’t draw overly definitive conclusions from them. Furthermore, the Ruby Benchmark Suite has many tests that don’t provide much insight when it comes to comparing implementations. They are there for legacy reasons and will probably be removed in the future. For the time being, take them with a grain of salt.
The results
Without further hesitation, here are the execution times for the tests (divided in A-L, M-Z). Timeouts indicate that the execution of a single run took more than 60 seconds and had to be interrupted. Bold values indicate the best performance for each test.


And here is the estimated memory usage:


Conclusions
MacRuby 0.6 is faster than Ruby 1.9.1 at times, but it can also be significantly slower. Overall, as things stand now, its performance appears to be between that of Ruby 1.9.1 and Ruby 1.8.7, with several outliers and a greater variance compared to those two implementations. Memory wise, MacRuby appears to be significantly more “memory hungry” than the other implementations (even though this wasn’t all that much of a surprise to me).
I’m interested in seeing how future releases that will be focused more on performance will affect these preliminary results. For the time being however, don’t let this outcome discourage you from using MacRuby 0.6, which is the first release that’s considered stable for Mac OS X development.
Download: CSV Files
PS: If you are looking for a fun and easy way to get started with MacRuby, check out ThinkCode.TV’s screencast on the subject.
Update (July 3, 2010): The following box plot compares the various implementations for the tests for which all the implementations were successful. Only times for the largest successful input number were used in those tests where multiple input numbers were tested.

The latest release of the IBM Adapter for Django now supports Django 1.2. Aside from enabling you to use the most recent version of Django, this release adds a few new goodies into the mix, that I’m sure many will appreciate.
For example, IBM’s adapter (through the underlying DBI wrapper) now uses persistent connections, which are especially helpful when dealing with Django – as it lacks connection pooling. (Of course DB2 also has the Connection Concentrator to aid in reducing the usage of server resources and improving scalability.)
Furthermore, the adapter adds support for the DECIMAL datatype, a necessary feature when dealing with money and currencies. Various enhancements and bug fixes were included too; check them out on Google Groups.
As a reminder, DB2 Express-C is an absolutely free of charge version of DB2 and it’s production ready (not a toy version). You can download it from here. Take it for a spin, experiment – chances are you’ll like it. If you need a guide to getting started, be sure to check out this free e-book by my colleagues Raul, Ian, and Rav.
ThinkCode.TV’s English site is going to be launched on April 19th. To celebrate the upcoming launch and whet your appetite, a 19 minute long screencast about solving ASCII mazes with a few lines of Python code was just released for free. This video serves to illustrate Python’s elegance and power, as well as ThinkCode.TV’s approach to screencasts and education.
In order to download the screencast, you don’t need a credit card, to provide your address or even your last name. Just head on over to this page and join the newsletter. Upon confirming your subscription, you will immediately receive an email with links to DRM-free, 720p HD files in the formats QuickTime Movie (.mov), AVI and Ogg Theora (.ogv). These videos are in English, prepared by a published serial author on the subject of Python/Django, and narrated by a native English speaker. They also include optional subtitles in the .srt format, as well as the source code which is released under the MIT license, as is customary for ThinkCode.TV to do.
I hope you enjoy the free screencast and stay tuned for the launch in three weeks.
The following is a very short guide on setting up Ruby Enterprise Edition (REE), nginx and Passenger, for serving Ruby on Rails applications on Ubuntu. It also includes a few quick and easy optimization tips.
We start with setting up REE (x64), using the .deb file provided by Phusion:
wget http://rubyforge.org/frs/download.php/66163/ruby-enterprise_1.8.7-2009.10_amd64.deb
sudo dpkg -i ruby-enterprise_1.8.7-2009.10_amd64.deb
ruby -v
In output you should see “ruby 1.8.7 (2009-06-12 patchlevel 174)…” or similar. If this is the case, good; while you are there, update RubyGems and the installed gems:
sudo gem update --system
sudo gem update
Next, you’ll need to install nginx, which is a really fast web server. The Phusion team has made it very easy to install, but if you simply follow most instructions found elsewhere, you’ll get the following error:
checking for system md library ... not found checking for system md5 library ... not found checking for OpenSSL md5 crypto library ... not found ./configure: error: the HTTP cache module requires md5 functions from OpenSSL library. You can either disable the module by using --without-http-cache option, or install the OpenSSL library in the system, or build the OpenSSL library statically from the source with nginx by using --with-http_ssl_module --with-openssl=options.
Instead, we are going to install libssl-dev first and then nginx and its Passenger module:
sudo aptitude install libssl-dev
sudo passenger-install-nginx-module
Follow the prompt and accept all the defaults (when prompted to chose between 1 and 2, pick 1).
Before I proceed with the configuration, I like to create an init script and have it boot at startup (the script itself is adapted from one provided by the excellent articles at slicehost.com):
sudo vim /etc/init.d/nginx
The content of which needs to be:
#! /bin/sh
### BEGIN INIT INFO
# Provides: nginx
# Required-Start: $all
# Required-Stop: $all
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: starts the nginx web server
# Description: starts nginx using start-stop-daemon
### END INIT INFO
PATH=/opt/nginx/sbin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
DAEMON=/opt/nginx/sbin/nginx
NAME=nginx
DESC=nginx
test -x $DAEMON || exit 0
# Include nginx defaults if available
if [ -f /etc/default/nginx ] ; then
. /etc/default/nginx
fi
set -e
. /lib/lsb/init-functions
case "$1" in
start)
echo -n "Starting $DESC: "
start-stop-daemon --start --quiet --pidfile /opt/nginx/logs/$NAME.pid \
--exec $DAEMON -- $DAEMON_OPTS || true
echo "$NAME."
;;
stop)
echo -n "Stopping $DESC: "
start-stop-daemon --stop --quiet --pidfile /opt/nginx/logs/$NAME.pid \
--exec $DAEMON || true
echo "$NAME."
;;
restart|force-reload)
echo -n "Restarting $DESC: "
start-stop-daemon --stop --quiet --pidfile \
/opt/nginx/logs/$NAME.pid --exec $DAEMON || true
sleep 1
start-stop-daemon --start --quiet --pidfile \
/opt/nginx/logs/$NAME.pid --exec $DAEMON -- $DAEMON_OPTS || true
echo "$NAME."
;;
reload)
echo -n "Reloading $DESC configuration: "
start-stop-daemon --stop --signal HUP --quiet --pidfile /opt/nginx/logs/$NAME.pid \
--exec $DAEMON || true
echo "$NAME."
;;
status)
status_of_proc -p /opt/nginx/logs/$NAME.pid "$DAEMON" nginx && exit 0 || exit $?
;;
*)
N=/etc/init.d/$NAME
echo "Usage: $N {start|stop|restart|reload|force-reload|status}" >&2
exit 1
;;
esac
exit 0
Change its permission and have it startup at boot:
sudo chmod +x /etc/init.d/nginx
sudo /usr/sbin/update-rc.d -f nginx defaults
From now on, you’ll be able to start, stop and restart nginx with it. Start the server as follows:
sudo /etc/init.d/nginx start
Heading over to your server IP with your browser, you should see “Welcome to nginx!”. If you do, great, we can move on with the configuration of nginx for your Rails app.
Edit nginx’ configuration file:
sudo vim /opt/nginx/conf/nginx.conf
Adding a server section within the http section, as follows:
server {
listen 80;
server_name example.com;
root /somewhere/my_rails_app/public;
passenger_enabled on;
rails_spawn_method smart;
}
The server name can also be a subdomain if you wish (e.g., blog.example.com). It’s important that you point the root to your Rails’ app public directory.
The rails_spawn_method directive is very efficient, allowing Passenger to consume less memory per process and speed up the spawning process, whenever your Rails application is not affected by its limitations (for a discussion about this you can read the proper section in the official guide).
If you have lots of RAM (e.g., more than 512 MB) on your server, you may want to consider increasing you maximum pool size, with the directive passenger_max_pool_size from its default size of 6. Conversely, if you want to limit the number of processes running at any time and consume less memory on a small VPS (e.g., 128 to 256MB), you can decrease that number down to 2 (or something in that range). (Always test a bunch of configurations to find one that works for you). You can read more about this directive, in the official guide.
While you are modifying nginx’ configuration, you may also want to increase the worker processes (e.g., to 4, on a typical VPS) and add a few more tweaks (such as enabling gzip compression):
# ...
http {
passenger_root /usr/local/lib/ruby/gems/1.8/gems/passenger-2.2.5;
passenger_ruby /usr/local/bin/ruby;
include mime.types;
default_type application/octet-stream;
access_log logs/access.log;
sendfile on;
keepalive_timeout 65;
tcp_nodelay on;
gzip on;
gzip_comp_level 2;
gzip_proxied any;
server {
#...
When you are happy with the changes, save the file, and restart nginx:
sudo /etc/init.d/nginx restart
If you wish to restart Passenger in the future, without having to restart the whole web server, you can simply run the following command:
touch /somewhere/my_rails_app/tmp/restart.txt
Passenger also provides a few handy monitoring tools. Check them out:
sudo passenger-status
sudo passenger-memory-stats
That’s it, you are ready to go! I hope that you find these few notes useful.
There is major news in Rubyland today. MacRuby’s team just released their fist beta of version 0.5 (an experimental, still incomplete version of Ruby), which brings JIT, removal of the dreaded GIL (Global Interpreter Lock), native threads, GCD (Grand Central Dispatch) for multicore computing, and a whole new set of features found in the release announcement to the table.
The most important new feature is the presence of a compiler. That’s right, thanks to this release, Ruby code can now become highly optimized executable code. How awesome is that? I can sense that you’re pumped by this news, so why not head over to MacRuby.com and download the installation file for yourself? After you’ve done that, the next thing you’re going to want to do is run a small test like the following:
$ macrubyc world_domination.rb -o world_domination
Can't locate program `llc'
Oh noes! llc is a tool that ships with the LLVM (upon which MacRuby is built), however it’s not included with MacRuby’s installer (it will be in the future). But fear not my friends, there is a solution:
$ svn co -r 82747 https://llvm.org/svn/llvm-project/llvm/trunk llvm-trunk
$ cd llvm-trunk
$ ./configure
$ UNIVERSAL=1 UNIVERSAL_ARCH="i386 x86_64" ENABLE_OPTIMIZED=1 make -j2
$ sudo env UNIVERSAL=1 UNIVERSAL_ARCH="i386 x86_64" ENABLE_OPTIMIZED=1 make install
If your machine does not have 2 cores, remove the -j2 option from the fourth line or adjust the number accordingly.
The compilation phase may take a couple of centuries, depending on your machine’s speed, but it should eventually build the LLVM.
llc will be placed in your PATH, and you’ll finally be able to compile Ruby code and obtain an executable to help you carry out your world domination plans.
$ macrubyc world_domination.rb -o world_domination
$ ./world_domination
MUAHAHAHAHA!
FriendFeed, which was recently acquired by Facebook, just released an interesting piece of open source software.
Tornado is an open source version of the scalable, non-blocking web server and tools that power FriendFeed. The FriendFeed application is written using a web framework that looks a bit like web.py or Google’s webapp, but with additional tools and optimizations to take advantage of the underlying non-blocking infrastructure.
The story so far
This release generated widespread interest among the Python and open source development communities. Rightfully so. There are many reasons to like Tornado. To begin with, it’s fast — and that’s fundamental for a web server. By using nginx as a load balancer and a static file server, and running a few Tornado instances (usually one per core available on the machine) it’s possible to handle thousands upon thousands of concurrent connections on relatively modest hardware; and this isn’t just theory. Tornado has already proven its worth in the field, by allowing FriendFeed to scale graciously.
Tornado is not only a fast web server, it acts as a very lightweight application framework as well. As such, it’s an appealing alternative to well established frameworks to the growing group of developers who’d like to develop “closer to the metal” and avoid the baggage associated with full-fledged web frameworks. The two things combined make Tornado ideal for developing “real time” web services and applications.
The feedback so far hasn’t been all positive though. Criticism of the project has mainly focused on the lack of test coverage and the fact that FriendFeed has opted not to contribute to, and improve on, the existing Twisted Web project (which has similar goals). To make things worse, there were a few nonchalant comments about it as well. Performance issues and lack of ease of use were the reported motivations for starting a new project from scratch.
Dustin Sallings started working on a hybrid solution (henceforth Tornado on Twisted) that would reportedly keep the good parts that Tornado introduced, while using Twisted as its core for networking and HTTP parsing.
At this point I became naturally curious about the speed of these three web servers. Is Tornado really faster than Twisted Web? And what about Tornado on Twisted, would it be faster or slower? Let’s find out.
Benchmark results
I ran a simple Hello World app for all three web servers. All the web servers were run in standalone mode without a load balancer. I stress tested the web servers with httperf using a progressively larger amount of concurrent requests. 100,000 requests were generated for each test. The web servers were run on a desktop machine with an Intel® Core™2 Quad Processor Q6600 (8M Cache, 2.40 GHz, 1066 MHz FSB) processor and 8GB of RAM. The operating system of choice was Ubuntu 9.04 (x86_64).
Without further ado, here are the results:

As you can see Tornado turned out to be faster than the rest of the Python web servers. Handling a peak of almost 3900 req/s with a single front-end and on commodity hardware is nothing to sneer at.
Twisted Web didn’t do too bad either (max. 2703.7 req/s), but the difference in performance is noticeable. Likewise, the performance of Tornado on Twisted was virtually identical to that of Twisted Web.
There you have it. I was curious about the possible outcome and now I know. Remember, this is a report on the numbers I got on my machine, not a research paper. But I hope that you find them interesting nevertheless.
Show me the code
Tornado:
import tornado.httpserver
import tornado.ioloop
import tornado.options
import tornado.web
import logging
from tornado.options import define, options
define("port", default=8888, help="run on the given port", type=int)
class MainHandler(tornado.web.RequestHandler):
def get(self):
self.write("Hello, world!")
def main():
tornado.options.parse_command_line()
application = tornado.web.Application([
(r"/", MainHandler),
])
http_server = tornado.httpserver.HTTPServer(application)
http_server.listen(options.port)
tornado.ioloop.IOLoop.instance().start()
if __name__ == "__main__":
main()
Twisted Web:
from twisted.internet import epollreactor
epollreactor.install()
from twisted.internet import reactor
from twisted.web import server, resource
class Simple(resource.Resource):
isLeaf = True
def render_GET(self, request):
return "Hello, world!"
site = server.Site(Simple())
reactor.listenTCP(8888, site)
reactor.run()
Tornado on Twisted:
from twisted.internet import epollreactor
epollreactor.install()
from twisted.internet import reactor
import tornado.options
import tornado.twister
import tornado.web
import logging
from tornado.options import define, options
define("port", default=8888, help="run on the given port", type=int)
class MainHandler(tornado.web.RequestHandler):
def get(self):
self.write("Hello, world!")
def main():
tornado.options.parse_command_line()
application = tornado.web.Application([
(r"/", MainHandler),
])
site = tornado.twister.TornadoSite(application)
reactor.listenTCP(options.port, site)
reactor.run()
if __name__ == "__main__":
main()
UPDATE (September 14, 2009):
An easy way to improve the performance and security of SQL queries is to replace literals with parameters. By replacing literal values with parameters, advanced relational databases will be able to compile your queries and have their execution plans cached. This saves time and precious resources when the same query (minus the actual values) is executed over and over.
Consider the following series of queries:
SELECT * FROM users WHERE karma BETWEEN 100 AND 499;
SELECT * FROM users WHERE karma BETWEEN 500 AND 999;
SELECT * FROM users WHERE karma BETWEEN 1000 AND 1999;
SELECT * FROM users WHERE karma BETWEEN 2000 AND 4999;
SELECT * FROM users WHERE karma BETWEEN 5000 AND 9999;
SELECT * FROM users WHERE karma BETWEEN 10000 AND 50000;
These each represent the same query and can be transformed into a single parameterized query:
SELECT * FROM users WHERE karma BETWEEN ? AND ?;
Trying to use clever tricks with quotes in order to inject arbitrary SQL code becomes futile. Parameters are considered values, and have no effect on the structure of the query itself.
Parameterized queries are therefore efficient and go a long way towards preventing SQL injection attacks in your applications. They have virtually no downside.
Newbie developers often ignore the existence of this feature and end up irritating seasoned DBAs who have to deal with the consequences of their incompetence. Leon Katsnelson argues that this is such an important matter, that every DBA should forward this Computerworld article to their developers. I tend to agree with how important of an issue that is.
That article provides the following example in Java:
String lastName = req.getParameter("lastName");
String query = "select * from customers where last_name = ?"
PreparedStatement pstmt = connection.prepareStatement(query);
pstmt.setString(1, lastName);
try { ResultSet results = pstmt.execute(); }
Here I’ll show you an example of how to work with parameterized queries from Ruby and Python. I’ll use the Ruby and Python drivers for DB2.
Ruby first:
require 'ibm_db'
conn = IBM_DB.connect("mydb", "db2inst1", "mypassword")
query = "SELECT * FROM users WHERE karma BETWEEN ? AND ?"
pstmt = IBM_DB.prepare(conn, query)
values = [500, 999]
IBM_DB.execute(pstmt, values)
while row = IBM_DB.fetch_array(pstmt)
puts "#{row[0]}:#{row[1]}"
end
We load the driver (use mswin32/ibm_db on Windows, and ibm_db.bundle on Mac), create a prepared statement, and then bind the two parameter values to it through the execute method. We then fetch the resultset one row at a time and print the value of the first two fields for each record. For fine-tuned control we could have used the IBM_DB::bind_param method.
The Python version is very similar:
import ibm_db
conn = ibm_db.connect("mydb", "db2inst1", "mypassword")
query = "SELECT * FROM users WHERE karma BETWEEN ? AND ?"
pstmt = ibm_db.prepare(conn, query)
values = (500, 999)
ibm_db.execute(pstmt, values)
tuple = ibm_db.fetch_tuple(pstmt)
while tuple:
print tuple[0] + ":" + tuple[1]
tuple = ibm_db.fetch_tuple(pstmt)
As you can see, working with parameterized queries is not any harder than dynamically generating SQL queries. Yet the benefits of doing so are huge.
Unfortunately, despite being a very sound choice to base an Object-Relational Mapper (ORM) on, ActiveRecord does not use parameterized queries. Even when it looks like you are passing parameters to a given method, these are actually used to dynamically form an SQL query. Of course you are still free to use parameterized queries in your Rails applications by employing the driver directly. But I really think this is something ActiveRecord should be built upon.
Luckily for Django developers, Django’s ORM uses parameterized queries, thus improving both performance and security with a single design choice. In the Python world you couldn’t get away with ignoring parameterized queries.
For those of you using Rails, all is not lost. DB2 Express-C 9.7 has a killer feature known as the Statement Concentrator, which caches similar queries allowing them to use a shared access plan. It’s not as efficient as using prepared statements in your code, but it’s the best you can do when, as in the case of ActiveRecord, you can’t use parameterized queries directly. Leon’s article explains in greater detail how this feature actually works.
This is the Python version of a post I made about Ruby a few days ago.
Now that Mac OS X 10.6 is out, it’s time to leave the world of 32 bit computing behind. The pre-installed Python interpreter will run in 64 bit mode by default, so you may need to pay attention when installing some C-based eggs.
Assuming you have DB2 Express-C installed already, the ibm_db Python egg for DB2 can easily be installed by following these simple steps:
$ sudo -s
$ export IBM_DB_LIB=/Users/<username>/sqllib/lib64
$ export IBM_DB_DIR=/Users/<username>/sqllib
$ export ARCHFLAGS="-arch x86_64"
$ easy_install ibm_db
This will install the ibm_db C driver, and the ibm_db_dbi Python module that complies to the DB-API 2.0 specification.
You can verify that the installation was successful my running the following:
$ python
>>> import ibm_db
>>>
Now, for the Django adapter, install Django first (if you haven’t done so already):
$ sudo easy_install django
The Django adapter can then be installed as follows:
$ sudo easy_install ibm_db_django
Finally, if have installed SQLAlchemy and wish to install the DB2 adapter for it, run:
$ sudo easy_install ibm_db_sa
Please let me know if you encounter any issues, I’d be glad to help you.
Now that Mac OS X 10.6 is out, it’s time to leave the world of 32 bit computing behind. The pre-installed Ruby interpreter will run in 64 bit mode by default, so you may need to pay attention when installing some C-based gems. The ibm_db Ruby gem for DB2 can easily be installed or updated to the latest available version by following these simple steps:
$ sudo -s
$ export IBM_DB_LIB=/Users/<username>/sqllib/lib64
$ export IBM_DB_INCLUDE=/Users/<username>/sqllib/include
$ export ARCHFLAGS="-arch x86_64"
$ gem install ibm_db
You can verify that the installation was successful my running the following:
$ irb
>> require 'ibm_db.bundle'
=> true
Please let me know if you encounter any issues, I’d be glad to help you.
One of the best programmers I know is selling a web application on eBay, that he’s been developing and running for the past three years. Given the starting price and considering what one lucky person or company will walk away with, I must say, it’s an amazing deal. I’m writing about his auction here so that I can help it get the proper exposure it deserves and because I think it’s an incredible bargain for anyone who is interested!

BlogBabel, the aforementioned site/web app, is a blog indexing and aggregation service that began in 2006. Amongst its features are the ability to detect and show the most popular blog discussions, weekly posts, books, videos, and even popular blog entries based on their location (through geotagging). It also features leaderboards of the most popular blogs.
Its codebase uses Python and Django, and consists of 27,359 physical lines of code (roughly equivalent to 6.46 person-years, according to sloccount). The R&D alone makes this application worthwhile to an interested party.
At this stage, BlogBabel has an Italian interface (located at it.blogbabel.com) and aggregates almost 15,000 Italian blogs and 5 million posts. Changing the interface to make it an international project that’s available in several languages, or switching to English (solely), would not be challenging in the least (they used to run a Spanish version as well, for example, but decided to discontinue it so as to focus on the Italian one).
BlogBabel has been featured in the mainstream Italian media and has had a noticeable influence on the Italian blogosphere. One could argue that it has been the yellow pages of the Italian blogosphere. Because of this, Ludovico Magnocavallo (the site’s creator) received substantial offers to buy BlogBabel in the past, but he turned them down because he wanted to continue building this site. Now however, due to personal circumstances and lack of time/resources, he’s willing to sell this application for what may amount to far less than its true value. And here’s the real bargain, the starting price, without a reserve, is 4,999 Euros. This is of course, a ridiculously low price for the value being offered. But Ludovico believes in letting the market decide.
If I had the funds lying around, I would buy it myself and gear it towards the English speaking world (in conjunction with the pre-existing Italian version). It’s a prepackaged, virtually ready-made startup with a great deal of potential both in its current state and in terms of what it could grow to become.
To recap, the auction includes:
BlogBabel has been running smoothly for three years, and is currently under-marketed. Optimizing ads, affiliates, and similar sources of revenue wouldn’t be hard at all, especially if one were to aim this site at the English speaking world.
Also, Ludovico has already implemented most of the code that’s necessary to allow users to have accounts (through OpenID), but since these “social features” are not fully implemented yet, they have not been deployed in production. A buyer could decide to disregard them or finish implementing them and roll out a technorati-like service. The winner of this auction could decide to implement support for Twitter, comments on social networks, sentiment analysis, etc, on their own. The possibilities are really limitless when you start with a solid engine and crawler, and already have a great deal of data at your fingertips.
I know Ludovico and he’s a stand-up guy. If you are interested in this great deal, you can bid here. If you have technical questions about this auction, please feel free to contact him directly through eBay.
UPDATE (September 8, 2009): Ludovico received an undisclosed offer for the site and a few years of maintenance work, so the auction for the site alone was suspended.
I finally got around to updating the Ruby and Rails book pages. The existing list was getting a bit obsolete and I didn’t like the idea of recommending old books to newcomers. I also had some interesting new entries.
Without further ado:
A few people may disagree with the choices, but I think most experienced Ruby and Rails programmers, who’ve read those books, will concur with my recommendations. I’m quite confident that these are, all things considered, some of the best books available on the subject.
A word to the publishers
As tempting as it is to collect Ruby and Rails books, these days I don’t feel I can economically justify the act of purchasing every Ruby or Rails book put out there. So if you are a publisher or an author, and you’d like for me to consider your book, you are certainly welcome to send me a review copy. I will definitely read it, but only include it on these lists if it’s either outstanding or as good as the existing ones. If it’s a programming book that’s not related to Ruby/Rails, yet is really good, I would consider reviewing it on my blog.
In a previous article I compared the performance of Ruby on Windows, built through Microsoft Visual C++ and GCC. The numbers for the MinGW version were very impressive. So the question now becomes, how does its performance compare to that of Ruby on Linux? To quote one person (Alex) who commented on the aforementioned post:
With the new mingw32 substantial speed improvements, think it makes sense now to also test at least the baseline (MRI) on Mac/Linux on the same battery of tests, so we Windows folks could get a better idea of how far behind are we yet and what the different Windows interpreters speed target shall be.
Any sort of performance improvement for something that is notoriously slow on Windows is more than welcome, but would this be enough to fill the gap between Ruby’s performance on Windows and on Linux? How much faster is Ruby on Linux? Let’s find out.
Setup
Benchmark results
The following table/image compares the performance of Ruby 1.8.6 on Windows and Linux. A light green background indicates which of the two was faster. The total times exclude tests that raised an error or were not available (N/A) for any of the four implementations, but includes timeouts (they are counted as 300 seconds to provide a lower bound). The ratio column indicates how many times faster Ruby on Linux was:

The second table/image below compares Ruby 1.9.1 on Windows and on Linux, using the same criteria as above.

Note: The totals shown are different from the ones seen in other posts since the subset of benchmarks included in the totals is different.
Conclusion
According to the geometric mean of the ratios for these tests, it appears that on average Ruby 1.8.6 on Linux is about twice as fast as Ruby 1.8.6 on Windows. Conversely, Ruby 1.9.1 on Linux is about 70% faster than the Windows version.
The Windows implementations use GCC 3.4.5 (a four year old compiler) at the moment, while I built the implementations on Ubuntu with GCC 4.3.3 (which is available by default). This helps, at least in part, to justify the performance gap. Luis Lavena, leader of the Windows port, confirmed to me that a switch to GCC 4.4.x is planned for the future. This should significantly increase performance on Windows yet again, and bump Ruby’s speed on Windows a bit closer to the speed that’s obtainable on Linux.
For the time being, switching to Ruby 1.9.1 on Windows will give you a performance that is better than what’s obtained by those who are still using Ruby 1.8.x on Linux. If it’s possible, switch.