As I write a series of thoughts on the pursuit of excellence in programming, I must preface my essay by asking you to ignore that I wrote these words. I invite you to evaluate the opinions and ideas presented here not ad hominem, but rather on the basis of their own merits. It would be easy to otherwise mistakenly dismiss them with the infamous question posed by Steve Jobs to a blogger: “What have you done that’s so great?”.
This is to say that I talk about the ambitious and noble goal of achieving excellence in programming, fully conscious of not having achieved said excellence. For the time being, I don’t feel like I can point my finger at something that would impress Steve Jobs (or less critical observers). I’m just a traveler on the journey of learning, with a desire to share his experiences and plans.
Two visions of intelligence
Mastering a complex discipline such as programming requires a great amount of learning over the course of several years, perhaps even decades. Maximizing one’s ability to learn is therefore an early investment that can quickly repay itself.
The biggest impact on my ability to learn was caused by a shift in the way I considered the matter of intelligence. There are mainly two ways to think about it. You can either consider intelligence to be a static, intrinsic ability or a more dynamic, cultivable characteristic of human beings.
Cognitive scientists and psychologists conclusively determined that people who perceived intelligence as a dynamic characteristic, outperformed and were more successful than people who internalized intelligence as an intrinsic, static ability.
It’s worth noting that it’s not really important whether intelligence can actually be developed through application. It’s the perception of it that forges students’ approach to learning.
This difference in perception is often conditioned by early parenting. Kids who are encouraged to work hard to achieve results and are praised on the basis of their effort, tend to develop a perception of intelligence and results as something they can work on. Other kids are conditioned to think that they are doing well because they are “smart” and that their intelligence alone will most likely lead them to success.
Society has a fascination with genius, and parents like to fancy their little ones to be several standard deviations better than the norm, but conditioning children this way has dangerous and counterproductive consequences.
Kids who are labeled and praised because of their “innate capabilities”, will often suffer from an overconfidence that will affect their ability to challenge themselves through the depths of the unknown, because they feel it would threaten their status. What if they fail? It would mean, in their eyes, that perhaps they are not the smart person they have been assumed to be all along. We all have seen such kids failing here and there, and quickly making excuses such as, “Oh, I wasn’t trying at all”.
A parent who is cultivating a kid’s interest in hard work, may be more likely to encourage their child with words such as, “It’s OK. Keep studying, and you’ll definitely do better next time”. A parent proposing a model of static intelligence, may justify their child’s failure in a given subject by concluding that “maybe you are not cut out for subject X”. [1]
When facing failure, the “static intelligence” child may crumble under the weight of his own demise, as if failure was a reflection of their intrinsic value rather than a temporary speed bump and occasion for growth. A “dynamic intelligence” child will simply try harder next time. Genius or not, excellence and mastery of any subject requires hard work and many “smart” kids fall short when the bar is raised high enough so that “smartness” alone won’t cut it anymore. This usually corresponds with the switch from high school to college.
I’m very familiar with all of this, because I was one of those kids. I was labeled by my parents and science teachers as a “genius”. Even psychologists at school, who came to help us figure out what careers we were better suited for, ended up telling me that I could pursue virtually any career (at the time I was interested in nuclear physics) and that according to their (virtually meaningless) IQ tests I would be classified as a “genius”.
Please note that the problem wasn’t so much the label. Most smart kids figure out that they are smart on their own rather quickly. The real problem was that I wasn’t taught the value of long-term intellectual effort. Effort itself was considered as being somewhat detrimental to my status. Not only was I supposed to succeed, but I supposedly had to do so without putting forth any effort (a “utopic” ambition).
One of the first example of this that I can remember is when my father “caught me studying” for a few hours the same book before a test in middle school. He told me something along the lines of “Why do you need to study? A genius like you should figure out the test without studying.”. It’s absurd, I know, and probably one of the dumbest things my — otherwise bright — father has ever told me. As a young kid though, such a statement can have a strong impact on you.
Another example has to do with Latin. My teacher was a palm reading, crazy cat lady and I had no respect for her from the start. So I didn’t pay attention in class, on top of not studying Latin at home, I set myself up for failure. When the first translation assignment came around, I got a mildly negative score. My less than professional teacher told me, “Oh I thought you were good, but I guess you are not”. Afterwards my father added to this by saying something along the lines of, “Well, don’t worry, I guess languages are not your strength.”.
Boom. That was enough for me to stop having any interest in Latin and completely ignore a subject at which I wasn’t excelling. Nobody could tell me, “You are stupid because you don’t understand Latin,” if I didn’t try at all. So I was the high school kid who did advanced Calculus stuff on his own for fun, when my classmates where struggling with Algebra, yet I pretty much sucked at Latin. (In retrospective, the thought patterns required to excel at Latin where not very different from those required to excel at math or mastering English as a second language. I suspect that had I put in some effort I could have been very good at, and actually enjoyed, the subject.)
Over the years I had to readjust this perception entirely. By falling on my face more than once, I learned that excellence is only achieved through a combination of talent and effort. The real genius may lie in the ability to put in thousands of hours of focused study and practice, in the pursuit of whatever one person is trying to learn and understand.
If people knew how hard I worked to get my mastery, it wouldn’t seem so wonderful at all. — Michelangelo
By this new definition, I was a complete idiot who had to entirely learn from scratch how to appreciate the value of effort to hone and develop his “talent” (which is nothing but a seed on its own).
Learning to learn
Being very interested in programming, computer science, mathematics, and science in general, I decided at some point that I had to entirely change my attitude towards learning if I were to master any of those disciplines. Effort was now more important than intelligence on its own, and I would feel satisfied only when doing a really good job in the pursuit of something challenging, that couldn’t be achieved by sheer “talent”.
The saddest thing in life is wasted talent, and the choices that you make will shape your life forever. — Calogero ‘C’ Anello from A Bronx Tale
In the process, I started to internalize a few principles and I dealt with issues related to both the art of learning and the art of programming.
An awful feeling
There is a cognitive bias known as the Dunning-Kruger effect [2], in which subjects who are inexperienced or less competent within a given discipline, tend to overestimate their abilities (they are in other words affected by illusory superiority [3]).
The other side of this coin is that the more you study, the more you realize how little you know and how much there is to know (a concept put forward by Socrates back in the days of Ancient Greece). This is both a pleasure and a discomfort. There is a huge amount of satisfaction in finding things out. Yet, being in doubt and fully aware of how little you know tends to be an unpleasant side effect all learners have to live with. Doubt truly is the water that’s fundamental for the growth of the flower of intellectual curiosity. [4]
My approach in this case is to embrace and dominate my ignorance and fears. Whenever there is a concept that I feel particularly ignorant about or that is way over my head, I try to tackle it as if my life depended on it. There are still countless things I’m ignorant about, but this approach as really paid off for me over the years.
If you are trying to learn a whole branch of computer science or mathematics, it will take a long time, so you may want to start first with smaller “fears” that can be mastered, at least at an introductory level, in a short amount of time. Rather than thinking, “Oh yeah, I should really learn Git”, for months, act on the thought. The essential knowledge to work with Git or Hg doesn’t take months to learn (assuming you have a need for either of these particular tools).
But you are in this for the long run, so don’t be afraid of improving your craft by studying advanced topics that require a bigger commitment in terms of time as well. There is no royal shortcut, mastery of our craft will require thousands of hours of dedicated study and practice.
Theory and practice
Masters of any intellectual discipline tend to have good working knowledge of both theoretical and practical aspects. Pursuing excellence in programming requires the study of many insightful books that will widen your view of the field and as a result necessarily improve your craft.
Working with books alone is not enough though. Programming requires writing and reading programs every day for years. That’s why I have a rule: I refuse to go to sleep if I haven’t read and written some code on a given day (this doesn’t include the code I write for work, of course). So far, this rule has had a positive impact on my ability to code.
It is a mistake to think that the practice of my art has become easy to me. I assure you, dear friend, no one has given so much care to the study of composition as I. There is scarcely a famous master in music whose works I have not frequently and diligently studied. — Mozart
Breadth Vs. Depth
As we progress in our journey towards the pursuit of excellence in programming, a question that will no doubt pester people’s minds is whether one should go for breadth of knowledge, or depth. There are countless programming languages, paradigms, methodologies, technologies, et cetera.
The truth of the matter is that if we are to become GrandMaster programmers, we cannot ignore either of them. In practice, depth has a much stronger impact in the way we construct software. It is through deep understanding that we can see the whole through the parts. There is therefore value in specializing in just a few languages and technologies, and really mastering them in-depth.
Getting things done in software development requires a certain pragmatism and proficiency with the tools at hand. There is no escaping it. Depth is therefore necessary, but not sufficient.
I find that the web is particularly good at covering the breadth aspect of things. There are always new and interesting areas I can learn about and experiment with. I don’t need a whole 600 page book to get a feeling for — or to better understand — certain technologies that are not crucial to my area of specialization.
While there are quite a few exceptions I could mention, I’d say that I tend to use the Internet as an aid for horizontal scaling of my knowledge, and books for vertical scaling. And again, more often than not, the depth level is mostly determined by the amount of time, practice and effort I put into it, rather than the media I’m using.
Where to find time
One of the objections I hear often, particularly when it comes to reading books is, “I don’t have time!”. In a few extreme cases, that may actually be true, but I think that most people severely underestimate how much time is “wasted” on a daily basis (even just surfing online).
Breaks are extremely important, and I’m not advocating any regime of incessant study. I simply know how crucial it is to be constant. It’s a marathon, not a sprint.
My second rule is: I refuse to got sleep if I haven’t read at least a chapter of a book that day. Very often I get caught up in the book I’m reading or working through, and end up getting through much more than just one chapter (it really depends on the book, of course). But for me the rule is clear: no sleep allowed until one chapter has been read every day.
Try this approach and you’ll see that reading doesn’t have to take up much time, yet doing so you can still read several books (including technical ones) every month.
We are what we repeatedly do. Excellence, then, is not an act but a habit. — Aristotle
For certain books, it is convenient to have the book in PDF format on your computer, as you switch from the book to the editor/console and back. However, generally speaking the computer tends to be quite distracting and thoughts like “I’ll just check my email quickly” can easily lead to hours spent doing something else.
For this reason I prefer to read from a paper book which is also easier on the eyes for extensive reading after a day in front of a computer screen. Even if I’m using the computer at the same time to input code, the physical presence of a book next to my laptop is enough to remind me that I shouldn’t get distracted online. (For me the depth and focus required by books is also an antidote to the re-wiring that the web tends to do to our brains. [5])
By the way, a few days ago Amazon announced a gorgeous, brand new graphite color Kindle DX [6]. I think I may pull the trigger and get it for my upcoming 30th birthday. Buying numerous paper books is expensive, and given the price of Kindle books, this move would end up being cheaper in the long run. The large e-ink display almost looks like paper and the device is not as distracting as an iPad (you read on a Kindle, and that’s it). Plus, it’s lighter than most technical books and surely takes up less space in your home.
Achieving focus
With so much going on within the programming world, distractions are easy to come by. My approach is to focus only on the given macro-task at hand. If I’m trying to learn about process calculi for example, then for the next few months my “learning time” will be ruthlessly dedicated to that subject, as if the rest of the programming ecosystem stopped in time.
Then there is focusing at a micro-task level. Learning about a given subject can always be divided into a long series of smaller steps. When I’m focusing on one such, tiny step, then everything else ceases to exist (or at least in theory).
One trick I use to achieve solemn focus on micro-tasks, whether reading code, writing code, or reading a technical book, is the use of the Pomodoro Technique [7]. In short, I use timer software [8] which alerts me when 25 minutes (a pomodoro) have passed, and gives me a 5 minute break for each pomodoro. Every 4 pomodoros, I can take a longer break.
When I first started using this technique I thought it was mostly a gimmick and I had a hard time taking breaks. I just wanted to keep going and the “tracking” seemed silly. However, I must say that it strikes an elegant balance between the desire to focus for extensive periods of time and the importance of taking regular mini-breaks.
This approach has become routine for my mind now, so even if it is just a gimmick, it’s still a good way for me to get focused and “in the zone”.
Aiming for sprezzatura
The pursuit of excellence requires a huge drive from within, and a fundamental dissatisfaction with just being good at a given discipline. I believe this is true regardless of the profession at hand.
As I progress in my journey, I’m discovering how the key is to make the pursuit of excellence a habit. This goes against my nature of being an intellect sprinter, but I’m in for the long run and I’m really learning to enjoy the, marathon like, process.
My long-term goal is to program with sprezzatura [9], where the process is so internalized and part of my subconscious that it almost looks effortless (as if the act of programming was committed to muscle memory). It will be an overnight success, 15 years in the making.
Regardless of the improvement level achieved, I will always have the joy, privilege and need to continue to learn for the betterment of myself and my craft. When there is no set destination, the journey is what really matters.
Ancora imparo. (I’m still learning.) — Michelangelo
Notes
[1] These concepts are explained, in a much more eloquent manner, in the early chapters of the excellent book, The Art of Learning by Josh Waitzkin.
[2] Dunning-Kruger Effect on Wikipedia.
[3] Illusory Superiority on Wikipedia.
[4] For more on the importance of doubt in science, check out the beautiful epilogue in What do you care what other people think? by Richard P. Feynman.
[5] For more on this phenomenon, read The Shallows by Nicholas Carr.
[6] The new Kindle DX on Amazon.
[8] Pomodoro for Mac OS X.
[9] Sprezzatura on Wikipedia.
Translations
Carlos Marcelo Cabrera translated this article into Spanish: La búsqueda de la excelencia en la programación.
From the Padrino’s site:
Padrino is a ruby framework built upon the excellent Sinatra Microframework. Sinatra is a DSL for creating simple web applications in Ruby with speed and minimal effort. This framework makes it as fun and easy as possible to code increasingly advanced web applications by expanding upon Sinatra while maintaining the spirit that made it great.
The Ruby community has plenty of web frameworks at this point. Padrino — self-described as “The Elegant Ruby Web Framework” — is interesting because it’s built on top of Sinatra, it’s highly modular, quite fast, and provides a drop-in admin interface. It fits between Sinatra and a large framework like Rails.
If it wasn’t for the fact that Rails 3 is about to be released, Padrino may have had a fighting chance at acquiring a good market share within the Ruby community. Rails 3 is here though, and it too is very modular and fast. Plus, it’s hard to beat the huge ecosystem that’s already built around it.
That said, the presence of an admin interface, a la Django, and the Sinatra core are definitely inviting features. Check out their documentation and screencast, to see if you think it’s worth considering for your own web development needs.
Rails 3 is a major upgrade; using it almost feels like working with an entirely new framework. Porting existing applications and acquiring the skills required to build new ones entails a significant amount of effort. You could scout the net for bits and pieces of information, but that would be time consuming and possibly frustrating. Thankfully there are resources available that have done the work for you, so you don’t have to waste time trying to figure out what’s new.
In this post, I’d like to point out a couple of resources that I think compliment each other well, and focus on how to upgrade applications, as opposed to simply providing you with a shopping list of features.
The first one is Upgrading applications to Rails 3, a screencast that was just released by ThinkCode.TV. This screencast is almost an hour-long and shows you how to port a real world web application from Rails 2 to Rails 3. As such, it can be very useful if you have existing code that you’d like to port over to Rails 3. The author ported a few large applications to Rails 3, as he has solid experience with it. I’m biased of course, but I feel it’s well worth $8.99. (Today only, use the coupon RAILS3 to purchase this Rails screencast for just $5.99.)

The second resource is the Rails 3 Upgrade Handbook by Jeremy McAnally. It’s a beautiful PDF that succinctly explains what’s new in Rails 3, as well as how to upgrade your applications to the new edition of the framework. At 10c per page ($12 for 120 pages), it too is worth the money in my opinion.

Regardless of whether you end up buying these resources or not, I sure hope you have extensive test coverage for your existing Rails 2 applications. In my experience this is a must, because porting complex applications to Rails 3 without solid test support is a definite challenge. Nevertheless, I feel that this major upgrade is truly worth it. Rails 3 really brings Rails to a whole new level and we, as a community, should be proud and excited about what lies ahead of us.
“What programming language should I study next? What framework?” I occasionally receive emails from younger — and not so young — readers alike asking me for guidance about such matters. “Use the right tool for the job” is the correct answer, but it’s cheap advice when there are a plethora of tools seemingly right for the job. For most people these days the job at hand is of course web application development.
Should they study Ruby and Ruby on Rails? Or Python and Django? How about C# 4.0 and ASP.NET MVC? Maybe CakePHP? Java and Stripes? And how about more exotic choices like Clojure and Compojure or Scala and Lift?
With very few exceptions, in 2010, it’s hard to choose a combination of semi-popular technologies that couldn’t do the job. Does it really make a huge difference if you choose to study Ruby on Rails or Django? In all honesty, despite all the existing differences, it doesn’t really matter. As long as you become proficient with one of these tools, you will be adequately equipped to approach most web development tasks. Your experience as a server-side developer will be the bottleneck, not your framework of choice.
The real reason why I get asked these questions though, is that these people are mostly looking for a silver bullet, a language-framework combo that will magically allow them to create fantastic web applications in a matter of weeks. They are often after a shortcut, but there is no royal path to web programming.
When I think about the future of programming languages, I envision Babel not people talking Esperanto. We are destined to live in a technological world were there will be many valid server-side options, which are similar yet different enough to justify their own existence and that of their respective communities.
There won’t be a programming language to rule them all, but I believe one language will continue to be the lingua franca of the web. In that sense, it’s the most important programming language today and I think its relevance will only continue to grow in the future. I’m talking of course about JavaScript.
Today JavaScript is king when it comes to client-side web programming. It took us a while to reach this point. In the collective mind, JavaScript was considered a poor language used by amateurs to create annoying web pages. Today (thanks to AJAX amongst others factors) it’s a language that’s appreciated by many professionals and used by the vast majority of web developers. Whether you program web applications in Ruby, Python, Perl, PHP, C#, or something else, you’ll deal with JavaScript (it’s the greatest common denominator of the web development community). I know of very few professional web developers who lack a cursory knowledge of the language (or its cousin, ActionScript).
Over the past few years the browser has become the single most important application on users’ computers. This in turn, sealed JavaScript’s fate for the foreseeable future. Despite its many flaws, JavaScript is a powerful and elegant language that has advanced features which are blatantly missing from “full blown” languages like Java. Programmers have come to realize its power and usefulness within the browser. Beautiful JavaScript frameworks like jQuery, YUI, and more recently SproutCore and Cappuccino (Objective-J), showcased the art of what it’s possible to accomplish with this language. And with HTML5 becoming closer to reality, there will be an ever greater emphasis on DOM scripting and less reliance, when feasible, on RIA plugins.
If generally speaking, JavaScript is a solid and powerful language that most web developers need to know anyway, why can’t we develop in JavaScript server-side as well? And while we’re at it, maybe use it for desktop applications too? It would seem rational to capitalize on the benefits of having a huge percentage of programmers use the same language for both client and server-side programming. (If an update to the language is required to clean it up a little, let’s do that.) Why shouldn’t we be able to run js myscript.js, outside of a browser, and obtain the result of the computation in output? There is no inherent reason why JavaScript needs to be tied to the browser.
Thankfully times are changing and concrete answers to those rhetorical questions are emerging. The V8 JavaScript Engine is a project that was started by Google which provides us with a standalone shell to execute scripts and try out code in a basic REPL (Read-eval-print-loop). It’s the same engine embedded into Google Chrome, and as such, it’s a fast implementation as well.
Another great effort that’s headed in the same direction, and builds on top of V8 is Node.js, an evented I/O framework. You can think of it as Tornado, Twisted or EventMachine, simplified for server-side JavaScript. Node doesn’t require as much knowledge about event loops and non-blocking I/O, and the look and feel of such callbacks is very reminiscent of the type of AJAX code we’ve all seen before. Node can easily be used as a basic, ultra fast web server, to which one can delegate I/O callbacks for scalability and efficiency.
Recently Heroku announced beta support for Node1. It’s a risk on their part, but one worth taking in my opinion. If nothing else, at the very least, Rails developers deploying on Heroku will have the option to integrate Node to increase scalability and performance.
But Node (which embeds the V8 engine) has a lot more potential than just that. The ultimate goal is to become a self-contained solution which would allow one to develop and deploy server-side JavaScript code in production mode.
Node is just a prominent example of the impact of the CommonJS project/movement, which is aimed at making JavaScript available outside of the browser (on severs and desktops). There is in fact an ecosystem of new .js libraries that are meant to be used with server-side JavaScript (this is likely to grow over time).
What we really need is a lightweight web framework that well integrates server and client-side JavaScript. This would have game changing potential (think Rails back in 2004). Developers have grown accustomed to a high level of abstraction when it comes to web development though, so there are a couple of possibilities here: either Node will become that framework or someone will create such a framework (perhaps on top of Node). Whoever does that will hold a piece of future and a golden ticket in their hands.
[1] For a terrific demo of a Cappuccino + Node application deployed on Heroku, check out GitHub Issues.
This is a tiny post to let you know that IBM just released version 2.5.0 of the IBM_DB gem with support for the upcoming Rails 3. That’s what I call both proactive and a true testament of IBM’s commitment towards DB2 on Rails.
Aside from providing a working adapter and driver before the new framework release is even out, this release has a few improvements and fixes, such as getting rid of a minor bug related to prepared statements and has_many associations.
Finally, ibm_db 2.5 improves upon Unicode integration with support for any encoding format that’s permitted by Ruby 1.9.

Recently Matt Aimonetti wrote an insightful article about Rails and the Enterprise. In it he identifies five core Enterprise application needs:
Matt then proceeds to illustrate how Rails does a good job in regards to most of these points, despite a few existing challenges.
Among these challenges, I can clearly see the following:
There isn’t a 1-800-RAILS number; the community may be great, but there are not exactly yearly contracts in place. Furthermore, the author of the Ruby driver for, say, Oracle doesn’t owe you a thing. He may or may not be there for you when you need something to be fixed quickly.
XML support is less than ideal (but it is improving).
Integration with the Enterprise world is not easy, due in part to less than stellar SOAP support (but that is also improving).
If you’re taking advantage of ActiveRecord and an Enterprise database like Oracle, your DBA isn’t likely to be happy that you aren’t using prepared statements.
You may think these are small points, and in the startup world they generally are. However, in the Enterprise world they do make the difference between adoption and niche.
One thing that Matt forgot to mention is DB2, which should be the poster child for how Rails can be Enterprise ready. And as a bonus, you get to throw around IBM’s name (which is synonymous with Enterprise) because Rails is both supported by, and used within, IBM.
Let’s address each point above with DB2 on Rails in mind.
IBM is the only database vendor to provide a Ruby driver and ActiveRecord/Rails adapter for its databases. This means that you have a team that’s accountable when things don’t work as they’re supposed to. This team is accountable, regardless of whether you have a contract with IBM or not; it’s their job, not a hobby. This involvement with Rails dates back to 2006, with continuing releases and improvements ever since. Our IBM’s optional yearly 24/7 support contracts (e.g., for less than $3000 a year per server with DB2 Express-C) include support for DB2 on Rails as well.
DB2 supports native storage and querying of XML documents and data (plus it’s fast). This technology is known as pureXML.
DB2 can both consume and serve SOAP web services. This lets DB2 do the integration for you.
DB2 on Rails supports parameterized queries. (While on the subject of queries, if you are using JRuby you can take advantage of pureQuery, which is an IBM created Enterprise solution that’s aimed at making your queries fast, reliable, manageable and easy to debug.)
If you are trying to introduce Rails into your Enterprise job, chances are that DB2 will already be present within the company infrastructure. If not, you can use DB2 Express-C which is entirely free — thus making it easier to introduce than an expensive solution.
IBM is one of the most trusted brands on the market today, as it has been for decades now. Banks and Enterprise companies the world over trust DB2 with their most critical data. One way for the Rails community to increase the adoption of Rails in the Enterprise, is to acknowledge and embrace the great pair that is Rails and DB2.
The latest release of the IBM Adapter for Django now supports Django 1.2. Aside from enabling you to use the most recent version of Django, this release adds a few new goodies into the mix, that I’m sure many will appreciate.
For example, IBM’s adapter (through the underlying DBI wrapper) now uses persistent connections, which are especially helpful when dealing with Django – as it lacks connection pooling. (Of course DB2 also has the Connection Concentrator to aid in reducing the usage of server resources and improving scalability.)
Furthermore, the adapter adds support for the DECIMAL datatype, a necessary feature when dealing with money and currencies. Various enhancements and bug fixes were included too; check them out on Google Groups.
As a reminder, DB2 Express-C is an absolutely free of charge version of DB2 and it’s production ready (not a toy version). You can download it from here. Take it for a spin, experiment – chances are you’ll like it. If you need a guide to getting started, be sure to check out this free e-book by my colleagues Raul, Ian, and Rav.
The usability of web forms is a subject that has been discussed extensively, and one which is supported by a large body of literature (1, 2, 3, 4). The consensus is that getting web forms right is much harder that it may initially seem. One aspect that particularly annoys me is the way most developers implement passwords and their validation.
Despite the emergence of single sign-on systems like OpenID, most users are still affected by so-called password fatigue, due to the effort required to memorize a number of different passwords for several services.
For a variety of reasons, users end up taking a dangerous shortcut: they reuse the same password (or small group of passwords) for everything. This approach is as secure as the weakest site you signed up with 1, which generally means it’s very insecure. A better alternative, and the approach I take, is to use a secure password manager instead. At this point it would be very challenging for me to work on Mac OS X, without having 1Password installed.
The reason I mention 1Password is because it has advanced features that generate randomized, secure passwords when you sign-up for a new site. Having that tool at my disposal further highlights the shortcomings of most web forms in regards to handling passwords. In fact, I often find myself changing settings in the password generator, just to satisfy the arbitrary rules defined by each individual form.

In light of the aforementioned considerations, here are a few suggestions for login implementers:
The majority of sites don’t follow these seemingly obvious guidelines. As a result, I often catch myself wondering if the password I’m about to submit will go through or not. That’s actually what motivated me to write this short article. I’m certain that some may disagree with a few of my points, while others may want to add more. Please feel free to do so in the comment section below.
1 Your client itself could be compromised of course (think about keyloggers, for example).
2 Technically speaking, no password system is ever truly secure. But as far as login systems go, these rules should be a good compromise between the need for security and the desire to avoid unnecessary frustration on the part of your users.
The API development team just released a major version of the ibm_db gem. Detailed installation instructions are available on RubyForge (PDF). Among several improvements, there are three particularly newsworthy features:
As we approach the release of Rails 3, supporting Ruby 1.9 is becoming more of a necessity. Likewise, the so called “One-Click installer” on Windows has been replaced by a current project that uses mingw32, which offers a much needed performance boost on Windows. Having a mingw32 compatible gem is starting to become a requirement for many of our Windows users.
Finally, DB2 is now the only database that supports prepared statements in ActiveRecord (without changing any of the application’s code). This has important performance benefits, as I explained in my article Improve the security and performance of DB2 Ruby on Rails applications using parameterized queries, which was published today by developerWorks.
The following is a very short guide on setting up Ruby Enterprise Edition (REE), nginx and Passenger, for serving Ruby on Rails applications on Ubuntu. It also includes a few quick and easy optimization tips.
We start with setting up REE (x64), using the .deb file provided by Phusion:
wget http://rubyforge.org/frs/download.php/66163/ruby-enterprise_1.8.7-2009.10_amd64.deb
sudo dpkg -i ruby-enterprise_1.8.7-2009.10_amd64.deb
ruby -v
In output you should see “ruby 1.8.7 (2009-06-12 patchlevel 174)…” or similar. If this is the case, good; while you are there, update RubyGems and the installed gems:
sudo gem update --system
sudo gem update
Next, you’ll need to install nginx, which is a really fast web server. The Phusion team has made it very easy to install, but if you simply follow most instructions found elsewhere, you’ll get the following error:
checking for system md library ... not found checking for system md5 library ... not found checking for OpenSSL md5 crypto library ... not found ./configure: error: the HTTP cache module requires md5 functions from OpenSSL library. You can either disable the module by using --without-http-cache option, or install the OpenSSL library in the system, or build the OpenSSL library statically from the source with nginx by using --with-http_ssl_module --with-openssl=options.
Instead, we are going to install libssl-dev first and then nginx and its Passenger module:
sudo aptitude install libssl-dev
sudo passenger-install-nginx-module
Follow the prompt and accept all the defaults (when prompted to chose between 1 and 2, pick 1).
Before I proceed with the configuration, I like to create an init script and have it boot at startup (the script itself is adapted from one provided by the excellent articles at slicehost.com):
sudo vim /etc/init.d/nginx
The content of which needs to be:
#! /bin/sh
### BEGIN INIT INFO
# Provides: nginx
# Required-Start: $all
# Required-Stop: $all
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: starts the nginx web server
# Description: starts nginx using start-stop-daemon
### END INIT INFO
PATH=/opt/nginx/sbin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
DAEMON=/opt/nginx/sbin/nginx
NAME=nginx
DESC=nginx
test -x $DAEMON || exit 0
# Include nginx defaults if available
if [ -f /etc/default/nginx ] ; then
. /etc/default/nginx
fi
set -e
. /lib/lsb/init-functions
case "$1" in
start)
echo -n "Starting $DESC: "
start-stop-daemon --start --quiet --pidfile /opt/nginx/logs/$NAME.pid \
--exec $DAEMON -- $DAEMON_OPTS || true
echo "$NAME."
;;
stop)
echo -n "Stopping $DESC: "
start-stop-daemon --stop --quiet --pidfile /opt/nginx/logs/$NAME.pid \
--exec $DAEMON || true
echo "$NAME."
;;
restart|force-reload)
echo -n "Restarting $DESC: "
start-stop-daemon --stop --quiet --pidfile \
/opt/nginx/logs/$NAME.pid --exec $DAEMON || true
sleep 1
start-stop-daemon --start --quiet --pidfile \
/opt/nginx/logs/$NAME.pid --exec $DAEMON -- $DAEMON_OPTS || true
echo "$NAME."
;;
reload)
echo -n "Reloading $DESC configuration: "
start-stop-daemon --stop --signal HUP --quiet --pidfile /opt/nginx/logs/$NAME.pid \
--exec $DAEMON || true
echo "$NAME."
;;
status)
status_of_proc -p /opt/nginx/logs/$NAME.pid "$DAEMON" nginx && exit 0 || exit $?
;;
*)
N=/etc/init.d/$NAME
echo "Usage: $N {start|stop|restart|reload|force-reload|status}" >&2
exit 1
;;
esac
exit 0
Change its permission and have it startup at boot:
sudo chmod +x /etc/init.d/nginx
sudo /usr/sbin/update-rc.d -f nginx defaults
From now on, you’ll be able to start, stop and restart nginx with it. Start the server as follows:
sudo /etc/init.d/nginx start
Heading over to your server IP with your browser, you should see “Welcome to nginx!”. If you do, great, we can move on with the configuration of nginx for your Rails app.
Edit nginx’ configuration file:
sudo vim /opt/nginx/conf/nginx.conf
Adding a server section within the http section, as follows:
server {
listen 80;
server_name example.com;
root /somewhere/my_rails_app/public;
passenger_enabled on;
rails_spawn_method smart;
}
The server name can also be a subdomain if you wish (e.g., blog.example.com). It’s important that you point the root to your Rails’ app public directory.
The rails_spawn_method directive is very efficient, allowing Passenger to consume less memory per process and speed up the spawning process, whenever your Rails application is not affected by its limitations (for a discussion about this you can read the proper section in the official guide).
If you have lots of RAM (e.g., more than 512 MB) on your server, you may want to consider increasing you maximum pool size, with the directive passenger_max_pool_size from its default size of 6. Conversely, if you want to limit the number of processes running at any time and consume less memory on a small VPS (e.g., 128 to 256MB), you can decrease that number down to 2 (or something in that range). (Always test a bunch of configurations to find one that works for you). You can read more about this directive, in the official guide.
While you are modifying nginx’ configuration, you may also want to increase the worker processes (e.g., to 4, on a typical VPS) and add a few more tweaks (such as enabling gzip compression):
# ...
http {
passenger_root /usr/local/lib/ruby/gems/1.8/gems/passenger-2.2.5;
passenger_ruby /usr/local/bin/ruby;
include mime.types;
default_type application/octet-stream;
access_log logs/access.log;
sendfile on;
keepalive_timeout 65;
tcp_nodelay on;
gzip on;
gzip_comp_level 2;
gzip_proxied any;
server {
#...
When you are happy with the changes, save the file, and restart nginx:
sudo /etc/init.d/nginx restart
If you wish to restart Passenger in the future, without having to restart the whole web server, you can simply run the following command:
touch /somewhere/my_rails_app/tmp/restart.txt
Passenger also provides a few handy monitoring tools. Check them out:
sudo passenger-status
sudo passenger-memory-stats
That’s it, you are ready to go! I hope that you find these few notes useful.
This is the Python version of a post I made about Ruby a few days ago.
Now that Mac OS X 10.6 is out, it’s time to leave the world of 32 bit computing behind. The pre-installed Python interpreter will run in 64 bit mode by default, so you may need to pay attention when installing some C-based eggs.
Assuming you have DB2 Express-C installed already, the ibm_db Python egg for DB2 can easily be installed by following these simple steps:
$ sudo -s
$ export IBM_DB_LIB=/Users/<username>/sqllib/lib64
$ export IBM_DB_DIR=/Users/<username>/sqllib
$ export ARCHFLAGS="-arch x86_64"
$ easy_install ibm_db
This will install the ibm_db C driver, and the ibm_db_dbi Python module that complies to the DB-API 2.0 specification.
You can verify that the installation was successful my running the following:
$ python
>>> import ibm_db
>>>
Now, for the Django adapter, install Django first (if you haven’t done so already):
$ sudo easy_install django
The Django adapter can then be installed as follows:
$ sudo easy_install ibm_db_django
Finally, if have installed SQLAlchemy and wish to install the DB2 adapter for it, run:
$ sudo easy_install ibm_db_sa
Please let me know if you encounter any issues, I’d be glad to help you.
Now that Mac OS X 10.6 is out, it’s time to leave the world of 32 bit computing behind. The pre-installed Ruby interpreter will run in 64 bit mode by default, so you may need to pay attention when installing some C-based gems. The ibm_db Ruby gem for DB2 can easily be installed or updated to the latest available version by following these simple steps:
$ sudo -s
$ export IBM_DB_LIB=/Users/<username>/sqllib/lib64
$ export IBM_DB_INCLUDE=/Users/<username>/sqllib/include
$ export ARCHFLAGS="-arch x86_64"
$ gem install ibm_db
You can verify that the installation was successful my running the following:
$ irb
>> require 'ibm_db.bundle'
=> true
Please let me know if you encounter any issues, I’d be glad to help you.
One of the best programmers I know is selling a web application on eBay, that he’s been developing and running for the past three years. Given the starting price and considering what one lucky person or company will walk away with, I must say, it’s an amazing deal. I’m writing about his auction here so that I can help it get the proper exposure it deserves and because I think it’s an incredible bargain for anyone who is interested!

BlogBabel, the aforementioned site/web app, is a blog indexing and aggregation service that began in 2006. Amongst its features are the ability to detect and show the most popular blog discussions, weekly posts, books, videos, and even popular blog entries based on their location (through geotagging). It also features leaderboards of the most popular blogs.
Its codebase uses Python and Django, and consists of 27,359 physical lines of code (roughly equivalent to 6.46 person-years, according to sloccount). The R&D alone makes this application worthwhile to an interested party.
At this stage, BlogBabel has an Italian interface (located at it.blogbabel.com) and aggregates almost 15,000 Italian blogs and 5 million posts. Changing the interface to make it an international project that’s available in several languages, or switching to English (solely), would not be challenging in the least (they used to run a Spanish version as well, for example, but decided to discontinue it so as to focus on the Italian one).
BlogBabel has been featured in the mainstream Italian media and has had a noticeable influence on the Italian blogosphere. One could argue that it has been the yellow pages of the Italian blogosphere. Because of this, Ludovico Magnocavallo (the site’s creator) received substantial offers to buy BlogBabel in the past, but he turned them down because he wanted to continue building this site. Now however, due to personal circumstances and lack of time/resources, he’s willing to sell this application for what may amount to far less than its true value. And here’s the real bargain, the starting price, without a reserve, is 4,999 Euros. This is of course, a ridiculously low price for the value being offered. But Ludovico believes in letting the market decide.
If I had the funds lying around, I would buy it myself and gear it towards the English speaking world (in conjunction with the pre-existing Italian version). It’s a prepackaged, virtually ready-made startup with a great deal of potential both in its current state and in terms of what it could grow to become.
To recap, the auction includes:
BlogBabel has been running smoothly for three years, and is currently under-marketed. Optimizing ads, affiliates, and similar sources of revenue wouldn’t be hard at all, especially if one were to aim this site at the English speaking world.
Also, Ludovico has already implemented most of the code that’s necessary to allow users to have accounts (through OpenID), but since these “social features” are not fully implemented yet, they have not been deployed in production. A buyer could decide to disregard them or finish implementing them and roll out a technorati-like service. The winner of this auction could decide to implement support for Twitter, comments on social networks, sentiment analysis, etc, on their own. The possibilities are really limitless when you start with a solid engine and crawler, and already have a great deal of data at your fingertips.
I know Ludovico and he’s a stand-up guy. If you are interested in this great deal, you can bid here. If you have technical questions about this auction, please feel free to contact him directly through eBay.
UPDATE (September 8, 2009): Ludovico received an undisclosed offer for the site and a few years of maintenance work, so the auction for the site alone was suspended.
I finally got around to updating the Ruby and Rails book pages. The existing list was getting a bit obsolete and I didn’t like the idea of recommending old books to newcomers. I also had some interesting new entries.
Without further ado:
A few people may disagree with the choices, but I think most experienced Ruby and Rails programmers, who’ve read those books, will concur with my recommendations. I’m quite confident that these are, all things considered, some of the best books available on the subject.
A word to the publishers
As tempting as it is to collect Ruby and Rails books, these days I don’t feel I can economically justify the act of purchasing every Ruby or Rails book put out there. So if you are a publisher or an author, and you’d like for me to consider your book, you are certainly welcome to send me a review copy. I will definitely read it, but only include it on these lists if it’s either outstanding or as good as the existing ones. If it’s a programming book that’s not related to Ruby/Rails, yet is really good, I would consider reviewing it on my blog.
I’m glad to announce that the API team has just released version 1.0.2 of the adapter for Django. And on my birthday to boot, what a nice present. This version extends its support to the recently released Django 1.1, as well as incorporating the feedback that was received earlier on.
(For installation instructions, take a look at the README file.)
IBM confirms its commitment to support Python and Django, and gives Django well deserved credentials in environments where having IBM’s support counts. Django is becoming an increasingly mature web framework with the potential to do well within the Enterprise world. Having support for DB2 will surely help.
The next step will be working with the Django team to bake DB2 support directly into Django’s releases. The code for the adapter is released under a liberal OSI-compliant license that is compatible with Django’s own BSD, and the API team is more than willing to work on the development and support of the adapter should it become part of Django. We love Django and ponies. Let’s make this happen, guys.
In a recent blog entry, Charles Nutter argues about the importance of JRuby for Ruby’s adoption within the Enterprise. Or, in his own words:
The idea of “Enterprise Ruby” has become less repellant since Dave Thomas’s infamous keynote at RailsConf 2006. There are a lot of large, lumbering organizations out there that have yet to adopt any of the newer agile language/framework combinations, and Rails has most definitely led the way. I personally believe that in order for Ruby to become more than just a nice language with a great community, it needs to gain adoption in those organizations, and it needs to do it damn quickly. JRuby is by far the best way for that to happen.
He has a very good point. Working for IBM (it doesn’t get much more Enterprise than that) I can testify to the number of colleagues and partners who ask me questions like, “Can I interface Rails with Java?”, “Can I deploy it with WebSphere?” or “How can I generate a Rails WAR file?”. The answers to these and similar questions are all found in JRuby.
A couple of years ago I “toured” Canada, speaking at a few IBM, internal conferences. The vast majority of my attendees were experienced Java developers who were doing business consulting for IBM’s clients. They were all very enthusiastic about my presentation on Ruby and Rails. It was a break from J2EE’s complexity. These people were genuinely excited about the perspective of using Rails when doing client work.
Mid-conference, one attendant said to me, “This is cool, but they’ll never let us use this stuff”. And that’s when I reached for the JRuby slides. The mood in the room suddenly shifted. These developers started to think “OK, this could actually work”. At the end of my speech, most of the questions I received had to do with JRuby.
As I mentioned during that series of conferences “JRuby can be your gateway to introducing Rails into your workplace”. Many people within the Enterprise world don’t have an option. It’s either a JVM-based solution or they have to give up on Rails altogether.
JRuby is not only attractive to Ruby fans who’d like to use Ruby/Rails in certain work environments, it’s also appealing to those who are looking for an alternative to Java as a language. Here is where we could hit the jackpot in terms of Ruby’s adoption. There are countless Java programmers in the world. Convincing even just a fraction of them to switch would be enough to drastically increase the size of our community.
As Charles mentioned in his post, people can now pick between Scala, Clojure, Groovy, JRuby and Jython. I believe that the choice developers ultimately make boils down to three key usability aspects:
Charles’ team has been focusing on the right things. If I can be permitted one criticism though, it would be to avoid responding to every post that praises a competing implementation. Openly fighting against other implementations can backfire and looks unprofessional. I understand the desire to set the record straight and being competitive, but there is no reason to constantly point out that “these implementations are not done” every time an early project shows some form of promise or progress. Otherwise, it’s easy to come across as someone whose “heart turned black as coal, and who finds himself wishing bad luck towards other implementations”.
Today JRuby is an Enterprise-friendly alternative to Ruby MRI/KRI; and Charles is right, JRuby is important for Ruby’s future. It would however be wrong to assume that JRuby is the only sort of future for Ruby and that C/C++ based implementations are becoming irrelevant. Ruby has never been a zero-sum game. Plurality is a substantial part of what the Ruby ecosystem is all about.
Finally, let me conclude by congratulating the JRuby team, who have just been hired by Engine Yard. I think this could be a very strategic move for both JRuby and Engine Yard.
Django’s development server is capable of serving static (media) files thanks to the view django.views.static.serve. Popular web servers like Apache, Lighttpd or NGINX are much faster though, and as such should be used in production mode. Our goal is to bypass Django and let Apache (or other valid alternatives) directly serve static files like images, videos, CSS, JavaScript files, and so on, for us.
Generally speaking, for performance reasons, it’s advised that you have two different webservers serving your dynamic requests and static files. In practice, for smaller sites, people often opt to simply use one webserver. In this article, I’ll discuss how to serve the static files within your Django project, through Apache.
The first thing we need to do is distinguish between development and production mode. We can do so by simply specifying DEBUG = True (development), or DEBUG = False (production) within our settings.py file.
settings.py may include (among others) the following declarations:
# Absolute path to the project directory
BASE_PATH = os.path.dirname(os.path.abspath(__file__))
# Main URL for the project
BASE_URL = 'http://example.org'
DEBUG = False
# Absolute path to the directory that holds media
MEDIA_ROOT = '%s/media/' % BASE_PATH
# URL that handles the media served from MEDIA_ROOT
MEDIA_URL = '%s/site_media/' % BASE_URL
# URL prefix for admin media -- CSS, JavaScript and images.
ADMIN_MEDIA_PREFIX = "%sadmin/" % MEDIA_URL
*PATH constants indicate paths on your filesystem (e.g., /home/myuser/projects/myproject), while *URL constants indicate the actual URL needed to reach a given page or file.
Notice that it’s not unusual to have a /site_media URL that corresponds to a /media folder. In the example above, I opted to separate regular media files for the project from the standard ones that ship with Django for the admin section. To do this, all we have to do is create a symbolic link as follows:
ln -s /usr/lib/python2.5/site-packages/django/contrib/admin/media /path/to/myproject/media/admin
When you’re in development mode, and DEBUG = True, you want to let Django serve your static files. This can be done by adding the following snippet (or similar) to your urls.py:
if settings.DEBUG:
urlpatterns += patterns('',
(r'^site_media/(?P<path>.*)$', 'django.views.static.serve', {'document_root': settings.MEDIA_ROOT}),
)
In production mode, the code contained within the if clause will not be executed as we’ve set DEBUG to False within settings.py.
From the Django side of things, we are good. We now need to instruct Apache. Within your virtual host file, you can specify something along the lines of:
<VirtualHost *:80>
#...
SetHandler python-program
PythonHandler django.core.handlers.modpython
SetEnv DJANGO_SETTINGS_MODULE myproject.settings
PythonDebug On
PythonAutoReload Off
PythonPath "['/usr/lib/python2.5/site-packages/django', '/path/to/myproject'] + sys.path"
#...
Alias /site_media "/path/to/myproject/media"
<Location "/site_media">
SetHandler None
</Location>
</VirtualHost>
The first group of declarations essentially tells Apache to use mod_python to handle any incoming requests. However, we don’t want Django to deal with static files, so the second group of declarations, aliases/maps the /site_media URL with the actual media directory on the server, and tells Apache to threat it as static content (with SetHandler None) bypassing de facto Django.
This is a great day for those of us who love DB2, as DB2 Express-C 9.7 has just been released. As mentioned before, this is the best DB2 ever, and an extremely important release.
To learn more about what’s new in this release, please check out the recording of our latest webinar:
If you run Linux, Unix or Windows, download it while it’s hot.
DB2 9.7 on the Cloud
Another great aspect of this release is that for the first time ever, DB2 has been released both as a product and as a deployment on the Cloud. If you pop over to RightScale, you can get a trial account for free and should see DB2 Express-C 9.7 on both CentOS and Ubuntu within the partner catalog. RightScale has been an amazing partner and they really do wonders to simplify Cloud Computing. In ten minutes time you can be up and running on the Cloud, thanks to the templates provided.
DB2 support for Django
But the good times don’t stop there, we are also announcing the first official release of the Django adapter for DB2. It sounded crazy when I first proposed the idea within IBM back in 2006, but now it’s a reality.
You can download the .tar.gz archive from the Google Code homepage for the project, or simply by clicking here. This version fully supports the Django 1.0.2 API. For instructions on how to install it, please read the Getting started with the IBM DB Django adapter guide. The current version supports DB2 for Linux, Unix, Windows and MAC OS X, version 8.2 or higher (9.5 FP2 or higher for MAC OS X). In the future, IBM Cloudscape, Apache Derby, Informix (IDS) and both System i & z/OS will be supported.
ibm_db gem updated to 1.1
I’ll conclude this DB2-centric post with a smaller, but still interesting announcement. The ibm_db gem has been updated to version 1.1. This release includes support for ActiveRecord’s QueryCache mechanism, enhanced support for BigInt (and BigSerial), support for rename_column (requires DB2 9.7), parametrization of the timestamp datatype (requires DB2 9.7), and a few fixes and performance enhancements as well. It is recommended that you upgrade to this version.
Counting rows is an ubiquitous operation on the web, so much so that it’s often overused. Regardless of misuse, there is no denying that the performance of counting operations has an impact on most applications. In this post I’ll discuss my findings about the performance of DB2 9.5 and MySQL 5.1 regarding counting records.
For those of you who are not into science fiction, let me clarify that the odd title of this post is a tongue-in-cheek reference to the great novel, Do Androids Dream of Electric Sheep?.
I connected to the database, created the table, imported the data and benchmarked counting operations using ActiveRecord in a standalone script. Here is the code I used:
#!/usr/bin/env ruby
require "rubygems"
require "active_record"
require 'benchmark'
ActiveRecord::Base.establish_connection(
:adapter => :mysql,
:username => "myuser",
:password => "mypass",
:database => "mydb")
ActiveRecord::Schema.define do
create_table :people, :force => true do |t|
t.string :name, :null => false
t.string :fbid, :null => false
t.string :gender
t.string :profession
end
end
class Person < ActiveRecord::Base
end
# This can be sped up by performing an import instead
Person.transaction do
File.open("person.tsv").each_line do |line|
line = line.split(/\t/)
p = Person.new
p.name = line[0]
p.fbid = line[1]
p.gender = line[6]
p.profession = line[17]
p.save!
end
end
n = 100
Benchmark.bm(26) do |x|
x.report("Count all:") { n.times { Person.count } }
x.report("Count profession:") { n.times { Person.count(:profession) } }
x.report("Count females:") do
n.times { Person.count(:conditions => "gender = 'Female'") }
end
x.report("Count males w/ profession:") do
n.times { Person.count(:profession, :conditions => "gender = 'Male'") }
end
end
Please note that importing records in a huge transaction containing hundreds of thousands of INSERT operations is far from the most efficient way to import. Massive imports of data using the load/import facilities provided by each database is the way to go (also see the ar-extensions plugin). The lengthy import wasn’t benchmarked here though, so it isn’t determinant for this article.
people.tsv is a 92.7 MB tab separated values file that contains 875,857 records from the Freebase project (in my file I removed the header line, leaving only records).
For those who are not familiar with ActiveRecord, the queries executed behind the scenes are (in order):
SELECT count(*) AS count_all FROM people
SELECT count(people.profession) AS count_profession FROM people
SELECT count(*) AS count_all FROM people WHERE (gender = 'Female')
SELECT count(people.profession) AS count_profession FROM people WHERE (gender = 'Male')
While the table definition (for MySQL) is:
CREATE TABLE `people` (
`id` int(11) DEFAULT NULL auto_increment PRIMARY KEY,
`name` varchar(255) NOT NULL,
`fbid` varchar(255) NOT NULL,
`gender` varchar(255),
`profession` varchar(255)
) ENGINE=InnoDB
As easily verified by enabling logging with:
ActiveRecord::Base.logger = Logger.new(STDOUT)
Without much further ado, here are the times I obtained on my last generation MacBook Pro 2.66 GHz with 4 GB DDR3 RAM, and 320 GB @ 7200 rpm hard disk, running Mac OS X Leopard:
MySQL:
Count all: 42.467522
Count profession: 52.130935
Count females: 54.575469
Count males w/ profession: 64.046631
DB2:
Count all: 5.818485
Count profession: 7.714391
Count females: 8.556377
Count males w/ profession: 9.656739
Or in graph form:
That’s an impressive difference. To be exact, in this example DB2 was between 6 and 7 times faster than MySQL. In the case of COUNT(*), DB2 counted almost a million records in 58 milliseconds, or in about the blink of an eye according to Wolfram Alpha.
For those who are skeptical, please note that DB2 was not manually fine-tuned in any way. The client codepage was set to 1252 to allow Greek letters, and the log size was increased to permit such a huge transaction during the import. That’s it, no optimizations were attempted. This is DB2 Express-C out of the box. It looks like smart androids count electric sheep with DB2 after all.
The advantages of DB2 over MySQL when dealing with a massive volume of traffic are well known (and not limited to performance either), but DB2 can dramatically improve performance even for your average web application. And DB2 9.7, which will be released this month, increases the performance and the ability to self-tune itself to the available resources and required workload even further. If you’d like to try DB2 Express-C for yourself, you can download it here. It doesn’t cost you a dime to obtain and can be used for development, testing and production absolutely free of charge.
In an attempt to satisfy our need for identity and belonging, we desperately try to wear as many labels as possible, and to a certain extent labels are a necessity. When people ask you what you do for a living, it’s far easier to reply “I’m a computer programmer” than to try and explain the plurality and complexity of the exact criteria of your job.
The problem with labels is that they can place you in a box, at times greatly limiting who and what you are. So while it’s okay to use labels to efficiently communicate with other people, it’s important not to fall into the trap of taking them too seriously, thus letting them become who you are – or are not.
It’s not the label per se, but rather our perception of what our identification with a given role implies. If I identify myself too strongly as a “rubyist” I may not be inclined to recognize the good that is found elsewhere in other programming languages, or worse still, reject such good in an attempt to defend the choice I opted to identify myself with. This inclination is the basis of many of the “religious wars” you see online.
I sometimes find myself in the odd predicament of limiting myself because of some label or assumption of what “a person like me” can and cannot do. In such instances though I’m reminded of a few stories about courageous individuals who went beyond labels, above the layer of conventionality, breaking what common sense would have considered a “difficult to challenge” limit. I’m reminded of blind people who took on photography and managed to be successful at it, or of a black kid of Kenyan origins who managed to become the President of the United States of America. But there is one story in particular that always gets me, it’s the story of Django Reinhardt, after whom the the popular Python framework was named.
Django was a Gypsy jazz guitarist who was severely injured in a fire when he was eighteen. As a result of this accident his right leg was paralyzed and the third and fourth fingers on his left hand were severely burned. Doctors recommended amputating his leg and were pretty darn sure that he would never play guitar again due to the extensive damage to his hand. Django refused the amputation though and left the hospital as soon as he could. Within a year he was able to walk again, albeit with the aid of a cane. Even more surprisingly, despite being “disabled” in his left hand, he persisted through the pain to practice his beloved instrument. He went on to reinvent the conventional approach to guitar playing by performing solos with the use of only two fingers, using his half-paralyzed fingers for chord work. Today Django is considered one of the most influential guitarists of the 20th century.
I’ve learned to consciously fight the urge to limit myself. Whatever labels you feel may be cutting your potential short or holding you back, I encourage you to break free and rise above them. Does doing so mean you’ll reinvent the way a musical instrument is played, reshape the course of history or become a hero in your field? Perhaps, but even if it doesn’t, your own life stands to become richer and freer because you decided not to live within the confines of a label.