Like many, I don’t use TextMate just for coding. All of my posts are first drafted in my trusty editor before being published. One of the problems that I had, and that others probably face too, is the less than smooth process of publishing properly highlighted code in posts and HTML pages. A few solutions exist, including embedding gist snippets, using “Create HTML from Document” in TextMate, or adopting JavaScript libraries or WP plugins. But when it comes to highlighting code, for me Pygments is simply unbeatable.
Pygments is a Python library but ships as a command line tool as well. However, switching between TextMate and the command line is not as convenient as I’d like. So on the weekend I pulled out my big sharp razor and started yak shaving. The result of that brief session is a hack that delivers the integration of TextMate and Pygments, so that code can be easily converted to HTML in order to beautifully present it.
First, let’s see how I use it. When I select a snippet of Ruby code in TextMate and press ⌃⌥1 a snippet of code is transformed into the proper HTML. ⌃⌥2 is for Python snippets, ⌃⌥3 for any other language, and ⌃⌥4 for any language as well but with the option of adding line numbers. In practice, this means that I use 1 and 2 most of the time and these shortcuts are easy enough to remember. Note that this is not necessarily the best arrangement, but it works well for me. I could, if so inclined, associate all 4 commands to the same shortcut and be prompted by a menu every time this combination is pressed, obtaining something along the lines of the image shown below:

Should I ever forget these 4 shortcuts, I can take a quick look at the Text bundle menu shown below. I placed these commands under the Text menu, since they are globally available for textual formats, whether I’m composing HTML, Textile, Markdown or ReST; but this is entirely arbitrary and I suspect that many would consider the HTML menu instead or place a “Convert to HTML” entry in the menu of the specific language.

Ruby and Python deserve their own command because they are the languages whose code I publish the most, but pressing ⌃⌥3 (or 4) prompts a long list of languages to choose from as shown below (the image is cut to reduce its length):

The following are a series of steps that you can take to reproduce the same results as mine. The HTML required to present the code nicely in this section was generated from within TextMate. In other words, I’m eating my own dog food.
Step 1: If you haven’t done so already, install Pygments. You can get it from the official site.
Step 2: Within TextMate click on the menu entry: Bundles -> Bundle Editor -> Show Bundle Editor and click on the triangle to open up Text in the left pane.
Step 3: Click on the +- button in the lower left corner of the window and select New Command, then name the command Pygmentize Ruby (assuming that you want a command for Ruby).
Step 4: Ensure that each option for Save, Input, Output and Activation are the same as shown below (click to enlarge):
Step 5: Fill the Command(s) text area with the following code:
#!/usr/bin/env python
import os
import sys
from pygments import highlight
from pygments.lexers import RubyLexer
from pygments.formatters import HtmlFormatter
try:
code = os.environ['TM_SELECTED_TEXT']
except KeyError:
sys.exit()
formatter = HtmlFormatter()
print highlight(code, RubyLexer(), formatter)
Step 6: Repeat the process for Pygmentize Python, Pygmentize… and Pygmentize with line numbers… but select a different Activation key equivalent (replace 1 with 2, 3 and 4, respectively).
The command code for Pygmentize Python is as follows:
#!/usr/bin/env python
import os
import sys
from pygments import highlight
from pygments.lexers import PythonLexer
from pygments.formatters import HtmlFormatter
try:
code = os.environ['TM_SELECTED_TEXT']
except KeyError:
sys.exit()
formatter = HtmlFormatter()
print highlight(code, PythonLexer(), formatter)
For Pygmentize… use the following:
#!/usr/bin/env python
import os
import sys
from commands import getoutput
from pygments import highlight
from pygments.lexers import get_all_lexers, get_lexer_by_name
from pygments.formatters import HtmlFormatter
try:
code = os.environ['TM_SELECTED_TEXT']
except KeyError:
sys.exit()
available_languages = ", ".join(sorted('"'+lex[1][0]+'"' for lex in get_all_lexers()))
chosen_language = getoutput("""echo $(osascript <<'AS'
tell app "TextMate"
activate
choose from list { %(languages)s } \
with title "Pick a language" \
with prompt "Select a language"
end tell
AS)""" % {'languages':available_languages})
os.system("osascript -e 'tell app ""TextMate"" to activate' &>/dev/null &")
lexer = get_lexer_by_name(chosen_language.lower())
formatter = HtmlFormatter() # linenos=False
print highlight(code, lexer, formatter)
And finally for Pygmentize with line numbers… use the almost identical script below:
#!/usr/bin/env python
import os
import sys
from commands import getoutput
from pygments import highlight
from pygments.lexers import get_all_lexers, get_lexer_by_name
from pygments.formatters import HtmlFormatter
try:
code = os.environ['TM_SELECTED_TEXT']
except KeyError:
sys.exit()
available_languages = ", ".join(sorted('"'+lex[1][0]+'"' for lex in get_all_lexers()))
chosen_language = getoutput("""echo $(osascript <<'AS'
tell app "TextMate"
activate
choose from list { %(languages)s } \
with title "Pick a language" \
with prompt "Select a language"
end tell
AS)""" % {'languages':available_languages})
os.system("osascript -e 'tell app ""TextMate"" to activate' &>/dev/null &")
lexer = get_lexer_by_name(chosen_language.lower())
formatter = HtmlFormatter(linenos=True)
print highlight(code, lexer, formatter)
Step 7: Click on Text and by dragging and dropping arrange the menu to include the Pygmentize commands as shown below (click to enlarge):
Step 8: At this point everything should work, whether you invoke the commands through a keyboard shortcut or through the Text menu. However, you will need to upload and include a Pygments stylesheet from within your site. To generate a stylesheet run the following from the command line:
pygmentize -S default -f html > pygmentize.css
In the above command, default is the name of the style. For example, the Python code you see in this article is styled with the style pastie (because I globally adopted that stylesheet for this site). For a comparison of the available styles check out this demo page.
Step 9: ????
Step 10: Profit!
I hope these hacked together commands can be useful to others. Feel free to customize them and improve upon them as it suits your needs.
UPDATE: I made a Pygments TextMate Bundle out of this.
A couple of weeks ago Django 1.0 was finally released. In the software world version numbers can be rather arbitrary, but this announcement electrified the usually quiet community. Hiding behind the 1.0 label there are thousands of bug fixes, code refactoring of crucial components, compatibility with Jython 2.5, and the addition of impressive features such as GeoDjango which adds GIS capabilities to the framework.
Above all, the release of a 1.0 version implies that we can now rely on a stable API. So far most Django developers adopted the trunk version of the framework in production, because a lot of the desirable new features were not available in 0.96. The introduction of Django 1.0 is a fundamental stepping stone, and one I’m sure will lead to even more developers giving Django a chance. It will also facilitate the adoption of the framework within the enterprise, which is rarely keen on the idea of working with the edge version of a 0.x framework.
Matz is often quoted as saying: “Rails is the killer app for Ruby”. There is no doubt that likewise, Django is the killer app for Python. As Django matures and gains further worldwide acceptance, employing Python for web development will become an increasingly common reality, repeating to a certain extent what Rails did for Ruby.
Admittedly, the parallel doesn’t fit perfectly as Ruby on Rails has an overwhelming market share amongst Ruby frameworks, while in the Python community there are a few big contenders (and Zope was the first widely adopted Python framework). But it’s clear to me that Django will bring the popularity of Python on the web to a whole new level, and together with Rails, they will be the two major frameworks for developing web applications, at least in the open source field.
These two frameworks will continue to benefit from this popularity, and indirectly so will their respective languages and communities. Finding a Ruby or a Python job outside of the realm of Silicon Valley startups will not be seen as having won the job lottery, but instead the norm. And we’ll suddenly realize that the paradigm shift of the first decade of the 21st century in the programming world didn’t come from the adoption of multi-core processors, but rather from a focus on web applications and the consequent rise to fame of MVC frameworks like Rails and Django. Whose existence is only possible thanks to open source, dynamic and highly expressive languages such as Ruby and Python.
After several months of keeping it under wraps, I’m happy to officially announce my own web framework to the world. It’s called Ruby on Crack and will be released by RailsConf 2008. The name of the framework was chosen because I wanted to push the idea of a complete break from the existing Ruby frameworks, a clear cut, if you will.
Rails is great, don’t get me wrong, but it’s very opinionated. If you need to get things done in a different way, coding in Rails can become a burden. More reasonable from this viewpoint is Merb, but it’s still somewhat too close to Rails. Ruby on Crack is very different. A programmer using Crack no longer has strong opinions or is constrained within the rules defined by the framework. The only guiding principle is FUD (Fast Useful Development), but you get to decide your own specific style of web development. The framework is very modular and each component is connected to another one through a unified API called PIPE.
According to my preliminary tests, using Crack makes you 5 to 10 times more productive than using Rails. Not only that, but speed wise, you’re running much faster with Crack. Ruby on Crack ships with two extremely fast web servers, as opposed to WEBrick with Rails, called Purebred and Fat. Purebred handles requests by spawning new threads, while Fat uses an event based approach. From what I’ve seen so far, they easily outperform any other existing webservers, to the point that I was able to serve a sample app at a rate of 10,000 requests per second on commodity hardware.
Part of the speed boost that Crack can give you, is due the highly efficient, thread-safe and powerful ORM called Freebase. Even Datamapper is really slow compared to Freebase. On average, Freebase smokes ActiveRecord (it’s 4 to 5 times faster) and it can take advantage of advanced database features which are not supported in ActiveRecord, Datamapper, Og or Sequel.
I don’t want to reveal too much right now, but Ruby on Crack is truly a revolutionary approach to web development and makes you value the true power and colorful nature of Ruby. All of this will be explained in detail in future posts, and the code will be released into the wild for all to enjoy within the next couple of months. It took a lot of effort to get my team to work on Crack, but the results are rather satisfying. Don’t just take my word for it though, here are a few testimonials from prominent figures in the community who’ve had a chance to use the closed beta of Ruby on Crack:
“Those who use Crack can really appreciate the beauty of Ruby.” — Matz
“Fuck You” — DHH
“Incredible framework. I plan to publish a book about it ASAP, and I’m sure that it’ll be the first in a long series of books that I’ll write while using Crack.” — Dave Thomas
“Crack is what we really needed at Engine Yard.” — Ezra Zygmuntowicz
“Wow, what an eye opener. Crack made web programming fun again.” — Obie Fernandez
“You ass#$!@ m0#@#!@&%$” — Zed Shaw
“Once you have tried Crack for the first time, you realize how addictive it is. I simply can’t go back to Rails anymore.” — April Pesce
“Web 3.0 will belong to those on Crack.” — Tim O’Reilly
“If there is something that the Ruby community got right it’s Crack. I wish Python programmers were on Crack too!” — Guido van Rossum
“This is truly innovative and a godsend for startups. Give us about 6 years and we’ll have Arc on Crack too.”- Paul Graham
Django seems to have reached its tipping point, that critical mass which will enable its momentum to skyrocket. Getting here took a while though; partially because of a lack of hype and partially due to Rails’ very prominent presence in the market. Now this well deserving framework has finally begun to be widely adopted and considered as a valid alternative to Rails, for agile web development. Why do I care about what other people are going to use? I care because I’m deeply passionate about technology that works and that keeps things as simple as possible – as such forms of innovation always should. Independently from their adoption, promotion, and the differences in their approaches, both Django and Rails have at their core, a lot of substance and can greatly simplify and improve the way a web developer’s creative process flows.
Stating that Django has reached its tipping point is a bold claim, but I can present some evidence to back it up. I will use Rails and Ruby as a comparison for Django and Python – but don’t construe this as a race between the two frameworks. Rails is still the most popular and will probably continue to be for a long time. I’m only comparing the numbers to get an idea of where Django stands right now.
Visiting irc.freenode.net, I’ve noticed that the Python (#python) channel is often more populated than the Ruby one (#ruby-lang), and the same goes for Django (#django) and Ruby on Rails (#rubyonrails). For example, right now I see 517 members for Python and 354 for Ruby, 382 for Django and 298 for Rails. Django and Python consistently have more hackers in their chats than Rails and Ruby. This doesn’t say too much, given that the average developer doesn’t hang out on irc, but it’s still somewhat indicative of Django’s growing community.
Moving to newsgroups/Google Groups, things start to change a little. As I write this, there are 12,457 subscribers for comp.lang.python and only 6,935 for comp.lang.ruby (with 1,857 members in ruby-talk-google). “Django users” has 8,178 members versus the 13,355 of “Ruby on Rails: Talk”. So far this month, the Django group has had 1,244 messages versus the 2,890 of the Rails one. By looking at these numbers, without any pretense of being too scientific in our comparative methods, we get the impression that the Rails community is almost twice as big as the Django one, which sounds about right. On the other hand we also get that the Python community is larger than the Ruby one (confirmed also by the irc results above). In looking at these numbers, Rails also has the advantage of being the most used Ruby framework by far. In Python-land, Turbogears (3,303 members), Pylons (1,333 members) and good old Zope split the pie too, even though Django remains the most popular choice. Guido van Rossum’s blessing for Django was just the icing on the cake.
By observing the TIOBE Index, we see that Python is in 7th position versus the 10th position where we find Ruby. Perhaps more interestingly, Python has had a +0.70 delta since last March, while Ruby a -0.11%. Again, this is certainly not an exact science, my friends. TIOBE accuracy is often disputed for good reason, but I think it’s still an indicative factor.
Speaking of less than entirely reliable things, Alexa (django vs rails), Compete, and Google Trends (yes, rails is a very generic term) all confirm the anecdotal evidence that Rails is still far more popular. That said, the values start to be at least somewhat comparable.
In my opinion the strongest indicators of Django’s increasing popularity come from the publishing world. You’ll see many books in print for a given topic, only if their publishers believe that there is a large enough market for them. In 2007 the following two books were published: Professional Python Frameworks: Web 2.0 Programming with Django and Turbogears and The Definitive Guide to Django: Web Development Done Right
(available for free online). 2008 has only just started and already there’s been one Django title published (Sams Teach Yourself Django in 24 Hours
), with two further titles lined up: Practical Django Projects
and Python Web Development with Django
(which I’m currently reviewing for Pearson, as it’s in the process of being written – and I must say, I think it’s going to be a very good one).
5 books on Django announced to date and more lined up to be released this year, I’m sure. There are now many books in print that cover Rails (my recommendations here), but the sudden spur of Django books reminds me of Rails a couple of years ago and will surely help widen Django’s popularity. Watch closely because things will move fast in Django-land.
Working with Python is nice. Just like Ruby, it usually doesn’t get in the way of my thought process and it comes “with batteries included”. Let’s consider the small task of printing a list of the N most frequent words within a given file:
from string import punctuation
from operator import itemgetter
N = 10
words = {}
words_gen = (word.strip(punctuation).lower() for line in open("test.txt")
for word in line.split())
for word in words_gen:
words[word] = words.get(word, 0) + 1
top_words = sorted(words.iteritems(), key=itemgetter(1), reverse=True)[:N]
for word, frequency in top_words:
print "%s: %d" % (word, frequency)
I won’t provide a step by step explanation of what I believe is already rather understandable. There are however a few tricky considerations to be made on behalf of those who are not too familiar with the language. First and foremost, I love using Generator Expressions because they are lazily evaluated and have a math-like readability. It’s just a very convenient way of crating generator objects. Notice how in the snippet I favor them over the option of placing a whole file into a string by concatenating the read() method to the open() one. Doing so results in a significant performance improvement for large files. Generator Expressions and List Comprehension are extremely useful language features which are inherited from the world of functional programming, and I’m glad that Python fully embraces them.
In the third for loop we count words and add them and their respective frequencies to the words dictionary (similar to a Ruby Hash). Notice how the method get() enabled us to specify a default value before incrementing the counter, in case the given key didn’t exist yet (which means that the word we were adding hadn’t been encountered before). We pass operator.itemgetter() as a keyword argument (another nice Python feature) to the sorted() function. itemgetter() returns a callable object that fetches the given item(s) from its operand which, in our case, essentially means that we can tell sorted() to sort based on the value of the dictionary’s items (the frequency of the words) rather than based on the keys (the words themselves).
Unfortunately there is a problem with this code. It will correctly sort the most popular words in the file, but equally represented words won’t be alphabetically ordered. Given that we specified a reverse order for the sorted() function, we could simply pass it key=itemgetter(1, 0) to order (in descending order) by value first and by key second. But let’s be realistic. In most cases, you want to have these type of keys whose values are equal, be alphabetically ordered (in ascending order). With a few changes to the code, this can be easily achieved:
from string import punctuation
def sort_items(x, y):
"""Sort by value first, and by key (reverted) second."""
return cmp(x[1], y[1]) or cmp(y[0], x[0])
N = 10
words = {}
words_gen = (word.strip(punctuation).lower() for line in open("test.txt")
for word in line.split())
for word in words_gen:
words[word] = words.get(word, 0) + 1
top_words = sorted(words.iteritems(), cmp=sort_items, reverse=True)[:N]
for word, frequency in top_words:
print "%s: %d" % (word, frequency)
Previously we specified what “key” should we use for sorting, while in this case we now have a much greater deal of control. By defining the function sort_items() and passing a pointer to it for the cmp argument of the function sorted(), we get to define how the comparison amongst the items of the dictionary should be carried out. The function that we defined at the beginning of the script will return -1, 0 or 1, depending on how the two key-value pairs compare. The returned value is cmp(x[1], y[1]) or cmp(y[0], x[0]). This may seem complicated but the trick is rather easy. The first part compares the frequencies of the two words and returns 1 or -1 if one is greater than the other. If they are equal, the expression to the left of the or will be 0, therefore the expression on the right of the or will be returned. On the right we compare the keys (the words), but invert the order of the arguments y and x to reverse the effects of the reversed ordering defined in sorted().
Finally, for those who prefer to use a lambda expression, rather than to define a function, we can write the following:
from string import punctuation
N = 10
words = {}
words_gen = (word.strip(punctuation).lower() for line in open("test.txt")
for word in line.split())
for word in words_gen:
words[word] = words.get(word, 0) + 1
top_words = sorted(words.iteritems(),
cmp=lambda x, y: cmp(x[1], y[1]) or cmp(y[0], x[0]),
reverse=True)[:N]
for word, frequency in top_words:
print "%s: %d" % (word, frequency)
Or simplified further by getting rid of reverse=True and using key rather than cmp:
from string import punctuation
N = 10
words = {}
words_gen = (word.strip(punctuation).lower() for line in open("test.txt")
for word in line.split())
for word in words_gen:
words[word] = words.get(word, 0) + 1
top_words = sorted(words.iteritems(),
key=lambda(word, count): (-count, word))[:N]
for word, frequency in top_words:
print "%s: %d" % (word, frequency)
Please bear in mind that the code makes a few assumptions so as to keep things simple. As it stands, the script would consider “l’amore” as a single word, and an accidental lack of spaces wouldn’t be accounted for (e.g. “word.Another” would be a single word too). The replace() method can be used to address these sorts of special cases.
Sure, this was a rather trivial example, born from an iPython session, but I think it gives away Python’s expressiveness and flexibility when dealing with problems that, approached in some other languages, would be much more error prone and verbose. Batteries included indeed.
Keen observers amongst my readers may have noticed that I’ve subtly renamed my blog. It used to be “Zen and the Art of Ruby Programming” and now it just reads “Zen and the Art of Programming”. Perhaps you also noticed that my Ruby logo has been replaced with a cuter one created for the Snakes and Rubies conference, which was held about two years ago at DePaul University (by the way, I don’t know who the creator of that great logo is, but really hope he/she doesn’t mind me using it. If anyone knows who designed it, please let me know so that I can fully give the right person credit for it).
For several months now I’ve been pondering this decision before finally taking the name change plunge. Don’t worry, I’m not going all good ol’ Zed on you. I like Ruby as much as I ever did, I think that in the past four years we have come a long way and I see a bright future for Ruby. My decision has little to do with Ruby or Rails. No Ruby, it’s not you, it’s me.
There are a few reasons beyond my decision:
What’s in a name? A lot, I think. While there may be a slight loss of “branding” and some negative effects when it comes to SEO, I don’t really mind, and it’s a risk that I’m willing to take. I want to explore the possibility of “Zen and the Art of Programming”, even if proves to be a mistake.
Rails has been a blessing and a curse for the Ruby community. It brought sudden popularity to the language with all the consequences, good and bad, that usually result from exponential growth. On one hand, it gave many developers the chance to appreciate the design of the Ruby language based on its own merit. On the other hand though, it’s been a cash cow that’s changed the community forever by attracting all kinds of attention. Rails has become the poster child for Ruby, blurring the distinction between the Ruby, and Rails, communities. A large number of web programmers got to experience the “ease of use” and beauty of Ruby development, but it also clearly exposed Ruby’s implementation shortcomings. Rails enabled Ruby to go from a relatively unknown programming language to a mainstream one within the frame of just a couple of years. Languages are made to be used, so overall, most would agree that the advantages for Ruby, brought forth by Rails’ success, far outweigh the negative aspects. In the words of Matz, Ruby’s creator:
Rails is the killer app for Ruby.
— Yukihiro Matsumoto
Whatever your take on the subject is, in this article I argue that Ruby on Rails is actually the best thing that ever happened to Python. Rails is a successful Ruby framework, so some may think that it would convince people to switch from Python to Ruby. In other words, naively, one could think of Rails as a Python exterminator, at least as far as web development goes (which is a big deal nowadays). This couldn’t be further from the truth. This year is going to be a great one in terms of the adoption of Python, and I think that Rails has had a positive influence in this regard.
Independently from Rails and Ruby, over the past few years Python has had a good deal of penetration within the market. It wasn’t Java, of course, but as far as dynamic languages go, Python was well represented in many companies and achieved a fairly adopted niche even within the enterprise world. What Rails did was to promote and popularize the usage of MVC web frameworks written in dynamic languages. What could have been considered an unusual choice in the pre-Rails era, is now viewed by most as the right way of developing web applications. The marketing abilities of the Rails community benefited the Python one, in the sense that they made the usage of dynamic languages for serious web projects a very acceptable and even well warranted choice.
Rails got the message out that MVC web frameworks and “scripting” languages could be very productive and much nicer to work with. Django and other web frameworks are indirectly taking advantage of this. A couple of years ago people were asking me whether it was better to adopt Rails or stick with Java for a given project. Nowadays, most emails and requests that I receive are about whether it makes sense to adopt Rails and Ruby or if it would be more sound to use Django and Python.
Initially, even Sun hired Ruby hackers to work on JRuby and have only recently announced that two pythonstas will work on Jython full-time. Yes, they are late to the party and should have done this years ago, but I think that Rails’ popularity and hype led them to consider Ruby first before eventually realizing that they should have done the same thing for Python. Rails has been, at least in part, a catalyst for Python’s success.
Don’t let this confuse you though. Python (and Django) are able to benefit from all the interest geared towards dynamic languages, only because they are technically excellent and make a strong case on their own. Their communities are much less about marketing and more about substance, in my opinion. I understand those who go from Ruby to Python, but there are far fewer motivations in favor of a switch from Python to Ruby. The reason for this is that, in a way, Python is currently an answer to Ruby’s MRI shortcomings. When I speak about Ruby’s shortcomings I always refer to the implementation and not to the design of the language, which is a well balanced and coherent mix of paradigms and features.
Things will change with Rubinius (and perhaps JRuby), but as it stands right now, Ruby’s implementation has critical limitations that affect the development of non-toy projects and its adoption within the enterprise world. Speed, I/O processing, threads, garbage collection and troubling deployment when compared to other languages, are serious issues. It doesn’t mean that you can’t use it, just that the limitations that are in place will make your life more difficult on certain projects.
In a way, Python is the only acceptable implementation of Ruby for certain values of Ruby. The two languages have different approaches and quite a few distinctions which make them both unique. Overall though there are plenty of similarities that make developing in one or the other a somewhat comparable mental process and coding experience. Sure, Python doesn’t promote the functional paradigm as much, and it’s less implicit/magical than Ruby (read import this). This could be a good thing when it comes to large and enterprise projects, but the two are all not that different from each other. They’re Capelli d’angelo and Spaghettini, as my fellow countryman Alex Martelli eloquently put it.
As much as I may like Ruby more as far as language design goes, not only does Python boast a very solid implementation, it has several advantages over Ruby that go beyond the interpreter. I’ve found that Python has an incredible amount of rock solid, high quality libraries that perform very well. Not all of them of course, but most are well coded, maintained and documented. In Rubyland we can’t claim the same levels of good reusable code. I use both and I see a big difference. Lingering for a moment on the subject of documentation, Python has a wealth of tutorials, guides and even entire books available for free online. Learning Python from these, without spending a cent, is a walk in the park. Ruby on the other hand has good books in print, but a long way to go as far as free documentation goes. And this is true for both Ruby and Rails.
The community attitude is much different too. The Python and Django communities generally keep a low profile, following the “shut up and show them the code” mantra. They do have some marketing issues but can’t be blamed for hyping their technology, like at least in part, Rails has done. For example, Django is a very mature framework that’s several years old, and yet it still hasn’t been tagged as a 1.0 version. If Twisted Matrix was implemented in Ruby it would be advertised as the second coming of Christ.
Jokes aside, I feel that the Python community has a very good attitude which is in no way altered by the Django one, because it happens to share the same traits. There are no exclusive private clubs or the feeling of experiencing a technological “gold rush”, even though the community is in no way smaller. This may seem like a minor point, but for many the maturity, pragmatism and attitude of the community is a big selling point for Python/Django.
I’ll refrain from indulging in a full length comparison of Rails and Django. But I must briefly mention that Django takes the cake when it comes to creating applications that do web publishing (for obvious background reasons). Despite attempts to put Photoshop online, I feel that most of the web is still about publishing and interacting with published data. This makes Django a good choice in most cases. Of course, should you develop a web application that intends to replace a desktop app, Rails will most likely have the edge there.
So what does this mean for me personally? I’ll use them both, as I’m a firm believer in using the right tool for the right job. I’ll give you an example. Stacktrace.it is a small revolution in the Italian IT publishing world. A few months ago I contacted about 30 people from amongst the best Italian hackers and IT professionals and I proposed that together we create a site in which we’d promote and influence software development and IT in general in Italy (even if I live in Canada myself). The site has been going very well and has attracted a group of highly technical people who follow and contribute regularly with high quality articles. The code was based on Luambo, an open source blog engine yet to be officially packaged and promoted to the world. What was remarkable about it was that most of the new features that were requested in our private mailing list ended up in the code in less than five minutes. An unbelievable level of productivity. Even our designer, who was not familiar with the language or the framework, was easily able to work on the project. Guess what? Despite loving Ruby and Rails, when it came to deciding on the framework and language to be used for that project, I strongly pushed for the adoption of Django and Python, and not Ruby on Rails. It was simply the right tool.
A while ago I informally announced IBM’s intention to develop an SQLAlchemy adapter for DB2 and Informix IDS. Today, I’m happy to inform you that we have a first working release for DB2 on Linux, Unix and Windows (LUW). Support for Informix IDS is next (almost done), and after that, it will be System i and z/OS’ turn.
This release will surely excite those Pythonistas who can appreciate DB2 for what it is: one of the most powerful data servers in the world. Which, in its Express-C version, also happens to be gratis (“free as in free beer”). But there is more to it than just that.
IBM has in fact created a project on Google Code, for supporting Python development with IBM Data Servers. Aside from downloads and SVN access, this gives the project a nice public bug tracker which was missing up until this point. A Google Group was also created in order to have an easy to follow support mailing list, and I invite you to join it now.
With the switch to Google Code, there was also an update to the Python drivers (now version 0.2.5), which contain a few improvements and a bug fix for the egg that wasn’t working properly on Linux.
The project currently hosts the following components:
Please use the driver and/or the adapter for SQLAlchemy and let us know if you encounter any issues or have any feedback about it.
This how-to is essentially the same as my previous one, only this time I’ve provided step-by-step instructions for installing Django with PostgreSQL on Ubuntu 7.10.
First and foremost, we are going to install Django from its svn repository, as opposed to obtaining the 0.96 release archive. The reason for this is that the trunk version implements a few new features. The development code is also rather stable and used by most people in production mode, even for sites like the Washington Post.
Install Subversion
sudo apt-get install subversion
Checkout Django
svn co http://code.djangoproject.com/svn/django/trunk django_trunk
Tell Python where Django is
Ubuntu already ships with Python 2.5.1, thus you won’t have to install it. You can verify this by running python in your shell (use exit() to get out of the python shell). What you need to do is inform Python about the location of your django_trunk directory. To do this create the following file:
/usr/lib/python2.5/site-packages/django.pth
Within this file, place only one line containing the path to your django_trunk folder. In my case, this is:
/home/antonio/django_trunk
Of course, change it to the full path location of the directory on your filesystem.
Add django-admin.py to your PATH
The bin directory within the django folder (which is inside django_trunk itself) contains several management utilities. We need therefore to add the following to the PATH (again, change it to your own location):
/home/antonio/django_trunk/django/bin
How you go about doing this, depends on the shell you are using, and I’m assuming you are able to export a shell variable on your own. In case you are using the bash shell (as I do) you could export it in .bashrc. Alternatively, you could just create a symlink to the utility django-admin.py in /usr/bin, but I recommend the former approach.
Install PostgreSQL and Psycopg2
sudo apt-get install postgresql pgadmin3 python-psycopg2
This will install PostgreSQL 8.2.5, PgAdmin III and the driver Psycopg2 for you. Most people at this point will ask, what’s the default password for PostgreSQL on Ubuntu? You can use the following instructions to set the password for the user postgres both in Ubuntu and within PostgreSQL:
sudo su -
passwd postgres
su postgres
psql template1
The last instruction should open the psql shell, where you can run the following:
ALTER USER postgres WITH ENCRYPTED PASSWORD 'mypassword';
Verify the installation
You should be all set now, but let’s verify this right away. Open the shell and run the following instructions inside the python shell (start off with the python command).
>>> import django
>>> print django.VERSION
(0, 97, 'pre')
>>> import psycopg2
>>> psycopg2.apilevel
'2.0'
By running exit() get out of the python shell, and verify that django-admin.py is in your path:
django-admin.py
Type 'django-admin.py help' for usage.
If you obtain a similar output for all three of them, you are really set.
Where to go from here
Now that Django is installed, you can go read the Django Book 1.0 that’s available for free online. Something equally well done and useful is really missing from the Rails community. Above all, experiment, Django (and programming in general) is learnt by doing. The Definitive Guide to Django: Web Development Done Right is also available for purchase in its deadtree version, which just came out. It’s cheap and it’s already a best seller on Amazon. Despite the availably of a free version online, I like having paper versions of tech books so that I can read without staring at the monitor. Furthermore, I feel like rewarding the authors (who are also the framework creators), while encouraging publishing companies that are willing to allow authors to make their books available for free on the web. Well done guys!
Installing Django on Mac OS X Leopard is supposed to be very straightforward, but if you are new to it, you may encounter a few puzzling questions and, in the case of MySQL, even a couple of headaches. I’m writing about this for the benefit of those of you who may attempt and struggle with this feat. MacPorts is not required for this how-to.
First and foremost, we are going to install Django from its svn repository, as opposed to obtaining the 0.96 release archive. The reason for this is that the trunk version implements a few new features. The development code is also rather stable and used by most people in production mode, even for sites like the Washington Post.
Checkout Django
svn co http://code.djangoproject.com/svn/django/trunk django_trunk
Tell Python where Django is
Mac OS X 10.5 already ships with Python 2.5.1, thus you won’t have to install it. You can verify this by running python in the Terminal (use exit() to get out of the python shell). What you need to do is inform Python about the location of your django_trunk directory. To do this create the following file:
/Library/Python/2.5/site-packages/django.pth
Within this file, place only one line containing the path to your django_trunk folder. In my case, this is:
/Users/Antonio/Code/django_trunk
Of course, change it to the full path location of the directory on your filesystem.
Add django-admin.py to your PATH
The bin directory within the django folder (which is inside django_trunk itself) contains several management utilities. We need therefore to add the following to the PATH (again, change it to your own location):
/Users/Antonio/Code/django_trunk/django/bin
How you go about doing this, depends on the shell you are using, and I’m assuming you are able to export a shell variable on your own. In case you are using the bash shell (as I do) then you should have a .profile file in your home directory. Alternatively, you could just create a symlink to the utility django-admin.py in /usr/bin, but I recommend the former approach.
Grab and install MySQL
I would normally recommend PostgreSQL, at least until we have DB2 on Mac, but I realize that many of you use and prefer MySQL, which also seems to be the only one that requires special instructions due to a few installation issues when trying to get MySQL and Python to work together. You can install MySQL by grabbing and running one of the packages that are available on the official site. Choose the one for x86 and Mac OS X 10.4.
Install the MySQLdb driver
Get MySQL-python-1.2.2.tar.gz from SourceForge. Please follow these exact instructions because the source code won’t compile out of the box and will give you the following error when trying to build it:
/usr/include/sys/types.h:92: error: duplicate 'unsigned'
/usr/include/sys/types.h:92: error: two or more data types
in declaration specifiers
error: Setup script exited with error: command 'gcc' failed
Run the following:
tar xvfz MySQL-python-1.2.2.tar.gz
cd MySQL-python-1.2.2
At this point, edit the _mysql.c file and comment out lines 37, 38 and 39 as follows:
//#ifndef uint
//#define uint unsigned int
//#endif
Now, from the MySQL-python-1.2.2 folder run:
python setup.py build
sudo python setup.py install
If you still get an error (and only in that case) you’ll need to edit the site.cfg file within the same folder and set threadsafe = False, before running the two commands above once again.
If instead, you don’t receive an error but you see warnings about files not required on this architecture, don’t be concerned about them. The last step required is to create a symbolic link with the following command:
sudo ln -s /usr/local/mysql/lib/ /usr/local/mysql/lib/mysql
All these adjustments are required because we are building and installing the driver on Mac and not on Linux.
Verify the installation
You should be all set now, but let’s verify this right away. Open Terminal and run the following commands in the python shell (start this with the python command).
Verify that MySQLdb is correctly installed:
>>> import MySQLdb
>>> MySQLdb.apilevel
'2.0'
Now, verify that Django is working:
>>> import django
>>> print django.VERSION
(0, 97, 'pre')
By running exit() get out of the python shell, and verify that django-admin.py is in your path:
django-admin.py
Type 'django-admin.py help' for usage.
If you obtain a similar output for all three of them, you are really set to write the next YouTube.
Where to go from here
Now that Django is installed, you can go read the Django Book 1.0 that’s available for free online. Something equally well done and useful is really missing from the Rails community. Above all, experiment, Django (and programming in general) is learnt by doing. The Definitive Guide to Django: Web Development Done Right is also available for purchase in its deadtree version, which just came out. It’s cheap and it’s already a best seller on Amazon. Despite the availably of a free version online, I like having paper versions of tech books so that I can read without staring at the monitor. Furthermore, I feel like rewarding the authors (who are also the framework creators), while encouraging publishing companies that are willing to allow authors to make their books available for free on the web. Well done guys!
For a month or so I’ve been running (or should I say rolling?) my ads with Adroll.com. I want to briefly introduce this cool niche network ad service to other programmers, so that we stop receiving travel ads whenever we talk about Rails.
Traditionally website and blog owners put Google Ads up on their sites. In most cases, unless you are uber-popular, Google gives you peanuts. This is particularly true for technical sites where the audience is very web-savvy and rarely clicks on ads. On top of that, and rightfully so, many programmers use Adblock Plus and won’t see your ads either, unlike the viewing public of more general sites. Granted ads are there mostly to recoup expenses for the server and they aren’t expected to turn a profit in my case, obtaining a decent eCPM with a programming blog is not an easy task for anyone. There is a big earning difference between the obtainable eCPM when running Google Ads or when serving ads for a Sponsor’s campaign. Sponsors’ campaigns, whether click or impression based, can give you a lot of cash to buy all those
cool
books
and even a nice treat
. The problem is that unless your site is popular, you won’t be of any interest to advertisement agencies who work on behalf of big companies. They are looking for millions of impressions and chances are that your programming blog won’t ever reach anything remotely close to that. So you are stuck with ridiculously low eCPMs from Google, which serves thousands upon thousands of impressions for less than a breakfast at Starbucks.
This is where Adroll kicks in. Adroll isn’t too well known yet, but they are growing steadily and are “revolutionary” in their own way. Adroll allows advertisers to buy ad campaigns on your website. Nothing new there, I can hear you say. The innovation relies on the fact that with Adroll you can create or join topical communities of sites. When you start to have a group of, say, a few dozen car sites and blogs, the number of pageviews and unique visitors becomes sufficiently large enough to interest the agencies who work on behalf of BMW or Mercedes, for example. And that’s when the real dough arrives. Sure, your site won’t be making a huge deal of money if it’s small, but the earning per thousand clicks or impressions is going to be far greater than what Google usually offers. And more importantly, ads will be relevant to the topic of your site and more likely to interest your visitors. With Adroll, you can also set the minimum amount of eCPM that you require from advertisers, therefore excluding campaigns that pay too little.
Not only that, but you can let your site participate in different topical communities, therefore increasing your chances of being part of a campaign from a good advertiser. If all this wasn’t neat enough, their service is simply perfect for managing your existing Google ads. You can even use it to rotate your typical ad services (Google, Yahoo, etc…) or your in-house ads, and Adroll won’t make a cent off you. You will still get paid 100% from Google and these other services, while Adroll just serves their ads and doesn’t enter into the monetary picture at all. For example, right now, as I wait for a few of you to join me, I’ve been using it to serve Google ads in two spots, and “advertise here” in a third spot. Adroll provides you with detailed and up to date statistics and also shows advertisers the most common keywords used to reach your site.
Here is an illustration and their own description (from the about) of their services:

“Problems we solve for advertisers:
Problems we solve for online publishers:
Long story short, I was able to obtain 50 invites for their private beta. These are going fast, so grab one for your programming blog or site now. I also created a Programmers Ad Community, which I invite all interested programming and development bloggers (and site owners) to join.
You simply need to do the following:
For those who are curious, this San Francisco company is powered by Python and proudly uses Twisted Matrix. Their team has hardcore hackers like my friend Valentino Volonghi and SQLAlchemy’s creator Mike Bayer. Those are just two of the smart cookies who have been hired by this company. And in fact, Adroll works very well, which is rare for a beta product. I strongly believe that they will succeed in their intent of changing the ad market by providing relevant ads to niche content providers.
Zooppa
Since I’m on the unusual (for this site) topic of ads, I’d like to bring another cool startup called Zooppa, to your attention. This site is innovative and very popular in Italy and Europe, but it has been completely ignored by US tech bloggers, perhaps because it doesn’t burn millions of VC dollars and it’s not based in Silicon Valley or this side of the Atlantic. Techcrunch should really talk about this one.
Regardless, take a look at this site and you’ll be hooked. It works in the following way: an advertiser asks the community to create a video, print or radio ad for their product. The creative process starts and the participants’ work is voted on. The winner(s) gets the money and gets to see their work used by big companies like HTC, Citroen, Fineco, et cetera. This is good for big companies because each video, including those that don’t win, give the product a lot of exposure. Each clip boosts the brand’s image. It’s also good for creative people, who get rewarded with hard cash for their work, and non-professional filmmakers have for the first time a shot at working for a big company. For us spectators, the site is quite entertaining and there are plenty of funny clips to watch. I wonder how long it’ll take before YouTube adopts something similar?
Zenbits are posts which include a variety of interesting subjects that I’d like to talk about briefly, without writing a post for each of them.
A few hours ago Rails 2.0 was finally (quietly) released. Unfortunately if you try ‘gem update’ or ‘gem install rails’ you will get the following error:
ERROR: Error installing rails:
rails requires activeresource (= 2.0.0)
To solve this problem, assuming you are installing, simply run:
$ gem install rails --source http://gems.rubyonrails.org
For details about this new release wait for the official announcement by DHH and his team.
On the subject of announcements, the IBM_DB Python driver for DB2 version 0.2.0 was released. This includes a Python egg for Linux and (finally) for Windows. You can download both of them from Cheeseshop.
A few days ago NetBeans 6.0 was released. Its support for Ruby and for Rails is stellar. Its editor seems to be refined to provide developers with a comfortable environment for programming Ruby and Rails applications in. The code auto-completion (with documentation on the fly) alone makes it extremely valuable. From what I’ve seen so far it’s a solid, well thought out IDE that sets the bar high when it comes to the world of Ruby/Rails editors. Now we need Aptana IDE to implement similar features, for those of us who use and prefer (at least on Windows and Linux) an Eclipse based IDE. Between NetBeans’ support for Ruby and the active development of JRuby, one can only conclude that Sun is very serious about Ruby and that they really “get it”. We can wish for the same kind of commitment from Microsoft, but so far I get the impression that projects like IronRuby are seen by Microsoft as little more than pet projects just like IronPython is. But I’d be happy if my first impression was to be proved wrong. That said, Microsoft is receiving a huge wake up call from their research division, as shown by excellent videos which cover non-mainstream and research topics as well. They’ve also proved this by incorporating advanced features from research languages in C# 3.0. We’ll see how it goes, but it looks like there might be some hope after all.
Speaking of videos, I recommend a fantastic interview with E.W.Dijkstra, recorded a few years ago. It’s called Discipline in Thought and deals with the subject of the nature of programming. I highly suggest that you watch part 1, part 2 and part 3. After that, you can dig further by reading some of his manuscripts in this archive.
On a different topic, I’d like to thank everyone who commented and posted about my Ruby shootout. We made the front page of, among others, Del.icio.us. Its popularity is important to me, because it gives the proper exposure to these projects and their authors and debunks the myth that we are all happy with Ruby’s status quo in terms of speed. The next run will add extra benchmarks (in order to provide less of an advantage to Ruby 1.9). Performance is not everything, but it can be an important aspect. I like Charles Oliver Nutter’s (of JRuby) approach:
If you run across benchmarks of any kind that show JRuby running slower than Ruby 1.8.x, we’d appreciate you filing them as bugs.
That’s the right attitude, it shows serious commitment in terms of resolving this issue. Kudos to him and his team. As far as commitment goes, I can’t praise Engine Yard enough, as they’ve just hired two excellent hackers (Ryan “zenspider” Davis and Eric “drbrain” Hodel) to work full time on Rubinius along with Evan Phoenix (who started the project in the first place). From January onward, Engine Yard will also pay Wilson “Defiler” Bilkovich and Brian “brixen” Ford to do work on Rubinius. That’s a ridiculously high IQ potential to have working on Rubinius. We can only expect great results and undoubtedly say that Engine Yard really gets it.
Finally, for those of you who requested it, please find here the results of my benchmarks in Excel and PDF format.
Update
The released (through rubygems) but not announced Rails 2.0 has now been upgraded to 2.0.1, and that’s what you’d get if you ran ‘gem install rails’ or ‘gem update’. The error reported above still exists, so you can update by specifying the source as mentioned in this post.
Update 2
For those of you who didn’t believe in my “scoop”, here is the official announcement with all the glorious details. Awesome!
Update 3
The gems should be properly propagated now, so that error shouldn’t be there anymore.
My post about Ruby 1.9′s impressive improvement over Ruby 1.8.6 created quite an echo within the developer community. Sure, the headline was an attention grabber, just like this one is
, but in a matter of a few hours, there were all sorts of blog entries with variants in many languages, more than 200 comments on Reddit, and fifty comments on my own blog. There were however, also a few misconceptions. It was great though because such a simple post generated a lot of discussion amongst developers, with some insightful arguments taking place – and besides it almost created a new meme with the whole “holy shmoly” thing. Fun as that certainly was, let’s try to summarize and clarify a few points.
First and foremost, for those who stopped at the title of the article and didn’t read on, I made it very clear in several places that I didn’t even begin to predict that Ruby 1.9 will be faster than Python 2.5.1 when it comes to real world applications. I ran a simple micro-benchmark where this just happened to be the case. Chances are that Python will have the edge in many instances, especially if we consider that it has several optimized libraries which may still be missing or suboptimal in Ruby. Within the scope of the recursive Fibonacci’s test, which essentially stress-tests method/function calling, Ruby 1.9 seems to be more than 13 times faster than Ruby 1.8.6, but within the Ruby community it’s well known that this improvement factor is not very often replicated in other micro-benchmarks or actual applications.
Ruby is improving, its progress is impressive, so let’s drive this point home and try not to speculate too much. Someone also pointed out Ruby’s lack of Unicode support and in other cases, its disadvantages over Python. That’s missing the point, the article isn’t aimed at comparing or claiming that one language is better than the other. The essence of the message was and remains, that Ruby’s main interpreter — which was typically fairly slow — will soon be replaced by Yarv, which will drastically improve Ruby’s speed, to the point where perhaps it will be comparable with the not-too-fast but acceptable CPython. And yes, even for Python there are psyco and pypy both of which would change the outcome of my test. With this out of the way, let’s move on.
Dima Dogadaylo and my friend Valentino Volonghi both blogged about a much faster version (they employed the use of generators) of my Python snippet. Many other people in this blog and on Reddit proposed faster algorithms too. That’s all well and fine, but it really compares apples and oranges. If we switch from a naive recursive algorithm to an iterative one, even Ruby 1.8.6 will be faster and able to compute the task in a few moments. The computationally expensive and inefficient recursive algorithm should be used by those who want to compare other languages with my results, otherwise the comparison will be meaningless. Using a fast language to implement a slow algorithm is always going to be slower than using a slow language with an efficient algorithm, for N sufficiently large.
The Fibonacci function is mathematically defined as follows:

In the intentionally naive and recursive algorithm I adopted in my original post, I essentially wrote the mathematical formula in Ruby and Python syntax, and then executed it for n in a range that goes from 0 up to 35. This is very inefficient, because the tree-recursive process generated in computing F(n) grows exponentially. We have:
F(0) = 0
F(1) = 1
F(2) = F(1) + F(1)
F(3) = F(2) + F(1) = F(1) + F(1) + F(1)
F(4) = F(3) + F(2) = F(1) + F(1) + F(1) + F(1) + F(1)
F(5) = F(4) + F(3) = F(1) + F(1) + F(1) + F(1) + F(1) + F(1) + F(1) + F(1)
F(6) = F(5) + F(4) = F(1) + F(1) + F(1) + F(1) + F(1) + F(1) + F(1) + F(1) + F(1) + F(1) + F(1) + F(1) + F(1)
…
You can see where this is going. The recurrence relation generated by the algorithm, implies that we execute 3*F(n) – 2 lines of code (for n nontrivial). When n=35, this is a huge number, since F(35) = 9227465. Now you can see why the algorithm really stress-tests function calling and why Ruby 1.8.6 chokes. If we want to evaluate the computational complexity of the algorithm in terms of big-O notation, we have O(phi^n), where phi is the Golden Ratio and equal to (sqrt(5) + 1)/2. That’s exponential. Valentino, Dima and many others came up with variants of iterative or tail-recursive algorithms, whose complexity is linear (O(n)). Using memoization (the technique of keeping a cache of the previous calculations) or a simple iterative loop changes the algorithm’s efficiency. The difference between my original algorithm and these others is humongous and there is no point in comparing them as a means of evaluating the runtime speed of a given language. At that point we could very well use Binet’s formula, or Edsger Dijkstra’s algorithm or even 2×2 matrix exponentiation, but we’d not be proving any point there. If you want to learn more about algorithms (a necessity for any serious programmer), I strongly suggest the introductory textbook “Introduction to Algorithms”.
Don Stewart, a real celebrity in the Haskell community, has replied to my post with two articles that essentially illustrate a couple of points that are not new to many people: 1) Haskell is fast, much faster than Python and Ruby, 2) Haskell’s ability to take advantage of multiple cores by following parallelism hints placed in the code for the compiler, is just plain awesome and easy on programmers. Don did use old versions of Ruby and Python, but I appreciate his response a lot, because he kept the same algorithm in place. He didn’t bait and switch, using one of the many fast implementations available on the Haskell wiki. His fair comparison showed, despite the very limited scope of the test, what kind of performance we can expect from this functional language’s main compiler (GHC).
As I said in the past, Haskell really is a language worth getting excited about. But it’s not all about performance and the trend of increasingly multicored CPUs. So I’m glad that we have both Ruby and Haskell with their strengths and weaknesses. While Ruby 1.9 will hopefully give us a runtime that’s seriously fast enough™ in most circumstances, it’d be nice if in Ruby’s future there were features that allowed us to take advantage of multiple cores, just like Haskell does without cumbersome code modifications.
Amongst the other replies there was a mix of everything, including Assembly and LOLcode, but I’d like to point out the post by a lisper, who took the Haskell vs Lisp approach in “Dude, your quad-cores have been smoking way too much Haskell!”. He runs the following code, first for n=45 and then for n=4500:
(defun printfib-trec (start end)
(format t "n=~D => ~D~%" start (fib-trec start))
(if (< start end)
(printfib-trec (+ start 1) end)))
(defun fib-trec (n)
"Tail-recursive Fibonacci number function"
(labels ((calc-fib (n a b)
(if (= n 0)
a
(calc-fib (- n 1) b (+ a b)))))
(calc-fib n 0 1)))
(time (printfib-trec 0 4500))
On my machine this runs in 3.291 seconds. Algorithm 101, guys. Quick question for my readers: how can an algorithm that is supposed to be O(phi^n) execute in 3 seconds, per n=4500? Simple, it’s that blogger who is being naive and not the algorithm that he adopted. If you pay attention you can see that he’s trying to compare the linear O(n) tail-recursive implementation of Fibonacci in Lisp, with the naive recursive one in Haskell, and from this he concludes “Oops. Sorry Haskell…”. Slow down, cowboy! You want to compare Lisp with Haskell? Let’s do a fair comparison then. Let’s keep the same algorithm for both and use n=45, shall we?
Here is my naive/recursive Lisp version:
(defun printfib (start end)
(format t "n=~D => ~D~%" start (fib start))
(if (< start end)
(printfib (+ start 1) end)))
(defun fib (n)
"Naive-recursive Fibonacci number function"
(if (or (= n 0) (= n 1))
n
(+ (fib (- n 1)) (fib (- n 2)))))
(time (printfib 0 45))
And here is the Haskell one:
import Control.Monad
import Text.Printf
fib :: Int -> Int
fib 0 = 0
fib 1 = 1
fib n = fib (n-1) + fib (n-2)
main = forM_ [0..45] $ \i ->
printf "n=%d => %d\n" i (fib i)
On my MacBook Pro, Intel Core 2 Duo 2.2 GHz and 2 GB of RAM, Lisp (well sbcl, which supposedly uses both cores, though there is no documented proof of this) took 259.743 seconds. See the difference between O(n) and O(phi^n)? Try n=4500 with this algorithm and the sun will have burned out before the computation is finished. Haskell, used only 1 core, and took 77.779 seconds. Hmmm, Haskell was 3.3 times faster than Lisp without even parallelizing it.
Just out of curiosity, let’s try again with Don’s code which still implements the same algorithm (whose complexity is O(phi^n)), but which introduces parallelism hints for the compiler. Remember, this is being done out of curiosity, we already established that for this particular micro-benchmark Haskell smokes Lisp away.
import Control.Parallel
import Control.Monad
import Text.Printf
cutoff = 35
fib' :: Int -> Integer
fib' 0 = 0
fib' 1 = 1
fib' n = fib' (n-1) + fib' (n-2)
fib :: Int -> Integer
fib n | n < cutoff = fib' n
| otherwise = r `par` (l `pseq` l + r)
where
l = fib (n-1)
r = fib (n-2)
main = forM_ [0..45] $ \i ->
printf "n=%d => %d\n" i (fib i)
Now that Haskell’s program uses two cores (assuming that Lisp does too, which is unlikely), Haskell runs it in 52.248 seconds versus Lisp’s 259.743 seconds. That’s about 5 times faster than Lisp. Does this prove that Haskell is faster than Lisp in general (or to be more exact, that GHC is faster than sbcl on Mac OS X 10.5)? Nope, guys it’s just a silly micro-benchmark after all. But damn the temptation to say “Oops. Sorry Lisp…” was too strong.
Alright the title of this post is a tad sensational sounding, I know, and it’s in part aimed at messing with my many Pythonista friends. We Rubyists have been teased for a long time, due to the slowness of the main Ruby interpreter. Well, it looks like with Ruby 1.9, it’ll be payback time. Just out of curiosity I decided to run a single benchmark (you can hardly call it that) to see how Ruby 1.9 had improved over the current stable version (1.8.6). I wasn’t planning to make a post about it. It was one of those tests that you do at 3 AM in an irb session when you feel you’ve made your daily peace with your actual workload for the night. When I saw the results though, my jaw dropped. I had to blog about this one.
I ran a recursive Fibonacci function, just to stress test a bit of recursion and method calling, and while I was at it, I decided to compare it with Python too. The test was run on Mac OS X 10.5 with my MacBook Pro (Core 2 Duo 2.2 GHz and 2 GB of memory). It’s a single test (which is obviously not a real world example, as you would use an iterative version of the function if it were), and unlike with real programs, it doesn’t stress many features of the language. At least for now, there is no reasonable evidence to conclude that Ruby 1.9 – which will be released for this coming Christmas – will actually be faster than Python 2.5.1 in the majority of situations, but hear me out and check out these very surprising results.
The Ruby code:
def fib(n)
if n == 0 || n == 1
n
else
fib(n-1) + fib(n-2)
end
end
36.times do |i|
puts "n=#{i} => #{fib(i)}"
end
And the Python equivalent:
def fib(n):
if n == 0 or n == 1:
return n
else:
return fib(n-1) + fib(n-2)
for i in range(36):
print "n=%d => %d" % (i, fib(i))
Running the snippets above, I got the following results:
Ruby 1.8.6: 158.869s
Python 2.5.1: 31.507s
Ruby 1.9.0: 11.934s
Ehm, hold on a second! Did Ruby just go from 159 seconds down to 12? Koichi Sasada, do you have an Amazon Wishlist? I was expecting a decent improvement, as I’ve been playing with 1.9 every now and then for a long time – so I knew it was faster – but I was blown away when I timed the latest version from trunk (even if it’s a really silly example that’s being tested). Granted Python is not the fastest language out there, but Ruby 1.9 was still able to execute the script almost 3 times as fast. It’s unbelievable.
Now it’ll be very interesting to run a series of algorithmically equivalent tests for Ruby and Python, and to see just when exactly Ruby 1.9 manages to knock Python out of the water – and where Python has still the edge. If I manage to find some time, I will report the results in this blog. But for now, I’ll say just… wow!
IBM provides the community with, among others, Ruby and Python open source drivers for DB2 (more exactly IBM databases). Ruby has a gem that packages the Rails adapter for DB2 and its prerequisite driver. As a result, the easiest way to get the Ruby driver for DB2 is to install the ibm_db gem through rubygems. The Python driver is instead currently provided as a tar.gz archive of source code. In both cases, on Linux, the installation builds the binary from source. This procedure is supposed to be very straightforward and user-friendly, and as long as you’re aware of the prerequisites and a few important steps, you can be up and running in no time. Unfortunately, if you aren’t aware of these things, as often happens with Linux, you may end up spending a good deal of time trying to figure out what’s wrong with your environment and setup procedure. This short – largely step-by-step – guide aims to resolve this, by providing you with clear instructions for setting up both the Ruby and Python drivers, respectively, for DB2 on Linux. The instructions below are tailored for Ubuntu 7.10 and its variants (including for example Kubuntu 7.10, 32 and 64 bit), but the same principles can be applied to other distros as well.
Prerequisites
Depending on which of the two drivers interests you, you will need to have Ruby or Python installed, along with a modern version of DB2 (e.g. 9.1.2 or 9.5). Please note that if you are still using DB2 Express-C 9.1, FixPack 2 or greater is required, so make sure that you grab the latest FixPack, FP4. For everyone else, you can get DB2 Express-C 9.5 from the official site for free. Please also note that if you were to run the DB2 9.5 setup on (K|X)Ubuntu 7.10 out of the box, you’d get an error similar to the one below.
ERROR:
The required library file libstdc++.so.5 is not found on the system.
ERROR:
The required library file libaio.so.1 is not found on the system.
Check the following web site for the up-to-date system requirements
of IBM DB2 9.5
http://www.ibm.com/software/data/db2/udb/sysreqs.html
http://www.software.ibm.com/data/db2/linux/validate
/home/antonio/Desktop/exp/db2/linux/install/../bin/db2usrinf:
error while loading shared libraries: libstdc++.so.5:
cannot open shared object file: No such file or directory
[: 609: 0: unexpected operator
/home/antonio/Desktop/exp/db2/linux/install/../bin/db2langdir:
error while loading shared libraries: libstdc++.so.5:
cannot open shared object file: No such file or directory
/home/antonio/Desktop/exp/db2/linux/install/../bin/db2langdir:
error while loading shared libraries: libstdc++.so.5:
cannot open shared object file: No such file or directory
DBI1055E The message file db2install.cat cannot be found.
Explanation: The message file required by this
script is missing from the system; it may have been
deleted or the database products may have been loaded
incorrectly.
User Response: Verify that the product option containing
the message file is installed correctly. If there are
verification errors; reinstall the product option.
To prevent this, please install DB2 with its prerequisites:
$ sudo apt-get install libstdc++5
$ sudo apt-get install libaio-dev
$ sudo ./db2setup
When the DB2 Setup Wizard prompts you for the type of installation requested, ensure that you select “custom” and then, when prompted with the “Features” screen a couple of clicks later, select “Base application development tools” under the section “Application Development Tools” (the check box should switch from gray to white and be marked off). You will need these for building the Ruby and Python drivers during the installation. You can of course install them later, by running the setup again and choosing the “Work with existing” button in the launchpad, but if you’re installing from scratch, it’s easier to do it right the first time.
You can install Ruby or Python any way you prefer, but on Ubuntu (with the Universe repository enabled) you can install the required essential compiler tools (remember, on Linux, unlike Windows, the driver binaries are built from source), Ruby and Rubygems by running:
$ sudo apt-get install build-essential
$ sudo apt-get install ruby-full rubygems
If you are interested in Python, this comes already pre-installed on Ubuntu. Not all variations of Ubuntu however have the python2.5-dev package installed (I believe Kubuntu does), so just to be on the safe side, if you want the Python driver to be installed, get this development package by running:
$ sudo apt-get install python2.5-dev
Installing the Ruby driver for DB2
Now that you’ve ensured that your system has the proper requirements installed, the Ruby driver installation should be quite straightforward, thanks to the gem packaging system. Assuming you are installing it with an arbitrary user account (as opposed to the db2inst1 account), you will need to run the following commands, which will also take care of letting the compiler know where the current DB2 instance is located:
$ . /home/db2inst1/sqllib/db2profile
$ export IBM_DB_DIR=/home/db2inst1/sqllib
$ export IBM_DB_LIB=/home/db2inst1/sqllib/lib
$ sudo gem update
$ sudo gem install ibm_db --include-dependencies
The last command should prompt you with a few options, please select the latest version of the ibm_db gem with a “(ruby)” next to it (usually option 1), since you are building on-the-fly rather then deploying a Win32 binary.
Installing the Python driver for DB2
In order for you to install the Python driver, you will need to grab ibm_db.tar.gz that contains the 0.1.0 version of the source code. Don’t be afraid of the version number though, despite being at a beta level, it’s a pretty solid driver which has benefited a lot from the maturity of the IBM API used by the PHP and Ruby ones (from which the Python driver was ported). Once you’ve extracted the archive in a given folder, from the shell, enter into that folder and run the following commands (do not worry about several warnings which appear during compilation).
$ . /home/db2inst1/sqllib/db2profile
$ export IBM_DB_DIR=/home/db2inst1/sqllib
$ export IBM_DB_LIB=/home/db2inst1/sqllib/lib
$ sudo python setup.py build
$ sudo python setup.py install
Connecting to the database
Now that at least one of the two drivers is installed, run a quick check to verify that the setup went fine and that you can connect to a database from Ruby or from Python. You can use any database, but if you are new to DB2, you may want to use the sample database called SAMPLE. To create it from your shell run the following:
$ sudo su db2inst1
$ bash
$ db2sampl
Inserting a couple of exit commands allows you to leave the Bash shell first, followed by the instance user db2inst1′s environment. Since you’re now back in the shell as a regular or arbitrary user (not as db2inst1), you’ll need to source the db2profile first, exactly as you’d do if you had just opened a new shell. You may want to consider inserting the following instruction in your shell profile as well if you plan to use the driver regularly.
$ . /home/db2inst1/sqllib/db2profile
Having performed this step, Python users can just run the python command to start the interactive shell, while Ruby users will have to require rubygems as well by running:
$ irb -rubygems
If you’d like to have this set in the profile of your shell as well, you can insert within it the command:
$ export RUBYOPT=rubygems
Ruby users can at this point run the following script interactively (insert one line at a time in irb):
require 'ibm_db'
conn = IBM_DB::connect("sample","db2inst1","mypassword")
sql = "SELECT * FROM SALES"
stmt = IBM_DB::exec(conn, sql)
while (row = IBM_DB::fetch_assoc(stmt))
p row
end
The output should be a list of hashes, one for each record. Unlike in Python, in Ruby if the connection fails, the IBM_DB::connect method will just return false and not an actual error. The same is true for the IBM_DB::exec method. In such cases, you can run IBM_DB::conn_errormsg and IBM_DB::stmt_errormsg to gather further information on what caused the problem.
For those using the Python driver, you can establish a successful connection and retrieve a record from the Sales table by running the snippet below. The output will be a dictionary whose keys are the names of the columns in the table.
import ibm_db
conn = ibm_db.connect("sample","db2inst1","mypassword")
sql = "SELECT * FROM SALES"
stmt = ibm_db.exec_immediate(conn, sql)
print ibm_db.fetch_assoc(stmt)
Naturally, there is much more to the usage of this IBM API which is common amongst a few languages, but the essentials of working with it need to be part of a different guide. While I go about writing that, feel free to take a look at the PHP and DB2 reference which documents a lot of shared functionalities and naming conventions.
I’m glad to inform you that the beta version of the Python and DB2 (IBM databases to be more exact) driver and DBI wrapper have been released in the Python Package Index. You can download the source for version 0.1.0 from here. This includes two components:
The DBI wrapper utilizes the IBM defined API driver, but you can also use the feature-rich API indipendently without the DBI wrapper.
I plan to provide a few examples about the usage of the IBM specific API. The Python driver is almost identical to the Ruby one, hence I may create a joined post for both languages. Now go try it and have fun. If you have any questions, feel free to comment below or send an email to the address opendev@us.ibm.com.
Python 3000′s first alpha release was made available last week, and as to be expected, it gathered a lot of interest from the development community. With Python on everyone’s lips, let’s talk about Python in this post as well.
Python and DB2
Back in March we gathered an overwhelming amount of feedback in regards to IBM’s interest in developing a reliable driver and adapter for Python and Django, just like we did for Ruby and RoR. We considered your feedback and took action. I’m glad to be able to “leak” some news to you: our team in the States has been working hard on an advanced driver whose initial development is about to be completed. A closed beta is expected to take place relatively soon, but not immediately.
The Python DB2 driver will be the stepping stone and we have decided to proceed with the creation of an adapter for SQLAlchemy first, and for the default Django ORM next. Enabling SQLAlchemy and DB2 means going further than just Django. It means automatically providing support for any framework that builds on top of it. The development of these adapters should not require too much time. Building a modern, fast and reliable driver was our number one priority and the most challenging part, but now that that component is almost complete, the rest should be smooth sailing from here on out.
Pythonistas’ reactions
In the last few weeks I’ve received many emails in regards to requests about Python and DB2. I have yet to finish replying to everyone, so I thought I would proceed with a public update. Some of these emails are very interesting because they come from very prepared programmers who are willing to help and have good ideas for creating plugins that exploit the unique features of DB2 like pureXML, DB2 Spatial Extender, and so on. Many people see the big potential of having DB2 working with Python, SQLAlchemy and Django, and are really looking forward to it.
The Rails community seems to be more oriented towards MySQL while the Django/Python one leans towards the historically more feature-rich PostgreSQL. I wonder if this difference explains the wider interest amongst pythonistas in seeing a best-selling, high-quality database like DB2 become available.
DB2 Express-C
As a remainder, for those of you who may not be in the loop, DB2 Express-C is an awesome free version of DB2 that doesn’t pose any limits on your database size, connections or users. You can run it on any server with up to 4GB of RAM and 4 CPU cores (the limit is in the license; the code is essentially the same as that of the more expensive versions). You also get native XML storage and querying, high performances and endless scalability. It won’t cost you a cent, but should you require it, you can optionally purchase one year support and get High Availability Disaster Recovery as well at a very competitive price per server.
There is a bigger surprise in the making, and this should be available in a month or so. However I’m absolutely not able to talk about it for the time being, so I’ll have to leave you hanging. As usual, it will be covered in this blog in due time, and it’s sure to please many people, so please feel free to subscribe to this blog by feed or by email if you haven’t done so already.