<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Zen and the Art of Programming &#187; Quick Tips</title>
	<atom:link href="http://programmingzen.com/category/quick-tips/feed/" rel="self" type="application/rss+xml" />
	<link>http://programmingzen.com</link>
	<description>By Antonio Cangiano, Software Engineer &#38; Technical Evangelist at IBM</description>
	<lastBuildDate>Wed, 21 Jul 2010 22:12:55 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Who is accessing your Gmail account?</title>
		<link>http://programmingzen.com/2010/06/15/who-is-accessing-your-gmail-account/</link>
		<comments>http://programmingzen.com/2010/06/15/who-is-accessing-your-gmail-account/#comments</comments>
		<pubDate>Tue, 15 Jun 2010 04:56:26 +0000</pubDate>
		<dc:creator>Antonio Cangiano</dc:creator>
				<category><![CDATA[Featured Article]]></category>
		<category><![CDATA[General]]></category>
		<category><![CDATA[Quick Tips]]></category>

		<guid isPermaLink="false">http://antoniocangiano.com/?p=1192</guid>
		<description><![CDATA[The Gmail team recently introduced a new feature (in the footer) that enables account holders to verify the latest login activities on their account. I routinely check mine and the results are usually boring, reminding me I check my email way too often (and I do so mostly via browser, through my Canadian IP). An [...]]]></description>
			<content:encoded><![CDATA[<p>The Gmail team recently introduced a new feature (in the footer) that enables account holders to verify the latest login activities on their account. I routinely check mine and the results are usually boring, reminding me I check my email way too often (and I do so mostly via browser, through my Canadian IP).</p>
<p><strong>An unwelcome surprise</strong></p>
<p>If you don&#8217;t check yours regularly, you should (my version of Google Apps doesn&#8217;t have this feature though). In fact, tonight during a routine check, I discovered an unwelcome surprise: an entry that didn&#8217;t belong. The following screenshot shows the recent activity on my account (with some information blanked out):</p>
<p align="center"><img src="http://antoniocangiano.com/wp-content/uploads/2010/06/gmail-activity.png" alt="My Gmail activity window" /></p>
<p>See that US, IMAP line? That wasn&#8217;t me. So did someone manage to access my account? Or was it a web application that I authorized? Before panicking, I decided to look into whatever information I could gather about that IP.</p>
<p>It turns out that it&#8217;s the IP of a server hosted by Slicehost (RackSpace), but I couldn&#8217;t find any website running on that IP address (173.203.211.51). To make things more interesting, I found two people (<a href="http://www.google.com/support/forum/p/Google%20Mail/thread?tid=2a1a5c7cd2665946&#038;hl=de">one German</a>, <a href="http://d.hatena.ne.jp/PIcOz/20100608/1276022682">one Japanese</a>) complaining online about the same IP address and IMAP access to their Gmail accounts.</p>
<p>Was my account hacked into? I have a hard time believing that someone actually managed to login by guessing my password which was as secure as a password can be. I haven&#8217;t used my laptop on an unsecured WiFi. I use a Mac and am very cautious about what I install, so I doubt I have a keylogger installed or anything of that nature. Using 1Password I&#8217;m even immune to the so-called &#8220;tab napping&#8221; attacks.</p>
<p><strong>Possible culprits</strong></p>
<p>Assuming that this is not a misunderstanding and some SaaS application I authorized is not in fact using that server to perform a legitimate action, I think it&#8217;s likely that someone managed to get in through a vulnerability or backdoor in one such application.</p>
<p>I&#8217;m not pointing fingers here, nor accusing anyone, but it is interesting to find such an occurence happening so shortly after granting the aforementioned authorizations. The websites I granted access to were:</p>
<ul>
<li>Zoho Discussions (24 hours before the suspected intrusion happened)</li>
<li>Trendly (3 days before the intrusion)</li>
<li>Etacts (a few weeks before the intrusion)</li>
</ul>
<p>It&#8217;s worth mentioning that in the past Etacts had scared the crap out of me with their American IP showing up in the recent activity list. However a lookup has always shown the questionable IP to belong to them.</p>
<p>Do any of these services intentionally use the server with IP 173.203.211.51? Since I&#8217;m not the only one who suspects a violation from this IP, it would be interesting to hear what Slicehost has to say about it? Perhaps they know if it&#8217;s a legitimate or illegitimate use of their server.</p>
<p><strong>How to deal with an email intrusion</strong></p>
<p>The perception of being intruded upon, whether it&#8217;s real or just a scare, is definitely not pleasant. Just in case the same happens to you, here is what I did to deal with the situation:</p>
<ul>
<li>I verified that there were no messages sent on my behalf.</li>
<li>I checked that there weren&#8217;t any new filters that would forward emails to a possible malicious user.</li>
<li>I verified that there weren&#8217;t any forwards and ensured that forwarding was disabled.</li>
<li>POP3 was already disabled, and I have now disabled IMAP as well.</li>
<li>I revoked access to my Google account for all listed web applications.</li>
<li>I changed my password to another humongous one on a different computer, with a brand new installation of Linux, directly wired to my DSL modem (bypassing the whole wireless infrastructure I set up at home).</li>
<li>I will, soon enough, format my Mac (I&#8217;ve been planning a DBAN wipe, plus a brand new installation for a while either way).</li>
<li>I will continue to monitor my account activity.</li>
</ul>
<p>This is the kind of information I felt necessary to share even if this turns out to be a false alarm. I highly suggest that you keep an eye on your Gmail account activity and if you find something suspicious, act accordingly.</p>
<p><strong>UPDATE (June 17, 2010):</strong> Please read <a href="http://antoniocangiano.com/2010/06/17/follow-up-to-my-gmail-third-party-access-post/">my follow up post</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://programmingzen.com/2010/06/15/who-is-accessing-your-gmail-account/feed/</wfw:commentRss>
		<slash:comments>33</slash:comments>
		</item>
		<item>
		<title>Setup Ruby Enterprise Edition, nginx and Passenger (aka mod_rails) on Ubuntu</title>
		<link>http://programmingzen.com/2009/11/20/setup-ruby-enterprise-edition-nginx-and-passenger-aka-mod_rails-on-ubuntu/</link>
		<comments>http://programmingzen.com/2009/11/20/setup-ruby-enterprise-edition-nginx-and-passenger-aka-mod_rails-on-ubuntu/#comments</comments>
		<pubDate>Fri, 20 Nov 2009 19:17:26 +0000</pubDate>
		<dc:creator>Antonio Cangiano</dc:creator>
				<category><![CDATA[Quick Tips]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Ruby on Rails]]></category>

		<guid isPermaLink="false">http://antoniocangiano.com/?p=1117</guid>
		<description><![CDATA[The following is a very short guide on setting up Ruby Enterprise Edition (REE), nginx and Passenger, for serving Ruby on Rails applications on Ubuntu. It also includes a few quick and easy optimization tips. We start with setting up REE (x64), using the .deb file provided by Phusion: wget http://rubyforge.org/frs/download.php/66163/ruby-enterprise_1.8.7-2009.10_amd64.deb sudo dpkg -i ruby-enterprise_1.8.7-2009.10_amd64.deb [...]]]></description>
			<content:encoded><![CDATA[<p>The following is a very short guide on setting up Ruby Enterprise Edition (REE), nginx and Passenger, for serving Ruby on Rails applications on Ubuntu. It also includes a few quick and easy optimization tips.</p>
<p>We start with setting up REE (x64), using the .deb file provided by Phusion:</p>
<div class="highlight">
<pre>wget http://rubyforge.org/frs/download.php/66163/ruby-enterprise_1.8.7-2009.10_amd64.deb
sudo dpkg -i ruby-enterprise_1.8.7-2009.10_amd64.deb
ruby -v
</pre>
</div>
<p>In output you should see &#8220;ruby 1.8.7 (2009-06-12 patchlevel 174)&#8230;&#8221; or similar. If this is the case, good; while you are there, update RubyGems and the installed gems:</p>
<div class="highlight">
<pre>sudo gem update --system
sudo gem update
</pre>
</div>
<p>Next, you&#8217;ll need to install nginx, which is a really fast web server. The <a href="http://phusion.nl/about.html">Phusion</a> team has made it very easy to install, but if you simply follow most instructions found elsewhere, you&#8217;ll get the following error:</p>
<pre>checking for system md library ... not found
checking for system md5 library ... not found
checking for OpenSSL md5 crypto library ... not found

./configure: error: the HTTP cache module requires md5 functions
from OpenSSL library.  You can either disable the module by using
--without-http-cache option, or install the OpenSSL library in the
system,
or build the OpenSSL library statically from the source with nginx by
using
--with-http_ssl_module --with-openssl=
<path> options.</pre>
<p>Instead, we are going to install libssl-dev first and then nginx and its Passenger module:</p>
<div class="highlight">
<pre>sudo aptitude install libssl-dev
sudo passenger-install-nginx-module
</pre>
</div>
<p>Follow the prompt and accept all the defaults (when prompted to chose between 1 and 2, pick 1).</p>
<p>Before I proceed with the configuration, I like to create an init script and have it boot at startup (the script itself is adapted from one provided by the excellent <a href="http://articles.slicehost.com">articles at slicehost.com</a>):</p>
<div class="highlight">
<pre>sudo vim /etc/init.d/nginx
</pre>
</div>
<p>The content of which needs to be:</p>
<div class="highlight">
<pre><span class="c">#! /bin/sh</span>

<span class="c">### BEGIN INIT INFO</span>
<span class="c"># Provides:          nginx</span>
<span class="c"># Required-Start:    $all</span>
<span class="c"># Required-Stop:     $all</span>
<span class="c"># Default-Start:     2 3 4 5</span>
<span class="c"># Default-Stop:      0 1 6</span>
<span class="c"># Short-Description: starts the nginx web server</span>
<span class="c"># Description:       starts nginx using start-stop-daemon</span>
<span class="c">### END INIT INFO</span>

<span class="nv">PATH</span><span class="o">=</span>/opt/nginx/sbin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
<span class="nv">DAEMON</span><span class="o">=</span>/opt/nginx/sbin/nginx
<span class="nv">NAME</span><span class="o">=</span>nginx
<span class="nv">DESC</span><span class="o">=</span>nginx

<span class="nb">test</span> -x <span class="nv">$DAEMON</span> <span class="o">||</span> <span class="nb">exit </span>0

<span class="c"># Include nginx defaults if available</span>
<span class="k">if</span> <span class="o">[</span> -f /etc/default/nginx <span class="o">]</span> ; <span class="k">then</span>
    . /etc/default/nginx
<span class="k">fi</span>

<span class="nb">set</span> -e

. /lib/lsb/init-functions

<span class="k">case</span> <span class="s2">&quot;$1&quot;</span> in
  start<span class="o">)</span>
    <span class="nb">echo</span> -n <span class="s2">&quot;Starting $DESC: &quot;</span>
    start-stop-daemon --start --quiet --pidfile /opt/nginx/logs/<span class="nv">$NAME</span>.pid <span class="se">\</span>
        --exec <span class="nv">$DAEMON</span> -- <span class="nv">$DAEMON_OPTS</span> <span class="o">||</span> <span class="nb">true</span>
<span class="nb">    echo</span> <span class="s2">&quot;$NAME.&quot;</span>
    ;;
  stop<span class="o">)</span>
    <span class="nb">echo</span> -n <span class="s2">&quot;Stopping $DESC: &quot;</span>
    start-stop-daemon --stop --quiet --pidfile /opt/nginx/logs/<span class="nv">$NAME</span>.pid <span class="se">\</span>
        --exec <span class="nv">$DAEMON</span> <span class="o">||</span> <span class="nb">true</span>
<span class="nb">    echo</span> <span class="s2">&quot;$NAME.&quot;</span>
    ;;
  restart|force-reload<span class="o">)</span>
    <span class="nb">echo</span> -n <span class="s2">&quot;Restarting $DESC: &quot;</span>
    start-stop-daemon --stop --quiet --pidfile <span class="se">\</span>
        /opt/nginx/logs/<span class="nv">$NAME</span>.pid --exec <span class="nv">$DAEMON</span> <span class="o">||</span> <span class="nb">true</span>
<span class="nb">    </span>sleep 1
    start-stop-daemon --start --quiet --pidfile <span class="se">\</span>
        /opt/nginx/logs/<span class="nv">$NAME</span>.pid --exec <span class="nv">$DAEMON</span> -- <span class="nv">$DAEMON_OPTS</span> <span class="o">||</span> <span class="nb">true</span>
<span class="nb">    echo</span> <span class="s2">&quot;$NAME.&quot;</span>
    ;;
  reload<span class="o">)</span>
      <span class="nb">echo</span> -n <span class="s2">&quot;Reloading $DESC configuration: &quot;</span>
      start-stop-daemon --stop --signal HUP --quiet --pidfile /opt/nginx/logs/<span class="nv">$NAME</span>.pid <span class="se">\</span>
          --exec <span class="nv">$DAEMON</span> <span class="o">||</span> <span class="nb">true</span>
<span class="nb">      echo</span> <span class="s2">&quot;$NAME.&quot;</span>
      ;;
  status<span class="o">)</span>
      status_of_proc -p /opt/nginx/logs/<span class="nv">$NAME</span>.pid <span class="s2">&quot;$DAEMON&quot;</span> nginx <span class="o">&amp;&amp;</span> <span class="nb">exit </span>0 <span class="o">||</span> <span class="nb">exit</span> <span class="nv">$?</span>
      ;;
  *<span class="o">)</span>
    <span class="nv">N</span><span class="o">=</span>/etc/init.d/<span class="nv">$NAME</span>
    <span class="nb">echo</span> <span class="s2">&quot;Usage: $N {start|stop|restart|reload|force-reload|status}&quot;</span> &gt;&amp;2
    <span class="nb">exit </span>1
    ;;
<span class="k">esac</span>

<span class="nb">exit </span>0
</pre>
</div>
<p>Change its permission and have it startup at boot:</p>
<div class="highlight">
<pre>sudo chmod +x /etc/init.d/nginx
sudo /usr/sbin/update-rc.d -f nginx defaults
</pre>
</div>
<p>From now on, you&#8217;ll be able to start, stop and restart nginx with it. Start the server as follows:</p>
<div class="highlight">
<pre>sudo /etc/init.d/nginx start
</pre>
</div>
<p>Heading over to your server IP with your browser, you should see &#8220;Welcome to nginx!&#8221;. If you do, great, we can move on with the configuration of nginx for your Rails app.</p>
<p>Edit nginx&#8217; configuration file:</p>
<div class="highlight">
<pre>sudo vim /opt/nginx/conf/nginx.conf
</pre>
</div>
<p>Adding a server section within the http section, as follows:</p>
<div class="highlight">
<pre>    server <span class="o">{</span>
        listen 80;
        server_name example.com;
        root /somewhere/my_rails_app/public;
        passenger_enabled on;
        rails_spawn_method smart;
    <span class="o">}</span>
</pre>
</div>
<p>The server name can also be a subdomain if you wish (e.g., blog.example.com). It&#8217;s important that you point the root to your Rails&#8217; app public directory.</p>
<p>The rails_spawn_method directive is very efficient, allowing Passenger to consume less memory per process and speed up the spawning process, whenever your Rails application is not affected by its limitations (for a discussion about this you can read the proper <a href="http://www.modrails.com/documentation/Users%20guide.html#_the_smart_spawning_method">section in the official guide</a>).</p>
<p>If you have lots of RAM (e.g., more than 512 MB) on your server, you may want to consider increasing you maximum pool size, with the directive passenger_max_pool_size from its default size of 6. Conversely, if you want to limit the number of processes running at any time and consume less memory on a small VPS (e.g., 128 to 256MB), you can decrease that number down to 2 (or something in that range). (Always test a bunch of configurations to find one that works for you). You can read more about this directive, <a href="http://modrails.com/documentation/Users%20guide%20Nginx.html#PassengerMaxPoolSize">in the official guide</a>.</p>
<p>While you are modifying nginx&#8217; configuration, you may also want to increase the worker processes (e.g., to 4, on a typical VPS) and add a few more tweaks (such as enabling gzip compression):</p>
<div class="highlight">
<pre><span class="c"># ...</span>
http <span class="o">{</span>
    passenger_root /usr/local/lib/ruby/gems/1.8/gems/passenger-2.2.5;
    passenger_ruby /usr/local/bin/ruby;

    include       mime.types;
    default_type  application/octet-stream;

    access_log  logs/access.log;

    sendfile        on;
    keepalive_timeout  65;
    tcp_nodelay on;

    gzip on;
    gzip_comp_level 2;
    gzip_proxied any;   

    server <span class="o">{</span>
    <span class="c">#...</span>
</pre>
</div>
<p>When you are happy with the changes, save the file, and restart nginx:</p>
<div class="highlight">
<pre>sudo /etc/init.d/nginx restart
</pre>
</div>
<p>If you wish to restart Passenger in the future, without having to restart the whole web server, you can simply run the following command:</p>
<div class="highlight">
<pre>touch /somewhere/my_rails_app/tmp/restart.txt
</pre>
</div>
<p>Passenger also provides a few handy monitoring tools. Check them out:</p>
<div class="highlight">
<pre>sudo passenger-status
sudo passenger-memory-stats
</pre>
</div>
<p>That&#8217;s it, you are ready to go! I hope that you find these few notes useful.</p>
]]></content:encoded>
			<wfw:commentRss>http://programmingzen.com/2009/11/20/setup-ruby-enterprise-edition-nginx-and-passenger-aka-mod_rails-on-ubuntu/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Add code highlighting to your Google Waves</title>
		<link>http://programmingzen.com/2009/10/19/add-code-highlighting-to-your-google-waves/</link>
		<comments>http://programmingzen.com/2009/10/19/add-code-highlighting-to-your-google-waves/#comments</comments>
		<pubDate>Tue, 20 Oct 2009 00:53:46 +0000</pubDate>
		<dc:creator>Antonio Cangiano</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Quick Tips]]></category>

		<guid isPermaLink="false">http://antoniocangiano.com/?p=1110</guid>
		<description><![CDATA[Google Wave is still rough around the edges, but it has a lot of potential in terms of becoming a great collaboration tool. As a developer, your first question will probably be: &#8220;How do I add code highlighting to my waves?&#8221;. The answer is straightforward, however not very easy to find if you google it. [...]]]></description>
			<content:encoded><![CDATA[<p>Google Wave is still rough around the edges, but it has a lot of potential in terms of becoming a great collaboration tool. As a developer, your first question will probably be: &#8220;How do I add code highlighting to my waves?&#8221;. The answer is straightforward, however not very easy to find if you google it. I hope this post will help fellow developers who are experimenting with Google Wave.</p>
<p>The following steps are required to obtain syntax highlighting for your code:</p>
<ol>
<li>Create a new wave and add the Syntaxy robot to your wave. Use the wave address: <strong>kasyntaxy@appspot.com</strong>.</li>
<li><strong>Reply</strong> to your first message or within it, thereby creating a reply (called &#8220;blip&#8221; in Google lingo).</li>
<li><strong>Specify your code&#8217;s language</strong>, prefixing the name with a hash and exclamation mark, like #!python or #!ruby.</li>
</ol>
<p>At this point, as you type the code in your blip it will be highlighted by the Syntaxy bot as shown in the picture below:</p>
<p align="center"><img src="http://antoniocangiano.com/wp-content/uploads/2009/10/wave-syntaxy.png" alt="Highlight code on Google Wave" /></p>
<p>More advanced automatic syntax highlighting bots will probably appear as Google Wave progresses, but this one should do the trick for now. On a side note, if you copy and paste code from XCode, the code formatting will be kept in your waves and blips without the need for bots.</p>
]]></content:encoded>
			<wfw:commentRss>http://programmingzen.com/2009/10/19/add-code-highlighting-to-your-google-waves/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Getting MacRuby&#8217;s compiler to work</title>
		<link>http://programmingzen.com/2009/10/08/getting-macrubys-compiler-to-work/</link>
		<comments>http://programmingzen.com/2009/10/08/getting-macrubys-compiler-to-work/#comments</comments>
		<pubDate>Thu, 08 Oct 2009 20:45:13 +0000</pubDate>
		<dc:creator>Antonio Cangiano</dc:creator>
				<category><![CDATA[Mac]]></category>
		<category><![CDATA[Quick Tips]]></category>
		<category><![CDATA[Ruby]]></category>

		<guid isPermaLink="false">http://antoniocangiano.com/?p=1107</guid>
		<description><![CDATA[There is major news in Rubyland today. MacRuby&#8217;s team just released their fist beta of version 0.5 (an experimental, still incomplete version of Ruby), which brings JIT, removal of the dreaded GIL (Global Interpreter Lock), native threads, GCD (Grand Central Dispatch) for multicore computing, and a whole new set of features found in the release [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://antoniocangiano.com/wp-content/uploads/2009/10/macruby_logo.png" align="right" alt="MacRuby's logo" />There is major news in Rubyland today. MacRuby&#8217;s team just released their fist beta of version 0.5 (an experimental, still incomplete version of Ruby), which brings JIT, removal of the dreaded GIL (Global Interpreter Lock), native threads, GCD (Grand Central Dispatch) for multicore computing, and a whole new set of features found in <a href="http://www.macruby.org/blog/2009/10/07/macruby05b1.html">the release announcement</a> to the table.</p>
<p>The most important new feature is the presence of a compiler. That&#8217;s right, thanks to this release, Ruby code can now become highly optimized executable code. How awesome is that? I can sense that you&#8217;re pumped by this news, so why not head over to MacRuby.com and <a href="http://www.macruby.org/downloads.html">download the installation file</a> for yourself? After you&#8217;ve done that, the next thing you&#8217;re going to want to do is run a small test like the following:</p>
<div class="highlight">
<pre><span class="nv">$ </span>macrubyc world_domination.rb -o world_domination
<span class="s1">Can&#39;t locate program `llc&#39;</span>
</pre>
</div>
<p>Oh noes! llc is a tool that ships with the LLVM (upon which MacRuby is built), however it&#8217;s not included with MacRuby&#8217;s installer (it will be in the future). But fear not my friends, there is a solution:</p>
<div class="highlight">
<pre><span class="nv">$ </span>svn co -r 82747 https://llvm.org/svn/llvm-project/llvm/trunk llvm-trunk
<span class="nv">$ </span><span class="nb">cd </span>llvm-trunk
<span class="nv">$ </span>./configure
<span class="nv">$ UNIVERSAL</span><span class="o">=</span>1 <span class="nv">UNIVERSAL_ARCH</span><span class="o">=</span><span class="s2">&quot;i386 x86_64&quot;</span> <span class="nv">ENABLE_OPTIMIZED</span><span class="o">=</span>1 make -j2
<span class="nv">$ </span>sudo env <span class="nv">UNIVERSAL</span><span class="o">=</span>1 <span class="nv">UNIVERSAL_ARCH</span><span class="o">=</span><span class="s2">&quot;i386 x86_64&quot;</span> <span class="nv">ENABLE_OPTIMIZED</span><span class="o">=</span>1 make install
</pre>
</div>
<p>If your machine does not have 2 cores, remove the -j2 option from the fourth line or adjust the number accordingly.</p>
<p>The compilation phase may take a couple of centuries, depending on your machine&#8217;s speed, but it should eventually build the LLVM. <img src='http://programmingzen.com/wp-includes/images/smilies/icon_razz.gif' alt=':-P' class='wp-smiley' />  llc will be placed in your PATH, and you&#8217;ll finally be able to compile Ruby code and obtain an executable to help you carry out your world domination plans.</p>
<div class="highlight">
<pre><span class="nv">$ </span>macrubyc world_domination.rb -o world_domination
<span class="nv">$ </span>./world_domination
MUAHAHAHAHA!
</pre>
</div>
]]></content:encoded>
			<wfw:commentRss>http://programmingzen.com/2009/10/08/getting-macrubys-compiler-to-work/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Serving Django Static Files through Apache</title>
		<link>http://programmingzen.com/2009/07/22/serving-django-static-files-through-apache/</link>
		<comments>http://programmingzen.com/2009/07/22/serving-django-static-files-through-apache/#comments</comments>
		<pubDate>Thu, 23 Jul 2009 03:27:28 +0000</pubDate>
		<dc:creator>Antonio Cangiano</dc:creator>
				<category><![CDATA[Django]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Quick Tips]]></category>

		<guid isPermaLink="false">http://antoniocangiano.com/?p=860</guid>
		<description><![CDATA[Django&#8217;s development server is capable of serving static (media) files thanks to the view django.views.static.serve. Popular web servers like Apache, Lighttpd or NGINX are much faster though, and as such should be used in production mode. Our goal is to bypass Django and let Apache (or other valid alternatives) directly serve static files like images, [...]]]></description>
			<content:encoded><![CDATA[<p>Django&#8217;s development server is capable of serving static (media) files thanks to the view <span class="n">django</span><span class="o">.</span><span class="n">views</span><span class="o">.</span><span class="n">static</span><span class="o">.</span><span class="n">serve</span>. Popular web servers like Apache, Lighttpd or NGINX are much faster though, and as such should be used in production mode. Our goal is to bypass Django and let Apache (or other valid alternatives) directly serve static files like images, videos, CSS, JavaScript files, and so on, for us.</p>
<p>Generally speaking, for performance reasons, it&#8217;s advised that you have two different webservers serving your dynamic requests and static files. In practice, for smaller sites, people often opt to simply use one webserver. In this article, I&#8217;ll discuss how to serve the static files within your Django project, through Apache.</p>
<p>The first thing we need to do is distinguish between development and production mode. We can do so by simply specifying DEBUG = True (development), or DEBUG = False (production) within our settings.py file.</p>
<p>settings.py may include (among others) the following declarations:</p>
<div class="highlight">
<pre><span class="c"># Absolute path to the project directory</span>
<span class="n">BASE_PATH</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">dirname</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">abspath</span><span class="p">(</span><span class="n">__file__</span><span class="p">))</span>

<span class="c"># Main URL for the project</span>
<span class="n">BASE_URL</span> <span class="o">=</span> <span class="s">&#39;http://example.org&#39;</span>

<span class="n">DEBUG</span> <span class="o">=</span> <span class="bp">False</span>

<span class="c"># Absolute path to the directory that holds media</span>
<span class="n">MEDIA_ROOT</span> <span class="o">=</span> <span class="s">&#39;</span><span class="si">%s</span><span class="s">/media/&#39;</span> <span class="o">%</span> <span class="n">BASE_PATH</span>

<span class="c"># URL that handles the media served from MEDIA_ROOT</span>
<span class="n">MEDIA_URL</span> <span class="o">=</span> <span class="s">&#39;</span><span class="si">%s</span><span class="s">/site_media/&#39;</span> <span class="o">%</span> <span class="n">BASE_URL</span>

<span class="c"># URL prefix for admin media -- CSS, JavaScript and images.</span>
<span class="n">ADMIN_MEDIA_PREFIX</span> <span class="o">=</span> <span class="s">&quot;</span><span class="si">%s</span><span class="s">admin/&quot;</span> <span class="o">%</span> <span class="n">MEDIA_URL</span>
</pre>
</div>
<p>*PATH constants indicate paths on your filesystem (e.g., /home/myuser/projects/myproject), while *URL constants indicate the actual URL needed to reach a given page or file.</p>
<p>Notice that it&#8217;s not unusual to have a /site_media URL that corresponds to a /media folder. In the example above, I opted to separate regular media files for the project from the standard ones that ship with Django for the admin section. To do this, all we have to do is create a symbolic link as follows:</p>
<div class="highlight">
<pre>ln -s /usr/lib/python2.5/site-packages/django/contrib/admin/media /path/to/myproject/media/admin
</pre>
</div>
<p>When you&#8217;re in development mode, and DEBUG = True, you want to let Django serve your static files. This can be done by adding the following snippet (or similar) to your urls.py:</p>
<div class="highlight">
<pre><span class="k">if</span> <span class="n">settings</span><span class="o">.</span><span class="n">DEBUG</span><span class="p">:</span>
    <span class="n">urlpatterns</span> <span class="o">+=</span> <span class="n">patterns</span><span class="p">(</span><span class="s">&#39;&#39;</span><span class="p">,</span>
        <span class="p">(</span><span class="s">r&#39;^site_media/(?P&lt;path&gt;.*)$&#39;</span><span class="p">,</span> <span class="s">&#39;django.views.static.serve&#39;</span><span class="p">,</span> <span class="p">{</span><span class="s">&#39;document_root&#39;</span><span class="p">:</span> <span class="n">settings</span><span class="o">.</span><span class="n">MEDIA_ROOT</span><span class="p">}),</span>
    <span class="p">)</span>
</pre>
</div>
<p>In production mode, the code contained within the if clause will not be executed as we&#8217;ve set DEBUG to False within settings.py.</p>
<p>From the Django side of things, we are good. We now need to instruct Apache. Within your virtual host file, you can specify something along the lines of:</p>
<div class="highlight">
<pre><span class="nt">&lt;VirtualHost</span> <span class="s">*:80</span><span class="nt">&gt;</span>

  <span class="c">#...</span>

  <span class="nb">SetHandler</span> python-program
  <span class="nb">PythonHandler</span> django.core.handlers.modpython
  <span class="nb">SetEnv</span> DJANGO_SETTINGS_MODULE myproject.settings
  <span class="nb">PythonDebug</span> <span class="k">On</span>
  <span class="nb">PythonAutoReload</span> <span class="k">Off</span>
  <span class="nb">PythonPath</span> <span class="s2">&quot;[&#39;/usr/lib/python2.5/site-packages/django&#39;, &#39;/path/to/myproject&#39;] + sys.path&quot;</span>

  <span class="c">#...</span>

  <span class="nb">Alias</span> <span class="sx">/site_media</span> <span class="s2">&quot;/path/to/myproject/media&quot;</span>
  <span class="nt">&lt;Location</span> <span class="s">&quot;/site_media&quot;</span><span class="nt">&gt;</span>
    <span class="nb">SetHandler</span> <span class="k">None</span>
  <span class="nt">&lt;/Location&gt;</span>
<span class="nt">&lt;/VirtualHost&gt;</span>
</pre>
</div>
<p>The first group of declarations essentially tells Apache to use mod_python to handle any incoming requests. However, we don&#8217;t want Django to deal with static files, so the second group of declarations, aliases/maps the /site_media URL with the actual media directory on the server, and tells Apache to threat it as static content (with SetHandler None) bypassing de facto Django.</p>
]]></content:encoded>
			<wfw:commentRss>http://programmingzen.com/2009/07/22/serving-django-static-files-through-apache/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>Memoization in Ruby and Python</title>
		<link>http://programmingzen.com/2009/05/18/memoization-in-ruby-and-python/</link>
		<comments>http://programmingzen.com/2009/05/18/memoization-in-ruby-and-python/#comments</comments>
		<pubDate>Tue, 19 May 2009 03:59:06 +0000</pubDate>
		<dc:creator>Antonio Cangiano</dc:creator>
				<category><![CDATA[Python]]></category>
		<category><![CDATA[Quick Tips]]></category>
		<category><![CDATA[Ruby]]></category>

		<guid isPermaLink="false">http://antoniocangiano.com/?p=771</guid>
		<description><![CDATA[Wikipedia defines memoization as &#8220;an optimization technique used primarily to speed up computer programs by having function calls avoid repeating the calculation of results for previously-processed inputs.&#8221;. This typically means caching the returning value of a function in a dictionary of sorts using the parameters passed to the function as a key. This is done [...]]]></description>
			<content:encoded><![CDATA[<p>Wikipedia defines memoization as &#8220;an optimization technique used primarily to speed up computer programs by having function calls avoid repeating the calculation of results for previously-processed inputs.&#8221;. This typically means caching the returning value of a function in a dictionary of sorts using the parameters passed to the function as a key. This is done in order to reuse that returning value immediately without calculating it again, when the function is invoked with the same arguments. Even though we are trading space for time, it is often invaluable for speeding up certain recursive functions and when dealing with dynamic programming where intermediate calls are often repeated many times.</p>
<p>Using memoization in Ruby is very easy thanks to the memoize gem. The first step to getting started is therefore to install it:</p>
<div class="highlight">
<pre><span class="nv">$ </span>sudo gem install memoize
Successfully installed memoize-1.2.3
1 gem installed
Installing ri documentation for memoize-1.2.3...
Installing RDoc documentation for memoize-1.2.3...
</pre>
</div>
<p>Now we can use the memoize method as illustrated in the example below:</p>
<div class="highlight">
<pre><span class="nb">require</span> <span class="s1">&#39;rubygems&#39;</span>
<span class="nb">require</span> <span class="s1">&#39;memoize&#39;</span>
<span class="nb">require</span> <span class="s1">&#39;benchmark&#39;</span>
<span class="kp">include</span> <span class="no">Memoize</span>

<span class="k">def</span> <span class="nf">fib</span><span class="p">(</span><span class="n">n</span><span class="p">)</span>
  <span class="k">return</span> <span class="n">n</span> <span class="k">if</span> <span class="n">n</span> <span class="o">&lt;</span> <span class="mi">2</span>
  <span class="n">fib</span><span class="p">(</span><span class="n">n</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="o">+</span> <span class="n">fib</span><span class="p">(</span><span class="n">n</span><span class="o">-</span><span class="mi">2</span><span class="p">)</span>
<span class="k">end</span>

<span class="no">Benchmark</span><span class="o">.</span><span class="n">bm</span><span class="p">(</span><span class="mi">15</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">b</span><span class="o">|</span>
  <span class="n">b</span><span class="o">.</span><span class="n">report</span><span class="p">(</span><span class="s2">&quot;Regular fib:&quot;</span><span class="p">)</span> <span class="p">{</span> <span class="n">fib</span><span class="p">(</span><span class="mi">35</span><span class="p">)</span> <span class="p">}</span>
  <span class="n">b</span><span class="o">.</span><span class="n">report</span><span class="p">(</span><span class="s2">&quot;Memoized fib:&quot;</span><span class="p">)</span> <span class="p">{</span> <span class="n">memoize</span><span class="p">(</span><span class="ss">:fib</span><span class="p">);</span> <span class="n">fib</span><span class="p">(</span><span class="mi">35</span><span class="p">)}</span>
<span class="k">end</span>
</pre>
</div>
<p>In the first block we simply invoke fib(35), while in the second one we first invoke the method memoize(:fib) to memoize the method fib. Running this code on my machine prints the following:</p>
<div class="highlight">
<pre>                     user     system      total        real
Regular fib:    55.230000   0.160000  55.390000 <span class="o">(</span> 55.819205<span class="o">)</span>
Memoized fib:    0.000000   0.000000   0.000000 <span class="o">(</span>  0.001305<span class="o">)</span>
</pre>
</div>
<p>We went from almost a minute of run time to an instantaneous execution. Optionally we could even pass a file location to the function memoize and this would use marshaling to dump and load the cached values on/from disk.</p>
<p>For Python we can write a simple decorator that behaves in a similar manner. In its simplest form it can be implemented as follows:</p>
<div class="highlight">
<pre><span class="c"># memoize.py</span>

<span class="k">def</span> <span class="nf">memoize</span><span class="p">(</span><span class="n">function</span><span class="p">):</span>
    <span class="n">cache</span> <span class="o">=</span> <span class="p">{}</span>
    <span class="k">def</span> <span class="nf">decorated_function</span><span class="p">(</span><span class="o">*</span><span class="n">args</span><span class="p">):</span>
        <span class="k">try</span><span class="p">:</span>
            <span class="k">return</span> <span class="n">cache</span><span class="p">[</span><span class="n">args</span><span class="p">]</span>
        <span class="k">except</span> <span class="ne">KeyError</span><span class="p">:</span>
            <span class="n">val</span> <span class="o">=</span> <span class="n">function</span><span class="p">(</span><span class="o">*</span><span class="n">args</span><span class="p">)</span>
            <span class="n">cache</span><span class="p">[</span><span class="n">args</span><span class="p">]</span> <span class="o">=</span> <span class="n">val</span>
            <span class="k">return</span> <span class="n">val</span>
    <span class="k">return</span> <span class="n">decorated_function</span>
</pre>
</div>
<p>Or more efficiently:</p>
<div class="highlight">
<pre><span class="c"># memoize.py</span>

<span class="k">def</span> <span class="nf">memoize</span><span class="p">(</span><span class="n">function</span><span class="p">):</span>
    <span class="n">cache</span> <span class="o">=</span> <span class="p">{}</span>
    <span class="k">def</span> <span class="nf">decorated_function</span><span class="p">(</span><span class="o">*</span><span class="n">args</span><span class="p">):</span>
        <span class="k">if</span> <span class="n">args</span> <span class="ow">in</span> <span class="n">cache</span><span class="p">:</span>
            <span class="k">return</span> <span class="n">cache</span><span class="p">[</span><span class="n">args</span><span class="p">]</span>
        <span class="k">else</span><span class="p">:</span>
            <span class="n">val</span> <span class="o">=</span> <span class="n">function</span><span class="p">(</span><span class="o">*</span><span class="n">args</span><span class="p">)</span>
            <span class="n">cache</span><span class="p">[</span><span class="n">args</span><span class="p">]</span> <span class="o">=</span> <span class="n">val</span>
            <span class="k">return</span> <span class="n">val</span>
    <span class="k">return</span> <span class="n">decorated_function</span>
</pre>
</div>
<p>When the memoized function has been invoked, we look in the cache to see if an entry for the given arguments already exist. If it does, we immediately return that value. If not, we call the function, cache the results and return its returning value.</p>
<p>Truth be told, the limit of this approach lies in the fact that since we are using a dictionary, only immutable objects can be used as keys. For example, we can use a tuple but are not allowed to have a list as a parameter. For the example within this article, this approach will suffice, but to take advantage of memoization when using arguments that are mutable, you may want to consider the approach described in <a href="http://code.activestate.com/recipes/466320/">this recipe</a>.</p>
<p>We can now rewrite the Ruby example above in Python as follows:</p>
<div class="highlight">
<pre><span class="kn">import</span> <span class="nn">timeit</span>
<span class="kn">from</span> <span class="nn">memoize</span> <span class="kn">import</span> <span class="n">memoize</span>

<span class="k">def</span> <span class="nf">fib1</span><span class="p">(</span><span class="n">n</span><span class="p">):</span>
    <span class="k">if</span> <span class="n">n</span> <span class="o">&lt;</span> <span class="mf">2</span><span class="p">:</span>
        <span class="k">return</span> <span class="n">n</span>
    <span class="k">else</span><span class="p">:</span>
        <span class="k">return</span> <span class="n">fib1</span><span class="p">(</span><span class="n">n</span><span class="o">-</span><span class="mf">1</span><span class="p">)</span> <span class="o">+</span> <span class="n">fib1</span><span class="p">(</span><span class="n">n</span><span class="o">-</span><span class="mf">2</span><span class="p">)</span>

<span class="nd">@memoize</span>
<span class="k">def</span> <span class="nf">fib2</span><span class="p">(</span><span class="n">n</span><span class="p">):</span>
    <span class="k">if</span> <span class="n">n</span> <span class="o">&lt;</span> <span class="mf">2</span><span class="p">:</span>
        <span class="k">return</span> <span class="n">n</span>
    <span class="k">else</span><span class="p">:</span>
        <span class="k">return</span> <span class="n">fib2</span><span class="p">(</span><span class="n">n</span><span class="o">-</span><span class="mf">1</span><span class="p">)</span> <span class="o">+</span> <span class="n">fib2</span><span class="p">(</span><span class="n">n</span><span class="o">-</span><span class="mf">2</span><span class="p">)</span>	

<span class="n">t1</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">Timer</span><span class="p">(</span><span class="s">&quot;fib1(35)&quot;</span><span class="p">,</span> <span class="s">&quot;from __main__ import fib1&quot;</span><span class="p">)</span>
<span class="k">print</span> <span class="n">t1</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span><span class="mf">1</span><span class="p">)</span>
<span class="n">t2</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">Timer</span><span class="p">(</span><span class="s">&quot;fib2(35)&quot;</span><span class="p">,</span> <span class="s">&quot;from __main__ import fib2&quot;</span><span class="p">)</span>
<span class="k">print</span> <span class="n">t2</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span><span class="mf">1</span><span class="p">)</span>
</pre>
</div>
<p>Running this code on my machine prints the following:</p>
<div class="highlight">
<pre>9.32223105431
0.000314950942993
</pre>
</div>
<p>In Python 2.5&#8242;s case by employing memoization we went from more than nine seconds of run time to an instantaneous result.</p>
<p>Granted we don&#8217;t write Fibonacci applications for a living, but the benefits and principles behind these examples still stand and can be applied to everyday programming whenever the opportunity, and above all the need, arises.</p>
]]></content:encoded>
			<wfw:commentRss>http://programmingzen.com/2009/05/18/memoization-in-ruby-and-python/feed/</wfw:commentRss>
		<slash:comments>15</slash:comments>
		</item>
		<item>
		<title>Resolving the gray window when running db2setup</title>
		<link>http://programmingzen.com/2008/11/27/resolving-the-gray-window-when-running-db2setup/</link>
		<comments>http://programmingzen.com/2008/11/27/resolving-the-gray-window-when-running-db2setup/#comments</comments>
		<pubDate>Thu, 27 Nov 2008 21:54:17 +0000</pubDate>
		<dc:creator>Antonio Cangiano</dc:creator>
				<category><![CDATA[DB2]]></category>
		<category><![CDATA[Quick Tips]]></category>

		<guid isPermaLink="false">http://antoniocangiano.com/?p=386</guid>
		<description><![CDATA[You drank the Kool-Aid and downloaded the awesomeness which is DB2 Express-C. Good job! Next you proceed to install it on Linux with sudo ./db2setup and boom, instead of a launchpad all you see is a gray window. Now what? This problem is a known Java bug (resolved in Java 6) that shows up on [...]]]></description>
			<content:encoded><![CDATA[<p>You drank the Kool-Aid and downloaded the awesomeness which is <a href="http://www.ibm.com/software/data/db2/express/download.html?S_CMP=ECDDWW01&#038;S_TACT=ACDB201">DB2 Express-C</a>. Good job! Next you proceed to install it on Linux with <code>sudo ./db2setup</code> and boom, instead of a launchpad all you see is a gray window. Now what?</p>
<p>This problem is a known Java bug (resolved in Java 6) that shows up on Linux distros where Compiz effects are enabled. For example, this problem manifests itself in recent Ubuntu releases, including 8.10, where Compiz is enabled by default.</p>
<p>There are a couple of easy ways to solve this problem though. The first is to temporary disable these effects during the installation and turn them back on when you&#8217;ve finished installing. In Ubuntu, you can do this by clicking on the <strong>Appearance </strong> menu, <strong>Visual Effects</strong> tab and then selecting <strong>None</strong>. The second method is to run <code>export AWT_TOOLKIT=MToolkit</code>, before running <code>sudo ./db2setup</code>.</p>
<p>A new setup is in the works to solve this issue, but for the time being, you can use the workarounds above to install DB2 Express-C on Linux.</p>
]]></content:encoded>
			<wfw:commentRss>http://programmingzen.com/2008/11/27/resolving-the-gray-window-when-running-db2setup/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Integrating TextMate and Pygments</title>
		<link>http://programmingzen.com/2008/10/27/integrating-textmate-and-pygments/</link>
		<comments>http://programmingzen.com/2008/10/27/integrating-textmate-and-pygments/#comments</comments>
		<pubDate>Mon, 27 Oct 2008 21:34:25 +0000</pubDate>
		<dc:creator>Antonio Cangiano</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Mac]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Quick Tips]]></category>
		<category><![CDATA[Ruby]]></category>

		<guid isPermaLink="false">http://antoniocangiano.com/?p=300</guid>
		<description><![CDATA[Like many, I don&#8217;t use TextMate just for coding. All of my posts are first drafted in my trusty editor before being published. One of the problems that I had, and that others probably face too, is the less than smooth process of publishing properly highlighted code in posts and HTML pages. A few solutions [...]]]></description>
			<content:encoded><![CDATA[<p>Like many, I don&#8217;t use TextMate just for coding. All of my posts are first drafted in my trusty editor before being published. One of the problems that I had, and that others probably face too, is the less than smooth process of publishing properly highlighted code in posts and HTML pages. A few solutions exist, including embedding <a href="http://gist.github.com/">gist snippets</a>, using &#8220;Create HTML from Document&#8221; in TextMate, or adopting JavaScript libraries or WP plugins. But when it comes to highlighting code, for me <a href="http://pygments.org/">Pygments</a> is simply unbeatable.</p>
<p>Pygments is a Python library but ships as a command line tool as well. However, switching between TextMate and the command line is not as convenient as I&#8217;d like. So on the weekend I pulled out my big sharp razor and started <a href="http://en.wiktionary.org/wiki/yak_shaving">yak shaving</a>. The result of that brief session is a hack that delivers the integration of TextMate and Pygments, so that code can be easily converted to HTML in order to beautifully present it.</p>
<p>First, let&#8217;s see how I use it. When I select a snippet of Ruby code in TextMate and press &#x2303;&#x2325;1 a snippet of code is transformed into the proper HTML. &#x2303;&#x2325;2 is for Python snippets, &#x2303;&#x2325;3 for any other language, and &#x2303;&#x2325;4 for any language as well but with the option of adding line numbers. In practice, this means that I use 1 and 2 most of the time and these shortcuts are easy enough to remember. Note that this is not necessarily the best arrangement, but it works well for me. I could, if so inclined, associate all 4 commands to the same shortcut and be prompted by a menu every time this combination is pressed, obtaining something along the lines of the image shown below:</p>
<p align="center"><img src="http://antoniocangiano.com/wp-content/uploads/2008/10/small_menu.png" alt="A possible prompt menu for Pygments"/></p>
<p>Should I ever forget these 4 shortcuts, I can take a quick look at the Text bundle menu shown below. I placed these commands under the Text menu, since they are globally available for textual formats, whether I&#8217;m composing HTML, Textile, Markdown or ReST; but this is entirely arbitrary and I suspect that many would consider the HTML menu instead or place a &#8220;Convert to HTML&#8221; entry in the menu of the specific language.</p>
<p align="center"><img src="http://antoniocangiano.com/wp-content/uploads/2008/10/text_menu.png" alt="The Text menu"/></p>
<p>Ruby and Python deserve their own command because they are the languages whose code I publish the most, but pressing &#x2303;&#x2325;3 (or 4) prompts a long list of languages to choose from as shown below (the image is cut to reduce its length):</p>
<p align="center"><img src="http://antoniocangiano.com/wp-content/uploads/2008/10/select_language.png" alt="The select a language dialog"/></p>
<p>The following are a series of steps that you can take to reproduce the same results as mine. The HTML required to present the code nicely in this section was generated from within TextMate. In other words, I&#8217;m eating my own dog food.</p>
<p><strong>Step 1</strong>: If you haven&#8217;t done so already, install Pygments. You can get it from the <a href="http://pygments.org/download/">official site</a>.</p>
<p><strong>Step 2</strong>: Within TextMate click on the menu entry: <strong>Bundles</strong> -> <strong>Bundle Editor</strong> -> <strong>Show Bundle Editor</strong> and click on the triangle to open up <strong>Text</strong> in the left pane.</p>
<p><strong>Step 3</strong>: Click on the <strong>+-</strong> button in the lower left corner of the window and select <strong>New Command</strong>, then name the command <strong>Pygmentize Ruby</strong> (assuming that you want a command for Ruby).</p>
<p><strong>Step 4</strong>: Ensure that each option for <strong>Save</strong>, <strong>Input</strong>, <strong>Output</strong> and <strong>Activation</strong> are the same as shown below (click to enlarge):</p>
<p align="center"><a href="http://antoniocangiano.com/wp-content/uploads/2008/10/pygmentize_ruby_command.png"><img src="http://antoniocangiano.com/wp-content/uploads/2008/10/pygmentize_ruby_command-300x212.png" alt="" title="The Pygmentize Ruby command" width="300" height="212" /></a></p>
<p><strong>Step 5</strong>: Fill the <strong>Command(s)</strong> text area with the following code:</p>
<div class="highlight">
<pre><span class="c">#!/usr/bin/env python</span>

<span class="kn">import</span> <span class="nn">os</span>
<span class="kn">import</span> <span class="nn">sys</span>
<span class="kn">from</span> <span class="nn">pygments</span> <span class="kn">import</span> <span class="n">highlight</span>
<span class="kn">from</span> <span class="nn">pygments.lexers</span> <span class="kn">import</span> <span class="n">RubyLexer</span>
<span class="kn">from</span> <span class="nn">pygments.formatters</span> <span class="kn">import</span> <span class="n">HtmlFormatter</span>

<span class="k">try</span><span class="p">:</span>
    <span class="n">code</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s">&#39;TM_SELECTED_TEXT&#39;</span><span class="p">]</span>
<span class="k">except</span> <span class="ne">KeyError</span><span class="p">:</span>
    <span class="n">sys</span><span class="o">.</span><span class="n">exit</span><span class="p">()</span>

<span class="n">formatter</span> <span class="o">=</span> <span class="n">HtmlFormatter</span><span class="p">()</span>
<span class="k">print</span> <span class="n">highlight</span><span class="p">(</span><span class="n">code</span><span class="p">,</span> <span class="n">RubyLexer</span><span class="p">(),</span> <span class="n">formatter</span><span class="p">)</span>
</pre>
</div>
<p>
<p><strong>Step 6</strong>: Repeat the process for <strong>Pygmentize Python</strong>, <strong>Pygmentize&#8230;</strong> and <strong>Pygmentize with line numbers&#8230;</strong> but select a different <strong>Activation key equivalent</strong> (replace 1 with 2, 3 and 4, respectively).</p>
<p>The command code for <strong>Pygmentize Python</strong> is as follows:</p>
<div class="highlight">
<pre><span class="c">#!/usr/bin/env python</span>

<span class="kn">import</span> <span class="nn">os</span>
<span class="kn">import</span> <span class="nn">sys</span>
<span class="kn">from</span> <span class="nn">pygments</span> <span class="kn">import</span> <span class="n">highlight</span>
<span class="kn">from</span> <span class="nn">pygments.lexers</span> <span class="kn">import</span> <span class="n">PythonLexer</span>
<span class="kn">from</span> <span class="nn">pygments.formatters</span> <span class="kn">import</span> <span class="n">HtmlFormatter</span>

<span class="k">try</span><span class="p">:</span>
    <span class="n">code</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s">&#39;TM_SELECTED_TEXT&#39;</span><span class="p">]</span>
<span class="k">except</span> <span class="ne">KeyError</span><span class="p">:</span>
    <span class="n">sys</span><span class="o">.</span><span class="n">exit</span><span class="p">()</span>

<span class="n">formatter</span> <span class="o">=</span> <span class="n">HtmlFormatter</span><span class="p">()</span>
<span class="k">print</span> <span class="n">highlight</span><span class="p">(</span><span class="n">code</span><span class="p">,</span> <span class="n">PythonLexer</span><span class="p">(),</span> <span class="n">formatter</span><span class="p">)</span>
</pre>
</div>
<p>For <strong>Pygmentize&#8230;</strong> use the following:</p>
<div class="highlight">
<pre><span class="c">#!/usr/bin/env python</span>

<span class="kn">import</span> <span class="nn">os</span>
<span class="kn">import</span> <span class="nn">sys</span>
<span class="kn">from</span> <span class="nn">commands</span> <span class="kn">import</span> <span class="n">getoutput</span>
<span class="kn">from</span> <span class="nn">pygments</span> <span class="kn">import</span> <span class="n">highlight</span>
<span class="kn">from</span> <span class="nn">pygments.lexers</span> <span class="kn">import</span> <span class="n">get_all_lexers</span><span class="p">,</span> <span class="n">get_lexer_by_name</span>
<span class="kn">from</span> <span class="nn">pygments.formatters</span> <span class="kn">import</span> <span class="n">HtmlFormatter</span>

<span class="k">try</span><span class="p">:</span>
    <span class="n">code</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s">&#39;TM_SELECTED_TEXT&#39;</span><span class="p">]</span>
<span class="k">except</span> <span class="ne">KeyError</span><span class="p">:</span>
    <span class="n">sys</span><span class="o">.</span><span class="n">exit</span><span class="p">()</span>

<span class="n">available_languages</span> <span class="o">=</span> <span class="s">&quot;, &quot;</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">sorted</span><span class="p">(</span><span class="s">&#39;&quot;&#39;</span><span class="o">+</span><span class="n">lex</span><span class="p">[</span><span class="mf">1</span><span class="p">][</span><span class="mf">0</span><span class="p">]</span><span class="o">+</span><span class="s">&#39;&quot;&#39;</span> <span class="k">for</span> <span class="n">lex</span> <span class="ow">in</span> <span class="n">get_all_lexers</span><span class="p">()))</span>
<span class="n">chosen_language</span> <span class="o">=</span> <span class="n">getoutput</span><span class="p">(</span><span class="s">&quot;&quot;&quot;echo $(osascript &lt;&lt;&#39;AS&#39;</span>
<span class="s">    tell app &quot;TextMate&quot;</span>
<span class="s">        activate</span>
<span class="s">        choose from list { </span><span class="si">%(languages)s</span><span class="s"> } </span><span class="se">\</span>
<span class="s">            with title &quot;Pick a language&quot; </span><span class="se">\</span>
<span class="s">            with prompt &quot;Select a language&quot;</span>
<span class="s">    end tell</span>
<span class="s">AS)&quot;&quot;&quot;</span> <span class="o">%</span> <span class="p">{</span><span class="s">&#39;languages&#39;</span><span class="p">:</span><span class="n">available_languages</span><span class="p">})</span>
<span class="n">os</span><span class="o">.</span><span class="n">system</span><span class="p">(</span><span class="s">&quot;osascript -e &#39;tell app &quot;&quot;TextMate&quot;&quot; to activate&#39; &amp;&gt;/dev/null &amp;&quot;</span><span class="p">)</span>

<span class="n">lexer</span> <span class="o">=</span> <span class="n">get_lexer_by_name</span><span class="p">(</span><span class="n">chosen_language</span><span class="o">.</span><span class="n">lower</span><span class="p">())</span>
<span class="n">formatter</span> <span class="o">=</span> <span class="n">HtmlFormatter</span><span class="p">()</span> <span class="c"># linenos=False</span>
<span class="k">print</span> <span class="n">highlight</span><span class="p">(</span><span class="n">code</span><span class="p">,</span> <span class="n">lexer</span><span class="p">,</span> <span class="n">formatter</span><span class="p">)</span>
</pre>
</div>
<p>And finally for <strong>Pygmentize with line numbers&#8230;</strong> use the almost identical script below:</p>
<div class="highlight">
<pre><span class="c">#!/usr/bin/env python</span>

<span class="kn">import</span> <span class="nn">os</span>
<span class="kn">import</span> <span class="nn">sys</span>
<span class="kn">from</span> <span class="nn">commands</span> <span class="kn">import</span> <span class="n">getoutput</span>
<span class="kn">from</span> <span class="nn">pygments</span> <span class="kn">import</span> <span class="n">highlight</span>
<span class="kn">from</span> <span class="nn">pygments.lexers</span> <span class="kn">import</span> <span class="n">get_all_lexers</span><span class="p">,</span> <span class="n">get_lexer_by_name</span>
<span class="kn">from</span> <span class="nn">pygments.formatters</span> <span class="kn">import</span> <span class="n">HtmlFormatter</span>

<span class="k">try</span><span class="p">:</span>
    <span class="n">code</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s">&#39;TM_SELECTED_TEXT&#39;</span><span class="p">]</span>
<span class="k">except</span> <span class="ne">KeyError</span><span class="p">:</span>
    <span class="n">sys</span><span class="o">.</span><span class="n">exit</span><span class="p">()</span>

<span class="n">available_languages</span> <span class="o">=</span> <span class="s">&quot;, &quot;</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">sorted</span><span class="p">(</span><span class="s">&#39;&quot;&#39;</span><span class="o">+</span><span class="n">lex</span><span class="p">[</span><span class="mf">1</span><span class="p">][</span><span class="mf">0</span><span class="p">]</span><span class="o">+</span><span class="s">&#39;&quot;&#39;</span> <span class="k">for</span> <span class="n">lex</span> <span class="ow">in</span> <span class="n">get_all_lexers</span><span class="p">()))</span>
<span class="n">chosen_language</span> <span class="o">=</span> <span class="n">getoutput</span><span class="p">(</span><span class="s">&quot;&quot;&quot;echo $(osascript &lt;&lt;&#39;AS&#39;</span>
<span class="s">    tell app &quot;TextMate&quot;</span>
<span class="s">        activate</span>
<span class="s">        choose from list { </span><span class="si">%(languages)s</span><span class="s"> } </span><span class="se">\</span>
<span class="s">            with title &quot;Pick a language&quot; </span><span class="se">\</span>
<span class="s">            with prompt &quot;Select a language&quot;</span>
<span class="s">    end tell</span>
<span class="s">AS)&quot;&quot;&quot;</span> <span class="o">%</span> <span class="p">{</span><span class="s">&#39;languages&#39;</span><span class="p">:</span><span class="n">available_languages</span><span class="p">})</span>
<span class="n">os</span><span class="o">.</span><span class="n">system</span><span class="p">(</span><span class="s">&quot;osascript -e &#39;tell app &quot;&quot;TextMate&quot;&quot; to activate&#39; &amp;&gt;/dev/null &amp;&quot;</span><span class="p">)</span>

<span class="n">lexer</span> <span class="o">=</span> <span class="n">get_lexer_by_name</span><span class="p">(</span><span class="n">chosen_language</span><span class="o">.</span><span class="n">lower</span><span class="p">())</span>
<span class="n">formatter</span> <span class="o">=</span> <span class="n">HtmlFormatter</span><span class="p">(</span><span class="n">linenos</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
<span class="k">print</span> <span class="n">highlight</span><span class="p">(</span><span class="n">code</span><span class="p">,</span> <span class="n">lexer</span><span class="p">,</span> <span class="n">formatter</span><span class="p">)</span>
</pre>
</div>
<p><strong>Step 7</strong>: Click on <strong>Text</strong> and by dragging and dropping arrange the menu to include the Pygmentize commands as shown below (click to enlarge):</p>
<p align="center"><a href="http://antoniocangiano.com/wp-content/uploads/2008/10/bundle_editor_text1.png"><img src="http://antoniocangiano.com/wp-content/uploads/2008/10/bundle_editor_text1-300x212.png" alt="Editing the Text menu" title="Editing the Text menu" width="300" height="212" /></a></p>
<p><strong>Step 8</strong>: At this point everything should work, whether you invoke the commands through a keyboard shortcut or through the <strong>Text</strong> menu. However, you will need to upload and include a Pygments stylesheet from within your site. To generate a stylesheet run the following from the command line:</p>
<div class="highlight">
<pre>pygmentize -S default -f html &gt; pygmentize.css
</pre>
</div>
<p>In the above command, <em>default</em> is the name of the style. For example, the Python code you see in this article is styled with the style <em>pastie</em> (because I globally adopted that stylesheet for this site). For a comparison of the available styles check out this <a href="http://pygments.org/demo/1116/?style=pastie">demo page</a>.</p>
<p><strong>Step 9</strong>: ????</p>
<p><strong>Step 10</strong>: Profit!</p>
<p>I hope these hacked together commands can be useful to others. Feel free to customize them and improve upon them as it suits your needs.</p>
<p><strong>UPDATE</strong>: I made a <a href="http://antoniocangiano.com/2008/10/28/pygments-textmate-bundle/">Pygments TextMate Bundle</a> out of this.</p>
]]></content:encoded>
			<wfw:commentRss>http://programmingzen.com/2008/10/27/integrating-textmate-and-pygments/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>This Week in Ruby (May 29, 2008)</title>
		<link>http://programmingzen.com/2008/05/29/this-week-in-ruby-may-29-2008/</link>
		<comments>http://programmingzen.com/2008/05/29/this-week-in-ruby-may-29-2008/#comments</comments>
		<pubDate>Thu, 29 May 2008 23:54:44 +0000</pubDate>
		<dc:creator>Antonio Cangiano</dc:creator>
				<category><![CDATA[Books]]></category>
		<category><![CDATA[DB2]]></category>
		<category><![CDATA[Mac]]></category>
		<category><![CDATA[Merb]]></category>
		<category><![CDATA[Quick Tips]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Ruby on Rails]]></category>
		<category><![CDATA[This Week in Ruby]]></category>
		<category><![CDATA[datamapper]]></category>
		<category><![CDATA[ironruby]]></category>
		<category><![CDATA[mack]]></category>
		<category><![CDATA[mod_rails]]></category>
		<category><![CDATA[purexml]]></category>
		<category><![CDATA[railsconf]]></category>
		<category><![CDATA[screencast]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://antoniocangiano.com/?p=183</guid>
		<description><![CDATA[This is the 9th episode of This Week in Ruby, please consider subscribing to my feed so as to not miss any weekly installments. Ruby Two days ago JRuby 1.1.2 was released. Amongst several bug fixes and improvements, this release is characterized by a focus on performances. Startup time, threading, method calling and YAML symbol [...]]]></description>
			<content:encoded><![CDATA[<p><em>This is the 9th episode of This Week in Ruby, please consider <a href="http://feeds.feedburner.com/ZenAndTheArtOfRubyProgramming">subscribing to my feed</a> so as to not miss any weekly installments.</em></p>
<p><strong>Ruby</strong></p>
<p>Two days ago <a href="http://docs.codehaus.org/display/JRUBY/2008/05/27/JRuby+1.1.2+Released">JRuby 1.1.2</a> was released. Amongst several bug fixes and improvements, this release is characterized by a focus on performances. Startup time, threading, method calling and YAML symbol parsing have all been drastically improved.</p>
<p>Huw Collingbourne of <a href="http://www.sapphiresteel.com">SapphireSteel</a>, <a href="http://www.sapphiresteel.com/The-Book-Of-Ruby-free-in-depth">has announced</a> that he&#8217;ll be releasing a complete book on Ruby, chapter by chapter, free of charge online. After reading <a href="http://www.sapphiresteel.com/The-Book-Of-Ruby">the first chapter,</a> I can attest that it&#8217;s excellent. Keep an eye on it, as new chapters get added.</p>
<p>The Pragmatic Programmers put out a series of screencasts for sale. The most relevant series for Ruby programmers is <a href="http://pragprog.com/screencasts/v-rbar/everyday-active-record">Everyday Active Record</a>. The first two episodes (a half an hour long, each) are out and can be purchased for just $5 a piece. The preview &mdash; and Ryan Bates&#8217;s reputation &mdash; lead me to believe that they are entirely worth their very reasonable sticker price. Speaking of screencasts, a new one about merb-slices was released <a href="http://merbunity.com/screencasts/1" title="17.9MB">on Merbunity</a>, check it out if you&#8217;re into Merb.</p>
<p>There were two important releases last week, <a href="http://www.mackframework.com/2008/05/21/release-055/">Mack 0.5.5</a> &mdash; which features a new rendering engine with support for Haml and Markaby &mdash; and <a href="http://datamapper.org/">DataMapper 0.9</a>, a major reworking of the <span class="caps">ORM</span>. A third release, which is perhaps just as welcomed, was launched by _Why who included <a href="http://hackety.org/2008/05/22/theImageBlockAtTheBottomOfShoes.html">a few graphical improvements</a> for Shoes, his <span class="caps">GUI</span> application toolkit. Definitely neat stuff, which I invite you to take a look at if you&#8217;re working on a Mac.</p>
<p>Peter Cooper published <a href="http://www.rubyinside.com/21-ruby-tricks-902.html">21 Ruby Tricks You Should Be Using In Your Own Code</a>. You probably know already most of the common ones at least, but they&#8217;re quick and fun, so if you haven&#8217;t checked out the post yet take a moment and do so. Other must-read tutorials and articles were <a href="http://redartisan.com/2008/5/18/dtrace-ruby">Ruby &#38;&#38; DTrace!</a> (really neat results), <a href="http://www.igvita.com/2008/05/27/ruby-eventmachine-the-speed-demon/">Ruby EventMachine &#8211; The Speed Demon</a> by one of my favorite Ruby bloggers, and <a href="http://benchcoach.com/papers/scraping">Will&#8217;s Guide to Mashing-up Remote Databases using Page Scraping</a>.</p>
<p>In a post made a couple of days ago, Robert Fischer opened up a can of worms by bringing up the issue of Ruby and <span class="caps">XML</span> libraries. As most of you know <span class="caps">REXML</span> is far from being issue-free (performance <em>in primis</em>), and in <a href="http://enfranchisedmind.com/blog/2008/05/27/the-status-of-rubys-libxml/">The Status of Ruby&#8217;s libxml</a> Robert uncovers that the author of LibXml Ruby is unable to actively pursue the development of his extension. This issue concerns me, but if I&#8217;m working with databases, I prefer to take advantage of <a href="http://www-306.ibm.com/software/data/db2/express/download.html"><span class="caps">DB2</span> Express-C</a> &#8217;s fantastic <a href="http://www.redbooks.ibm.com/abstracts/sg247315.html?Open">pureXML</a> features, which give me the sort of speed, flexibility and stability that I won&#8217;t find in a Ruby library anytime soon.</p>
<p>Before highlighting some of the news from Rails-land, I wanted to inform you that a new version of <a href="http://antoniocangiano.com/2007/12/03/the-great-ruby-shootout/">The Great Ruby Shootout</a> will surface in June, as I intend to test a couple of special new entries.</p>
<p><strong>Rails</strong></p>
<p>Today, <a href="http://en.oreilly.com/rails2008/public/content/home">RailsConf 2008</a> started and it certainly stands a great chance of being dubbed an exhilarating event. A few people enquired to see if they could meet me there, but unfortunately I couldn&#8217;t make it. Chances are that you&#8217;re reading this post from RailsConf. If that&#8217;s the case, say hi for me and don&#8217;t forget to visit the nice fellas from <a href="http://engineyard.com">Engine Yard</a>, <a href="http://en.oreilly.com/rails2008/public/schedule/detail/4345">Morph</a> (my sponsor), <a href="http://phusion.nl">Phusion</a> and <a href="http://en.oreilly.com/rails2008/public/schedule/detail/4351">GemStone</a>. Oh and also, feel free to pass around the url of this entry. <img src='http://programmingzen.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<p>Rails 2.1 <span class="caps">RC1</span> is out, so you&#8217;ll find this article on <a href="http://blog.assaydepot.com/2008/5/23/upgrade-to-rails-2-1-0_rc1">upgrading to Rails 2.1.0_RC1</a> useful. Fabio Akita released a new version of his popular tutorials, Rolling with Rails 2.1 (<a href="http://www.akitaonrails.com/2008/5/25/rolling-with-rails-2-1-the-first-full-tutorial-part-1">part 1</a> and <a href="http://www.akitaonrails.com/2008/5/26/rolling-with-rails-2-1-the-first-full-tutorial-part-2">part 2</a>). And if you are looking for an advanced authentication/authorization system for Rails 2, take a gander at <a href="http://lockdown.rubyforge.org/">Lockdown on RubyForge</a>.</p>
<p>My friends at SeeSaw implemented a series of <a href="http://code.google.com/p/rails-widgets/">Rails Widgets</a> which can easily be installed as a Rails plugin. Feel free to use them and/or contribute, in order to add further support for simplifying and reusing common UI elements. Speaking of shiny things, check out this <a href="http://azizash.deviantart.com/art/Ruby-on-Rails-icon-pack-81755219">Ruby on Rails icon pack</a>; very pleasing to the eye, in my opinion.</p>
<p>RubyInside published a list of <a href="http://www.rubyinside.com/28_mod_rails_and_passenger_resources-899.html">28 mod_rails / Passenger Resources To Help You Deploy Rails Applications Faster</a>. As <span class="caps">DHH</span> forecasted, &#8220;this could definitely become very popular, very fast <img src='http://programmingzen.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> &#8221;.</p>
<p>New Relic released their RPM solution for monitoring and improving the performances of Rails applications to the general public. You can <a href="http://newrelic.com/get-RPM.html">get it here</a>.</p>
<p>And finally, some great news just came in, <a href="http://twitter.com/john_lam/statuses/822070470">IronRuby is running unmodified Rails</a>. &#8220;Excellent&#8221; (said in Montgomery Burns&#8217; voice, complete with characteristic hand gesture).</p>
]]></content:encoded>
			<wfw:commentRss>http://programmingzen.com/2008/05/29/this-week-in-ruby-may-29-2008/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Using Python to detect the most frequent words in a file</title>
		<link>http://programmingzen.com/2008/03/18/use-python-to-detect-the-most-frequent-words-in-a-file/</link>
		<comments>http://programmingzen.com/2008/03/18/use-python-to-detect-the-most-frequent-words-in-a-file/#comments</comments>
		<pubDate>Tue, 18 Mar 2008 05:32:35 +0000</pubDate>
		<dc:creator>Antonio Cangiano</dc:creator>
				<category><![CDATA[Python]]></category>
		<category><![CDATA[Quick Tips]]></category>

		<guid isPermaLink="false">http://antoniocangiano.com/2008/03/18/use-python-to-detect-the-most-frequent-words-in-a-file/</guid>
		<description><![CDATA[Working with Python is nice. Just like Ruby, it usually doesn&#8217;t get in the way of my thought process and it comes &#8220;with batteries included&#8221;. Let&#8217;s consider the small task of printing a list of the N most frequent words within a given file: from string import punctuation from operator import itemgetter N = 10 [...]]]></description>
			<content:encoded><![CDATA[<p>Working with Python is nice. Just like Ruby, it usually doesn&#8217;t get in the way of my thought process and it comes &#8220;with batteries included&#8221;. Let&#8217;s consider the small task of printing a list of the N most frequent words within a given file:</p>
<div class="highlight">
<pre><span class="k">from</span> <span class="nn">string</span> <span class="k">import</span> <span class="n">punctuation</span>
<span class="k">from</span> <span class="nn">operator</span> <span class="k">import</span> <span class="n">itemgetter</span>

<span class="n">N</span> <span class="o">=</span> <span class="mi">10</span>
<span class="n">words</span> <span class="o">=</span> <span class="p">{}</span>

<span class="n">words_gen</span> <span class="o">=</span> <span class="p">(</span><span class="n">word</span><span class="o">.</span><span class="n">strip</span><span class="p">(</span><span class="n">punctuation</span><span class="p">)</span><span class="o">.</span><span class="n">lower</span><span class="p">()</span> <span class="k">for</span> <span class="n">line</span> <span class="ow">in</span> <span class="nb">open</span><span class="p">(</span><span class="s">&quot;test.txt&quot;</span><span class="p">)</span>
                                             <span class="k">for</span> <span class="n">word</span> <span class="ow">in</span> <span class="n">line</span><span class="o">.</span><span class="n">split</span><span class="p">())</span>

<span class="k">for</span> <span class="n">word</span> <span class="ow">in</span> <span class="n">words_gen</span><span class="p">:</span>
    <span class="n">words</span><span class="p">[</span><span class="n">word</span><span class="p">]</span> <span class="o">=</span> <span class="n">words</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">word</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="o">+</span> <span class="mi">1</span>

<span class="n">top_words</span> <span class="o">=</span> <span class="n">sorted</span><span class="p">(</span><span class="n">words</span><span class="o">.</span><span class="n">iteritems</span><span class="p">(),</span> <span class="n">key</span><span class="o">=</span><span class="n">itemgetter</span><span class="p">(</span><span class="mi">1</span><span class="p">),</span> <span class="n">reverse</span><span class="o">=</span><span class="bp">True</span><span class="p">)[:</span><span class="n">N</span><span class="p">]</span>

<span class="k">for</span> <span class="n">word</span><span class="p">,</span> <span class="n">frequency</span> <span class="ow">in</span> <span class="n">top_words</span><span class="p">:</span>
    <span class="k">print</span> <span class="s">&quot;</span><span class="si">%s</span><span class="s">: </span><span class="si">%d</span><span class="s">&quot;</span> <span class="o">%</span> <span class="p">(</span><span class="n">word</span><span class="p">,</span> <span class="n">frequency</span><span class="p">)</span>
</pre>
</div>
<p>I won&#8217;t provide a step by step explanation of what I believe is already rather understandable. There are however a few tricky considerations to be made on behalf of those who are not too familiar with the language. First and foremost, I love using Generator Expressions because they are lazily evaluated and have a math-like readability. It&#8217;s just a very convenient way of crating generator objects. Notice how in the snippet I favor them over the option of placing a whole file into a string by concatenating the <code>read()</code> method to the <code>open()</code> one. Doing so results in a significant performance improvement for large files. Generator Expressions and List Comprehension are extremely useful language features which are inherited from the world of functional programming, and I&#8217;m glad that Python fully embraces them.</p>
<p>In the third <code>for</code> loop we count words and add them and their respective frequencies to the <code>words</code> dictionary (similar to a Ruby Hash). Notice how the method <code>get()</code> enabled us to specify a default value before incrementing the counter, in case the given key didn&#8217;t exist yet (which means that the word we were adding hadn&#8217;t been encountered before). We pass <code>operator.itemgetter()</code> as a keyword argument (another nice Python feature) to the <code>sorted()</code> function. <code>itemgetter()</code> returns a callable object that fetches the given item(s) from its operand which, in our case, essentially means that we can tell <code>sorted()</code> to sort based on the value of the dictionary&#8217;s items (the frequency of the words) rather than based on the keys (the words themselves).</p>
<p>Unfortunately there is a problem with this code. It will correctly sort the most popular words in the file, but equally represented words won&#8217;t be alphabetically ordered. Given that we specified a reverse order for the <code>sorted()</code> function, we could simply pass it <code>key=itemgetter(1, 0)</code> to order (in descending order) by value first and by key second. But let&#8217;s be realistic. In most cases, you want to have these type of keys whose values are equal, be alphabetically ordered (in ascending order). With a few changes to the code, this can be easily achieved:</p>
<div class="highlight">
<pre><span class="k">from</span> <span class="nn">string</span> <span class="k">import</span> <span class="n">punctuation</span>

<span class="k">def</span> <span class="nf">sort_items</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
    <span class="sd">&quot;&quot;&quot;Sort by value first, and by key (reverted) second.&quot;&quot;&quot;</span>
    <span class="k">return</span> <span class="nb">cmp</span><span class="p">(</span><span class="n">x</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="n">y</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span> <span class="ow">or</span> <span class="nb">cmp</span><span class="p">(</span><span class="n">y</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">x</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span>

<span class="n">N</span> <span class="o">=</span> <span class="mi">10</span>
<span class="n">words</span> <span class="o">=</span> <span class="p">{}</span>

<span class="n">words_gen</span> <span class="o">=</span> <span class="p">(</span><span class="n">word</span><span class="o">.</span><span class="n">strip</span><span class="p">(</span><span class="n">punctuation</span><span class="p">)</span><span class="o">.</span><span class="n">lower</span><span class="p">()</span> <span class="k">for</span> <span class="n">line</span> <span class="ow">in</span> <span class="nb">open</span><span class="p">(</span><span class="s">&quot;test.txt&quot;</span><span class="p">)</span>
                                             <span class="k">for</span> <span class="n">word</span> <span class="ow">in</span> <span class="n">line</span><span class="o">.</span><span class="n">split</span><span class="p">())</span>

<span class="k">for</span> <span class="n">word</span> <span class="ow">in</span> <span class="n">words_gen</span><span class="p">:</span>
    <span class="n">words</span><span class="p">[</span><span class="n">word</span><span class="p">]</span> <span class="o">=</span> <span class="n">words</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">word</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="o">+</span> <span class="mi">1</span>

<span class="n">top_words</span> <span class="o">=</span> <span class="n">sorted</span><span class="p">(</span><span class="n">words</span><span class="o">.</span><span class="n">iteritems</span><span class="p">(),</span> <span class="nb">cmp</span><span class="o">=</span><span class="n">sort_items</span><span class="p">,</span> <span class="n">reverse</span><span class="o">=</span><span class="bp">True</span><span class="p">)[:</span><span class="n">N</span><span class="p">]</span>

<span class="k">for</span> <span class="n">word</span><span class="p">,</span> <span class="n">frequency</span> <span class="ow">in</span> <span class="n">top_words</span><span class="p">:</span>
    <span class="k">print</span> <span class="s">&quot;</span><span class="si">%s</span><span class="s">: </span><span class="si">%d</span><span class="s">&quot;</span> <span class="o">%</span> <span class="p">(</span><span class="n">word</span><span class="p">,</span> <span class="n">frequency</span><span class="p">)</span>
</pre>
</div>
<p>Previously we specified what &#8220;key&#8221; should we use for sorting, while in this case we now have a much greater deal of control. By defining the function <code>sort_items()</code> and passing a pointer to it for the cmp argument of the function <code>sorted()</code>, we get to define how the comparison amongst the items of the dictionary should be carried out. The function that we defined at the beginning of the script will return -1, 0 or 1, depending on how the two key-value pairs compare. The returned value is <code>cmp(x[1], y[1]) or cmp(y[0], x[0])</code>. This may seem complicated but the trick is rather easy. The first part compares the frequencies of the two words and returns 1 or -1 if one is greater than the other. If they are equal, the expression to the left of the <code>or</code> will be 0, therefore the expression on the right of the <code>or</code> will be returned. On the right we compare the keys (the words), but invert the order of the arguments y and x to reverse the effects of the reversed ordering defined in <code>sorted()</code>.</p>
<p>Finally, for those who prefer to use a lambda expression, rather than to define a function, we can write the following:</p>
<div class="highlight">
<pre><span class="k">from</span> <span class="nn">string</span> <span class="k">import</span> <span class="n">punctuation</span>

<span class="n">N</span> <span class="o">=</span> <span class="mi">10</span>
<span class="n">words</span> <span class="o">=</span> <span class="p">{}</span>

<span class="n">words_gen</span> <span class="o">=</span> <span class="p">(</span><span class="n">word</span><span class="o">.</span><span class="n">strip</span><span class="p">(</span><span class="n">punctuation</span><span class="p">)</span><span class="o">.</span><span class="n">lower</span><span class="p">()</span> <span class="k">for</span> <span class="n">line</span> <span class="ow">in</span> <span class="nb">open</span><span class="p">(</span><span class="s">&quot;test.txt&quot;</span><span class="p">)</span>
                                             <span class="k">for</span> <span class="n">word</span> <span class="ow">in</span> <span class="n">line</span><span class="o">.</span><span class="n">split</span><span class="p">())</span>

<span class="k">for</span> <span class="n">word</span> <span class="ow">in</span> <span class="n">words_gen</span><span class="p">:</span>
    <span class="n">words</span><span class="p">[</span><span class="n">word</span><span class="p">]</span> <span class="o">=</span> <span class="n">words</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">word</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="o">+</span> <span class="mi">1</span>

<span class="n">top_words</span> <span class="o">=</span> <span class="n">sorted</span><span class="p">(</span><span class="n">words</span><span class="o">.</span><span class="n">iteritems</span><span class="p">(),</span>
                   <span class="nb">cmp</span><span class="o">=</span><span class="k">lambda</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">:</span> <span class="nb">cmp</span><span class="p">(</span><span class="n">x</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="n">y</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span> <span class="ow">or</span> <span class="nb">cmp</span><span class="p">(</span><span class="n">y</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">x</span><span class="p">[</span><span class="mi">0</span><span class="p">]),</span>
                   <span class="n">reverse</span><span class="o">=</span><span class="bp">True</span><span class="p">)[:</span><span class="n">N</span><span class="p">]</span>

<span class="k">for</span> <span class="n">word</span><span class="p">,</span> <span class="n">frequency</span> <span class="ow">in</span> <span class="n">top_words</span><span class="p">:</span>
    <span class="k">print</span> <span class="s">&quot;</span><span class="si">%s</span><span class="s">: </span><span class="si">%d</span><span class="s">&quot;</span> <span class="o">%</span> <span class="p">(</span><span class="n">word</span><span class="p">,</span> <span class="n">frequency</span><span class="p">)</span>
</pre>
</div>
<p>Or simplified further by getting rid of <code>reverse=True</code> and using key rather than cmp:</p>
<div class="highlight">
<pre><span class="k">from</span> <span class="nn">string</span> <span class="k">import</span> <span class="n">punctuation</span>

<span class="n">N</span> <span class="o">=</span> <span class="mi">10</span>
<span class="n">words</span> <span class="o">=</span> <span class="p">{}</span>

<span class="n">words_gen</span> <span class="o">=</span> <span class="p">(</span><span class="n">word</span><span class="o">.</span><span class="n">strip</span><span class="p">(</span><span class="n">punctuation</span><span class="p">)</span><span class="o">.</span><span class="n">lower</span><span class="p">()</span> <span class="k">for</span> <span class="n">line</span> <span class="ow">in</span> <span class="nb">open</span><span class="p">(</span><span class="s">&quot;test.txt&quot;</span><span class="p">)</span>
                                             <span class="k">for</span> <span class="n">word</span> <span class="ow">in</span> <span class="n">line</span><span class="o">.</span><span class="n">split</span><span class="p">())</span>

<span class="k">for</span> <span class="n">word</span> <span class="ow">in</span> <span class="n">words_gen</span><span class="p">:</span>
    <span class="n">words</span><span class="p">[</span><span class="n">word</span><span class="p">]</span> <span class="o">=</span> <span class="n">words</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">word</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="o">+</span> <span class="mi">1</span>

<span class="n">top_words</span> <span class="o">=</span> <span class="n">sorted</span><span class="p">(</span><span class="n">words</span><span class="o">.</span><span class="n">iteritems</span><span class="p">(),</span>
                   <span class="n">key</span><span class="o">=</span><span class="k">lambda</span><span class="p">(</span><span class="n">word</span><span class="p">,</span> <span class="n">count</span><span class="p">):</span> <span class="p">(</span><span class="o">-</span><span class="n">count</span><span class="p">,</span> <span class="n">word</span><span class="p">))[:</span><span class="n">N</span><span class="p">]</span> 

<span class="k">for</span> <span class="n">word</span><span class="p">,</span> <span class="n">frequency</span> <span class="ow">in</span> <span class="n">top_words</span><span class="p">:</span>
    <span class="k">print</span> <span class="s">&quot;</span><span class="si">%s</span><span class="s">: </span><span class="si">%d</span><span class="s">&quot;</span> <span class="o">%</span> <span class="p">(</span><span class="n">word</span><span class="p">,</span> <span class="n">frequency</span><span class="p">)</span>
</pre>
</div>
<p>Please bear in mind that the code makes a few assumptions so as to keep things simple. As it stands, the script would consider &#8220;l&#8217;amore&#8221; as a single word, and an accidental lack of spaces wouldn&#8217;t be accounted for (e.g. &#8220;word.Another&#8221; would be a single word too). The <code>replace()</code> method can be used to address these sorts of special cases.</p>
<p>Sure, this was a rather trivial example, born from an iPython session, but I think it gives away Python&#8217;s expressiveness and flexibility when dealing with problems that, approached in some other languages, would be much more error prone and verbose. Batteries included indeed.</p>
]]></content:encoded>
			<wfw:commentRss>http://programmingzen.com/2008/03/18/use-python-to-detect-the-most-frequent-words-in-a-file/feed/</wfw:commentRss>
		<slash:comments>15</slash:comments>
		</item>
	</channel>
</rss>
