Memoization in Ruby and Python

Wikipedia defines memoization as “an optimization technique used primarily to speed up computer programs by having function calls avoid repeating the calculation of results for previously-processed inputs.”. This typically means caching the returning value of a function in a dictionary of sorts using the parameters passed to the function as a key. This is done in order to reuse that returning value immediately without calculating it again, when the function is invoked with the same arguments. Even though we are trading space for time, it is often invaluable for speeding up certain recursive functions and when dealing with dynamic programming where intermediate calls are often repeated many times.

Using memoization in Ruby is very easy thanks to the memoize gem. The first step to getting started is therefore to install it:

$ sudo gem install memoize
Successfully installed memoize-1.2.3
1 gem installed
Installing ri documentation for memoize-1.2.3...
Installing RDoc documentation for memoize-1.2.3...

Now we can use the memoize method as illustrated in the example below:

require 'rubygems'
require 'memoize'
require 'benchmark'
include Memoize

def fib(n)
  return n if n < 2
  fib(n-1) + fib(n-2)
end

Benchmark.bm(15) do |b|
  b.report("Regular fib:") { fib(35) }
  b.report("Memoized fib:") { memoize(:fib); fib(35)}
end

In the first block we simply invoke fib(35), while in the second one we first invoke the method memoize(:fib) to memoize the method fib. Running this code on my machine prints the following:

                     user     system      total        real
Regular fib:    55.230000   0.160000  55.390000 ( 55.819205)
Memoized fib:    0.000000   0.000000   0.000000 (  0.001305)

We went from almost a minute of run time to an instantaneous execution. Optionally we could even pass a file location to the function memoize and this would use marshaling to dump and load the cached values on/from disk.

For Python we can write a simple decorator that behaves in a similar manner. In its simplest form it can be implemented as follows:

# memoize.py

def memoize(function):
    cache = {}
    def decorated_function(*args):
        try:
            return cache[args]
        except KeyError:
            val = function(*args)
            cache[args] = val
            return val
    return decorated_function

Or more efficiently:

# memoize.py

def memoize(function):
    cache = {}
    def decorated_function(*args):
        if args in cache:
            return cache[args]
        else:
            val = function(*args)
            cache[args] = val
            return val
    return decorated_function

When the memoized function has been invoked, we look in the cache to see if an entry for the given arguments already exist. If it does, we immediately return that value. If not, we call the function, cache the results and return its returning value.

Truth be told, the limit of this approach lies in the fact that since we are using a dictionary, only immutable objects can be used as keys. For example, we can use a tuple but are not allowed to have a list as a parameter. For the example within this article, this approach will suffice, but to take advantage of memoization when using arguments that are mutable, you may want to consider the approach described in this recipe.

We can now rewrite the Ruby example above in Python as follows:

import timeit
from memoize import memoize

def fib1(n):
    if n < 2:
        return n
    else:
        return fib1(n-1) + fib1(n-2)
   
@memoize
def fib2(n):
    if n < 2:
        return n
    else:
        return fib2(n-1) + fib2(n-2)	

t1 = timeit.Timer("fib1(35)", "from __main__ import fib1")
print t1.timeit(1)
t2 = timeit.Timer("fib2(35)", "from __main__ import fib2")
print t2.timeit(1)

Running this code on my machine prints the following:

9.32223105431
0.000314950942993

In Python 2.5’s case by employing memoization we went from more than nine seconds of run time to an instantaneous result.

Granted we don’t write Fibonacci applications for a living, but the benefits and principles behind these examples still stand and can be applied to everyday programming whenever the opportunity, and above all the need, arises.

Prev Article Next Article

About The Author

Antonio Cangiano

15 Comments

Torsten Grust May 19, 2009

Hi Antonio,

nice write-up. In the Python example, you appear to invoke the memoized fib2() variant on the argument 10 (instead of 35). Unfair advantage! 😉

Cheers,
–Torsten

Loading...

Reply
Antonio Cangiano May 19, 2009

Hehe, good spotting Torsten. That was a typo due to my experimenting with several input sizes. It’s fixed now. 🙂

Loading...

Reply
david May 19, 2009

cool!

I’ve benchmarked ruby 1.9 and python 2.6, the results:

$ ruby fib.rb
user system total real
Regular fib: 42.210000 5.270000 47.480000 ( 48.118022)
Memoized fib: 0.000000 0.000000 0.000000 ( 0.001342)

$ /opt/ruby-enterprise/bin/ruby fib.rb
user system total real
Regular fib: 27.930000 0.050000 27.980000 ( 28.085893)
Memoized fib: 0.000000 0.000000 0.000000 ( 0.000714)

$ ruby1.9 fib.rb
user system total real
Regular fib: 7.450000 0.010000 7.460000 ( 7.564142)
Memoized fib: 0.000000 0.000000 0.000000 ( 0.000578)

$ python2.5 fib.py
16.7434659004
0.000143051147461

$ python2.6 fib.py
16.5563249588
0.000123023986816

Loading...

Reply
adit May 19, 2009

great tips

what if the methods doesn’t have params/inputs …does method like that can be memoized too ?

Loading...

Reply
david May 19, 2009

and python3 (for the record):

$ python3 fib.py
20.1024479866
0.000137090682983

Loading...

Reply
Clemens Kofler May 19, 2009

To put some shameless self promotion here: I’ve written a pretty extensive article on the topic of memoization. You can find it here: http://www.railway.at/articles/2008/09/20/a-guide-to-memoization.

Loading...

Reply
Bram May 19, 2009

JRuby 1.2.0 (ruby 1.8.6 patchlevel 287) results:

user system total real
Regular fib: 7.955000 0.000000 7.955000 ( 7.955000)
Memoized fib: 0.004000 0.000000 0.004000 ( 0.004000)

Loading...

Reply
Marco May 19, 2009

mmm but how to use it for a method that is coming from a library that we use ?
is there a way to decorate a method in an import statement or something like that ?
cheers
Marco
p.s. I’m interested in the python implementation.

Loading...

Reply
ajuc May 19, 2009

adit: memoization isn’t magic – it works only on functions that always return the same value for the same parameters.

So memoization will work for function without parameters, but only if this function always return the same value 🙂 so it is not very useful.

BTW it works in such way – each call to function first checks if the function was called with the same arguments before, if it was, it returns the value it returned last time, if not, it calculates the value normally, and then keeps the value associated with parameters it was called.

So it is simple caching, and can be only used on “pure” (as in functional programming) functions.

Loading...

Reply
Antonio Cangiano May 19, 2009

Marco, you could declare a function that wraps an imported function, and memoize that one.

Loading...

Reply
Neil Cook May 19, 2009

I have to disagree with ajuc that memoizing parameter-less function return values is not very useful. Whenever the cost of calculating the return value is high, and there is a possibility that the function may be called more than once, it is useful.

An example would be a result from a complex database query.

Loading...

Reply
Ivan Baldin May 20, 2009

You can speed python example even more if you replace “try/except” construct in your decorator with with “if args in cache”. I had ~40% speed increase on python 3 and ~95% on python 2.6.

Loading...

Reply
Antonio Cangiano May 20, 2009

That’s interesting, Ivan. That was my very first approach too but I didn’t like the idea of accessing the hash twice for each hit in the cache. Exception handling must be so expensive that it’s still faster to do the double lookup. I modified the post to include this alternative as well.

Loading...

Reply

Michael Foord May 21, 2009

You can avoid the exception and the double lookup with code like:

def memoize(function):
    cache = {}
    def decorated_function(*args):
        missing = object()
        val = cache.get(args, missing)
        if not val is missing:
            return val
        else:
            val = function(*args)
            cache[args] = val
            return val
    return decorated_function

As a micro-optimisation you could make both cache and missing keyword arguments to the inner function. Lookup of local variables is faster. Not benchmarked any of this…

Rob Reid February 5, 2010

I did benchmark your suggestion, Michael, and it’s a little slower:

0.000112s (Antonio’s 2nd version)
0.000155s (your version)

But if you remove missing, the time goes down to 0.000099s:

def memoize2(function):
cache = {}
def decorated_function(*args):
val = cache.get(args)
if not val:
val = function(*args)
cache[args] = val
return val
return decorated_function

“if args in cache” seems safer, though. If function(*args) == None, it will always be reevaluated for args. Functions can also return object(), although that is admittedly less likely.

Loading...

Reply
Pingback: Principles of Computing course on Coursera. My notes. | Arseny’s Blog November 11, 2015

Memoization in Ruby and Python

Related

About The Author

Antonio Cangiano

15 Comments

Leave a ReplyCancel reply

Get more stuff like this

Share this:

Related

About The Author

Antonio Cangiano

15 Comments

Leave a ReplyCancel reply