Meditations on programming, startups, and technology
New Relic

Python, Django and DB2: we need your input!

This article is obsolete. Please refer to the following articles for up do date instructions: Ruby/Rails and DB2 | Python/Django and DB2. Thank you!

I may be biased by the fact that I’m part of the DB2 team, but I think that DB2 Express – C is a kick ass offering for developers who are interested in using an enterprise level database without spending a cent.

Before joining IBM, I too thought DB2 was mostly something for giant companies like Wal-Mart or for mainframe legacy databases that have held information since the seventies. Well, things change, and they change very fast and many of those assumptions are just plain myths.

DB2 Express – C was born from a few people’s desire to bring the power of one of the most advanced databases in the world to developers for free. We had to give in to some small compromises, because IBM is obviously not a non-profit company but… we actually managed to compromise in a way that won’t negatively effect developers too much (shhh, don’t tell anyone ;-)).

DB2 Express – C is entirely free and the code base is no different from the commercial version. This is a rather different approach from other vendors, whose “express” versions are often crippled by design. There is no database size limit, no limit on the number of instances or databases and no restrictions on the number of users or connections. DB2 Express – C license requires only that it is used on a Linux or Windows server, with a maximum of 4 cores (32 or 64 bit) and 4 GB of memory. That’s it. This is pretty much a generous license given that most developers, startups and even medium sized businesses won’t need anything more than 4 cores with 4 GB of RAM for their dataservers.

I think it’s a very valid offer and probably one of the best propositions available for developers. DB2 is one of the fastest databases in the world and if you outgrow these server boundaries than you are already probably making some serious money and paying for commercial licenses won’t be an issue. :) There is a wealth of information to get you started and a helpful support forum.

This is all dandy and cool, but developers need good drivers and good APIs to access the database from their favorite programming language or framework. This is why there have been many efforts to support Ruby and Ruby on Rails with a native driver and a vendor supported adapter.

But hey, what about Python many people asked? Well, I was one of the first to ask ‘What about Python and Django?’ myself. My team is striving to innovate and embrace the developers’ community, therefore such ideas are immediately considered no matter what it takes. Just to give you an idea, in my team we have a guy who looks like Jimi Hendrix and happily brings his “17 MacBook Pro to work (I’m so jealous about both his Mac and the fact that he still has all his hair :-P).

Then there is the guy who’s visited hundreds of universities around the world, and has more miles under his belt than a rock star.
It’s a cool team trust me, so in this environment when we see an opportunity to make a concrete step for the community we dive right in.

For a while I’ve been pushing and promoting the idea (within IBM) of a vendor supported Python driver and Django adapter. It looks like the time has come to start considering this seriously and to allocate appropriate resources for it. And I need your help.

I need your feedback and help to collect good ideas, in order for us to create the best driver API that we can. The Python Database API Specification v2.0 – PEP 249 is our starting point, but if you have specific requirements, thoughts or good ideas, please do not hesitate to contact me at my last name or by commenting on this blog. There are no dates or estimated times as we are just in the brainstorming phase, but now is the time for you to give us your input. Take this as a friendly petition and request for feedback, and feel free to contact me even if you are just interested in seeing IBM’s official driver and adapter for Python, Django and IBM databases (DB2 LUW, IDS, etc…).

I thank you in advance and ask you to get the message out there, in the Python and Django communities.

DISCLAIMER: this post expresses my opinions and does not necessarily represent IBM’s positions, strategies or opinions in particular in regards to my colleague’s hair :).

If you enjoyed this post, then make sure you subscribe to my Newsletter and/or Feed.

receive my posts by email

39 Responses to “Python, Django and DB2: we need your input!”

  1. Jonathan says:

    I don’t know python well enough to suggest anything but db2 express-c appears to be a good deal. I must find some time to try it out.

  2. dbt says:

    did you intentionally exclude OpenBSD, NetBSD, FreeBSD and MacOS X?

  3. @dbt

    Unfortunately *BSD and Mac OS X are not supported at the moment.

  4. Masklinn says:

    I think that a DBAPI2-compliant driver would be a very good idea indeed.

    But as far as Django, even though I do love and use it, I think that you should work on SQLAlchemy support before considering Django.

    Work is being done to use SQLAlchemy in Django anyway, so you could very well kill two birds with one stone.

  5. anonymous says:

    My suggestion: don’t target the Django adapter/ORM specifically, target SQLAlchemy so that _all_ web frameworks (Django, Pylons, TG, etc.) can use db2 express-c. The first step would be to create a DB-API compatible library, then code the SQLA adapter (see SQLA’s Oracle adapter for example).

  6. Kimutaku says:

    Hey Antonio, by looking at the ruby driver, the approach seems to be using C bindings (complemented with a kind of “db2 client” I suppose).

    Do you know if another approach would be considered? I’m thinking in “type 4” drivers for java. It’s way cool in java that you just drop a couple of jars, point to the appropiate url and the app just works. No client needed. That could even help in case of some deployments with a db2 server on linux/windows and django/whathever in another platform.

    On the other hand, I assume you know about the python db-sig mailing list (which seems active and healthy) and the pydb2 project.

    Anyway, I hope we see an official python-db2 driver anytime soon. You know, the *racle side seems pretty active in the python world, although the existent drivers are non-oficial.

  7. @Jonathan

    It is indeed a good deal, and the best part is that there is no catch.

    @Masklinn, @anonymous

    I definitely think that implementing an adapter for SQL Alchemy as early as possible is a sound idea.


    I think the approach that we had in mind was to rely on the DB2 Call Level Interface (CLI) rather than having something similar to the JDBC type 4 driver. However based also on the feedback received, we’ll consider any sensible approach.

    Thanks for all the feedback so far, keep it coming. :)

  8. Vinay Gupta says:

    You might want to talk to the guy who wrote the Python / MySQL driver. He’s very good at what he does, nice to work with (I had some issues and he was very helpful) and it might greatly cut down the learning curve.

  9. @Vinay Gupta

    I won’t be personally writing the driver or the adapter but there will be a dedicated team working on this. Thank you very much for the suggestion though. :)

  10. tonetheman says:

    Making db express -c slightly easier to download… like actually having the download available instead of the verification might help a bit. [just a suggestion]

  11. David Preece says:

    Make it go on OSX. Lots of us develop on MacBooks y’know.

  12. JazzMan says:

    My requests are:

    # support for SQL Alchemy
    # support for pure XML
    # a nicer control center because I hate java guis

    Thank you for taking in consideration my suggestions.

  13. anonymous2 says:

    some of us are still at PowerBooks, so if PPC version exists… :)

  14. Ed Singleton says:

    I’d like to second some kind of Mac support. Wouldn’t need to be production quality, it’s only for development, but if I can’t develop with it I can’t use it.

  15. Jeff says:

    Ed why don’t you run DB2 through virtualization? It works fine (:

  16. Daego says:

    Please make sure you include binary drivers for Win32. The MySQLdb doesn’t, for instance, which makes it a royal pain for those of us stuck developing through Windows.

  17. brad clements says:

    I agree with comment 5. Start with a plain DB adapter, then consider supporting an ORM if you have the resources.

    In the case of SQLAlchemy, the ORM level work is mostly adapting the SQL dialect and reflection to DB2 specifics.

    Regarding a “type 4” style driver, I prefer performance over installation ease. So I think CLI is the way to go.

    Regarding installation ease, easy_install and PYPI have really helped a lot in this area. Having a DB adapter that can be installed via easy_install shouldn’t be too difficult and would address the issues of those calling for a “type 4” implemention.

    Overall, I have clients interested in DB2 who would be candidates for enterprise level installs in the future. But we just cannot make the leap to DB2 without a “reliable and supported” python DB2 adapter. I am not saying that pydb2 is unreliable, but it does have the appearance of being “unsupported”, so it’s a non-starter for us.

    Strange how people are using this Python/DB thread to comment about BSD and MacOS support. :-(

  18. Coliform says:

    DB2 is a bloated beast, with client applications written in Java (!). It takes a long time to install. It uses a ton of memory.

    I simply can’t imagine a scenario where I’d choose the proprietary DB2 over PostgreSQL.

  19. Andy Dustman says:

    @Vinay: Yeah, I am that guy.

    @Daego: There are up-to-date MySQLdb packages for Windows at the SourceForge project; I just can’t build them myself, so I can’t control when they come out.

    Some general observations:

    I have used DB2 with Django, but before you get all excited, this was primarily a MySQL app, and the DB2 stuff was to periodically synchronize some data out of a legacy database. This was not using any of the Django database backends; I just used DB-API to get the data I wanted and then stored in MySQL with the Django ORM.

    Speed: I can’t complain about DB2’s speed, but in my case, the server is a big IBM mainframe.

    Size: DB2 is HUGE, about 700 MB of RPMs, so expect to use 1 GB of disk space.

    I am using mx.ODBC for the DB-API apapter.

    Writing a database backed for Django is trivial. You could just about any of the existing backends that is based on an DB-API adapter and convert it easily. Introspection may take a bit more work.

  20. Dave K. says:

    Coliform, seriously man, why are you using a blog like this to spread FUD? Last time I installed DB2 on a laptop, the installation went smoothly and quickly. Have you even tried DB2? Besides, how is your comment contributing to this discussion?

    Antonio, given that a team will work on this, you should tell them to start with the driver (the PEP 249 is fine). Then they should write an SQL Alchemy adapter to cover all frameworks, and only then once you’ve got that in place, write the ORM adapter for Django, in my opinion.

  21. David Rushby says:

    I agree with others. Start with a Python DB API-compliant module. If it seems like a good idea, write a module with a more DB2-friendly API, then implement a DB API compatibility layer on top of it.

    Please make sure to implement a system of callbacks to allow the client programmer to conveniently override the default DB->Python and Python->DB type conversions.

    Several existing Python DB API modules support such a facility. KInterbasDB is a “documented example”:

    It supports dynamically redefining the type conversions at the Connection, Cursor, and column-within-Cursor levels. The type translators have a hierarchical relationship, so:

    – if no custom translator for the specific field being converted is defined at the column-within-Cursor level, the Cursor level is checked

    – if none at the Cursor level, the Connection level is checked.

    – if none at any level, use a sensible default conversion

    Also, distributed transaction support is quite valuable. Please don’t leave it out, if you have the programmer-hours to implement support for it.

  22. Derek says:

    I would prefer a “type-4” type driver that doesn’t require any other DB2 components to be installed.

  23. Coliform says:

    No FUD (ironic of you to use that term in defense of IBM, about whom the term was invented), merely my own experience.

    I notice that you don’t answer the substance of my comment: why would anyone ever choose the proprietary DB2 over the excellent and free PostgreSQL? This is an opportunity for anyone to explain why DB2 would be an appropriate choice for any project, low-budget or otherwise. You might even convince me.

  24. João Marcus says:

    Coliform: Because one could need to integrate with existing apps? Apps that use DB2? That’s a damn good reason.

  25. Kimutaku says:

    Coliform: I suggest you to open a topic in “Ask Reddit” about that. Here, it sounds trollish and off topic.

  26. A DBA says:

    I don’t feel as thought this is right place to discuss this topic, and I apologize for using this space for an off topic reply, but I feel compelled to give you my two cents worth, Coliform. Granted I like PostgreSQL a lot, and I feel that it is often a good choice, but there are reasons why I would use DB2 over PostgreSQL, including the following:

    – DB2 is significantly faster than PostgreSQL.
    – DB2’s scalability is incredible, it especially shines when your data volumes keep growing (clustering across hundreds of servers? No problem).
    – DB2 has awesome replication features and these are completely missing from PostgreSQL.
    – DB2 is also an hybrid engine, meaning that I can store natively XML at full speed without having to do parsing of XML documents with slow libraries in my application code.

    Again, PostgreSQL is nice (and I mean that with sincerity), but given that I can get DB2 for free, I prefer to use a solution which I simply cannot outgrow. In other words, if we left the Express-C boundaries, the performance and scalability requirements would be so high that I would prefer to pay for something like DB2 rather than use PostgreSQL, in order to handle and secure my data as reliably and safely as possible.

  27. Dan Sickles says:

    A little more on the XML support. DB2 uses a new hierarchical on-disk structure to store XML. No parsing, shredding or blobbing. You can mix and match SQL and XQuery in the same query. Very powerful. I’d like to see the native XML leveraged for a document-oriented app server like django. ORM + OXM? But get the DB adapter and SQLAlchemy support done first.

  28. Richard says:

    I like DB2 Express-C’s license a lot, and I played with the product when it first came out, using a test server with a PHP interface. It was a little hard to set up, but I suspect some of the rough spots have been sanded down by now. I would take it for another spin if it had interfaces for Python (meeting DB-API spec 2), which I want to use for larger web applications, and for Django, which I want to start using when it gets to 1.0. SQLAlchemy? It sounds cool and I’d use the sqlalchemy branch of Django if it ever gets done, but development of that branch seems to be stalled. So, to me, the Django interface is more important than the SQLAlchemy interface.

  29. Dick Davies says:

    No Solaris support? It’s supported by DB2 8.3 (and on x64 for DB2 9), so why does the express version have a different list of supported platforms?

    I can’t understand why the platform support is so sketchy for DB2 – different versions/releases seem to require totally different Oses. Once you start trying to layer something on top (DB2 Content Manager), the matrix of supported platforms becomes completely baffling.

    God forbid we ever have to do an OS upgrade.

  30. Garry says:

    DB2 Express – C is a great offer Antonio.

  31. Mateo says:

    Do you have any information on DB2 Replication w/encryption? Do you know if DB2 Replication provides a feature that encrypts its data that’s going over the wire?

  32. Angela says:

    In relation to your question Mateo, DB2 replication cannot replicate data that is encrypted. DB2 replication cannot replicate the following data types under any circumstances:
    – LOB columns from non-DB2 relational sources
    – Any column on which any of the following procedures is defined:

  33. Seems like this blog lacks a mention of OpenBSD and the like

  34. DB2 express is something ive been playing with recently and it appears to be very good

  35. Claus Conrad says:

    Hello Antonio,

    I just found your article. DB2 Express-C sounds indeed like a good deal. I wondered how your work on the Python interface is going? Didn’t find anything official for download yet.

    Not regarded to Python, but you seem knowledgeable on this – if you deploy DB2 Express-C on a server with more CPUs or RAM than the limits allow, will it take advantage of 2×2 CPUs and 4 GB or won’t it run at all?


  36. Hi Claus,

    DB2 Express-C will run and take advantage of a more powerful machine. The limit is not technical, but rather a licensing related one.

    Development of the driver/adapter is not carried out by me personally, but is done by a dedicated team. At the moment, there are no further updates available.


  37. Adi Azar says:

    You can not replicate encrypted data when you use DB2, as far as i know.


  38. this is a useful link regarding DB2 replication with encrypted data:

  39. Adi Azar says:

    Thanks for link, but it does not give sufficient details.

Copyright © 2005-2014 Antonio Cangiano. All rights reserved.