An easy way to improve the performance and security of SQL queries is to replace literals with parameters. By replacing literal values with parameters, advanced relational databases will be able to compile your queries and have their execution plans cached. This saves time and precious resources when the same query (minus the actual values) is executed over and over.
Consider the following series of queries:
SELECT * FROM users WHERE karma BETWEEN 100 AND 499;
SELECT * FROM users WHERE karma BETWEEN 500 AND 999;
SELECT * FROM users WHERE karma BETWEEN 1000 AND 1999;
SELECT * FROM users WHERE karma BETWEEN 2000 AND 4999;
SELECT * FROM users WHERE karma BETWEEN 5000 AND 9999;
SELECT * FROM users WHERE karma BETWEEN 10000 AND 50000;
These each represent the same query and can be transformed into a single parameterized query:
SELECT * FROM users WHERE karma BETWEEN ? AND ?;
Trying to use clever tricks with quotes in order to inject arbitrary SQL code becomes futile. Parameters are considered values, and have no effect on the structure of the query itself.
Parameterized queries are therefore efficient and go a long way towards preventing SQL injection attacks in your applications. They have virtually no downside.
Newbie developers often ignore the existence of this feature and end up irritating seasoned DBAs who have to deal with the consequences of their incompetence. Leon Katsnelson argues that this is such an important matter, that every DBA should forward this Computerworld article to their developers. I tend to agree with how important of an issue that is.
That article provides the following example in Java:
String lastName = req.getParameter("lastName");
String query = "select * from customers where last_name = ?"
PreparedStatement pstmt = connection.prepareStatement(query);
pstmt.setString(1, lastName);
try { ResultSet results = pstmt.execute(); }
Here I’ll show you an example of how to work with parameterized queries from Ruby and Python. I’ll use the Ruby and Python drivers for DB2.
Ruby first:
require 'ibm_db'
conn = IBM_DB.connect("mydb", "db2inst1", "mypassword")
query = "SELECT * FROM users WHERE karma BETWEEN ? AND ?"
pstmt = IBM_DB.prepare(conn, query)
values = [500, 999]
IBM_DB.execute(pstmt, values)
while row = IBM_DB.fetch_array(pstmt)
puts "#{row[0]}:#{row[1]}"
end
We load the driver (use mswin32/ibm_db
on Windows, and ibm_db.bundle
on Mac), create a prepared statement, and then bind the two parameter values to it through the execute
method. We then fetch the resultset one row at a time and print the value of the first two fields for each record. For fine-tuned control we could have used the IBM_DB::bind_param method.
The Python version is very similar:
import ibm_db
conn = ibm_db.connect("mydb", "db2inst1", "mypassword")
query = "SELECT * FROM users WHERE karma BETWEEN ? AND ?"
pstmt = ibm_db.prepare(conn, query)
values = (500, 999)
ibm_db.execute(pstmt, values)
tuple = ibm_db.fetch_tuple(pstmt)
while tuple:
print tuple[0] + ":" + tuple[1]
tuple = ibm_db.fetch_tuple(pstmt)
As you can see, working with parameterized queries is not any harder than dynamically generating SQL queries. Yet the benefits of doing so are huge.
Unfortunately, despite being a very sound choice to base an Object-Relational Mapper (ORM) on, ActiveRecord does not use parameterized queries. Even when it looks like you are passing parameters to a given method, these are actually used to dynamically form an SQL query. Of course you are still free to use parameterized queries in your Rails applications by employing the driver directly. But I really think this is something ActiveRecord should be built upon.
Luckily for Django developers, Django’s ORM uses parameterized queries, thus improving both performance and security with a single design choice. In the Python world you couldn’t get away with ignoring parameterized queries.
For those of you using Rails, all is not lost. DB2 Express-C 9.7 has a killer feature known as the Statement Concentrator, which caches similar queries allowing them to use a shared access plan. It’s not as efficient as using prepared statements in your code, but it’s the best you can do when, as in the case of ActiveRecord, you can’t use parameterized queries directly. Leon’s article explains in greater detail how this feature actually works.
Get more stuff like this
Subscribe to my mailing list to receive similar updates about programming.
Thank you for subscribing. Please check your email to confirm your subscription.
Something went wrong.
In general, I absolutely agree with the above. Where the RDBMS not only stores the parsed query but the query plan, however, one should probably be aware that queries such as the “BETWEEN” one can have unexpected adverse effects.
Consider the case where all karma values lie between 1000 and 5000 and further that karma is indexed. A SELECT for karma between 1999 and 2001 will almost certainly use the index, being suitably selective. Looking for all karmas between 1100 and 4900 would probably be best done with a table scan, but would be unexpectedly slower in the case that the query plan was re-used from the first query.
It’s a relatively rare situation but I’m sensitive to it: the scars are still healing after several years ago!
Excellent post. Paramaterizing queries can be enormous beneficial in terms of both speed and security.
The article “The Curse and Blessing of Dynamic SQL ” at http://www.sommarskog.se/dynamic_sql.html
makes this point with examples and details for T-SQL in Microsoft SQL Server.
I think so hibernate have inbuilt support for parameterized query. So hibernate is less prone to sql injection.
Ankur,
you are right and wrong at the sametime. Blind faith can lead to problems 🙂
e.g.
String street = getStreetFromUser();
Query query = session.createQuery(“from Address a where a.street='” + street + “‘”);
see – http://www.owasp.org/index.php/Interpreter_Injection#ORM_Injection