[Tutor] Python and Apache (OT) [use prepared statements!]
Danny Yoo
dyoo@hkn.eecs.berkeley.edu
Fri Apr 25 17:33:01 2003
On Fri, 25 Apr 2003, Timothy M. Brauch wrote:
> Python seems to be getting a bad rap in a web-based addition to the book
> "Apache: The Definitive Guide, 3rd ed.", Peter Laurie.
> <http://www.onlamp.com/pub/a/apache/2003/04/10/apacheandpython.html>
Hi Timothy,
It may just be that he was asked to write the article in a hurry --- a lot
of the tone of his article implies that he had not talked with other
Python folks while preparing the article. He's making a lot of
assumptions about the language that only time and experience will fix.
It sounds like he's getting slightly roasted in the feedback section of
his article, so we won't get into subjective details: let's concentrate on
the objective things he talks about in his article.
A few things he complains about are non-issues. For example, here's one
complaint he makes about documentation:
"""(You can also download the manuals. In the world of freeware, where no
one is paid to do the tiresome work of presenting user-friendly material,
you get over 700 files without, as far as I could see, an index. But,
hey!)"""
It's true that the documentation still does need some work, but this
particular complaint is on pretty shaky ground. There are several indices
in the documentation. If we're interested in a list of modules, we can
look at:
http://www.python.org/doc/current/modindex.html
And if we're interested in a function in the Library Reference, there's an
extensive index at:
http://www.python.org/doc/current/lib/genindex.html
His program example uses quite a few deprecated and private modules that
should not be imported directly; in particular, he uses '_mysql' and
'regex' --- he should really be using the 'MySQLdb' module frontend, as
well as the 're' regular expression module. And it's really odd that he's
not using the 'cgi' module to do form parameter parsing. Why do it by
hand if the standard library does it better?
Let's take a chunk of his code, and rewrite it using MySQLdb so that it's
clearer. Hmmm... let's see...
### original code
a="select xname,sname from people where xname='"+c+"'"
print a,"<BR>"
db.query(a)
r=db.store_result()
while(1):
a=r.fetch_row()
if(len(a)):
print "Christian name:",a[0][0], "Surname:\
",a[0][1],"<BR>"
else:
break
###
Yikes. There are some major problems with this code, especially on the
first line:
a="select xname,sname from people where xname='"+c+"'"
When we're working with databases, we should always try to use "prepared
statement" syntax if it's possible to do so: the database handler should
be responsible for doing quotation for us, not us. This is such an
important thing to know that we should talk about it more.
There's a strong possibility that 'c' can itself contain quotation marks,
in which case the xname's contents will leak right into our SQL. One
example of such a name is:
O'Reilly
*cough*
For a SELECT statement, it might not be too bad, but if this had been an
UPDATE or DELETE, imagine what can happen if we're doing something like:
"delete from people where name = '" + c + "'
If someone feeds in:
c = "' or 1 = 1"
the resulting SQL statement will completely fry the 'people' table. This
sort of stuff is especially relevant to CGI code, where we do need to be
careful how we handle data from the outside world.
Here's the rewritten code, using prepared statement syntax:
### Assume 'conn' is a database connection conforming to DB API 2.0
print a, "<BR>"
cursor = conn.cursor()
cursor.execute("select xname,sname from people where xname=%s", c)
for name, surname in cursor.fetchall():
print "Christian name:", name, "Surname:", surname, "<br>"
###
My main impression of the article is that he wrote it in isolation from
the rest of the Python community. If he had properly peer reviewed it,
the article would have been more accurate.
Hope this helps!