String substitution VS proper mysql escaping

John Nagle nagle at animats.com
Thu Aug 19 16:50:43 EDT 2010


On 8/18/2010 2:50 AM, Cameron Simpson wrote:
> On 18Aug2010 12:07, Nik Gr<nikos.the.gr33k at gmail.com>  wrote:
> |  Στις 18/8/2010 7:31 πμ, ο/η Cameron Simpson έγραψε:
> |>On 17Aug2010 20:15, Νίκος<nikos.the.gr33k at gmail.com>   wrote:
> |>| ===============================
> |>| cursor.execute( ''' SELECT host, hits, date FROM visitors WHERE page =
> |>| '%s' ORDER BY date DESC ''' % (page) )
> |>| ===============================
> |>|
> |>| Someone told me NOT to do string substitution ("%") on SQL statements
> |>| and to let MySQLdb do it
> |>| for me, with proper escaping like the following
> |>|
> |>| ===============================
> |>| cursor.execute('''SELECT host, hits, date FROM visitors WHERE page=%s
> |>|         ORDER BY date DESC''', (page,))
> |>| ===============================
> |>|
> |>| The difference is that if some external source can control "page",
> |>| and
> |>| they put in a value like
> |>|         100 ; DELETE FROM visitors; SELECT * FROM visitors
> |>| i will be losing my database table data.
> |>
> |>That other difference is that the mysql dialect support knows how to
> |>correctly escape a string for insertion into an SQL statement. You may
> |>not, or may forget to pre-escape the string, etc. Using the MySQLdb
> |>stuff do it for you is reliable and robust.
> |
> | Can you please tell me what escaping means by giving me an example
> | of what is escaped and whats isn't?
>
> In your plain substitution example above:
>
>    cursor.execute( ''' SELECT host, hits, date FROM visitors WHERE page =
>    '%s' ORDER BY date DESC ''' % (page) )
>
> Supposing page is the string "100". This will produce the SQL statement:
>
>    SELECT host, hits, date FROM visitors WHERE page = '100' ORDER BY date DESC
>
> which looks ok. But suppose page was the string:
>
>    bill o'reilly
>
> Then your SQL statement looks like this:
>
>    SELECT host, hits, date FROM visitors WHERE page = 'bill o'reilly' ORDER BY date DESC
>
> To the SQL engine this looks like the string "bill o" followed by an SQL
> instruction named "reilly", and then the opening quote for another string.
> Invalid SQL.
>
> The procedure used to avoid this problem (to insert an _arbitrary_
> string into the SQL statement) is to "escape" problematic characters in
> strings when placing them into SQL statements. In this case, the quote
> character in the string is the SQL "end string" character. Therefore the
> string must be modified in the SQL statement to be correctly expressed.
>
> IIRC, SQL uses the quote doubling convention for strings, so this:
>
>    SELECT host, hits, date FROM visitors WHERE page = 'bill o''reilly' ORDER BY date DESC
>
> is how one would write the literal SQL for that.
>
> The MySQLdb library will do this and a host of other equivalent things
> automatically and correctly and consistently when you pass page as a
> parameter to the execute() method, needing no special attention or
> detailed syntactic knowledge on your part when you write your program.

    Right.  There's much mystery about this, but it's really simple.
All MySQLdb is doing for

	cursor.execute(sqlstring, values)

is

	quotedstr = sqlstring % map(MySQLdb.escape_string, values)
	cursor.execute(quotedstr)

It just applies "MySQLdb.escape_string to each arg.

If PHP did that, we'd have far fewer "SQL injection attacks".

    Most programs shouldn't use "MySQLdb.escape", and should let
the built-in call do it.  It's useful, though, if you're constructing
a data file for LOAD DATA to do a bulk database load.  The
same escaping works for LOAD DATA INFILE with the default
input format.

					John Nagle



More information about the Python-list mailing list