Why Is Escaping Data Considered So Magical?
Owen Jacobson
angrybaldguy at gmail.com
Thu Jun 24 22:43:26 EDT 2010
On 2010-06-24 21:02:48 -0400, Roy Smith said:
> In article <i00t2k$l07$1 at lust.ihug.co.nz>,
> Lawrence D'Oliveiro <ldo at geek-central.gen.new_zealand> wrote:
>
>> I construct ad-hoc queries all the time. It really isn’t that hard to do
>> safely. All you have to do is read the documentation
>
> I get worried when people talk about how easy it is to do something
> safely.
First: I agree with this. While it's definitely possible to correctly
escape a given SQL dialect under controlled conditions, it's not at all
easy to get it right, and the real world is even more unfriendly than
most people expect. Furthermore there's no reason to do it that way:
Python's DB API spec effectively requires that placeholder parameters
of *some* kind exist. Even if you feel the need to construct SQL, you
can construct it with parameters almost as easily as you can construct
it with the values baked in.
With that said...
> 2) Somebody runs your application with a different PYTHONPATH, which
> causes a different (i.e. malicious) urllib module to get loaded, which
> makes urllib.quote() do something you didn't expect.
Someone who can manipulate PYTHONPATH or otherwise add code to the
runtime environment is already in a position to hose your database,
independently of escaping-related issues. It's up to the sysadmin or
user to ensure that their environment is sane, and it's on their head
if they add broken code to a program's runtime environment.
Lawrence D'Oliveiro wrote:
> I'€™ve done this sort of thing for MySQL, for HTML and JavaScript (in both
> Python and JavaScript itself), and for Bash. It’s not hard to verify you’ve
> done it correctly. It lets you easily create table-updating code like the
> following, which makes it so easy to update the code to track changes in the
> database structure:
>
> sql.cursor.execute \
> (
> "update items set "
> +
> ", ".join
> (
> tuple
> (
> "%(name)s = %(value)s"
> %
> {
> "name" : field[0],
> "value" : SQLString(Params.getvalue
> (
> "%s[%s]" % (field[1],
> urllib.quote(modify_id))
> ))
> }
> for field in
> (
> ("class_name", "modify_class"),
> ("make", "modify_make"),
> ("model", "modify_model"),
> ("details", "modify_details"),
> ("serial_nr", "modify_serial"),
> ("inventory_nr", "modify_invent"),
> ("when_purchased", "modify_when_purchased"),
> ... you get the idea ...
> ("location_name", "modify_location"),
> ("comment", "modify_comment"),
> )
> )
> +
> (
> "last_modified = %d" % int(time.time()),
> )
> )
> +
> " where inventory_nr = %s" % SQLString(modify_id)
> )
Why would I write this when SQLAlchemy, even without using its ORM
features, can do it for me? It even uses the placeholder-generating
strategy I mentioned above, where possible.
Finally, it's worth noting that MySQL is (almost) the only mainstream
database that uses escaping for parameterization. PostgreSQL, SQL
Server, Oracle, DB2, and most other databases support parameters
natively in their communication protocols: parameters aren't injected
into the query string, but are sent separately and processed separately
within the DBMS. This neatly avoids encoding-related and
quoting-related problems entirely, and it means the type of the
parameter can be preserved if it's useful.
-o
More information about the Python-list
mailing list