[DB-SIG] In praise of pyformat
carsten at uniqsys.com
Sun Aug 12 03:45:21 CEST 2007
On Sat, 2007-08-11 at 18:33 -0400, Mike Meyer wrote:
> On Sat, 11 Aug 2007 17:10:34 -0400 Carsten Haese <carsten at uniqsys.com> wrote:
> > The iron-clad, all-encompassing, golden rule is this: If something looks
> > like a parameter marker and occurs in a place where a parameter marker
> > is syntactically allowed, it's a parameter marker. Anything else isn't.
> Ok, I'll accept that that's iron-clad and all-encompassing. It's also
> I suspect it also changes depending on the underlying
Not so much, especially if you restrict yourself to qmark-style (the SQL
standard) for cross-database compatibility. No SQL compliant database
will allow question marks outside of quotation marks or apostrophes for
anything other than parameter markers.
> > And in case that's not clear, parameter markers are not allowed inside
> > string literals or delimited column names.
> In every dbapi 2 module for every available database? Or just most of
In any correct DB-API implementation for any SQL compliant database. In
SQL's grammar, <dynamic parameter specification> and <literal> are
disjoint productions, so you'll never find one inside the other.
Here are the pertinent productions in BNF:
<value specification> ::=
| <general value specification>
<general value specification> ::=
| <dynamic parameter specification>
| <variable specification>
<dynamic parameter specification> ::= <question mark>
> > The basic rule above still holds true, because you can't have a
> > parameter placeholder inside a datetime literal or in a table name. The
> > parser just uses a heuristic because writing a full SQL parser would be
> > insane.
> Let's see if I've got this: the exact rule is that you only do binding
> where a parameter is allowed. Figuring out where a parameter is
> allowed requires a large enough portion of a full SQL parser that
> writing one would be insane. So doing what I'm asking - providing a
> precise rule - would be insane. I think the key word is "insane".
You're mixing two different levels. One is the definition, one is the
practical implementation. Implementing the definition to the letter is
insane, which is why the practical implementation uses a heuristic that
provides the same result.
> > To execute a query where table names and column names are variable, you
> > should use string formatting yourself to build the structure of the
> > query, placing parameter markers into the structure where necessary, and
> > then using parameter binding to supply the actual values.
> But isn't one of the arguments against pyformat/format is that they
> lead to people doing string formatting to build the query, which is a
> bad thing because, unless done very carefully, they leave you exposed
> to all kinds of data injection attacks? If the other styles wind up
> *requiring* you to build the query with string formatting, how can
> they possibly be considered superior?
It forces the application programmer to understand and appreciate the
difference between USER-SUPPLIED VALUES and SYNTAX ELEMENTS in a
database query. User-supplied values should *always* be provided via
parameter binding. Syntax elements should *never* (and in standard SQL,
can't) be provided via parameter binding.
In real database applications, the syntax elements of queries vary
rarely, and should never come from user input. That's why requiring the
application developer to put together those parts with string formatting
String formatting and parameter binding are two different tools for two
different tasks, and that's why parameter styles that distinguish the
two are superior.
More information about the DB-SIG