[DB-SIG] In praise of pyformat
Paul Boddie
paul at boddie.org.uk
Sun Aug 12 02:30:45 CEST 2007
On Sunday 12 August 2007 00:33, Mike Meyer wrote:
> On Sat, 11 Aug 2007 17:10:34 -0400 Carsten Haese <carsten at uniqsys.com>
wrote:
> >
> > The iron-clad, all-encompassing, golden rule is this: If something looks
> > like a parameter marker and occurs in a place where a parameter marker
> > is syntactically allowed, it's a parameter marker. Anything else isn't.
>
> Ok, I'll accept that that's iron-clad and all-encompassing. It's also
> circular. I suspect it also changes depending on the underlying
> database.
How is the above circular? If we have something, for example "?", which looks
like a parameter marker (because we've defined "?" to be a parameter marker),
and it appears in a place where such parameter markers are syntactically
allowed/possible/recognised (see the SQL specifications available on the
Internet), then our software notes that "?" in that place is a parameter
marker. Otherwise, the occurrence of "?" just forms part of some other
sequence of characters and is not considered to be such a parameter marker by
our software.
> > And in case that's not clear, parameter markers are not allowed inside
> > string literals or delimited column names.
>
> In every dbapi 2 module for every available database? Or just most of
> them?
By an appropriate choice of marker in conjunction with the standards.
[...]
> Let's see if I've got this: the exact rule is that you only do binding
> where a parameter is allowed. Figuring out where a parameter is
> allowed requires a large enough portion of a full SQL parser that
> writing one would be insane. So doing what I'm asking - providing a
> precise rule - would be insane. I think the key word is "insane".
The standard doesn't permit parameter usage in various places because it is
not a simple string substitution that occurs. So, the first party to follow
the standard must be the application developer, although they may be lucky
with some database systems and modules if misusing parameters and knowing
that the module/database just does a substitution.
However, a database module or library which converts from a standard marker to
a non-standard one, for example, would need to know where to find the genuine
occurrences of the standard marker. That's the job of something like
sqlliterals, and you shouldn't need a full parser for this in many cases,
thanks to the design of the SQL grammar. Really, the aim of software like
sqlliterals would be to preserve the correctness of converted queries - not
necessarily to forbid incorrect syntax - so one might attempt to use a
parameter to substitute column names, and it would be the job of the database
system to complain.
[...]
> > The fact that the query works with string literals for table names and
> > column names is mildly surprising. That must be an Sqlite peculiarity.
> > It's definitely not standard SQL.
>
> sqlite allows you to use either single and double quotes for string
> literals and identifier delimiters.
So it doesn't follow SQL-92, at least.
[...]
> > To execute a query where table names and column names are variable, you
> > should use string formatting yourself to build the structure of the
> > query, placing parameter markers into the structure where necessary, and
> > then using parameter binding to supply the actual values.
>
> But isn't one of the arguments against pyformat/format is that they
> lead to people doing string formatting to build the query, which is a
> bad thing because, unless done very carefully, they leave you exposed
> to all kinds of data injection attacks? If the other styles wind up
> *requiring* you to build the query with string formatting, how can
> they possibly be considered superior?
Because for a lot of applications, people don't need to build up queries from
parts, or if they do, they can do it without mixing in user-supplied values.
Distinguishing parameter usage from string substitution reduces confusion for
beginners who don't tend to do such advanced stuff, and it makes them more
aware of the issues when they finally start doing so.
Paul
More information about the DB-SIG
mailing list