[DB-SIG] API 3.0 limiting paramstyle to ['named', 'qmark'] is okay. ('format' is not desirable)

Daniele Varrazzo daniele.varrazzo at gmail.com
Fri May 17 17:50:40 CEST 2013


Sorry, originally sent this message only to M.-A.L.

On Fri, May 17, 2013 at 11:42 AM, M.-A. Lemburg <mal at egenix.com> wrote:
> psychopg2 uses the 'format' paramstyle and while I agree that
> it has issues, I think the existing code base using
> it is large enough that we cannot easily remove that
> paramstyle :-(
>
> I'd love to hear what Daniele thinks about this.

Of course switching psycopg to named/qmark placeholders means
immediately destroying the psycopg compatibility with 100% of the
applications written using it. I cannot imagine earning any love from
our user base for this choice. Ditto for the mysql driver and I guess
others.

Reading back the upstream thread I also see proposal to leave support
for all the current placeholders but mandate drivers to implement
qmark and named too. I think this is only going to complicate the
implementation of the drivers and of the programs using it. Even
making paramstyle a connection or cursor property there will always be
the case of a connection passed to library functions written for the
previous param style. Every function taking a connection as input
should check what is the paramstyle used and dynamically generate a
query, instead of using a constant string. Or get into a pattern of
storing the paramstyle, switch to a known one, run the query and go
back to the original one.

Also note that the paramstyle indication leaves unspecified how to
disambiguate the placeholder "magic" character. How do I hardcode a
"?" in a query for a qmark driver? "\?" or "??". The specs don't say
it, and afaik drivers may have solved this problem in different ways,
which doesn't make interoperability magically happen. Likewise how do
you escape the colon in the named format? For extra fun, postgres has
a widely used "::" cast operator, applying which would become
:value::::integer. Or :value\:\:integer (to be written in a raw
string, otherwise it better be :value\\:\\:integer). At least the
format/pyformat style borrow a well defined escape syntax from the
python spec. Also, what characters is the placeholder name made of?
What is the placeholder name in :fun() or in :array[3]? And in
":variété"? What in :value3? And is :3value a numeric or a named
placeholder? In my opinion everything but format/pyformat just too
underspecified to be really sound.

Also note that the original reason different paramstyles exist is that
different client libraries use different placeholders (qmark for
sqlite, named for oracle...) and using them the Python module can
avoid conversions, just passing-through the query. What if the dbapi
decides to define a placeholder syntax that happens to be different
from the underlying driver? A subtle conversion would now be required.
In this line of thought please note that the parameter required by the
postgres libraries is $1, $2, ... not supported by the dbapi. I'm not
advocating for their introduction: I just want to point out that qmark
exists because it was handy for sqlite and named exists because it was
handy for oracle: making them mandatory for drivers that natively
support other placeholder formats means taking an implementation
detail of a random database and make a mandatory conversion for all
the others.

While at it, I also want to point out that the entire idea of the
"paramstyle" module attribute is not sufficient to describe the
behaviour of the drivers that implement more than one paramstyle, such
as the proposed qmark/named or the widespread format/pyformat.

My (provocative) proposal for the dbapi3 is to drop the paramstyle
attribute altogether and suggest the driver users to read the manual
to know how to pass parameters to the queries.

If that's too provocative I think the Python dbapi should mandate the
%s and %(name)s formats because they are the only ones to have well
defined syntax and escaping rule and are well known to every python
developer. All the other formats are implementation details of the
underlying drivers, so it would be pointless to standardize on them.
At which point the paramstyle again becomes redundant and can be
dropped. Also note that it would be easy for a driver to convert %s
and %(name)s into ? or :1 or :name applying a regexp; not so much the
contrary (because of the above ambiguities).

The absolutely worse decision that could be made defining an API is to
mandate one format but allow something else too: this would definitely
not help interoperability and would create confusion even in
environments where a single database and a single driver for it is
used. If a driver wanted to expose a pass-through cursor using
whatever format the underlying driver takes it should be so outside
the DBAPI. As an example, psycopg would use a different cursor
subclass to allow $1 parameters, but this is strictly a consequence of
psycopg architecture and may easily be different for other drivers.

-- Daniele


More information about the DB-SIG mailing list