[DB-SIG] API 3.0 limiting paramstyle to ['named', 'qmark'] is okay. ('format' is not desirable)

Michael Bayer mike_mp at zzzcomputing.com
Fri May 17 18:22:58 CEST 2013


On May 17, 2013, at 11:50 AM, Daniele Varrazzo <daniele.varrazzo at gmail.com> wrote:

> Sorry, originally sent this message only to M.-A.L.
> 
> On Fri, May 17, 2013 at 11:42 AM, M.-A. Lemburg <mal at egenix.com> wrote:
>> psychopg2 uses the 'format' paramstyle and while I agree that
>> it has issues, I think the existing code base using
>> it is large enough that we cannot easily remove that
>> paramstyle :-(
>> 
>> I'd love to hear what Daniele thinks about this.
> 
> Of course switching psycopg to named/qmark placeholders means
> immediately destroying the psycopg compatibility with 100% of the
> applications written using it. I cannot imagine earning any love from
> our user base for this choice. Ditto for the mysql driver and I guess
> others.

we're talking here about DBAPI3.  psycopg2 would offer this as a new name within the module namespace.  The existing DBAPI2 system would remain unaffected.

> 
> Reading back the upstream thread I also see proposal to leave support
> for all the current placeholders but mandate drivers to implement
> qmark and named too. I think this is only going to complicate the
> implementation of the drivers and of the programs using it.

Drivers written against the DBAPI3 spec should only support qmark and named.   Existing DBAPI2 drivers can remain supporting whatever they do now.


> Even
> making paramstyle a connection or cursor property there will always be
> the case of a connection passed to library functions written for the
> previous param style.

> Every function taking a connection as input
> should check what is the paramstyle used and dynamically generate a
> query, instead of using a constant string. Or get into a pattern of
> storing the paramstyle, switch to a known one, run the query and go
> back to the original one.

I agree, making paramstyle a settable attribute is not helping anyone.



> Also note that the paramstyle indication leaves unspecified how to
> disambiguate the placeholder "magic" character. How do I hardcode a
> "?" in a query for a qmark driver? "\?" or "??". The specs don't say
> it, and afaik drivers may have solved this problem in different ways,
> which doesn't make interoperability magically happen.

this would be great, I already have to struggle quite a bit with percent signs among drivers that accept "format" vs. those that don't vs. those that sometimes do.


> Likewise how do
> you escape the colon in the named format? For extra fun, postgres has
> a widely used "::" cast operator, applying which would become
> :value::::integer. Or :value\:\:integer (to be written in a raw
> string, otherwise it better be :value\\:\\:integer).

We should probably look to see what JDBC does as it supports both ? and :named.

> At least the
> format/pyformat style borrow a well defined escape syntax from the
> python spec.

The major issue with format/pyformat is that too many DBAPIs take the easy way out and just run their string through Python's "%" operator.   This causes DBAPIs that use the format styles to behave differently from those that don't.   In particular, a problem I have with MySQL-python and *maybe* recent versions of psycopg2 (would have to check again) is that the behavior changes whether or not I pass parameters to cursor.execute(). The same statement that includes a "%" sign as part of the string must be passed without escaping when no list/dictionary is passed as the second argument, but when a list/dictionary is passed, the parsing changes and now the % sign has to be escaped normally.   This is an undocumented behavior that's directly a side effect of the DBAPI saying, "oh there's parameters, let's call "stmt % params", vs. not.

 I'd prefer every DBAPI use extremely similar parsing routines, so that escaping behavior and all of that is completely consistent.


> Also, what characters is the placeholder name made of?
> What is the placeholder name in :fun() or in :array[3]? And in
> ":variété"? What in :value3? And is :3value a numeric or a named
> placeholder? In my opinion everything but format/pyformat just too
> underspecified to be really sound.

see JDBC

> 
> Also note that the original reason different paramstyles exist is that
> different client libraries use different placeholders (qmark for
> sqlite, named for oracle...) and using them the Python module can
> avoid conversions, just passing-through the query. What if the dbapi
> decides to define a placeholder syntax that happens to be different
> from the underlying driver? A subtle conversion would now be required.

having six parameter styles is extremely inconvenient, and there is no DBAPI I'm aware of that doesn't end up doing some kind of conversion in any case.


> In this line of thought please note that the parameter required by the
> postgres libraries is $1, $2, ... not supported by the dbapi. I'm not
> advocating for their introduction: I just want to point out that qmark
> exists because it was handy for sqlite and named exists because it was
> handy for oracle: making them mandatory for drivers that natively
> support other placeholder formats means taking an implementation
> detail of a random database and make a mandatory conversion for all
> the others.

qmark and colon are widely accepted standard formats I've seen in use since the 90's


> 
> While at it, I also want to point out that the entire idea of the
> "paramstyle" module attribute is not sufficient to describe the
> behaviour of the drivers that implement more than one paramstyle, such
> as the proposed qmark/named or the widespread format/pyformat.

a DBAPI3 that supports only qmark and named, based on the type of collection passed to execute(), no longer needs a ".paramstyle" attribute at all.


> 
> My (provocative) proposal for the dbapi3 is to drop the paramstyle
> attribute altogether and suggest the driver users to read the manual
> to know how to pass parameters to the queries.

Agree on drop the paramstyle, disagree on "read the manual" - the enormous issue with DBAPI2 is how all the drivers make up their own conventions and behaviors for what should be entirely consistent systems.   I shouldn't have to read any DBAPI2 manual to be able to use a particular DBAPI in plain "vanilla" mode - the spec should be all that's needed, unless I would like to use specific features offered by that DBAPI.

> 
> The absolutely worse decision that could be made defining an API is to
> mandate one format but allow something else too: this would definitely
> not help interoperability and would create confusion even in
> environments where a single database and a single driver for it is
> used.

agree, I'd prefer fewer "choices" in DBAPI, and it's all the "optional" things allowed now that turns it into the wild west.  I think DBAPI3 should be presented as a new module within DBAPIs, using the format "from <package> import dbapi3", presenting a "connect()" function in dbapi3, and only dbapi3 behaviors present in that namespace.

It would be absolutely great if I could install any of 20 different DBAPIs, and for each one call upon "from <name> import dbapi3; dbapi3.connect()" in exactly the same way.  




More information about the DB-SIG mailing list