[DB-SIG] API 3.0 limiting paramstyle to ['named', 'qmark'] is okay. ('format' is not desirable)

M.-A. Lemburg mal at egenix.com
Fri May 17 18:57:07 CEST 2013

On 17.05.2013 17:50, Daniele Varrazzo wrote:
> Sorry, originally sent this message only to M.-A.L.
> On Fri, May 17, 2013 at 11:42 AM, M.-A. Lemburg <mal at egenix.com> wrote:
>> psychopg2 uses the 'format' paramstyle and while I agree that
>> it has issues, I think the existing code base using
>> it is large enough that we cannot easily remove that
>> paramstyle :-(
>> I'd love to hear what Daniele thinks about this.
> Of course switching psycopg to named/qmark placeholders means
> immediately destroying the psycopg compatibility with 100% of the
> applications written using it. I cannot imagine earning any love from
> our user base for this choice. Ditto for the mysql driver and I guess
> others.

Thought so :-)

So I think we all agree that the existing paramstyles cannot
simply be dropped.

This leaves the option of making things
easier for new users and existing users who write new code, by
allowing them to choose which paramstyle they prefer instead of
having the database modules mandate the style.

> Reading back the upstream thread I also see proposal to leave support
> for all the current placeholders but mandate drivers to implement
> qmark and named too. I think this is only going to complicate the
> implementation of the drivers and of the programs using it.

Why is that ?

> Even
> making paramstyle a connection or cursor property there will always be
> the case of a connection passed to library functions written for the
> previous param style. Every function taking a connection as input
> should check what is the paramstyle used and dynamically generate a
> query, instead of using a constant string. Or get into a pattern of
> storing the paramstyle, switch to a known one, run the query and go
> back to the original one.

The .paramstyle on the connection is only used as default value
for the cursor.paramstyle. Functions that take a connection
as argument and don't know which .paramstyle is being used
by the application would then do:

cursor = connection.cursor()
cursor.paramstyle = 'named'

This also make the intention very clear in the function
for someone who's reading the code.

> Also note that the paramstyle indication leaves unspecified how to
> disambiguate the placeholder "magic" character. How do I hardcode a
> "?" in a query for a qmark driver? "\?" or "??". The specs don't say
> it, and afaik drivers may have solved this problem in different ways,
> which doesn't make interoperability magically happen. 

You normally don't have to escape '?' in SQL statements. It is
allowed inside SQL string literals, but I've not seen it used
outside those literals for anything other than a binding
parameter marker.

> Likewise how do
> you escape the colon in the named format?

The format usually is defined as ":[a-z0-9]+", with the additional
requirement that the marker is not preceded by a colon, e.g.
"::integer" is not a valid marker.

> For extra fun, postgres has
> a widely used "::" cast operator, applying which would become
> :value::::integer. Or :value\:\:integer (to be written in a raw
> string, otherwise it better be :value\\:\\:integer). At least the
> format/pyformat style borrow a well defined escape syntax from the
> python spec. Also, what characters is the placeholder name made of?
> What is the placeholder name in :fun() or in :array[3]? And in
> ":variété"? What in :value3? And is :3value a numeric or a named
> placeholder? In my opinion everything but format/pyformat just too
> underspecified to be really sound.

True, we would have to clarify the definitions of those parameter
markers a bit. They are currently defined by the database driver
or backend and only serve an informational purpose.

> Also note that the original reason different paramstyles exist is that
> different client libraries use different placeholders (qmark for
> sqlite, named for oracle...) and using them the Python module can
> avoid conversions, just passing-through the query.

Aside: qmark is the ODBC standard and also used by MS SQL Server,
Sybase and DB2 as native paramstyle. named is the native style
used by Oracle.

You're right, that the original motivation was to avoid
having to parse/convert the SQL statements in the database

However, given that it's not all that hard to write such
conversion routines, I think we should reconsider the case
and make things easier for the user.

> What if the dbapi
> decides to define a placeholder syntax that happens to be different
> from the underlying driver? A subtle conversion would now be required.
> In this line of thought please note that the parameter required by the
> postgres libraries is $1, $2, ... not supported by the dbapi. I'm not
> advocating for their introduction: I just want to point out that qmark
> exists because it was handy for sqlite and named exists because it was
> handy for oracle: making them mandatory for drivers that natively
> support other placeholder formats means taking an implementation
> detail of a random database and make a mandatory conversion for all
> the others.

We'd only make the support for 'named' and 'qmark' mandatory.
Database modules would still be free to continue to also offer
'format' or 'pyformat' or even other paramstyles such as the
$n one.

> While at it, I also want to point out that the entire idea of the
> "paramstyle" module attribute is not sufficient to describe the
> behaviour of the drivers that implement more than one paramstyle, such
> as the proposed qmark/named or the widespread format/pyformat.

The module scope variable provides the default for all
connections created with the module, so the
sequence goes like this:

module.paramstyle -> connection.paramstyle -> cursor.paramstyle

> My (provocative) proposal for the dbapi3 is to drop the paramstyle
> attribute altogether and suggest the driver users to read the manual
> to know how to pass parameters to the queries.
> If that's too provocative I think the Python dbapi should mandate the
> %s and %(name)s formats because they are the only ones to have well
> defined syntax and escaping rule and are well known to every python
> developer. All the other formats are implementation details of the
> underlying drivers, so it would be pointless to standardize on them.
> At which point the paramstyle again becomes redundant and can be
> dropped. Also note that it would be easy for a driver to convert %s
> and %(name)s into ? or :1 or :name applying a regexp; not so much the
> contrary (because of the above ambiguities).

Well, I guess you're biased on this one :-)

We've already had the discussion and the poll on the paramstyle
markers and consensus was to focus on qmark and named.

> The absolutely worse decision that could be made defining an API is to
> mandate one format but allow something else too: this would definitely
> not help interoperability and would create confusion even in
> environments where a single database and a single driver for it is
> used. If a driver wanted to expose a pass-through cursor using
> whatever format the underlying driver takes it should be so outside
> the DBAPI. As an example, psycopg would use a different cursor
> subclass to allow $1 parameters, but this is strictly a consequence of
> psycopg architecture and may easily be different for other drivers.

I don't understand this one. How would we make things more
difficult for users if we mandate two of the already available
paramstyles ?

I think it would greatly simplify the use of the DB-API, since
applications could then be written to use a single paramstyle,
even when using multiple database modules.

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, May 17 2013)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
2013-05-07: Released mxODBC Zope DA 2.1.2 ...     http://egenix.com/go46
2013-05-06: Released mxODBC 3.2.3 ...             http://egenix.com/go45

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

More information about the DB-SIG mailing list