[DB-SIG] Experiences with DB-API2.0

24 Jun 2002 13:54:21 -0000

Andy Dustman <andy@dustman.net> wrote:
>
>On Fri, 2002-06-21 at 15:31, M.-A. Lemburg wrote:
>> Some examples which almost always need some way of tweaking:
>> 
>> * auto-increment or sequencing

Compare MySQL and PostgreSQL for example. To a point, though, I can imagine 
that it is possible to implement a procedure or an extension in MySQL to give 
PostgreSQL-style sequences, and it might be possible to implement something in 
PostgreSQL to support self-incrementing columns, although I'd want to stay well 
away from either activity. Even when considering portability between databases 
that favour sequences (Oracle, PostgreSQL) the syntax used to access sequences 
differs.

I suppose that the best approach, where possible, is to modify the database 
environment as much as possible to suit the Python applications. If that means 
some stored procedure implementation, then I would consider that the best 
solution, as opposed to writing lots of Python-based abstraction code.

>> * blobs and other more esoteric data types like images and Unicode

JDBC does at least provide some support for this.

>> * for MySQL and non-SQL databases: sub-selects, joins, views, transactions

See below for pertinent remarks from Mr Dustman.

>> * count()
>> * select distinct

I suppose different behaviour can be experienced here, but also see below.

>> * locking
>> * transaction isolation
>> * database, table and user creation
>> * various limits on data types
>> * syntax for defining data types
>> * names of data types

So-called administrative functions are arguably somewhat easier to factor out 
of most applications, although the point below also addresses this somewhat. 
Certainly, the average SELECT, INSERT or UPDATE isn't going to get into such 
details, and where type issues arise, surely JDBC has something to teach us.

>> and probably a dozen more.
>
>I think this highlights that there are two separate portability issues.
>
>1) Ability to execute queries on the database and return results
>
>2) The queries themselves

This is exactly the point I was referring to. Even JDBC doesn't do too much 
about (2) despite enforcing certain standard syntax for things like stored 
procedure invocation.

>DB-API does a good job, IMHO, on point 1, once you get a connection
>object. If we were doing another version, I'd standardize connect by
>ditching connection *strings* for connection *parameters* (host, user,
>passwd are pretty universal) and keyword arguments for anything else.

Arguably, everything should be a keyword argument. People keep going on 
about "explicit being better than implicit", so let's enforce that.

As for the paramstyle, I would suggest standardising on one style just like 
JDBC. Some people might miss their favourite style, but the benefits of not 
having to encode an extra abstraction layer over every statement invocation are 
just too great. Besides, I'm tempted to think that the use of %s as a 
paramstyle confuses beginners - they might start to think of parameters as 
being equivalent to string substitutions, which they are not.

>DB-API doesn't do much for point 2, nor was it ever intended to. It
>doesn't even require that SQL be passed to .execute. That, perhaps,
>should be a job for the SQL-API, which doesn't exist, but could be an
>additional layer on top of DB-API. Indeed, this could be a fairly
>standard wrapper/proxy class for a connection object which is extended
>as needed with mixins for various implementations (DB-API modules) which
>actually constructs SQL statements.

This is much harder than doing the relatively minor improvements described 
above, and I would argue that if someone really wants their queries to work 
everywhere, then they should pay special attention to that themselves 
(configuring their database systems as much as possible, as stated above). 
However, this view should *not* obstruct the other work from going ahead, 
especially since a number of people have expressed their discontent with the 
current situation.

Paul