[DB-SIG] Next Version [was Re: Preparing statement API]

M.-A. Lemburg mal@lemburg.com
Sun, 30 Jan 2000 22:18:41 +0100

Stuart 'Zen' Bishop wrote:
> On Sat, 29 Jan 2000, M.-A. Lemburg wrote:
> > While this would help, it is not needed. All you have to do is
> > check whether the connection object has the .rollback() method
> > and if it does, whether it works:
> >
> > def haveTransactions(conn):
> >     if not hasattr(conn,'rollback'):
> >       return 0
> >     try:
> >       conn.rollback()
> >     except:
> >       return 0
> >     else:
> >       return 1
> >
> > The function has to be called on newly created connection objects
> > to not cause unwanted .rollbacks().
> I've realized the obvious that faking autocommit mode in a higher level
> abstraction layer will double the number of calls to the RDBMS. I think
> there is definitly a need to a conn.autocommit() method for drivers that
> support it.

Just my 2cents: auto commit is a *bad thing* and a pretty evil
invention of ODBC. While it does make writing ODBC drivers
simpler (ones which don't support transactions that is), it
is potentially dangerous at times, e.g. take a crashing
program: there is no way to recover from errors because the
database has no way of knowing which data is valid and which is
not. No commercial application handling "mission critical" (I 
love that term ;-) data would ever want to run in auto-commit
Note that the DB API 2.0 explicitly state that
Note that if the database supports an
auto-commit feature, this must be initially off. An interface method may be
provided to turn it back on. 

> > Such a higher level layer written in Python would indeed be nice.
> > It should ideally focus on an OO-style representation of DB
> > access similar to the RogueWave OO-layer on top of JDBC... well,IMHO
> > anyway ;-)
> I was thinking of just building on the existing Driver design.
> I feel the design of the driver API would provide an excellent basis
> for an abstraction layer. I've experimented with OO access to RDBMS tables,
> and found in many cases (the ones I was working on at the time), it creates
> more problems than it solves.

The main advantage is that it offers more flexibility when it
comes to dealing with columns of different types. Anyway, this
is just my view of things, so it might well not capture the
needs of others.
> > Perhaps someone could summarize these features in Perl's DBI or
> > provide pointers ?! I, for one, don't have any experience with Perl's
> > DBI.
> I've attached the DBI docs to this email for people who want a read.
> Its probably a few months out of date.
> Below is a summary/comparison/general mess.

Thanks for the summary. Comments follow...
> What PerlDBI has that Python DB doesn't (translated to pseudo-python).
> I've ignored the sections that are irrelevant (we thankfully don't have
> to worry about Perl's error handling and references):
>     con.quote('It's an ex-parrot') == "'It''s an ex-parrot'"
>     con.quote(None) == "NULL"
>     con.quote(dbi.Date()) == "TO_DATE('1999-04-01 12:53','YYYY-MM-DD HH:MM')"
>     etc.
>         This method allows us to dynamically build SQL, without worrying
>         about the specifics of our RDBMS quoting. Actually, I don't think
>         PerlDBI does the last, but the Python version could as the PyDB
>         handles dates. Really useful for when you are building table joins
>         and selection criteria from command line arguments, HTML form input
>         etc. They also allow con.quote(arga,type), but I don't know how
>         useful this would be in python - perhaps some obscure case where
>         python cannot easily cast a datatype but the RDBMS can.

Quoting is generally not needed since you pass in the data
via bound parameters. The DB interface does the needed quoting
(if any) without user intervention.
>     con.autocommit(1)
>     mode = con.autocommit()
>         As discussed above. An interesting point from the docs - 'Drivers
>         should always default to AutoCommit mode. (An unfortunate choice
>         forced on the DBI by ODBC and JDBC conventions.)'. I see no
>         reason why we can't do it properly :-)

See above... all drivers should default to auto-commit mode
being *disabled*.
>     cur.prepare('select * from parrots')
>     sth.execute()
>         PerlDBI allows you to explicitly prepare your statement. PyDB
>         neatly sidesteps this issue, but it is non obvious to people
>         who come from a DBI/ODBC/JDBC background. A small snippet
>         of code should be added to the PyDB API to demonstrate the reuse:
>         q = 'select * from parrots where dead = ?'
>         cur = con.Connect(...)
>         cur.execute(q,('T',))
>         print cur.fetchall()
>         cur.execute(q,('F',))  # Reuses same RDBMS cursor if possible.
>                                # Major performance gain on some RDBMS
>         print cur.getchall()
>         q2 = 'select * from parrots where %s = ?' % dead
>         cur.execute(q2,('T',)) # Note that a new RDBMS cursor will be created
>                                # because a different string instance is passed.
>                                # String equality is not enough.

Hmm, the spec is a little unclear about this: it should of course
say that "equality is enough". mxODBC first checks for identity
and then for equality, BTW.

>         print cur.fetchall()
>     cur.fetch_mapping() = {'COLA': 'A Value','COLB','B Value'}
>         Also a fetchall variant. This would live in an abstraction layer,
>         not in the driver layer.

>     DBI.available_drivers()
>         List of available drivers. Leave this for version 2 of the abstraction
>         layer, as it would mean it couldn't support PyDB 1 or PyDB 2 drivers.
>     DBI.data_sources(driver)
>         List of available data sources. 'Note that many drivers have no way
>         of knowing what data sources might be available for it and thus,
>         typically, return an empty or incomplete list.'. Might be worth
>         defining the call for those few drivers that can support it.
>     cur.longlength = 2 * 1024 * 1024
>     cur.truncok = 1

The truncok attribute is interesting... I would suggest to have
a .dontreportwarnings attribute (or something along those lines)
on both cursor and connection objects (or some other way to turn off
reporting of warnings via exceptions). There are sometimes troubles
with ODBC functions returning warnings and mxODBC having to
raise a Warning exception: since some sequences cannot be completed
because of the warning exceptions, there is considerable loss
of functionality results for some DB backends (MS SQL Server being
a prominent example). mxODBC currently solves this by simply ignoring
some warnings...
>         Specify the maximum size of a blob returned from the database,
>         and if an exception should be raised if the data would need to
>         be truncated. Does any interface handle BLOBS nicely? I've never
>         used them myself, as they just seem to painful.

These two are only relevant for interfaces to DB managers
such as ODBC managers which arbitrate DB requests to multiple
ODBC drivers. A single driver usually only handles one data source.
> Things Python DB has that PerlDBI doesnt:
>     Better date handling
>         PerlDBI claims Y2K compliance because it knows nothing about dates
>         (although some drivers contain some date handling ability).

>     Generic functions for input/output optimization.
>         PyDB provides the arraysize() and setXXXsize() methods to allow
>         the programmer to specify hints to the RDBMS. These hints are
>         driver specific in PerlDBI.
>     Generic function for calling stored procedures.
>         PerlDBI requires you to do this though the standard cursor
>         interface, which makes it a pain to obtain stored function output.
>     cur.nextset() method
>         PerlDBI doesn't know about multiple result sets (and neither do
>         I for that matter...)
>     Driver.threadsafety
>         There is no generic way of interigating a PerlDBI driver for
>         thread safety - if you want to write generic DB code, it needs
>         to be single threaded.
> An interesting bit that I think relates to Digital Creations issues
> (I wasn't paying too much attention at the time...):

Not sure what you're talking about here... what issues were there ?
> Autocommit
>     [...]
>     * Database in which a transaction must be explicitly started
>          For these database the intention is to have them act
>          like databases in which a transaction is always active
>          (as described above).
>          To do this the DBI driver will automatically begin a
>          transaction when AutoCommit is turned off (from the
>          default on state) and will automatically begin another
>          transaction after a the commit entry elsewhere in this
>          documentor the rollback entry elsewhere in this
>          document.
>          In this way, the application does not have to treat
>          these databases as a special case.

We should also have a similar paragraph in the DB API spec.

Just curious: In what state would a DB be if no transaction
was explicitly started ?

Marc-Andre Lemburg
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/