DB API 2.0 and transactions

Magnus Lycka lycka at carmen.se
Fri Jun 10 11:42:23 CEST 2005


You might have spotted a fairly nasty bug there!

Christopher J. Bottaro wrote:
> Hi,
> Why is there no support for explicit transactions in the DB API?  I mean
> like transaction() to start the trans and commit() and rollback() would end
> the trans or something.

To quote from Date & Darwen "A Guide to the SQL Standard, 4th ed.":

"An SQL-transaction is initiated when the relevant SQL-agent executes
a 'transaction-initiating' SQL statement (...) and the SQL-agent does
not already have an SQL transaction in progress. Note, therefore, that
(...) SQL-transactions can't be nested. Note too that transaction
initiation is always implicit--there is no explicit 'BEGIN TRANSACTION'
statement."

The Python DB-API standard matches the SQL standard, and that seems
reasonable.

> This gets weird when using the Python DB API to interact with Postgres
> because a transaction gets started in 3 places:  connection, commit,
> rollback.

That's not how it's supposed to work! Are you sure that you don't
implicitly start transactions by SELECTs etc?

PostgreSQL violates the SQL standards by running in autocommit mode
unless you explicitly perform its non-standard BEGIN command. If you
are right about the behaviour you describe, the PostgreSQL binding
for Python that you use may have taken the easy route, and performs
a "BEGIN" on connect and after every commit or rollback.

If so, this is a serious bug, and should be reported as one. The correct
thing to do is to insert the BEGIN just before the first SQL statement
that is affecting transactions. Of course, this means that the binding
needs to keep track of transaction state, and this makes it a little
bit more complicated. You'd need something like this in the binding:

class connection:
     def __init__(...):
         ...
         self.inTxn = False

     def commit(...):
         ...
         self.inTxn = False

     def rollback(...):
         ...
         self.inTxn = False

     def execute(...):
         ...
         if not self.inTxn:
             perform the BEGIN command against the backend
             self.inTxn = True
         ...

Actually, this isn't perfect either, because not all SQL commands
(should) initate transactions, but it's a lot closer to what we want.

This bug has implications far beyond timestamps. Imagine two transaction
running with isolation level set to e.g. serializable. Transaction A
updates the AMOUNT column in various rows of table X, and transaction
B calculates the sum of all AMOUNTS.

Lets say they run over time like this, with | marking begin and >
commit (N.B. ASCII art, you need a fixed font):

...|--A-->.......|--A-->........
...........|-B->.........|-B->..

This works as expected...but imagine transactions implicitly
begin too early:

|-----A-->|---------A-->|-------
|------------B->|----------B->|-

This will cause the aggregations in B to show "delayed" results.
Not at all what one might expect...


For more about isolation levels, see e.g. here:
http://pgsqld.active-venture.com/transaction-iso.html



More information about the Python-list mailing list