[Persistence-sig] "Straw Man" transaction API

Jim Fulton jim@zope.com
Mon, 15 Jul 2002 09:59:50 -0400


This is an interesting proposal. I'll me interested to see
more discussion on it. It appears to shift responsability for
management of individual object changes further into the resource
managers, which is fine.

I'm a little fuzzy on participants that write data to other participants.
The notion that they flush data on begin_savepoints feels a little
brittle to me. If the participant the flush to does any significant work
on begin_savepoint, then it appears that things could happen in an inconvenient
order and cause problems.

Is the transaction info cleared at transaction boundaries?

Jim

Phillip J. Eby wrote:
> Since it's been pretty quiet here, apart from the BOF discussion, I 
> thought I'd draft up a transaction/participant API to stir up some 
> debate.  I did a little research on JTA and related protocols in Java, 
> and found that JTA is actually pretty pitiful in comparison to the rich 
> model already offered by ZODB.  Also, the DBAPI doesn't really offer a 
> way to get at multi-phase commit protocols, but perhaps if we get a nice 
> Python transaction API together, we can encourage such access be made 
> available in DBAPI 3.0.
> 
> My goals for the straw man were to support the functionality of ZODB 
> transactions, but without any ZODB-specific baggage in the API, to 
> decouple the management of dirty objects, writes, etc. from the 
> co-ordination of the transaction itself, and to support a richer model 
> of what a "transaction participant" is, including the ability to nest or 
> chain storage mechanisms together to an arbitrary depth.  Backward 
> compatibility in the API or the transaction coordination messages was 
> explicitly not a goal.
> 
> Anyway, here it is, for all of you to pick apart or set fire to, like 
> the straw man it is.  I ask only that you read the whole thing before 
> you light up your flamethrowers.  :)
> 
> 
> """'Straw Man' Transaction Interfaces"""
> 
> class Transaction:
> 
>     """Manages transaction lifecycle, participants, and metadata.
> 
>     There is no predefined number of transactions that may exist, or
>     what they are associated with.  Depending on the application
>     model, there may be one per application, one per transaction, one
>     per incoming connection (in server applications), or some other
>     number.  The transaction package should, however, offer an API for
>     managing per-thread (or per-app, if threads aren't being used)
>     transactions, since this will probably be the most common usage
>     scenario."""
> 
>     # The basic transaction lifecycle
> 
>     def begin(self, **info):
>         """Begin a transaction.  Raise TransactionInProgress if
>         already begun.  Any keyword arguments are passed on to the
>         setInfo() method.  (See below.)"""
> 
>     def commit(self):
>         """Commit the transaction, or raise NoTransaction if not in
>         progress."""
> 
>     def abort(self):
>         """Abort the transaction, or raise NoTransaction if not in
>         progress."""
> 
> 
>     # Managing participants
> 
>     def subscribe(self, participant):
>         """Add 'participant' to the set of objects that will receive
>         transaction messages.  Note that no particular ordering of
>         participants should be assumed.  If the transaction is already
>         active, 'participant' will receive a 'begin_txn()' message. If
>         a commit or savepoint is in progress, 'participant' may also
>         receive other messages to "catch it up" to the other
>         participants.  However, if the commit or savepoint has already
>         progressed too far for the new participant to join in, a
>         TransactionInProgress error will be raised.
> 
>         Note: this is not ZODB!  Participants remain subscribed until
>         they unsubscribe, or until the transaction object is
>         de-allocated!"""
> 
>     def unsubscribe(self, participant):
>         """Remove 'participant' from the set of objects that will
>         receive transaction messages.  It can only be called when a
>         transaction is not in progress, or in response to
>         begin/commit/abort_txn() messages received by the
>         unsubscribing participant.  Otherwise, TransactionInProgress
>         will be raised."""
> 
> 
>     # Getting/setting information about a transaction
> 
>     def isActive(self):
>         """Return True if transaction is in progress."""
> 
>     def getTimestamp(self):
>         """Return the time that the transaction began, in time.time()
>         format, or None if no transaction in progress."""
> 
>     def setInfo(self, **args):
>         """Update the transaction's metadata dictionary with the
>         supplied keyword arguments.  This can be used to record
>         information such as a description of the transaction, the user
>         who performed it, etc. Note that the transaction itself does
>         nothing with this information. Transaction participants will
>         need to retrieve the information with 'getInfo()' and record
>         it at the appropriate point during the transaction."""
> 
>     def getInfo(self):
>         """Return a copy of the transaction's metadata dictionary"""
> 
> 
>     # "Sub-transaction" support
> 
>     def savepoint(self):
>         """Request a write to stable storage, and mark a savepoint for
>         possible partial rollback via 'revert()'.  This will most
>         often be used simply to suggest a good time for in-memory data
>         to be written out.  But it can also be used in conjunction
>         with revert() to provide a single-level 'nested transaction',
>         if all participants support reverting to a savepoint."""
> 
>     def revert(self):
>         """Request a rollback to the last savepoint.  If no savepoint
>         has occurred in this transaction, this is implemented via an
>         abort(), followed by a begin(), keeping the same metadata.  If
>         a savepoint has occurred, this will raise
>         CannotRevertException unless all transaction participants
>         support reverting to a savepoint."""
> 
> 
> 
> class Participant:
>     """Participant in a transaction; may be a resource manager, a
>     transactional cache, or just a logging/monitoring object.
> 
>     Event sequence is approximately as follows:
> 
>         begin_txn
>         ( ( begin_savepoint end_savepoint ) | revert ) *
>         ( begin_commit vote_commit commit_txn ) | abort_txn
> 
>     In other words, every transaction begins with begin_txn, and ends
>     with either commit_txn or abort_txn.  A commit_txn will always be
>     preceded by begin_commit and vote_commit.  An abort_txn may occur
>     at *any* point following begin_txn, and aborts the transaction.
>     begin/end_savepoint pairs and revert() messages may occur any time
>     between begin_txn and begin_commit, as long as abort_txn hasn't
>     happened.
> 
>     Generally speaking, participants fall into a few broad categories:
> 
>     * Database connections
> 
>     * Resource managers that write data to another participant, e.g. a
>       storage manager writing to a database connection
> 
>     * Resource managers that manage their own storage transactions,
>       e.g. ZODB Database/Storage objects, a filesystem-based queue, etc.
> 
>     * Objects which don't manage any transactional resources, but need to
>       know what's happening with a transaction, in order to log it.
> 
>     Each kind of participant will typically use different messages to
>     achieve their goals.  Resource managers that use other
>     participants for storage, for example, won't care much about
>     begin_txn() and vote_commit(), while a resource manager that
>     manages direct storage will care about vote_commit() very deeply!
> 
>     Resource managers that use other participants for storage, but
>     buffer writes to the other participant, will need to pay close
>     attention to the begin_savepoint() and begin_commit() messages.
>     Specifically, they must flush all pending writes to the
>     participant that handles their storage, and enter a
>     "write-through" mode, where any further writes are flushed
>     immediately to the underlying participant.  This is to ensure that
>     all writes are written to the "root participant" for those writes,
>     by the time end_savepoint() or vote_commit() is issued.
> 
>     By following this algorithm, any number of participants may be
>     chained together, such as a persistence manager that writes to an
>     XML document, which is persisted in a database table, which is
>     persisted in a disk file.  The persistence manager, the XML
>     document, the database table, and the disk file would all be
>     participants, but only the disk file would actually use
>     vote_commit() and commit_txn() to handle a commit.  All of the
>     other participants would flush pending updates and enter
>     write-through mode at the begin_commit() message, guaranteeing that
>     the disk file participant would know about all the updates by the
>     time vote_comit() was issued, regardless of the order in which the
>     participants received the messages."""
> 
>     def begin_txn(self, txn):
>         """Transaction is beginning; nothing special to be done in
>         most cases. A transactional cache might use this message to
>         reset itself.  A database connection might issue BEGIN TRAN
>         here."""
> 
>     def begin_savepoint(self, txn):
>         """Savepoint is beginning; flush dirty objects and enter
>         write-through mode, if applicable.  Note: this is not ZODB!
>         You will not get savepoint messages before a regular commit,
>         just because another savepoint has already occurred!"""
> 
>     def end_savepoint(self, txn):
>         """Savepoint is finished, it's safe to return to buffering
>         writes; a database connection would probably issue a
>         savepoint/checkpoint command here."""
> 
>     def revert(self, txn):
>         """Roll back to last savepoint, or raise
>         CannotRevertException; Database connections whose underlying
>         DB doesn't support savepoints should definitely raise
>         CannotRevertError.  Resource managers that write data to other
>         participants, should simply roll back state for all objects
>         changed since the last savepoint, whether written through to
>         the underlying storage or not.  Transactional caches may want
>         to reset on this message, also, depending on their precise
>         semantics. Note: this is not ZODB!  You will not get a
>         revert() before an abort_txn(), just because a savepoint has
>         occurred during the transaction!"""
> 
>     def begin_commit(self, txn):
>         """Transaction commit is beginning; flush dirty objects and
>         enter write-through mode, if applicable.  DB connections will
>         probably do nothing here.  Note: participants *must* continue
>         to accept writes until vote_commit() occurs, and *must*
>         accept repeated writes of the same objects!"""
> 
>     def vote_commit(self, txn):
>         """Raise an exception if commit isn't possible.  This will
>         mostly be used by resource managers that handle their own
>         storage, or the few DB connections that are capable of
>         multi-phase commit."""
> 
>     def commit_txn(self, txn):
>         """This message follows vote_commit, if no participants vetoed
>         the commit.  DB connections will probably issue COMMIT TRAN
>         here. Transactional caches might use this message to reset
>         themselves."""
> 
>     def abort_txn(self, txn):
>         """This message can be received at any time, and means the
>         entire transaction must be rolled back.  Transactional caches
>         might use this message to reset themselves."""
> 
> 
> 
> 
> _______________________________________________
> Persistence-sig mailing list
> Persistence-sig@python.org
> http://mail.python.org/mailman-21/listinfo/persistence-sig



-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (888) 344-4332            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org