[Persistence-sig]
Re: ACID, savepoints, and exceptions (was re: "Straw Man"transaction API)
Phillip J. Eby
pje@telecommunity.com
Wed, 21 Aug 2002 19:02:48 -0400
At 12:32 AM 08/20/2002 -0400, Jeremy Hylton wrote:
> >>>>> "PJE" == Phillip J Eby <pje@telecommunity.com> writes:
> PJE> In my API I've standardized on a 'CannotRevertException' when
> PJE> rollback to a savepoint is not possible, and added a
> PJE> 'NullSavepoint' object which can be returned by an object that
> PJE> has nothing to do on rollback.
>
>NullSavepoint is just an implementation convenience, right?
Yep.
> PJE> An open issue that needs to be addressed, however, is the
> PJE> question of rolling back more than once to the same savepoint.
> PJE> In some ways, it's a very handy capability, but I'm not sure
> PJE> which databases support this.
>
>Let me ask the question the other way: Of the databases that support
>savepoints, which ones don't support this?
An interesting question. The reason I'm iffy about it is that the ones I
looked at (Sybase, Oracle, and SleepyCat/BerkeleyDB) weren't very precise
in their docs, at least the docs I looked at. They simply didn't mention
what happens to a savepoint once you roll back to it. SleepyCat offers
nested transactions, which I *believe* are terminated upon rollback, just
like top-level transactions. So anything implemented on a SleepyCat
back-end might need to work around this issue.
> PJE> I'm therefore inclined to say we
> PJE> should explicitly say that a savepoint can be rolled back at
> PJE> most once (since some savepoints may not be able to be rolled
> PJE> back).
>
>I want savepoints that can be returned to multiple times. If a
>database supports savepoints at all, I don't see why it wouldn't
>support multiple rollbacks. (If it didn't, an adapter could
>just call savepoint() as part of finishing each rollback().) Multiple
>rollbacks is necessary to support nested transactions.
I don't think that rollback to the *same* savepoint is necessary, but I
suppose the point is moot, since even a DB that didn't allow multiple
rollbacks would logically support creating a second savepoint at the
location you got to after rolling back the first. It's a little more work
to implement in that case, but I think I agree with your logic.
But... there is a difference in implementation burden that applies
here. How many applications will use savepoints as part of their natural
flow, and is it too much to ask to have them do:
while 1:
sp = txn.savepoint()
try:
# do something that might fail...
except:
sp.rollback()
continue
The only difference here, as far as I can see, is that the savepoint() call
is in the loop (in my suggested approach) instead of just above and outside
it (as it would be with reusable savepoints).
Perhaps there's something else you're using savepoints for that doesn't
look like this sort of loop, in which case it would be interesting to learn
about that use case.
> PJE> This suggests adding a
> PJE> 'canRollback()' method to the interface, such that a rollback
> PJE> aggregator can check that its aggregated savepoints can
> PJE> actually be rolled back, so that "CannotRevert" errors don't
> PJE> cause the transaction to be hosed.
>
>It's probably good to have some way to query this, although I feel
>like the predicate methods for testing features haven't worked out all
>that well in the ZODB3 storage api. What about that client code has
>access to would support the canRollback() method? It seems like it
>depends on which objects are participating in the transaction.
>
>I tend more towards an ask for forgiveness (AFF) than a look before
>you leap (LBYL). If savepoint() returned None when it wasn't possible
>to rollback, that would be good enough, no? The clients know, for
>their specific transaction, whether rollback is going to work. The
>savepoint() presumably hasn't caused too much extra work in those
>cases.
Okay. So what you're saying is, document that savepoint() returns an
IRollback or None, and None means you can't roll back to the
savepoint. And if any participant returns None for the savepoint() call,
the transaction must return None from its savepoint() call. I'm good with
that; my primary goal here is just to remove the ambiguity of what happens
when something can savepoint() but not rollback().
> PJE> Another issue here is clean aborts. If an error is raised by a
> PJE> data manager during abort, what should the semantics be? Older
> PJE> ZODB transaction classes wrap every data manager abort call in
> PJE> a try-except that ensures that *all* the abort methods get
> PJE> called, even if several of them raise errors. The new ZODB4
> PJE> transaction API doesn't do this, and thus can fail to
> PJE> completely roll back a transaction.
>
>I tried to do as little as possible within the commit() implementation
>to deal with errors. I figured if an error occurs, the client had
>better abort the transaction explicitly. The documentation for ZODB3
>said that clients needed to do this, but the implementation didn't
>work that way.
Er, the paragraph I wrote above is about the abort() method; the word
"commit" isn't even in the the paragraph. :) I'm fine with the idea of
requiring an explicit abort() by the application upon exception during
commit(). It's the fact that ZODB4 doesn't trap errors during *abort()*
that's an issue for me, relative to older ZODB versions.
When I get back from the Enterprise Architecture summit, I plan to redo
some things in my own "straw man" transactions for PEAK. I realized on the
trip up here, that I haven't really thought through some of the
ramifications of Shane's "multi-pass commit" counter-proposal to my
"write-through cascade" architecture. For example, durable subscriptions
make less sense in the multi-pass commit model, because there are more
objects to call, more times, up to O(n^2) in the degenerate case, for
fairly large "n" (I expect to have dozens of data managers per app,
although relatively few will have active involvement in a given
transaction). I also need to think through how the re-pass protocol will
work, given the absence of durable subscriptions.
I have some hope that these re-thinks will make the API leaner and meaner
than I currently have it, while retaining "Zopeward
compatibility". Ideally, we should be able to each present our somewhat
different transaction models to the SIG, as a jumping-off point for future
discussion.
I have lowered my expectations somewhat, however, with respect to the SIG's
goal of a transaction API. Previously I hoped to use the to-be-decided API
as PEAK's core transaction API, but now I'm aspiring merely to have in PEAK
an API that can be adapted to that of the SIG. Or, if I turn out to be
really lucky, the PEAK API may merely end up being a slight superset
relative to the SIG API. Unfortunately, I have too much code in too many
projects which need the PEAK transaction API to exist already, and so I
need to move forward with *something*, even if I end up having to do some
refactoring later.
Luckily, however, my first draft at an actual PEAK implementation, both of
a standalone transaction service and as a transaction service layered over
the ZODB4 transaction API, verified for me that it's possible to do this
kind of layering, as long as the underlying transaction API is at least as
rich as that of ZODB4. And I'm guessing the SIG isn't going to endorse any
transaction model that isn't at least that rich. :)