[Persistence-sig] A simple Observation API

Tue, 30 Jul 2002 13:58:32 -0400

At 08:40 AM 7/30/02 -0400, Jeremy Hylton wrote:
>>>>>> "PJE" == Phillip J Eby <pje@telecommunity.com> writes:
>
>  PJE> Well, the example implementation I wrote took care of all of
>  PJE> that, quite elegantly I thought.  But for my purposes, it's
>  PJE> sufficient as long as _p_changed is set after the last
>  PJE> modification that occurs.  It's okay if it's also set after
>  PJE> previous modifications.  It just must be set after the last
>  PJE> modification, regardless of how many other times it's set.
>
>  PJE> This requirement on my part has strictly to do with data
>  PJE> managers that write to other data managers, in the context of
>  PJE> the transaction API I proposed.
>
>Can you explain how _p_changed is used outside of transaction control?
>I still don't understand how the timing of _p_changed affects things.
>

This has to do with the "write-through mode" phase between
"prepareToCommit()" and "voteOnCommit()" messages (whatever you call them).
 During this phase, to support cascaded storage (one data manager writes to
another), all data managers must "write through" any changes that occur
*immediately*.  They can't wait for "prepareToCommit()", because they've
already received it.  Basically, when the object says, "I've changed"
(i.e. via "register" or "notify" or whatever you call it), the data manager
must write it out right then.

But, if the _p_changed flag is set *before* the change, the data manager
has no way to know what the change was and write it.  It can't wait for
"voteOnCommit()", because then the DM it writes to might have already
voted, for example.  It *must* know about the change as soon as the change
has occurred.  Thus, the change message must *follow* a change.  It's okay
if there are multiple change messages, as long as there's at least one
*after* a set of changes.

Now, you may say that there are other ways to address dependencies between
participants than having "write-through mode" during the prepare->vote
phase.  And you're right.  ZPatterns certainly manages to work around this,
as does Steve Alexander's TransactionAgents.  TransactionAgents, however,
is actually a partial rewrite of the Zope transaction machinery, and there
are some holes in how ZPatterns addresses the issue as well.  (ZPatterns
addresses it by adding more objects to the transaction during the
"commit()" calls to the data managers, that are roughly equivalent to the
current "prepare()" message concept.)

We could address this by having transaction participants declare their
dependencies to other participants, and have the transaction do a
topological sort, and send all messages in dependency order.  It could then
be an error to have a circular dependency, and data managers could raise an
error if they received an object change message once they were done with
the prepare() call.  It would make the Transaction API and implementation a
bit more complex, leave data managers about the same in complexity as they
would have been before, and it would mean that persistent objects wouldn't
need to worry about whether _p_changed was flagged before or after a change.

I proposed the direction I proposed, however, because it seemed to me
easier to require _p_changed to be after, than to make the transaction
manage a dependency graph.  Data managers will still have to keep track of
whether they've received a prepare() message, and do something special with
a change notification during that time, regardless of whether you manage
dependencies or have a "write-through" mode.

But, with explicit dependency management, DM's also have the extra overhead
of declaring their dependencies at registration, and they lose the ability
to "not know" who they depend on.  In other words, some
modularity/information hiding is lost if you can't have the data manager
delegate to functions or objects that know "how" to write the data, without
it having to know as well in order to do the registration.

Plus, had I proposed dependency management, I would be now defending
*that*, and I figured "_p_changed after" would be easier to justify.  :)
Perhaps I should have proposed dependency management instead, so that then
you could have said, "oh but we could solve that more easily if we just
made _p_changed be after instead of before", and then I would have said,
"Oh, of course, that's brilliant".  :)

All joking aside, I'm not married to either approach.  If you have
something that'll do it better than either way, or if I've somehow
overlooked a way in which this is already solved by the new ZODB4 API,
please let me know.