[Python-Dev] [ 654866 ] pickle and cPickle not equivalent
Patrick K. O'Brien
pobrien@orbtech.com
Wed, 18 Dec 2002 13:28:07 -0600
On Wednesday 18 December 2002 12:36 pm, Jim Fulton wrote:
> I think Patrick raises a reasonable issue, although it would require
> slowing things down a bit to fix it.
>
> Patrick, why do you feel you need this for a reliable persistence
> system? ZODB doesn't need this.
I don't necessarily *need* this, but it would be nice to have. Otherwise
I have to remember to always work around it. Here is a quick rundown on
my particular use case:
I'm working on a very simple persistence framework, called PyPerSyst
(http://sourceforge.net/projects/pypersyst), whose goal is to be able
to persist any picklable object graph, support ACID requirements, and
be able to safely recover from crashes. The difference between
PyPerSyst and ZODB is that PyPerSyst keeps everything in RAM (which is
how it keeps things simple).
The basic approach is to start by pickling an object graph (which we
call the System). Changes to the System are then made by sending a
Transaction class instance to the persistence Engine (sort of a
mediator thing). The Engine pickles the Transaction to a transaction
log, then executes the transaction against the System. If there is a
crash, the System is recovered from the pickle file and then all
Transactions in the log are loaded and re-executed against the System.
The actual pickling strategy is delegated by the Engine to one of
several Storage classes. One of the storages uses a single file (look
for `singlefile.py`) for the System snapshot as well as all
Transactions. This storage also allows Transactions to contain direct
references to objects in the System.
In order for this to work properly, the object references in the
Transactions must resolve to the same objects in the System when
Transactions are unpickled during recovery. In our first sample app (a
Banking app, no less) this didn't work properly because of the "bug" in
cPickle (and recent bank account transactions would effectively
disappear, even though they appeared to be recovered properly).
Because this is a framework that is supposed to work with any System and
any Transaction defined by the end-user, I can't easily guarantee that
they won't be referencing class instances that cPickle won't want to
memoize. So I either need to disallow this particular storage strategy,
or have it use only pickle (though I'd like the speed of cPickle), or
make sure cPickle works the same as pickle.
Did any of this make sense? Did I answer your question?
[The other thing I don't know is if this "bug" that we are discussing is
the only example of its kind, or whether there are more to be
discovered. But we can cross that bridge when we get there.]
--
Patrick K. O'Brien
Orbtech http://www.orbtech.com/web/pobrien
-----------------------------------------------
"Your source for Python programming expertise."
-----------------------------------------------