[Python-bugs-list] [ python-Bugs-654866 ] pickle and cPickle not equivalent

noreply@sourceforge.net noreply@sourceforge.net
Tue, 17 Dec 2002 13:53:00 -0800


Bugs item #654866, was opened at 2002-12-16 22:37
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=654866&group_id=5470

Category: Python Library
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Patrick K. O'Brien (pobrien)
Assigned to: Nobody/Anonymous (nobody)
Summary: pickle and cPickle not equivalent

Initial Comment:
Ignoring cosmetic differences, cPickle is not properly  
pickling references to items in its memo, while pickle  
does the right thing. Attached are two files, created by  
the same process, one using pickle, the other cPickle.  
 
I'll describe the important difference. First we pickle a 
python graph (Bank) that contains references to 
accounts. Then we pickle two transactions 
(DepositOrWithdrawal) that contain references to one of 
the accounts. The pickle version works fine and does a 
get of the account pickled with the Bank: 
 
(ccommands 
DepositOrWithdrawal 
p20 
g2 
Ntp21 
R(dp22 
g6 
F1039911613.244611 
sS'acnt' 
p23 
g13 
sS'amount' 
p24 
I555 
sbp25 
.g0 
(g20 
g2 
Ntp26 
R(dp27 
g6 
F1039911618.384868 
sg23 
g13 
sg24 
I555 
sbp28 
. 
 
But the cPickle version does it wrong. For the first 
transaction it puts a new account instance with a get to 
the dictionary of the account in the Bank. And on 
pickling the second transaction it gets a reference to the 
account instance from the first transaction: 
 
(ccommands 
DepositOrWithdrawal 
p18 
g3 
NtRp19 
(dp20 
g7 
F1039910911.252528 
sS'acnt' 
p21 
(iaccount 
Account 
p22 
g13 
bsS'amount' 
p23 
I555 
sb.g1 
(g18 
g3 
NtRp24 
(dp25 
g7 
F1039910918.068189 
sg21 
g22 
sg23 
I555 
sb. 
 
Let me know if you need more details than this. 

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-12-17 22:53

Message:
Logged In: YES 
user_id=21627

I have simplified the example even more, see t.py. Here is
the explanation:

cpickle only puts objects into the memo that have a
reference count > 1 to them, otherwise, there can't be
multiple references to the same object, and thus no backward
references. pickle has no such mechanism, as it will always
hold references to the objects in its various variables, and
would never observe objects with a refcount 1.

Of course, if you have multiple dump calls on the same
pickler object, such objects won't be shared.

This isn't normally an issue, since, in order to see the
same object again, you'ld have to pickle its single
container again, and the container will be in the memo. This
only fails if you
a) pickle the object itself (as you do), or
b) insert the object into a different container after
pickling it, and pickle that container.

It appears that this behaviour has always been in cPickle.c.



----------------------------------------------------------------------

Comment By: Patrick K. O'Brien (pobrien)
Date: 2002-12-17 19:22

Message:
Logged In: YES 
user_id=179604

I said the file was attached, but didn't give the file name. Look 
for the pickleError.py file for code that demonstrates the 
problem. 

----------------------------------------------------------------------

Comment By: Patrick K. O'Brien (pobrien)
Date: 2002-12-17 19:20

Message:
Logged In: YES 
user_id=179604

Attached is a greatly reduced example that demonstrates the 
problem. Let me know if you need more than this. 

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-12-17 02:17

Message:
Logged In: YES 
user_id=6380

Hm. Rather than me trying to check out the code and guessing
what's what, could you perhaps construct a small
self-contained example that demonstrates the problem? I'd
also like to see some code that shows that the unpickled
value is in fact wrong -- cPickle and pickle routinely
create different pickles, but normally that's benign.

----------------------------------------------------------------------

Comment By: Patrick K. O'Brien (pobrien)
Date: 2002-12-17 01:39

Message:
Logged In: YES 
user_id=179604

New vs. old-style classes is the problem. If I change Account 
to be a new-style class the problem goes away. Hope that 
helps. 

----------------------------------------------------------------------

Comment By: Patrick K. O'Brien (pobrien)
Date: 2002-12-17 01:25

Message:
Logged In: YES 
user_id=179604

I forgot to mention that one of the classes that gets pickled is 
the Clock class, which is in pypersyst.clock. Clock is a new 
style class, as is transaction.Transaction, the base class for 
the transactions the appear in commands.py. Is there a 
possible problem with mixing old and new-style classes in the 
same pickle file? Then again, pickle does it fine, cPickle does 
not. So we're back to where we started. 

----------------------------------------------------------------------

Comment By: Patrick K. O'Brien (pobrien)
Date: 2002-12-17 00:13

Message:
Logged In: YES 
user_id=179604

The classes are part of a sample app that someone else wrote. 
Their lack of Python experience shows a bit. The system that 
is doing the pickling was written by me. It is part of the 
PyPerSyst project: 
 
http://sourceforge.net/projects/pypersyst 
 
You can check out the files from CVS. The storage.singlefile 
module is where the pickling takes place. I commented out the 
cPickle import and used pickle instead to find the problem. 
 
The particular classes being pickled are defined in the files in 
the sandbox/pobrien/plyonsbank directory: bank.py is the main 
app and contains the Bank class, account.py contains the 
Account class, command.py contains a variety of transaction 
classes, and console.py is the interface. 
 
The pypersyst package needs to be on the Python path, but 
the sandbox stuff doesn't. Hope that helps. 

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-12-16 23:42

Message:
Logged In: YES 
user_id=6380

I'd need the code for the classes before I can delve into this.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=654866&group_id=5470