[Python-bugs-list] [ python-Bugs-650739 ] Binary pickle format depends on marshal

noreply@sourceforge.net noreply@sourceforge.net
Mon, 09 Dec 2002 11:02:20 -0800


Bugs item #650739, was opened at 2002-12-09 03:22
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=650739&group_id=5470

Category: Python Library
>Group: Not a Bug
Status: Closed
>Resolution: Invalid
Priority: 5
Submitted By: L. Peter Deutsch (lpd)
Assigned to: Tim Peters (tim_one)
Summary: Binary pickle format depends on marshal

Initial Comment:
The documentation of the pickle package (section
3.14.1) says "The pickle serialization format is
guaranteed to be backwards compatible across Python
releases."  While this is easy to verify for the
non-binary format, the binary format calls
marshal.dumps and marshal.loads in quite a few places,
and is thus subject to the statement in the same
paragraph that "The marshal serialization format is not
guaranteed to be portable across Python versions."

It appears that mdumps and mloads are only used for
converting integers to and from binary format. I
suggest that instead of invoking mdumps and mloads, the
pickle code include its own copies of these two simple
algorithms.

I haven't looked at cPickle, so I don't know whether it
calls the marshal code, or whether it implements the
algorithms itself. If the former, I suggest changing it
too to the latter.


----------------------------------------------------------------------

Comment By: L. Peter Deutsch (lpd)
Date: 2002-12-09 13:05

Message:
Logged In: YES 
user_id=8861

Yes, code duplication is an evil. So are undocumented
dependencies. In this case, we simply have different
opinions as to which evil is lesser. :-)

I'm willing to let the bug be closed.


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-12-09 12:50

Message:
Logged In: YES 
user_id=31435

There isn't a bug here, just your fear of a potential bug 
someday.  The few people who touch marshal are acutely 
aware of the issues here already -- that's why your fears 
haven't materialized in the last decade of living dangerously 
<wink>.

If you want to add comments to the module, you can upload 
a patch to the patch manager here, and I expect someone 
will apply it.  Code duplication is a clear and present evil, and 
would not be accepted.

----------------------------------------------------------------------

Comment By: L. Peter Deutsch (lpd)
Date: 2002-12-09 12:10

Message:
Logged In: YES 
user_id=8861

There is currently no documentation anywhere in the code to
alert the maintainer of the marshal package to the fact that
pickle has a design dependency on it. On the contrary, the
comment in marshal.c says "This is intended for writing and
reading compiled Python code only."

Hidden dependencies of this kind are an invitation to future
obscure problems. If the dependency is not removed, it
should at least be documented. If the code is not changed, I
strongly advocate adding the following comment to marshal.c:
"Note that the pickle package (pickle.py) uses the dump and
load methods from this package for writing and reading
integer values. If these methods are ever changed, their old
implementation must be copied to pickle.py."

I still think it would make a lot more sense to simply
duplicate the code. We're talking about 10-20 lines of
Python code here, added to nearly 1,000 lines of code in
pickle.py. And I would note that cPickle.c *already*
duplicates these algorithms rather than using the
implementation in marshal.

Would the Python team be willing to consider making this
change if I wrote and tested the code myself? If so, is
there a procedure I should follow for submitting the code,
or is posting the code as a follow-up comment here sufficient?


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-12-09 11:53

Message:
Logged In: YES 
user_id=31435

Closing this as Invalid.  So long as the doc's promises are 
kept, it doesn't matter how the implementation accomplishes 
it.  If the marshal format changes in an incompatible way 
someday, then pickle will have to stop using marshal 
routines.  Changing that beforehand isn't needed or helpful.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=650739&group_id=5470