Mixing protocols in pickles?
tim.one at comcast.net
Fri Jan 23 07:06:04 CET 2004
[Erik Max Francis]
> Is there any prohibition against mixing different protocols within the
> same pickle?
No. The unpickling implementation has no idea which protocol is in use.
There's a ton of info about unpickling in 2.3's undocumented new
pickletools.py library module, BTW, and a symbolic pickle disassembler
(which would be especially useful here).
Try getting gzip out of your test program; pickles are exercised more
heavily than gzip. Try it with both pickle and cPickle; it might matter.
> So I thought it would be clever to write the tag string with
> protocol 0 so it would show up in a file viewer as plain text
> and then write the rest of the data with protocol 1
A protocol 1 string pickle just contains the raw string bytes; that's the
same stuff for printable ASCII as proto 0 produces, and is *more* readable
if you're using "funny characters" (a proto 0 string pickle replaces funny
characters with Python string escape sequences).
> This works fine on Unix, but on Windows it generates the (utterly
> puzzling) error:
> File "C:\My Documents\botec-0.1x1\botec.py", line 1666, in load
> system = pickle.load(inputFile)
> ImportError: No Module named copy_reg
That's especially puzzling to me because that error message isn't one Python
produces: all messages of this form produced by the core spell it "module"
instead of "Module".
> So I guess there are a few questions: Why is the error being
> generated so obscure and seemingly incorrect?
Not enough info to say.
> Is it really the case that mixing multiple protocols within the same
> pickle isn't allowed,
No, the unpickler doesn't care (read pickletools.py for a full explanation
of why it doesn't care).
> or is this truly a bug
Somewhere, yes, but not necessarily in pickle.
> (or, say, a Windows-specific problem because protocol 0 requires that
> the pickle file be in text mode?)?
No, that protocol 0 used to be called "text mode" was a horrid mistake.
Pickle files should always be opened (for both reading and writing) in
binary mode, regardless of the pickle protocol. It so happens you can *get
away* with using text-mode files on Windows for (only) protocol 0 pickles,
but that's more by accident than design.
More information about the Python-list