[Python-Dev] Pickle implementation questions

Tim Peters tim.peters at gmail.com
Fri Jun 30 22:24:10 CEST 2006


[Tim Peters]
>> I hope you've read PEP 307:

[Bruce Christensen]
> I have. Thanks to you and Guido for writing it! It's been a huge help.

You're welcome -- although we were paid for that, so thanks aren't needed ;-)

>> The implementation is more like:
>> [snip]

> Thanks! That helps a lot. PEP 307 and the pickle module docs describe the end
> result pretty well, but they don't always make it clear where things are
> implemented.

Well, "where" and "how" are implementation details.  Alas, those
aren't always clearly separated from the semantics (and since Guido &
I both like operational definitions, stuff we write is especially
prone to muddiness on such points).  The layers of backward
compatibility for now out-of-favor gimmicks don't help either -- this
is akin to reading the Windows API docs, finding around six functions
that _sound_ relevant, and then painfully discovering none of them
actually do what you hope they do, one at a time :-)

> I'm trying to make sure that I'm getting the right interaction between
> object.__reduce(_ex)__, pickle, and copy_reg.

Alas, I'm sure I don't remember sufficient details anymore myself.

> One (hopefully) last question: is object.__reduce(_ex)__ really implemented in
> object?

Yes, although I think you're overlooking this bit of the "acts as if"
pseudo-implementation from my last note:

       elif proto < 2:
           return copy_reg._reduce_ex(self, proto)

That is, the `object` implementation left the proto < 2 cases coded in
Python.  You won't get to the (hoped to be) common path:

        else:
           # about 130 lines of C code exploiting proto 2

unless you ask for proto 2.

> The tracebacks below would indicate that pickle directly implements the
> behavior that the specs say is implemented in object. However, that could be
> because frames from C code don't show up in tracebacks.

That's right, they don't, and the C `object` code calls back into
copy_reg in proto < 2 cases.

> I'm not familiar enough with CPython to know for sure.
>
> >>> import copy_reg
> >>> def bomb(*args, **kwargs):
> ...     raise Exception('KABOOM! %r %r' % (args, kwargs))
> ...
> >>> copy_reg._reduce_ex = bomb
> >>> import pickle
> >>> pickle.dumps(object())

You're defaulting to protocol 0 there, so, as above, the `object`
implementation actually calls copy_reg._reduce_ex(self, 0) in this
case.  Much the same if you do:

>>> pickle.dumps(object(), 1)

I think it's a misfeature of pickle that it defaults to the oldest
protocol instead of the newest, but not much to be done about that in
Python 2.

Do one of these instead and the traceback will go away:

>>> pickle.dumps(object(), 2)
>>> pickle.dumps(object(), -1)
>>> pickle.dumps(object(), pickle.HIGHEST_PROTOCOL)

> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File "C:\Python24\lib\pickle.py", line 1386, in dumps
>     Pickler(file, protocol, bin).dump(obj)
>   File "C:\Python24\lib\pickle.py", line 231, in dump
>     self.save(obj)
>   File "C:\Python24\lib\pickle.py", line 313, in save
>     rv = reduce(self.proto)
>   File "<stdin>", line 2, in bomb
> Exception: KABOOM! (<object object at 0x01E3C448>, 0) {}

It's _certainly_ an implementation accident that the `object` coding
happens to call back into `copy_reg` here.  There was no intent that
users be able to monkey-patch copy_reg and replace _reduce_ex().  It
was left coded in Python purely as a cost/benefit tradeoff.

> >>> class NewObj(object):
> ...     def __reduce__(self):
> ...             raise Exception("reducing NewObj")

In this case, it doesn't matter at all how `object` implements
__reduce__ or __reduce_ex__, because you're explicitly saying that
NewObj has its own __reduce__ method, and that overrides `object`'s
implementation.  IOW, you're getting exactly what you ask for in this
case, and regardless of pickle protocol specified:

> >>> import pickle
> >>> pickle.dumps(NewObj())

Ask for protocols 1 or 2 here, and you'll get the same traceback.  It
would be a bug if you didn't :-)

> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File "C:\Python24\lib\pickle.py", line 1386, in dumps
>     Pickler(file, protocol, bin).dump(obj)
>   File "C:\Python24\lib\pickle.py", line 231, in dump
>     self.save(obj)
>   File "C:\Python24\lib\pickle.py", line 313, in save
>     rv = reduce(self.proto)
>   File "<stdin>", line 3, in __reduce__
> Exception: reducing NewObj


More information about the Python-Dev mailing list