[New-bugs-announce] [issue26695] pickle and _pickle accelerator have different behavior when unpickling an object with falsy __getstate__ return

Josh Rosenberg report at bugs.python.org
Tue Apr 5 11:01:46 EDT 2016

New submission from Josh Rosenberg:

According to a note on the pickle docs ( https://docs.python.org/3/library/pickle.html#object.__getstate__ ): "If __getstate__() returns a false value, the __setstate__() method will not be called upon unpickling."

The phrasing is a little odd (since according to the __setstate__ docs, there is a behavior for classes without __setstate__ where it just assigns the contents of the pickled state dict to the __dict__ of the object), but to me, this means that any falsy value should prevent any __setstate__-like behavior.

But this is not how it works. Both the C accelerator and Python code treat None specially (they don't pickle state at all if it's None), which prevents __setstate__ or the __setstate__-like fallback from being executed.

But if it's any other falsy value, the behaviors differ, and diverge from the docs. Specifically, on load of a pickle with a non-None falsy state (say, False itself, or 0, or () or []):

Without __setstate__:
Pure Python pickle: Does not execute fallback code, behaves as expected (it just stored state it will never use), matching spirit of docs
C accelerated _pickle: Fails on anything but the empty dict with an UnpicklingError: state is not a dictionary, violating spirit of docs

With __setstate__:
Both versions call __setstate__ even though the documentation explicitly says they will not.

Seems like if nothing else, the docs should agree with the code, and the C and Python modules should agree on behavior.

I would not be at all surprised if outside code depends on being able to pickle falsy state and have its __setstate__ receive the falsy state (if nothing else, when the state is a container or number, being empty or 0 would be reasonable; failing to call __setstate__ in that case would be surprising). So it's probably not a good idea to make the implementation match the docs.

My proposal would be that at pickle time, if the class lacks __setstate__, treat any falsy return value as None. This means:

1. pickles are smaller (no storing junk that the default __setstate__-like behavior can't use)
2. pickles are valid (no UnpicklingError from the default __setstate__-like behavior)

The docs would also have to change, to indicate that, if defined, __setstate__ will be called even if __getstate__ returned a falsy (but not None) value.

Downside is the description of what happens is a little complex, since the behavior for non-None falsy values differs depending on the presence of a real __setstate__. Upside is that any code depending on the current behavior of falsy state being passed to __setstate__ keeps working, CPython and other interpreters will match behavior, and classes without __setstate__ will have smaller pickles.

assignee: docs at python
components: Documentation
messages: 262908
nosy: docs at python, josh.r
priority: normal
severity: normal
status: open
title: pickle and _pickle accelerator have different behavior when unpickling an object with falsy __getstate__ return
versions: Python 2.7, Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 3.6

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list