[Python-Dev] Python 2.5.1c1 pickle problem
Raymond Hettinger
python at rcn.com
Thu Apr 12 09:14:50 CEST 2007
Ralf, your issue is arising because of revision 53655 which fixes SF 1615701.
Subclasses of builtins are pickled using obj.__reduce_ex__() which returns
a tuple with a _reconstructor function and a tuple of arguments to that
function.
That tuple of arguments include the subclass name, the base class, and a state
which is computed as state=base(obj). So, in your case, state=dict(yourobject).
Formerly, casting a dict subclass to a dict would use a fast internal copying
method
which duplicated the hash table directly. The OP for SF 1615701 felt strongly
that
this was buggy behavior because it would bypass the custom __getitem__ () method
implemented by his subclass. An argument in favor of the bugfix was that
dict(m) or dict.update(m) shouldn't behave differently depending on whether m
was
a dict-subclass or another mapping-like object. For the latter, writing dict(m)
is the semantic equivalent for writing dict((k,m[k]) for k in m.keys()). This
is where
your __getitem__() call comes from.
At first, I did not agree with the proposed bugfix because 1) it would
unnecessarily
slowdown several operations on dict subclasses and make them less attractive;
2) it violated the OpenClosedPrinciple where the subclass has no business
knowing how the internals of the base class are implemented (in this case, the
subclass should not depend on whether dict(m) is implemented in terms of
keys/getitem, in terms of iteritems, or through direct access to the hash
table);
and 3) because there were clearer and more reliable ways to implement
the use cases suggested by the OP.
The matter was briefly discussed on SF and python-dev where the OP found
other proponents. It became clear that if left alone, the existing
implementation
would continue to defy the expectations of some folks subclassing builtins.
Those expectations arise naturally from a mental model of builtins behaving
just like a fast version of the most natural pure-Python equivalents.
I'm not sure what your code was doing where the bugfix would cause breakage.
If its __getitem__() override returned a meaningful value for each element
in obj.keys(), then it should have worked fine. Of course, if it was raising
an exception or triggering a side-effect, then one could argue that the bugfix
was working as intended by allowing the subclasser to affect how the base
class goes about its business.
Am leaving this open for others to discuss and decide. The old behavior was
surprising to some, but the revised behavior also appears to have some
unforeseen consequences.
Raymond
P.S. In addition to rev 53655, a number of similar changes were made to sets.
More information about the Python-Dev
mailing list