[Python-Dev] compatibility for C-accelerated types

Sat Oct 17 18:20:34 EDT 2015

A recent discussion in a tracker issue [1] brought up the matter of
compatibility between the pure Python implementation of OrderedDict
and the new C implementation.  In working on that port I stuck as
closely as possible to the Python implementation.  This meant some
parts of the code are bit more complex than they would be otherwise.
(Serhiy has been kind enough to do some cleanup.)

Compatibility was one of the fundamental goals of the porting effort.
Not only does compatibility make sense but it's also specifically
required by PEP 399 [2]:

    Any new accelerated code must act as a drop-in replacement
    as close to the pure Python implementation as reasonable.
    Technical details of the VM providing the accelerated code
    are allowed to differ as necessary, e.g., a class being a type
    when implemented in C.

For the most part I have questions about what is "reasonable",
specifically in relation to OrderedDict.

I've already opened up a separate thread related to my main question:
type(obj) vs. obj.__class__. [3]  In the tracker issue, Serhiy pointed
out:

    There is no a difference. io, pickle, ElementTree, bz2, virtually
    all accelerator classes was created as replacements of pure
    Python implementations. All C implementations use
    Py_TYPE(self) for repr() and pickling. I think this deviation is
    common and acceptable.

In a review comment on the associated patch he said:

    Isn't type(self) is always the same as self.__class__ for pure
    Python class? If right, then this change doesn't have any effect.

To which he later replied:

    It is the same if you assigned the __class__ attribute, but can
    be different if set __class__ in the subclass declaration.

So it isn't clear if that is a compatibility break or how much so it might be.

Serhiy also noted that, as of 3.5 [4], you can no longer assign to
obj.__class__ for instances of subclasses of builtin (non-heap) types.
So that is another place where the two OrderedDict implementations
differ.  I expect there are a few others in dark corner cases.

On the tracker he notes another OrderedDict compatibility break:

    Backward compatibility related to __class__ assignment was
    already broken in C implementation. In 3.4 following code
    works:

    >>> from collections import *
    >>> class foo(OrderedDict):
    ...     def bark(self): return "spam"
    ...
    >>> class bar(OrderedDict):
    ...     pass
    ...
    >>> od = bar()
    >>> od.__class__ = foo
    >>> od.bark()
    'spam'

    In 3.5 it doesn't.

As PEP 399 says, we should go as far "as reasonable" in the pursuit of
compatibility.  At the same time, I feel not insignificant
responsibility for *any* incompatibility that comes from the C
implementation of OrderedDict.  The corner cases impacted by the above
compatibility concerns are borderline enough that I wanted to get some
feedback.  Thanks!

-eric

[1] http://bugs.python.org/issue25410
[2] https://www.python.org/dev/peps/pep-0399/
[3] https://mail.python.org/pipermail/python-dev/2015-October/141953.html
[4] http://bugs.python.org/issue24912