[pypy-dev] wrong precedence of __radd__ vs list __iadd__

Greg Price greg at quora.com
Thu Mar 10 01:13:46 CET 2011


The following program works in CPython, but fails in PyPy:

  class C(object):
    def __radd__(self, other):
      other.append(1)
      return other

  z = []
  z += C()
  print z  # should be [1]

In PyPy, this fails with "TypeError: 'C' object is not iterable".

The issue is that PyPy is allowing the list's inplace_add behavior to
take precedence over the C object's __radd__ method, while CPython
does the reverse.

A similar issue occurs in the following program:

  class C(object):
      def __rmul__(self, other):
          other *= 2
          return other
      def __index__(self):
          return 3

  print [1] * C() # should be [1,1]

where PyPy instead prints [1,1,1].


In Python, if the LHS of a foo-augmented assignment has an in-place
foo method defined, then that method is given the first opportunity to
perform the operation, followed by the foo method and the RHS's
reverse-foo methods.
http://docs.python.org/reference/datamodel.html#object.__iadd__

But: CPython's 'list' type does not have any numeric methods defined
at the C level, in-place or otherwise. See PyList_Type in
listobject.c, where tp_as_number is null. So an in-place addition
falls through to the RHS's nb_add, if present, and for a class with
metaclass 'type' and an __radd__() method this is slot_nb_add() from
typeobject.c, which calls __radd__().

So when "z += [1,2]" runs in CPython, it works via a further wrinkle.
The meat of the implementation of INPLACE_ADD is PyNumber_InPlaceAdd()
in abstract.c. When all numeric methods fail to handle the addition,
this function falls back to sequence methods, looking for
sq_inplace_concat or sq_concat methods on the LHS. 'list' has these
methods, so its sq_inplace_concat method handles the operation.

Similarly, PyNumber_InPlaceMultiply() tries all numeric methods first,
before falling back to sq_inplace_repeat or sq_repeat.

In PyPy, by contrast, it doesn't look like there's any logic for
falling back to a concatenate method if numeric methods fail. Instead,
if I'm reading correctly, the inplace_add__List_ANY() method in
pypy.objspace.std.listobject runs *before* any numeric methods on the
RHS are tried.


For the narrow case at hand, a sufficient hack for e.g. addition would
be to teach inplace_add__List_ANY() to look for an __radd__() first.
At a quick grep through CPython, full compatibility by that approach
would require similar hacks for bytes, array.array, collections.deque,
str, unicode, buffer, and bytearray; plus the users of structseq.c,
including type(sys.float_info), pwd.struct_passwd, grp.struct_group,
posix.stat_result, time.struct_time, and several others. Those types
in CPython all have sq_concat and not nb_add or nb_inplace_add, so
they will all permit an RHS's __radd__ to take precedence, but then
fall back on concatenation if no such method exists.

A more comprehensive approach would teach the generic dispatch code
about sequence methods falling after numeric methods in the dispatch
sequence. Then each sequence type would need to identify its sequence
methods for that code.

Perhaps a hybrid approach is best. Assume that no type with a
sq_concat also has a nb_add, and no type with a sq_repeat also has a
nb_multiply. (I believe this is true for all built-in, stdlib, and
pure-Python types.) Then whenever we define a method
{inplace_,}{add,mul}__Foo_ANY, for a sequence type Foo, it's enough
for that method to check for __r{add,mul}__ on the RHS. So we can
write a generic helper function and use that in each such method.

Does that last approach sound reasonable? I'm happy to go and
implement it, but I'm open to other suggestions.


I've posted tests (which fail) at
  https://bitbucket.org/price/pypy-queue/changeset/9dd9c2a5116a

Greg



More information about the Pypy-dev mailing list