[Python-Dev] Python-versus-CPython question for mul dispatch

15 May 2015

      Hi all,

While attempting to clean up some of the more squamous aspects of
numpy's operator dispatch code [1][2], I've encountered a situation
where the semantics we want and are using are possible according to
CPython-the-interpreter, but AFAICT ought not to be possible according
to Python-the-language, i.e., it's not clear to me whether it's
possible even in principle to implement an object that works the way
numpy.ndarray does in any other interpreter. Which makes me a bit
nervous, so I wanted to check if there was any ruling on this.

Specifically, the quirk we are relying on is this: in CPython, if you do

  [1, 2] * my_object

then my_object's __rmul__ gets called *before* list.__mul__,
*regardless* of the inheritance relationship between list and
type(my_object). This occurs as a side-effect of the weirdness
involved in having both tp_as_number->nb_multiply and
tp_as_sequence->sq_repeat in the C API -- when evaluating "a * b",
CPython tries a's nb_multiply, then b's nb_multiply, then a's
sq_repeat, then b's sq_repeat. Since list has an sq_repeat but not an
nb_multiply, this means that my_object's nb_multiply gets called
before any list method.

Here's an example demonstrating how weird this is. list.__mul__ wants
an integer, and by "integer" it means "any object with an __index__
method". So here's a class that list is happy to be multiplied by --
according to the ordinary rules for operator dispatch, in the example
below Indexable.__mul__ and __rmul__ shouldn't even get a look-in:

In [3]: class Indexable(object):
   ...:     def __index__(self):
   ...:         return 2
   ...:

In [4]: [1, 2] * Indexable()
Out[4]: [1, 2, 1, 2]

But, if I add an __rmul__ method, then this actually wins:

In [6]: class IndexableWithMul(object):
   ...:     def __index__(self):
   ...:         return 2
  ...:     def __mul__(self, other):
   ...:         return "indexable forward mul"
   ...:     def __rmul__(self, other):
   ...:         return "indexable reverse mul"

In [7]: [1, 2] * IndexableWithMul()
Out[7]: 'indexable reverse mul'

In [8]: IndexableWithMul() * [1, 2]
Out[8]: 'indexable forward mul'

NumPy arrays, of course, correctly define both __index__ method (which
raises an array on general arrays but coerces to int for arrays that
contain exactly 1 integer), and also defines an nb_multiply slot which
accepts lists and performs elementwise multiplication:

In [9]: [1, 2] * np.array(2)
Out[9]: array([2, 4])

And that's all great! Just what we want. But the only reason this is
possible, AFAICT, is that CPython 'list' is a weird type with
undocumented behaviour that you can't actually define using pure
Python code.

Should I be worried?

-n

[1] https://github.com/numpy/numpy/pull/5864
[2] https://github.com/numpy/numpy/issues/5844

-- 
Nathaniel J. Smith -- http://vorpus.org

[Python-Dev] Python-versus-CPython question for __mul__ dispatch

Nathaniel Smith

[Python-Dev] Python-versus-CPython question for mul dispatch