On Thu, 2020-12-03 at 13:10 -0600, Sebastian Berg wrote:
Hi all,
I just sniped myself wondering about how correct dispatching for dunders works for independently derived subclasses.
Sorry, this was likely mostly noise... I had forgotten about the asymmetry in `__add__` and `__radd__` [1]. As the docs clearly explain, this asymmetry resolves the issue as long as all subclasses implement `__add__` (and do not inherit it). When `__add__` is inherited (or `super()`) used I think the "strict" approach I mentioned maybe still has a point (although only for `__add__` probably, not `__radd__`). But, inheriting a dunder unmodified is probably not worth thinking too much about. Cheers, Sebastian [1] In my defense, NumPy's protocol I was coming from is "multiple dispatch", so it cannot distinguish forward and backward method.
This is an extreme corner case where two subclasses may not know about each other, and further cannot establish a hierarch:
class A(int): pass
class B(int): def __add__(self, other): return "B" def __radd__(self, other): return "B"
print(B() + A()) # prints "B" print(A() + B()) # prints 0 (does not dispatch to `B`)
In the above, `A` inherits from `int` which relies on the rule "subclasses before superclasses" to ensure that `B.__add__` is normally called. However, this rule cannot establish a priority between `A` and `B`, and while `A` can decide to do the same as `int`, it cannot be sure what to do with `B`.
The solution, or correct(?) behaviour, is likely also described somewhere on python I got it from NumPy [1]:
"The recommendation is that [a dunder-implementation] of a class should generally `return NotImplemented` unless the inputs are instances of the same class or superclasses."
By inheriting from `int`, this is not what we do! The `int` implementation does not defer to `B` even though `B` is not a superclass of `A`.
Now, you could fix this by replacing `int` with `strict_int`:
class strict_int(): def __add__(self, other): if not isinstance(self, type(other)): return NotImplemented return "int" def __radd__(self, other): if not isinstance(self, type(other)): return NotImplemented return "int"
or generally the `not isinstance(self, type(other))` pattern. In that case `B` can choose to support `A`:
class A(strict_int): pass
class B(strict_int): def __add__(self, other): return "B" def __radd__(self, other): return "B"
# Both print "B", as `B` "supports" `A` print(B() + A()) print(A() + B())
The other side effect of that is that one of the classes has to do this to avoid an error:
class A(strict_int): pass class B(strict_int): pass
A() + B() # raises TypeError
Now, I doubt Python could change how `int.__add__` defers since that would modify behaviour in a non-backward compatible way. But, I am curious whether there is a reason why I do not recall ever reading the recommendation to use the pattern:
class A: def __add__(self, other): if not isinstance(self, type(other)): return NotImplemented return "result"
Rather than:
class A: def __add__(self, other): if not isinstance(other, A): return NotImplemented return "result"
The first one leads to a strict (error unless explicitly handled) behaviour for multiple independently derived subclasses. I admit, the second pattern is much simpler to understand, so that might be a reason in itself. But I am curious whether I am missing an important reason why the first pattern is not the recommended one (e.g. in the "Numeric abstract base classes" docs [2].
Cheers,
Sebastian
[1] https://numpy.org/neps/nep-0013-ufunc-overrides.html#subclass-hierarchies [2] https://docs.python.org/3/library/numbers.html?highlight=notimplemented#impl... _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/NDDYQG... Code of Conduct: http://python.org/psf/codeofconduct/