Mailman 3 Why do binary arithmetic operators care about differing method implementations but rich comparisons don't? - Python-Dev

Why do binary arithmetic operators care about differing method implementations but rich comparisons don't?

Brett Cannon

Sept. 27, 2020

1:38 p.m.

When you do a binary arithmetic operation, one of the things that dictates whether the left-hand side's __*__ method is called before the right-hand side's __r*__ method is if the left-hand side's __r*__ differs (there's also the fact __r*__ methods are not called if. the types are the same). Presumably this is because you only care about giving precedence to the right-hand side when it would actually matter due to a difference in implementation (with the assumption that there isn't a specific need to get the right-hand side special dispensation to participate in the operation). But with rich comparisons there doesn't seem to be an equivalent check for a difference in method implementation. Why is that? Is it because we don't want to assume that if someone bothered to implement both __gt__ and __lt__ that they would not necessarily be the inverse of each other like __add__ and __radd__?

Attachments:

attachment.htm (text/html — 956 bytes)

Show replies by date

Guido van Rossum

September 2020

2:57 p.m.

Hm... IIRC the reason why we did this for `__r*__` is because the more derived class might want to return an instance of that class, and we can't assume that the less derived class knows how to create an instance of the more derived class (the `__init__` signatures might differ). For comparisons the return value is usually a bool, and in that case the type of the return value is not a concern. But I guess for things like numpy arrays (where A<B returns an array of Booleans of the same shape) the same argument might apply. I guess it's an oversight that we didn't think of this when we added rich comparisons in PEP 207, 20 years ago. That PEP is so old it doesn't even have a date! (Hi David Ascher! :-) I think we could try to change it but it would require a very careful risk analysis. On Sun, Sep 27, 2020 at 1:41 PM Brett Cannon <brett@python.org> wrote:

...

-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Brett Cannon

5:58 p.m.

On Sun, Sep 27, 2020 at 2:58 PM Guido van Rossum <guido@python.org> wrote:

...

Hm... IIRC the reason why we did this for `__r*__` is because the more derived class might want to return an instance of that class, and we can't assume that the less derived class knows how to create an instance of the more derived class (the `__init__` signatures might differ).

Yep, that's what the data model docs suggest (see the note at https://docs.python.org/3/reference/datamodel.html#object.__ror__). But the interesting bit is skipping the call of __r*__ when `lhs.__r*__ == rhs.__r*__` (as long as the derived class requirements are met). That's the difference that I'm really curious about compared to rich comparisons and their inverse which don't have this call avoidance. To help visualize all of this, you can see https://github.com/brettcannon/desugar/blob/066f16c00a2c78784bfb18eec31476df... for binary arithmetic operators compared to https://github.com/brettcannon/desugar/blob/066f16c00a2c78784bfb18eec31476df... for rich comparisons. [SNIP]

...

I think we could try to change it but it would require a very careful risk analysis.

I'm not sure how critical it is to change. I'm sure there's some potential perf gain by avoiding the (potentially) unnecessary call, but I also don't know if people have implemented these functions in such a way that skipping the inverse operation on the right-hand side object would break something. Would abuse of the syntax make a difference (e.g. making `>` do something magical)? -Brett

...

On Sun, Sep 27, 2020 at 1:41 PM Brett Cannon <brett@python.org> wrote:

...
When you do a binary arithmetic operation, one of the things that dictates whether the left-hand side's __*__ method is called before the right-hand side's __r*__ method is if the left-hand side's __r*__ differs (there's also the fact __r*__ methods are not called if. the types are the same). Presumably this is because you only care about giving precedence to the right-hand side when it would actually matter due to a difference in implementation (with the assumption that there isn't a specific need to get the right-hand side special dispensation to participate in the operation).

But with rich comparisons there doesn't seem to be an equivalent check for a difference in method implementation. Why is that? Is it because we don't want to assume that if someone bothered to implement both __gt__ and __lt__ that they would not necessarily be the inverse of each other like __add__ and __radd__? _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/7NZUCODE... Code of Conduct: http://python.org/psf/codeofconduct/

-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Guido van Rossum

9:56 p.m.

On Sun, Sep 27, 2020 at 5:58 PM Brett Cannon <brett@python.org> wrote:

...

Ooh, interesting. (Aren't you missing a few checks for MISSING in the elif or else branches?) Let me guess some more (I'm on a rare caffeine high since 9am so we'll see how this goes :-). The idea is clearly that if lhs and rhs are the same class we don't bother calling `__r*__` (because if `__*__` didn't do it there's no reason that `__r*__` would be any different). Are you sure you read things right, and `__r*__` is skipped when the `__r*__` methods are the same, and not only when the lhs and rhs classes are the same? It does seem kind of a pointless optimization, since if the first call is successful we'll skip the second call anyway, and if it returns NotImplemented, well, if our assumption that `__r*__` is going to do the same, it's going to be an error anyway. I wonder if this was always there? Maybe we should study the git blame some more. And why don't we do this for rich comparisons? Probably because the logic is completely separate. :-( And maybe when we did rich comparisons (nearly a decade after the original binary operator overloading IIRC) the optimization idea didn't occur to us, or maybe we realized that we'd be optimizing an error case. Or maybe because rich comparisons were trying to somehow model the earlier `__cmp__`?

...

I don't know, PEP 207 explicitly says the reflexivity assumptions are assumed. I guess I misunderstood your question for clarification as a suggestion to change. I feel this requires more careful thought than I can muster tonight.

...

-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Brett Cannon

12:03 p.m.

On Sun, Sep 27, 2020 at 9:56 PM Guido van Rossum <guido@python.org> wrote:

...

On Sun, Sep 27, 2020 at 5:58 PM Brett Cannon <brett@python.org> wrote:

...
On Sun, Sep 27, 2020 at 2:58 PM Guido van Rossum <guido@python.org> wrote:

...
Hm... IIRC the reason why we did this for `__r*__` is because the more derived class might want to return an instance of that class, and we can't assume that the less derived class knows how to create an instance of the more derived class (the `__init__` signatures might differ).

Yep, that's what the data model docs suggest (see the note at https://docs.python.org/3/reference/datamodel.html#object.__ror__).

But the interesting bit is skipping the call of __r*__ when `lhs.__r*__ == rhs.__r*__` (as long as the derived class requirements are met). That's the difference that I'm really curious about compared to rich comparisons and their inverse which don't have this call avoidance.

To help visualize all of this, you can see https://github.com/brettcannon/desugar/blob/066f16c00a2c78784bfb18eec31476df... for binary arithmetic operators compared to https://github.com/brettcannon/desugar/blob/066f16c00a2c78784bfb18eec31476df... for rich comparisons.

Ooh, interesting. (Aren't you missing a few checks for MISSING in the elif or else branches?)

Nope, I handle that generically in the `for` loop farther down that makes the actual calls. That one _MISSING check is because since it's just an instance of `object()` then that subclass check will always succeed. I should probably just define a custom singleton class to let me drop that one guard case.

...

Let me guess some more (I'm on a rare caffeine high since 9am so we'll see how this goes :-).

The idea is clearly that if lhs and rhs are the same class we don't bother calling `__r*__` (because if `__*__` didn't do it there's no reason that `__r*__` would be any different).

That was my assumption.

...

Are you sure you read things right, and `__r*__` is skipped when the `__r*__` methods are the same, and not only when the lhs and rhs classes are the same?

The test I wrote for this is at https://github.com/brettcannon/desugar/blob/066f16c00a2c78784bfb18eec31476df... and passes when run against CPython via the 'operator' module which just delegates to the syntax anyway. And the code that makes this happen is (I think) https://github.com/python/cpython/blob/6f8c8320e9eac9bc7a7f653b43506e75916ce... . BTW I have this all linked and written down in https://snarky.ca/unravelling-binary-arithmetic-operations-in-python/.

...

It does seem kind of a pointless optimization, since if the first call is successful we'll skip the second call anyway, and if it returns NotImplemented, well, if our assumption that `__r*__` is going to do the same, it's going to be an error anyway. I wonder if this was always there? Maybe we should study the git blame some more.

Assuming I have the write line of C doe, it looks like you introduced it in 2.2 with new-style classes, 19 years ago to the day. 😄 https://github.com/python/cpython/commit/4bb1e36eec19b30f2e582fceffa250e1598...

...

And why don't we do this for rich comparisons? Probably because the logic is completely separate. :-( And maybe when we did rich comparisons (nearly a decade after the original binary operator overloading IIRC) the optimization idea didn't occur to us, or maybe we realized that we'd be optimizing an error case. Or maybe because rich comparisons were trying to somehow model the earlier `__cmp__`?

My guess was no one honestly knew/remembered this quirk existed for binary arithmetic operators who were involved with rich comparisons.

...

...
[SNIP]

...
I think we could try to change it but it would require a very careful risk analysis.

I'm not sure how critical it is to change. I'm sure there's some potential perf gain by avoiding the (potentially) unnecessary call, but I also don't know if people have implemented these functions in such a way that skipping the inverse operation on the right-hand side object would break something. Would abuse of the syntax make a difference (e.g. making `>` do something magical)?

I don't know, PEP 207 explicitly says the reflexivity assumptions are assumed. I guess I misunderstood your question for clarification as a suggestion to change.

I'm just looking for historical context for a blog post is all. If we feel it's worth considering making the logic more uniform across operators then I think that's worth considering, but I am personally okay considering this a historical quirk that this difference exists to begin with.

...

I feel this requires more careful thought than I can muster tonight.

😄 Yeah, this is definitely digging into the bowels of Python. -Brett

...

...
-Brett

...
On Sun, Sep 27, 2020 at 1:41 PM Brett Cannon <brett@python.org> wrote:

...
When you do a binary arithmetic operation, one of the things that dictates whether the left-hand side's __*__ method is called before the right-hand side's __r*__ method is if the left-hand side's __r*__ differs (there's also the fact __r*__ methods are not called if. the types are the same). Presumably this is because you only care about giving precedence to the right-hand side when it would actually matter due to a difference in implementation (with the assumption that there isn't a specific need to get the right-hand side special dispensation to participate in the operation).

But with rich comparisons there doesn't seem to be an equivalent check for a difference in method implementation. Why is that? Is it because we don't want to assume that if someone bothered to implement both __gt__ and __lt__ that they would not necessarily be the inverse of each other like __add__ and __radd__? _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/7NZUCODE... Code of Conduct: http://python.org/psf/codeofconduct/

-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Guido van Rossum

9:14 p.m.

On Mon, Sep 28, 2020 at 12:03 PM Brett Cannon <brett@python.org> wrote:

...

Ah, that's much clearer than all the English words written so far here. :-) Let me go over this function (binary_op1()) for subtraction, the example from your blog. One piece of magic is that there are no separate `__sub__` and `__rsub__` implementations at this level -- the `tp_as_number` struct just has a slot named `nb_subtract` that takes two objects and either subtracts them or returns NotImplemented. This means that (**at this level**) there really is no point in calling `__rsub__` if the lhs and rhs have the same type, because it would literally just call the same `nb_subtract` function with the same arguments a second time. And if the types are different but the functions in `nb_subtract` are still the same, again we'd be calling the same function with the same arguments twice. The `nb_subtract` slot for Python classes dispatches to either `__sub__` or `__rsub__` in a complicated way. The code is SLOT1BINFULL in typeobject.c, which echoes binary_op1(): https://github.com/python/cpython/blob/b0dfc7581697f20385813582de7e92ba6ba01... That's some macro! Now, interestingly, this macro may call *both* `left.__sub__(right)` and `right.__rsub__(left)`. That is surprising, since there's also logic to call left's nb_subtract and right's nb_subtract in binary_op1(). What's up with that? Could we come up with an example where `a-b` makes more than two calls? For that to happen we'd have to trick binary_op1() into calling both. But I think that's impossible, because all Python classes have the same identical function in nb_subtract (the function is synthesized using SLOT1BIN -> SLOT1BINFULL), and in that case binary_op1() skips the second call (the two lines that Brett highlighted!). So we're good here. But maybe here we have another explanation for why binary_op1() is careful to skip the second call. (The slot function duplicates this logic so it will only call `__sub__` in this case.) Since rich comparison doesn't have this safeguard, can we trick *that* into making more than two calls? No, because the "reverse" logic (`self.__lt__(other)` -> `other.__gt__(self)` etc.) is only implemented once, in do_richcompare() in abstract.c. The slot function in typeobject.c (slot_tp_richcompare()) is totally tame. So the difference goes back to the design at the C level -- the number slots don't have separate `__sub__` and `__rsub__` implementations (the C function in nb_subtract has no direct way of knowing if it was called on behalf of its first or second argument), and the complications derive from that. The rich comparison slot has a clear `op` flag that always tells it which operation was requested, and the implementation is much saner because of it. So yes, in a sense the difference is because rich comparison is much newer than binary operators in Python -- binary operators are still constrained by the original design, which predates operator overloading in user code (i.e. `__sub__` and `rsub__`). But it was not a matter of forgetting anything -- it was a matter of better design. (Brett, maybe this warrants an update to your blog post?) -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Brett Cannon

11:52 a.m.

Thanks for the explanation! And I will look at updating my blog post. On Mon, Sep 28, 2020 at 9:14 PM Guido van Rossum <guido@python.org> wrote:

...

On Mon, Sep 28, 2020 at 12:03 PM Brett Cannon <brett@python.org> wrote:

...
And the code that makes this happen is (I think) https://github.com/python/cpython/blob/6f8c8320e9eac9bc7a7f653b43506e75916ce... .

Ah, that's much clearer than all the English words written so far here. :-) Let me go over this function (binary_op1()) for subtraction, the example from your blog.

One piece of magic is that there are no separate `__sub__` and `__rsub__` implementations at this level -- the `tp_as_number` struct just has a slot named `nb_subtract` that takes two objects and either subtracts them or returns NotImplemented.

This means that (**at this level**) there really is no point in calling `__rsub__` if the lhs and rhs have the same type, because it would literally just call the same `nb_subtract` function with the same arguments a second time.

And if the types are different but the functions in `nb_subtract` are still the same, again we'd be calling the same function with the same arguments twice.

The `nb_subtract` slot for Python classes dispatches to either `__sub__` or `__rsub__` in a complicated way. The code is SLOT1BINFULL in typeobject.c, which echoes binary_op1(): https://github.com/python/cpython/blob/b0dfc7581697f20385813582de7e92ba6ba01...

That's some macro!

Now, interestingly, this macro may call *both* `left.__sub__(right)` and `right.__rsub__(left)`. That is surprising, since there's also logic to call left's nb_subtract and right's nb_subtract in binary_op1(). What's up with that? Could we come up with an example where `a-b` makes more than two calls? For that to happen we'd have to trick binary_op1() into calling both. But I think that's impossible, because all Python classes have the same identical function in nb_subtract (the function is synthesized using SLOT1BIN -> SLOT1BINFULL), and in that case binary_op1() skips the second call (the two lines that Brett highlighted!). So we're good here.

But maybe here we have another explanation for why binary_op1() is careful to skip the second call. (The slot function duplicates this logic so it will only call `__sub__` in this case.)

Since rich comparison doesn't have this safeguard, can we trick *that* into making more than two calls? No, because the "reverse" logic (`self.__lt__(other)` -> `other.__gt__(self)` etc.) is only implemented once, in do_richcompare() in abstract.c. The slot function in typeobject.c (slot_tp_richcompare()) is totally tame.

So the difference goes back to the design at the C level -- the number slots don't have separate `__sub__` and `__rsub__` implementations (the C function in nb_subtract has no direct way of knowing if it was called on behalf of its first or second argument), and the complications derive from that. The rich comparison slot has a clear `op` flag that always tells it which operation was requested, and the implementation is much saner because of it.

So yes, in a sense the difference is because rich comparison is much newer than binary operators in Python -- binary operators are still constrained by the original design, which predates operator overloading in user code (i.e. `__sub__` and `rsub__`). But it was not a matter of forgetting anything -- it was a matter of better design.

(Brett, maybe this warrants an update to your blog post?)

-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Eric Wieser

8:41 a.m.

Since I don't see it linked anywhere here: this was discussed a few years ago at https://bugs.python.org/issue30140. Eric

Guido van Rossum

September 2020

9:57 p.m.

...

-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Brett Cannon

12:58 a.m.

On Sun, Sep 27, 2020 at 2:58 PM Guido van Rossum <guido@python.org> wrote:

...

Hm... IIRC the reason why we did this for `__r*__` is because the more derived class might want to return an instance of that class, and we can't assume that the less derived class knows how to create an instance of the more derived class (the `__init__` signatures might differ).

...

I think we could try to change it but it would require a very careful risk analysis.

...

On Sun, Sep 27, 2020 at 1:41 PM Brett Cannon <brett@python.org> wrote:

...
When you do a binary arithmetic operation, one of the things that dictates whether the left-hand side's __*__ method is called before the right-hand side's __r*__ method is if the left-hand side's __r*__ differs (there's also the fact __r*__ methods are not called if. the types are the same). Presumably this is because you only care about giving precedence to the right-hand side when it would actually matter due to a difference in implementation (with the assumption that there isn't a specific need to get the right-hand side special dispensation to participate in the operation).

But with rich comparisons there doesn't seem to be an equivalent check for a difference in method implementation. Why is that? Is it because we don't want to assume that if someone bothered to implement both __gt__ and __lt__ that they would not necessarily be the inverse of each other like __add__ and __radd__? _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/7NZUCODE... Code of Conduct: http://python.org/psf/codeofconduct/

-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Guido van Rossum

4:56 a.m.

On Sun, Sep 27, 2020 at 5:58 PM Brett Cannon <brett@python.org> wrote:

...

-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Brett Cannon

7:03 p.m.

On Sun, Sep 27, 2020 at 9:56 PM Guido van Rossum <guido@python.org> wrote:

...

On Sun, Sep 27, 2020 at 5:58 PM Brett Cannon <brett@python.org> wrote:

...
On Sun, Sep 27, 2020 at 2:58 PM Guido van Rossum <guido@python.org> wrote:

...
Hm... IIRC the reason why we did this for `__r*__` is because the more derived class might want to return an instance of that class, and we can't assume that the less derived class knows how to create an instance of the more derived class (the `__init__` signatures might differ).

Yep, that's what the data model docs suggest (see the note at https://docs.python.org/3/reference/datamodel.html#object.__ror__).

But the interesting bit is skipping the call of __r*__ when `lhs.__r*__ == rhs.__r*__` (as long as the derived class requirements are met). That's the difference that I'm really curious about compared to rich comparisons and their inverse which don't have this call avoidance.

To help visualize all of this, you can see https://github.com/brettcannon/desugar/blob/066f16c00a2c78784bfb18eec31476df... for binary arithmetic operators compared to https://github.com/brettcannon/desugar/blob/066f16c00a2c78784bfb18eec31476df... for rich comparisons.

Ooh, interesting. (Aren't you missing a few checks for MISSING in the elif or else branches?)

...

Let me guess some more (I'm on a rare caffeine high since 9am so we'll see how this goes :-).

The idea is clearly that if lhs and rhs are the same class we don't bother calling `__r*__` (because if `__*__` didn't do it there's no reason that `__r*__` would be any different).

That was my assumption.

...

Are you sure you read things right, and `__r*__` is skipped when the `__r*__` methods are the same, and not only when the lhs and rhs classes are the same?

...

It does seem kind of a pointless optimization, since if the first call is successful we'll skip the second call anyway, and if it returns NotImplemented, well, if our assumption that `__r*__` is going to do the same, it's going to be an error anyway. I wonder if this was always there? Maybe we should study the git blame some more.

...

And why don't we do this for rich comparisons? Probably because the logic is completely separate. :-( And maybe when we did rich comparisons (nearly a decade after the original binary operator overloading IIRC) the optimization idea didn't occur to us, or maybe we realized that we'd be optimizing an error case. Or maybe because rich comparisons were trying to somehow model the earlier `__cmp__`?

My guess was no one honestly knew/remembered this quirk existed for binary arithmetic operators who were involved with rich comparisons.

...

...
[SNIP]

...
I think we could try to change it but it would require a very careful risk analysis.

I'm not sure how critical it is to change. I'm sure there's some potential perf gain by avoiding the (potentially) unnecessary call, but I also don't know if people have implemented these functions in such a way that skipping the inverse operation on the right-hand side object would break something. Would abuse of the syntax make a difference (e.g. making `>` do something magical)?

I don't know, PEP 207 explicitly says the reflexivity assumptions are assumed. I guess I misunderstood your question for clarification as a suggestion to change.

...

I feel this requires more careful thought than I can muster tonight.

😄 Yeah, this is definitely digging into the bowels of Python. -Brett

...

...
-Brett

...
On Sun, Sep 27, 2020 at 1:41 PM Brett Cannon <brett@python.org> wrote:

...
When you do a binary arithmetic operation, one of the things that dictates whether the left-hand side's __*__ method is called before the right-hand side's __r*__ method is if the left-hand side's __r*__ differs (there's also the fact __r*__ methods are not called if. the types are the same). Presumably this is because you only care about giving precedence to the right-hand side when it would actually matter due to a difference in implementation (with the assumption that there isn't a specific need to get the right-hand side special dispensation to participate in the operation).

But with rich comparisons there doesn't seem to be an equivalent check for a difference in method implementation. Why is that? Is it because we don't want to assume that if someone bothered to implement both __gt__ and __lt__ that they would not necessarily be the inverse of each other like __add__ and __radd__? _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/7NZUCODE... Code of Conduct: http://python.org/psf/codeofconduct/

-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Guido van Rossum

4:14 a.m.

On Mon, Sep 28, 2020 at 12:03 PM Brett Cannon <brett@python.org> wrote:

...

Brett Cannon

6:52 p.m.

Thanks for the explanation! And I will look at updating my blog post. On Mon, Sep 28, 2020 at 9:14 PM Guido van Rossum <guido@python.org> wrote:

...

On Mon, Sep 28, 2020 at 12:03 PM Brett Cannon <brett@python.org> wrote:

...
And the code that makes this happen is (I think) https://github.com/python/cpython/blob/6f8c8320e9eac9bc7a7f653b43506e75916ce... .

Ah, that's much clearer than all the English words written so far here. :-) Let me go over this function (binary_op1()) for subtraction, the example from your blog.

One piece of magic is that there are no separate `__sub__` and `__rsub__` implementations at this level -- the `tp_as_number` struct just has a slot named `nb_subtract` that takes two objects and either subtracts them or returns NotImplemented.

This means that (**at this level**) there really is no point in calling `__rsub__` if the lhs and rhs have the same type, because it would literally just call the same `nb_subtract` function with the same arguments a second time.

And if the types are different but the functions in `nb_subtract` are still the same, again we'd be calling the same function with the same arguments twice.

The `nb_subtract` slot for Python classes dispatches to either `__sub__` or `__rsub__` in a complicated way. The code is SLOT1BINFULL in typeobject.c, which echoes binary_op1(): https://github.com/python/cpython/blob/b0dfc7581697f20385813582de7e92ba6ba01...

That's some macro!

Now, interestingly, this macro may call *both* `left.__sub__(right)` and `right.__rsub__(left)`. That is surprising, since there's also logic to call left's nb_subtract and right's nb_subtract in binary_op1(). What's up with that? Could we come up with an example where `a-b` makes more than two calls? For that to happen we'd have to trick binary_op1() into calling both. But I think that's impossible, because all Python classes have the same identical function in nb_subtract (the function is synthesized using SLOT1BIN -> SLOT1BINFULL), and in that case binary_op1() skips the second call (the two lines that Brett highlighted!). So we're good here.

But maybe here we have another explanation for why binary_op1() is careful to skip the second call. (The slot function duplicates this logic so it will only call `__sub__` in this case.)

Since rich comparison doesn't have this safeguard, can we trick *that* into making more than two calls? No, because the "reverse" logic (`self.__lt__(other)` -> `other.__gt__(self)` etc.) is only implemented once, in do_richcompare() in abstract.c. The slot function in typeobject.c (slot_tp_richcompare()) is totally tame.

So the difference goes back to the design at the C level -- the number slots don't have separate `__sub__` and `__rsub__` implementations (the C function in nb_subtract has no direct way of knowing if it was called on behalf of its first or second argument), and the complications derive from that. The rich comparison slot has a clear `op` flag that always tells it which operation was requested, and the implementation is much saner because of it.

So yes, in a sense the difference is because rich comparison is much newer than binary operators in Python -- binary operators are still constrained by the original design, which predates operator overloading in user code (i.e. `__sub__` and `rsub__`). But it was not a matter of forgetting anything -- it was a matter of better design.

(Brett, maybe this warrants an update to your blog post?)

-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Eric Wieser

September 2020

8:41 a.m.

Since I don't see it linked anywhere here: this was discussed a few years ago at https://bugs.python.org/issue30140. Eric

1607

Age (days ago)

1609

Last active (days ago)

List overview

Download

7 comments

3 participants

participants (3)

Brett Cannon
Eric Wieser
Guido van Rossum

Why do binary arithmetic operators care about differing method implementations but rich comparisons don't?

tags

participants (3)