`__lcontains__` for letting the other class determine container membership when `__contains__` fails

*Currently, the `in` operator (also known as `__contains__`) always uses the rightmost argument's implementation.* *For example,*
* status = obj in "xylophone" *
*Is similar to:* * status = "xylophone".__contains__( obj )* *The current implementation of `__contains__` is similar to the way that `+` used to only look to the leftmost argument for implementation. * * total = 4 + obj*
* total = int.__add__(4, obj)*
*However, these days, `__radd__` gives us the following:* * try:*
*We propose something similar for `__contains__`: That a new dunder/magic method `__lcontains__` be created and that the `in` operator be implemented similarly to the following:* * # IMPLEMENTATION OF*
*The proposed enhancement would be backwards compatible except in the event that a user already wrote a class having an `__lcontains__` method.* * With our running example of the string “xylophone”, writers of user-defined classes would be able to decide whether their objects are elements of “xylophone” or not. Programmer would do this by writing an `__lcontains__` method.* *As an example application, one might develope a tree in which each node represents a string (the strings being unique within the tree). A property of the tree might be that node `n` is a descendant of node `m` if and only if `n` is a sub-string of `m`. For example the string "yell" is a descendant of "yellow." We might want the root node of the tree to be a special object, `root` such that every string is in `root` and that `root` is in no string. That is, the code `root in "yellow"` should return `False`. If ` __lcontains__ ` were implemented, then we could implement the node as follows:*

Hm... with only a little bit of cooperation of the container class (e.g. xylophone), you could implement this yourself: class xylophone: def __contains__(self, item): if hasattr(item, '__lcontains__'): return item.__lcontains__(self) return False On Tue, Nov 12, 2019 at 5:04 PM Samuel Muldoon <muldoonsamuel@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On Nov 12, 2019, at 17:00, Samuel Muldoon <muldoonsamuel@gmail.com> wrote:
When was this? I’m pretty sure __radd__ was there in 1.x.
You’ve specified rules that are different from the one you gave for __radd__, and also different from the actual rules for __radd__. Is that intentional? If so, why? To summarize the rules: If type(rhs) is a proper subclass of type(lhs), check rhs.__radd__ first and fall back to lhs.__add__. Otherwise, if they’re the same type, only check lhs.__add__. Otherwise, check lhs.__add__ first and fall back to rhs.__radd__. In each case, the check uses special method lookup, not normal getattr. Also, fallback happens if lookup raises an AttributeError or the call returns NotImplemented; it does not happen if either one raises NotImplementedError. And finally, if the fallback fails in the same way, you get a TypeError.
As an example application, one might develope a tree in which each node represents a string (the strings being unique within the tree). A property of the tree might be that node `n` is a descendant of node `m` if and only if `n` is a sub-string of `m`. For example the string "yell" is a descendant of "yellow." We might want the root node of the tree to be a special object, `root` such that every string is in `root` and that `root` is in no string.
I don’t understand why you’d want this. If your tree is defined as substrings of a string, why isn’t your root the maximal string, instead of an empty string? Also, why does `node in “yellow”` work in the first place, when “yellow” is a str, not a Node? Also, any string is a substring of itself; do you actually want every Node to be a descendant of itself? (And, if so, is the root a descendant of itself or not?) And finally, doesn’t this mean the root of any tree contains every descendant of every possible tree, not just its own descendants? Most of all, why can’t you implement your rule in Python today, without any new methods? class Node: def __contains__(self, other): if self.isroot: return True if other.isroot: return False return other.label in self.label The only reason you need __radd__ is to handle interaction with different types, especially ones you don’t control. When you’re just building a single type, you can put all the logic in __add__. And the same thing ought to be true for __lcontains__. Not understanding the point of this example makes it hard to evaluate how well the proposal solves it, but I don’t think it actually does.
Presumably the rhs’s __contains__ method exists and does not raise NotImplementedError, right? Then by your rules, RootNode.__lcontains__ would never get called. This is the reason for those complicated rules about proper subclasses, identical classes, and unrelated classes being handled differently by __radd__. But even with those rules, your rhs isn’t even a Node, it’s a str. And str.__contains__ definitely exists and doesn’t raise NotImplementedError, and, as it’s an unrelated class, it will get called first, so you’ll just get a TypeError without ever having the chance to get your __lcontains__ called. And that means that you can’t actually get the benefits without massively breaking backward compatibility. The only reason you can use __radd__ to make new types be addable to int is that int.__add__ doesn’t raise TypeError on unknown types, it returns NotImplemented. And the same for every other builtin, stdlib, and third-party numeric type. But every builtin, stdlib, and third-party container type raises TypeError from __contains__ on unknown types. So for __lcontains__ to be useful, they’d all have to be changed to return NotImplemented instead. I think __lcontains__ (following the same rules as __radd__, and with the change to every existing __contains__, and probably at least two versions’ worth of __future__) could be useful, and if I were designing a new Python-like language I’d probably include it unless someone came up with a good reason not to. By adding it today would definitely be disruptive. So it needs a real killer use case that’s worth all that disruption.

Note that __lcontains__ (if it exists) would be called first, at least for different types. So maybe it would be easier than you think. But I still think it’s not needed. On Tue, Nov 12, 2019 at 9:04 PM Andrew Barnert via Python-ideas < python-ideas@python.org> wrote:
-- --Guido (mobile)

On Tue, Nov 12, 2019, at 20:00, Samuel Muldoon wrote:
*Currently, the `in` operator (also known as `__contains__`) always uses the rightmost argument's implementation.*
minor bikeshed: I've always considered the "r" in these to mean "reverse", not "right". more serious issue: method pairs like this generally use the main version *first* and only use the reversed version if the main one returns NotImplemented (not, as your post goes on to say, raises NotImplementedError. if all candidates return NotImplemented, the operator raises TypeError). If this is to consider the left-hand operand the primary authority, it might be better to simply name the method __in__.

Hm... with only a little bit of cooperation of the container class (e.g. xylophone), you could implement this yourself: class xylophone: def __contains__(self, item): if hasattr(item, '__lcontains__'): return item.__lcontains__(self) return False On Tue, Nov 12, 2019 at 5:04 PM Samuel Muldoon <muldoonsamuel@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On Nov 12, 2019, at 17:00, Samuel Muldoon <muldoonsamuel@gmail.com> wrote:
When was this? I’m pretty sure __radd__ was there in 1.x.
You’ve specified rules that are different from the one you gave for __radd__, and also different from the actual rules for __radd__. Is that intentional? If so, why? To summarize the rules: If type(rhs) is a proper subclass of type(lhs), check rhs.__radd__ first and fall back to lhs.__add__. Otherwise, if they’re the same type, only check lhs.__add__. Otherwise, check lhs.__add__ first and fall back to rhs.__radd__. In each case, the check uses special method lookup, not normal getattr. Also, fallback happens if lookup raises an AttributeError or the call returns NotImplemented; it does not happen if either one raises NotImplementedError. And finally, if the fallback fails in the same way, you get a TypeError.
As an example application, one might develope a tree in which each node represents a string (the strings being unique within the tree). A property of the tree might be that node `n` is a descendant of node `m` if and only if `n` is a sub-string of `m`. For example the string "yell" is a descendant of "yellow." We might want the root node of the tree to be a special object, `root` such that every string is in `root` and that `root` is in no string.
I don’t understand why you’d want this. If your tree is defined as substrings of a string, why isn’t your root the maximal string, instead of an empty string? Also, why does `node in “yellow”` work in the first place, when “yellow” is a str, not a Node? Also, any string is a substring of itself; do you actually want every Node to be a descendant of itself? (And, if so, is the root a descendant of itself or not?) And finally, doesn’t this mean the root of any tree contains every descendant of every possible tree, not just its own descendants? Most of all, why can’t you implement your rule in Python today, without any new methods? class Node: def __contains__(self, other): if self.isroot: return True if other.isroot: return False return other.label in self.label The only reason you need __radd__ is to handle interaction with different types, especially ones you don’t control. When you’re just building a single type, you can put all the logic in __add__. And the same thing ought to be true for __lcontains__. Not understanding the point of this example makes it hard to evaluate how well the proposal solves it, but I don’t think it actually does.
Presumably the rhs’s __contains__ method exists and does not raise NotImplementedError, right? Then by your rules, RootNode.__lcontains__ would never get called. This is the reason for those complicated rules about proper subclasses, identical classes, and unrelated classes being handled differently by __radd__. But even with those rules, your rhs isn’t even a Node, it’s a str. And str.__contains__ definitely exists and doesn’t raise NotImplementedError, and, as it’s an unrelated class, it will get called first, so you’ll just get a TypeError without ever having the chance to get your __lcontains__ called. And that means that you can’t actually get the benefits without massively breaking backward compatibility. The only reason you can use __radd__ to make new types be addable to int is that int.__add__ doesn’t raise TypeError on unknown types, it returns NotImplemented. And the same for every other builtin, stdlib, and third-party numeric type. But every builtin, stdlib, and third-party container type raises TypeError from __contains__ on unknown types. So for __lcontains__ to be useful, they’d all have to be changed to return NotImplemented instead. I think __lcontains__ (following the same rules as __radd__, and with the change to every existing __contains__, and probably at least two versions’ worth of __future__) could be useful, and if I were designing a new Python-like language I’d probably include it unless someone came up with a good reason not to. By adding it today would definitely be disruptive. So it needs a real killer use case that’s worth all that disruption.

Note that __lcontains__ (if it exists) would be called first, at least for different types. So maybe it would be easier than you think. But I still think it’s not needed. On Tue, Nov 12, 2019 at 9:04 PM Andrew Barnert via Python-ideas < python-ideas@python.org> wrote:
-- --Guido (mobile)

On Tue, Nov 12, 2019, at 20:00, Samuel Muldoon wrote:
*Currently, the `in` operator (also known as `__contains__`) always uses the rightmost argument's implementation.*
minor bikeshed: I've always considered the "r" in these to mean "reverse", not "right". more serious issue: method pairs like this generally use the main version *first* and only use the reversed version if the main one returns NotImplemented (not, as your post goes on to say, raises NotImplementedError. if all candidates return NotImplemented, the operator raises TypeError). If this is to consider the left-hand operand the primary authority, it might be better to simply name the method __in__.
participants (4)
-
Andrew Barnert
-
Guido van Rossum
-
Random832
-
Samuel Muldoon