Add __eq__ to colletions.abc.Sequence ?

Is there any reason why collections.abc.[Sequence|MutableSequence] do not have "__eq__" for free? I was writing some tests now and was caught by surprise there. (chcking the docs, if my custom MutbaleSequence also inherits from "collections.abc.Set" I will have working "__eq__" - and "__ne__", methods but also weirdly behaved "__le__" and cousins as a side efffect - weird enough I think i will just hand-write the "__eq__") While theoretically adding "__eq__" and "__ne__" _could_ hab backward compatibility issues, I think it is more probable it would actually fix some undetected bugs i a couple projets, as the default __eq__ is an identity comparison (is). Anyway, maybe there is a reason it is not a given. Any thoughts?

I don't recall why this was done. It seems somewhat odd, since Set and Mapping in the same module do have __eq__. I don't care much for the default implementation though. (I don't understand why you would want to inherit from both Sequence and Set -- and certainly the resulting mongrel type would have to behave weirdly in order to conform to user expectations for both of its parents, regardless of what you do for __le__.) Traditionally we've been very reluctant to add new methods to existing ABCs, because of the implications for classes everywhere that inherit from these. On Tue, Jun 30, 2020 at 5:16 AM Joao S. O. Bueno <jsbueno@python.org.br> wrote:
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On Tue, 30 Jun 2020 at 11:57, Guido van Rossum <guido@python.org> wrote:
The idea of inheriting from Seuqence and Set is that, since `Sequence` gives me __iter__, it has the complete requisites for `set` and then I get the `__eq__` for "free" . On a second thought, it is likely the `Set.__eq__` do not care about order - I was making some interactive experimentation and did not check that: I stopped when I found out the __le__ (and related) methods created this way made no sense at all, exactly as you put it. I ended up writting an __eq__ - and in the process I found it is not _that_ straightforward due to having to check subclasses types when comparing. (given Base sequence A, child class B(A), class C(A) and class B1(B) - Instances of B and B1 can be equal, but instances of B and C should always be different) - or in Python, inside __eq__ : if not issubclass(type(other), type(self)) and not issubclass(type(self), type(other)): return False
yes - I can see that. That is why, even though I am writing this message it is more in a question tone, than the usual "I want this feature" normally used in this list. On the other hand I can see no reason why these comparisons are not there, and can't think of many ways code would break if they were introduced (such code would have to be relying on the default "is" operation of the default "object.__eq__")

On Sat, 4 Jul 2020 at 16:51, Serhiy Storchaka <storchaka@gmail.com> wrote:
Ah - yes. Half the logic is already on the __eq__ semantics - thanks. Well, I am updating that on my code right now. Anyway I am not seeing anyone opposing this going into col...abc.Sequence - Maybe it is ok for a BPO? (yes, there is this consideration I had, and also Guido, that it would create new behavior in classes that already exist, but I saw no way that could break code not specially crafted to break with this change.) _______________________________________________

On Mon, Jul 6, 2020, at 01:47, Neil Girdhar wrote:
Anyone can in principle override __eq__ to throw an exception, but they're not "supposed to" - the default behavior is that an object is only equal to itself, and floating point NaNs aren't equal to anything including itself which isn't very useful, but in all cases the operation itself is valid and simply returns false e.g. when the other operand is a different type rather than treating it as any kind of error. Which of course means that, right now, a sequence that does not define its own __eq__ method is equal only to itself, rather than it being an error to try to compare it.

On Tue, Jun 30, 2020, at 10:57, Guido van Rossum wrote:
One thing that may be worth considering is that tuples and lists with the same respective contents are not equal to each other [whereas sets and frozensets are] I do think it might be worthwhile to have a "compare two sequences" [and possibly also "hash a sequence", to match the tuple hash without making a tuple] building block as a function somewhere, so people could relatively easily make their own [perhaps even something like "__eq__ = collections.sequence_eq"]

On Wed, 1 Jul 2020 at 03:37, Random832 <random832@fastmail.com> wrote:
Maybe something like collections.mixins.ComparableSequence ? a bundle class with "__eq__", "__ne__", "__le__" -behaving like they do for lists? Then, " collections.mixins.HashableSequence" would also be a natural fit. If that is interesting enough, maybe more such mixins can be thought of to be added there? One thing that I've missed sometimes - and is complicated to implement - is to have the Sequences generated by collections.abc... to behave properly with slices - collections.mixins.SlicedSequence that would override `__delitem__`, `__setitem__` and `__getitem__` and handle slices could pair up with the "ComparableSequence" - people could use these "a la carte", and no backwards compatibility would be hurt. _______________________________________________

Can a Sequence be infinite? If so, an equality test of two nonterminating sequences would be a nonterminating operation. Do Sized and *Reversible* imply that a sequence terminates? Could __len__ return inf? Perhaps `Ordered` is a requisite condition for defining a comparator for Sequences. `OrderedSequence`? Are there unordered Sequences for which a default `__eq__` / `__cmp__` (et. al) would be wrong or inappropriate? On Fri, Jul 3, 2020, 2:08 AM Random832 <random832@fastmail.com> wrote:

On Fri, Jul 3, 2020, at 03:57, Wes Turner wrote:
Can a Sequence be infinite? If so, an equality test of two nonterminating sequences would be a nonterminating operation.
I think not - an infinite sequence would make len, contains, and reversed ill-defined (it also wouldn't allow certain kinds of slices)
Do Sized and *Reversible* imply that a sequence terminates? Could __len__ return inf?
__len__ must return an integer.
I don't think so [index as a mixin implies being ordered, i think]... the bigger problem is the one I mentioned earlier, that allowing comparison between sequences of different types is inconsistent with tuple and list.

On Sat, 4 Jul 2020 at 12:51, Random832 <random832@fastmail.com> wrote:
As far as types are concerned, the `__eq__` should worry about it - just Sequences that are a subtype of other, or the other being a subtype of one, should be able to compare equal (As happens with lists and tuples as well: subclasses of both will compare equal to base lists and tuples with the same content.). The code for that is on my first reply to Guido, above: if not issubclass(type(other), type(self)) and not issubclass(type(self), type(other)): return False I am actually using that on the file that motivated me sending the first e-mail here -as it makes sense in that project.

On Sat, Jul 4, 2020, at 12:30, Joao S. O. Bueno wrote:
As will two different subclasses of list. Should we try to replicate this behavior as well (i.e. determine the the "base sequence type" somehow, so that two subclasses of the same one can be compared and others cannot)? This is something that we have to get right the first time; equality comparisons can't be made more permissive later since they simply return False rather than being an error... otherwise I might be advocating to "fix" list and tuple.

On Fri, 3 Jul 2020 at 03:07, Random832 <random832@fastmail.com> wrote:
Really - I think I had worked on this once, but the slicing support was via decorators.
And sliced __delitem__ may be difficult to implement efficiently without knowing the internals of the sequence type.
Indeed, as would __setitem__ inserting asequence larger than the target slice - (or shorter). There are log of corner cases there hard to get right. But, yes, it would be so inefficient that probably it is better left for a 3rdy party package than the stdlib. So - let's just drop these out of the proposal and maybe check if Sequence.__eq__ is worth it.

I don't recall why this was done. It seems somewhat odd, since Set and Mapping in the same module do have __eq__. I don't care much for the default implementation though. (I don't understand why you would want to inherit from both Sequence and Set -- and certainly the resulting mongrel type would have to behave weirdly in order to conform to user expectations for both of its parents, regardless of what you do for __le__.) Traditionally we've been very reluctant to add new methods to existing ABCs, because of the implications for classes everywhere that inherit from these. On Tue, Jun 30, 2020 at 5:16 AM Joao S. O. Bueno <jsbueno@python.org.br> wrote:
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On Tue, 30 Jun 2020 at 11:57, Guido van Rossum <guido@python.org> wrote:
The idea of inheriting from Seuqence and Set is that, since `Sequence` gives me __iter__, it has the complete requisites for `set` and then I get the `__eq__` for "free" . On a second thought, it is likely the `Set.__eq__` do not care about order - I was making some interactive experimentation and did not check that: I stopped when I found out the __le__ (and related) methods created this way made no sense at all, exactly as you put it. I ended up writting an __eq__ - and in the process I found it is not _that_ straightforward due to having to check subclasses types when comparing. (given Base sequence A, child class B(A), class C(A) and class B1(B) - Instances of B and B1 can be equal, but instances of B and C should always be different) - or in Python, inside __eq__ : if not issubclass(type(other), type(self)) and not issubclass(type(self), type(other)): return False
yes - I can see that. That is why, even though I am writing this message it is more in a question tone, than the usual "I want this feature" normally used in this list. On the other hand I can see no reason why these comparisons are not there, and can't think of many ways code would break if they were introduced (such code would have to be relying on the default "is" operation of the default "object.__eq__")

On Sat, 4 Jul 2020 at 16:51, Serhiy Storchaka <storchaka@gmail.com> wrote:
Ah - yes. Half the logic is already on the __eq__ semantics - thanks. Well, I am updating that on my code right now. Anyway I am not seeing anyone opposing this going into col...abc.Sequence - Maybe it is ok for a BPO? (yes, there is this consideration I had, and also Guido, that it would create new behavior in classes that already exist, but I saw no way that could break code not specially crafted to break with this change.) _______________________________________________

On Mon, Jul 6, 2020, at 01:47, Neil Girdhar wrote:
Anyone can in principle override __eq__ to throw an exception, but they're not "supposed to" - the default behavior is that an object is only equal to itself, and floating point NaNs aren't equal to anything including itself which isn't very useful, but in all cases the operation itself is valid and simply returns false e.g. when the other operand is a different type rather than treating it as any kind of error. Which of course means that, right now, a sequence that does not define its own __eq__ method is equal only to itself, rather than it being an error to try to compare it.

On Tue, Jun 30, 2020, at 10:57, Guido van Rossum wrote:
One thing that may be worth considering is that tuples and lists with the same respective contents are not equal to each other [whereas sets and frozensets are] I do think it might be worthwhile to have a "compare two sequences" [and possibly also "hash a sequence", to match the tuple hash without making a tuple] building block as a function somewhere, so people could relatively easily make their own [perhaps even something like "__eq__ = collections.sequence_eq"]

On Wed, 1 Jul 2020 at 03:37, Random832 <random832@fastmail.com> wrote:
Maybe something like collections.mixins.ComparableSequence ? a bundle class with "__eq__", "__ne__", "__le__" -behaving like they do for lists? Then, " collections.mixins.HashableSequence" would also be a natural fit. If that is interesting enough, maybe more such mixins can be thought of to be added there? One thing that I've missed sometimes - and is complicated to implement - is to have the Sequences generated by collections.abc... to behave properly with slices - collections.mixins.SlicedSequence that would override `__delitem__`, `__setitem__` and `__getitem__` and handle slices could pair up with the "ComparableSequence" - people could use these "a la carte", and no backwards compatibility would be hurt. _______________________________________________

Can a Sequence be infinite? If so, an equality test of two nonterminating sequences would be a nonterminating operation. Do Sized and *Reversible* imply that a sequence terminates? Could __len__ return inf? Perhaps `Ordered` is a requisite condition for defining a comparator for Sequences. `OrderedSequence`? Are there unordered Sequences for which a default `__eq__` / `__cmp__` (et. al) would be wrong or inappropriate? On Fri, Jul 3, 2020, 2:08 AM Random832 <random832@fastmail.com> wrote:

On Fri, Jul 3, 2020, at 03:57, Wes Turner wrote:
Can a Sequence be infinite? If so, an equality test of two nonterminating sequences would be a nonterminating operation.
I think not - an infinite sequence would make len, contains, and reversed ill-defined (it also wouldn't allow certain kinds of slices)
Do Sized and *Reversible* imply that a sequence terminates? Could __len__ return inf?
__len__ must return an integer.
I don't think so [index as a mixin implies being ordered, i think]... the bigger problem is the one I mentioned earlier, that allowing comparison between sequences of different types is inconsistent with tuple and list.

On Sat, 4 Jul 2020 at 12:51, Random832 <random832@fastmail.com> wrote:
As far as types are concerned, the `__eq__` should worry about it - just Sequences that are a subtype of other, or the other being a subtype of one, should be able to compare equal (As happens with lists and tuples as well: subclasses of both will compare equal to base lists and tuples with the same content.). The code for that is on my first reply to Guido, above: if not issubclass(type(other), type(self)) and not issubclass(type(self), type(other)): return False I am actually using that on the file that motivated me sending the first e-mail here -as it makes sense in that project.

On Sat, Jul 4, 2020, at 12:30, Joao S. O. Bueno wrote:
As will two different subclasses of list. Should we try to replicate this behavior as well (i.e. determine the the "base sequence type" somehow, so that two subclasses of the same one can be compared and others cannot)? This is something that we have to get right the first time; equality comparisons can't be made more permissive later since they simply return False rather than being an error... otherwise I might be advocating to "fix" list and tuple.

On Fri, 3 Jul 2020 at 03:07, Random832 <random832@fastmail.com> wrote:
Really - I think I had worked on this once, but the slicing support was via decorators.
And sliced __delitem__ may be difficult to implement efficiently without knowing the internals of the sequence type.
Indeed, as would __setitem__ inserting asequence larger than the target slice - (or shorter). There are log of corner cases there hard to get right. But, yes, it would be so inefficient that probably it is better left for a 3rdy party package than the stdlib. So - let's just drop these out of the proposal and maybe check if Sequence.__eq__ is worth it.
participants (6)
-
Guido van Rossum
-
Joao S. O. Bueno
-
Neil Girdhar
-
Random832
-
Serhiy Storchaka
-
Wes Turner