unittest of sequence equality

The following test fails because because `seq1 == seq2` returns a (boolean) NumPy array whenever either seq is a NumPy array.
import unittest import numpy as np unittest.TestCase().assertSequenceEqual([1.,2.,3.], np.array([1.,2.,3.]))
I expected `unittest` to rely only on features of a `collections.abc.Sequence`, which based on https://docs.python.org/3/glossary.html#term-sequence, I believe are satisfied by a NumPy array. Specifically, I see no requirement that a sequence implement __eq__ at all much less in any particular way.
In short: a test named `assertSequenceEqual` should, I would think, work for any sequence and therefore (based on the available documentation) should not depend on the class-specific implementation of __eq__.
Is that wrong?
Thank you, Alan Isaac

On Tue, Dec 22, 2020 at 10:54 AM Alan G. Isaac alan.isaac@gmail.com wrote:
The following test fails because because `seq1 == seq2` returns a (boolean) NumPy array whenever either seq is a NumPy array.
import unittest import numpy as np unittest.TestCase().assertSequenceEqual([1.,2.,3.],
np.array([1.,2.,3.]))
I expected `unittest` to rely only on features of a `collections.abc.Sequence`, which based on https://docs.python.org/3/glossary.html#term-sequence, I believe are satisfied by a NumPy array. Specifically, I see no requirement that a sequence implement __eq__ at all much less in any particular way.
In short: a test named `assertSequenceEqual` should, I would think, work for any sequence and therefore (based on the available documentation) should not depend on the class-specific implementation of __eq__.
Is that wrong?
Yes and no. :) I don't agree that `seq1 == seq2` should not be tried if the sequences support it, but the function does work on sequences that lack a definition of `__eq__` as you would expect (e.g. user-defined sequences where you just didn't want to bother). The fact that numpy chooses to implement __eq__ in such a way that its result would be surprising if used in an `if` guard I think is more a design choice/issue of numpy than a suggestion that you can't trust `==` in testing because it _can_ be something other than True/False.

On 22/12/2020 19:08, Brett Cannon wrote:
... The fact that numpy chooses to implement __eq__ in such a way that its result would be surprising if used in an `if` guard I think is more a design choice/issue of numpy than a suggestion that you can't trust `==` in testing because it _can_ be something other than True/False.
+1 In addition to NumPy's regularly surprising interpretation of operators, it is evident from Ivan Pozdeev's investigation (other branch) that part of the problem lies with bool(np.array) being an error. I can see why that might be sensible. You can have one or the other, but not both.
I wondered if Python had become stricter here after NumPy made its choices, but a little mining turns up:
"New in version 2.1. These are the so-called ``rich comparison'' methods, and are called for comparison operators in preference to __cmp__() below. The correspondence between operator symbols and method names is as follows: |x<y| calls |x.__lt__(y)|, |x<=y| calls |x.__le__(y)|, |x==y| calls |x.__eq__(y)|, |x!=y| and |x<>y| call |x.__ne__(y)|, |x>y| calls |x.__gt__(y)|, and |x>=y| calls |x.__ge__(y)|. These methods can return any value, but if the comparison operator is used in a Boolean context, the return value should be interpretable as a Boolean value, else a TypeError will be raised. By convention, |0| is used for false and |1| for true. "
https://docs.python.org/release/2.1/ref/customization.html
The combination of choices makes the result of a comparison, about which there is some freedom, not interpretable as a boolean value. We are warned that this should not be expected to work. Later docs (from v2.6) refer explicitly to calling bool() as a definition of "interpretable". bool() is there from v2.3.
Jeff Allen

Interesting. Did you look at the code? It is here (that's the `==` operator you're complaining about):
https://github.com/python/cpython/blob/6afb730e2a8bf0b472b4c3157bcf5b44aa7e6...
The code does already analyze the length of the sequence
You are right that collections.abc.Sequence (or its ancestors other than object) does not implement `__eq__`, so it would seem that the `==` operator would have to be replaced with a simple loop: ``` for x, y in zip(seq1, seq2): if x is not y and x != y: break else: return # They are all equal ``` Making that change would probably slow things down. (Note that the odd check "x is not y and x != y" is needed to keep the previous behavior regarding NaN and other objects that aren't equal to themselves.)
One could also argue that the docstring warns about this issue: ``` For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator. ``` IOW, I think this ship has actually sailed.
On Tue, Dec 22, 2020 at 10:56 AM Alan G. Isaac alan.isaac@gmail.com wrote:
The following test fails because because `seq1 == seq2` returns a (boolean) NumPy array whenever either seq is a NumPy array.
import unittest import numpy as np unittest.TestCase().assertSequenceEqual([1.,2.,3.],
np.array([1.,2.,3.]))
I expected `unittest` to rely only on features of a `collections.abc.Sequence`, which based on https://docs.python.org/3/glossary.html#term-sequence, I believe are satisfied by a NumPy array. Specifically, I see no requirement that a sequence implement __eq__ at all much less in any particular way.
In short: a test named `assertSequenceEqual` should, I would think, work for any sequence and therefore (based on the available documentation) should not depend on the class-specific implementation of __eq__.
Is that wrong?
Thank you, Alan Isaac _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/6Z43SU2R... Code of Conduct: http://python.org/psf/codeofconduct/

On 22.12.2020 21:52, Alan G. Isaac wrote:
The following test fails because because `seq1 == seq2` returns a (boolean) NumPy array whenever either seq is a NumPy array.
You sure about that? For me, bool(np.array) raises an exception:
In [12]: np.__version__ Out[12]: '1.19.4'
In [11]: if [False, False]==np.array([False, False]): print("foo") <...> ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
import unittest import numpy as np unittest.TestCase().assertSequenceEqual([1.,2.,3.], np.array([1.,2.,3.]))
I expected `unittest` to rely only on features of a `collections.abc.Sequence`, which based on https://docs.python.org/3/glossary.html#term-sequence, I believe are satisfied by a NumPy array. Specifically, I see no requirement that a sequence implement __eq__ at all much less in any particular way.
In short: a test named `assertSequenceEqual` should, I would think, work for any sequence and therefore (based on the available documentation) should not depend on the class-specific implementation of __eq__.
Is that wrong?
Thank you, Alan Isaac _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/6Z43SU2R... Code of Conduct: http://python.org/psf/codeofconduct/

Here, `seq1 == seq2` produces a boolean array (i.e., an array of boolean values). hth, Alan Isaac
On 12/22/2020 2:28 PM, Ivan Pozdeev via Python-Dev wrote:
You sure about that? For me, bool(np.array) raises an exception:
In [12]: np.__version__ Out[12]: '1.19.4'
In [11]: if [False, False]==np.array([False, False]): print("foo") <...> ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
On 22.12.2020 21:52, Alan G. Isaac wrote:
The following test fails because because `seq1 == seq2` returns a (boolean) NumPy array whenever either seq is a NumPy array.
import unittest import numpy as np unittest.TestCase().assertSequenceEqual([1.,2.,3.], np.array([1.,2.,3.]))
I expected `unittest` to rely only on features of a `collections.abc.Sequence`, which based on https://docs.python.org/3/glossary.html#term-sequence, I believe are satisfied by a NumPy array. Specifically, I see no requirement that a sequence implement __eq__ at all much less in any particular way.
In short: a test named `assertSequenceEqual` should, I would think, work for any sequence and therefore (based on the available documentation) should not depend on the class-specific implementation of __eq__.
Is that wrong?
Thank you, Alan Isaac _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/6Z43SU2R... Code of Conduct: http://python.org/psf/codeofconduct/

Okay, I see that by "fails", you probably meant "raises this exception" rather than fails the usual way (i.e. raises anAssertionError).
On 22.12.2020 22:38, Alan G. Isaac wrote:
Here, `seq1 == seq2` produces a boolean array (i.e., an array of boolean values). hth, Alan Isaac
On 12/22/2020 2:28 PM, Ivan Pozdeev via Python-Dev wrote:
You sure about that? For me, bool(np.array) raises an exception:
In [12]: np.__version__ Out[12]: '1.19.4'
In [11]: if [False, False]==np.array([False, False]): print("foo") <...> ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
On 22.12.2020 21:52, Alan G. Isaac wrote:
The following test fails because because `seq1 == seq2` returns a (boolean) NumPy array whenever either seq is a NumPy array.
import unittest import numpy as np unittest.TestCase().assertSequenceEqual([1.,2.,3.], np.array([1.,2.,3.]))
I expected `unittest` to rely only on features of a `collections.abc.Sequence`, which based on https://docs.python.org/3/glossary.html#term-sequence, I believe are satisfied by a NumPy array. Specifically, I see no requirement that a sequence implement __eq__ at all much less in any particular way.
In short: a test named `assertSequenceEqual` should, I would think, work for any sequence and therefore (based on the available documentation) should not depend on the class-specific implementation of __eq__.
Is that wrong?
Thank you, Alan Isaac _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/6Z43SU2R... Code of Conduct: http://python.org/psf/codeofconduct/
Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/74CUML37... Code of Conduct: http://python.org/psf/codeofconduct/

In the light of this and https://github.com/python/cpython/blob/6afb730e2a8bf0b472b4c3157bcf5b44aa7e6... (linked to from https://mail.python.org/archives/list/python-dev@python.org/message/AQRLRVY7... )
I reckon that
*like the other code before it, `seq1 == seq2` should check for (TypeError, NotImplementedError) and fall back to by-element comparison in such a case.*
On 22.12.2020 22:50, Ivan Pozdeev via Python-Dev wrote:
Okay, I see that by "fails", you probably meant "raises this exception" rather than fails the usual way (i.e. raises anAssertionError).
On 22.12.2020 22:38, Alan G. Isaac wrote:
Here, `seq1 == seq2` produces a boolean array (i.e., an array of boolean values). hth, Alan Isaac
On 12/22/2020 2:28 PM, Ivan Pozdeev via Python-Dev wrote:
You sure about that? For me, bool(np.array) raises an exception:
In [12]: np.__version__ Out[12]: '1.19.4'
In [11]: if [False, False]==np.array([False, False]): print("foo") <...> ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
On 22.12.2020 21:52, Alan G. Isaac wrote:
The following test fails because because `seq1 == seq2` returns a (boolean) NumPy array whenever either seq is a NumPy array.
import unittest import numpy as np unittest.TestCase().assertSequenceEqual([1.,2.,3.], np.array([1.,2.,3.]))
I expected `unittest` to rely only on features of a `collections.abc.Sequence`, which based on https://docs.python.org/3/glossary.html#term-sequence, I believe are satisfied by a NumPy array. Specifically, I see no requirement that a sequence implement __eq__ at all much less in any particular way.
In short: a test named `assertSequenceEqual` should, I would think, work for any sequence and therefore (based on the available documentation) should not depend on the class-specific implementation of __eq__.
Is that wrong?
Thank you, Alan Isaac _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/6Z43SU2R... Code of Conduct: http://python.org/psf/codeofconduct/
Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/74CUML37... Code of Conduct: http://python.org/psf/codeofconduct/
-- Regards, Ivan
Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/MKNN64A4... Code of Conduct: http://python.org/psf/codeofconduct/

On 22.12.2020 22:59, Ivan Pozdeev via Python-Dev wrote:
In the light of this and https://github.com/python/cpython/blob/6afb730e2a8bf0b472b4c3157bcf5b44aa7e6... (linked to from https://mail.python.org/archives/list/python-dev@python.org/message/AQRLRVY7... )
I reckon that
*like the other code before it, `seq1 == seq2` should check for (TypeError, NotImplementedError)
and fall back to by-element comparison in such a case.*
Or just bail out ("resist the temptation to guess") and tell the user to compare their weird types themselves.
On 22.12.2020 22:50, Ivan Pozdeev via Python-Dev wrote:
Okay, I see that by "fails", you probably meant "raises this exception" rather than fails the usual way (i.e. raises anAssertionError).
On 22.12.2020 22:38, Alan G. Isaac wrote:
Here, `seq1 == seq2` produces a boolean array (i.e., an array of boolean values). hth, Alan Isaac
On 12/22/2020 2:28 PM, Ivan Pozdeev via Python-Dev wrote:
You sure about that? For me, bool(np.array) raises an exception:
In [12]: np.__version__ Out[12]: '1.19.4'
In [11]: if [False, False]==np.array([False, False]): print("foo") <...> ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
On 22.12.2020 21:52, Alan G. Isaac wrote:
The following test fails because because `seq1 == seq2` returns a (boolean) NumPy array whenever either seq is a NumPy array.
import unittest import numpy as np unittest.TestCase().assertSequenceEqual([1.,2.,3.], np.array([1.,2.,3.]))
I expected `unittest` to rely only on features of a `collections.abc.Sequence`, which based on https://docs.python.org/3/glossary.html#term-sequence, I believe are satisfied by a NumPy array. Specifically, I see no requirement that a sequence implement __eq__ at all much less in any particular way.
In short: a test named `assertSequenceEqual` should, I would think, work for any sequence and therefore (based on the available documentation) should not depend on the class-specific implementation of __eq__.
Is that wrong?
Thank you, Alan Isaac _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/6Z43SU2R... Code of Conduct: http://python.org/psf/codeofconduct/
Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/74CUML37... Code of Conduct: http://python.org/psf/codeofconduct/
-- Regards, Ivan
Python-Dev mailing list --python-dev@python.org To unsubscribe send an email topython-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived athttps://mail.python.org/archives/list/python-dev@python.org/message/MKNN64A4... Code of Conduct:http://python.org/psf/codeofconduct/
-- Regards, Ivan
Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/WYHFEGPV... Code of Conduct: http://python.org/psf/codeofconduct/

This comment completely misses the point. This "weird type" qualifies as a Sequence. (See collections.abc.) Alan Isaac
On 12/22/2020 3:09 PM, Ivan Pozdeev via Python-Dev wrote:
Or just bail out ("resist the temptation to guess") and tell the user to compare their weird types themselves.

On Tue, Dec 22, 2020 at 06:33:41PM -0500, Alan G. Isaac wrote:
This comment completely misses the point. This "weird type" qualifies as a Sequence. (See collections.abc.)
It's not weird because of the sequence abc, it's weird because of its treatment of equality, using the `==` operator as an element-wise operator instead of an object equality boolean operator.
Numpy is entitled to do this, but we're not obligated to take heroic measures to integrate numpy arrays with unittest methods. If we can do so easily, sure, let's fix it.
I think Ivan's suggestion that the assertSequenceEqual method fall back on element-by-element comparisons has some merit.

On Wed, Dec 23, 2020 at 1:06 AM Steven D'Aprano steve@pearwood.info wrote:
We're not obligated to take heroic measures to integrate numpy arrays with unittest methods. If we can do so easily, sure, let's fix it.
I think Ivan's suggestion that the assertSequenceEqual method fall back on element-by-element comparisons has some merit.
If there are other common types this helps with, sure. But for numpy, as pointed out elsewhere in this thread, it would still fail for numpy arrays of > 1 dimension.
Personally I think this is really an issue with the structure of unitest -- having a custom assertion for every possibility is intractable.
If you want to test numpy arrays, use the utilities provided by numpy.
- CHB

On 1/8/2021 2:50 PM, Chris Barker via Python-Dev wrote:
If there are other common types this helps with, sure. But for numpy, as pointed out elsewhere in this thread, it would still fail for numpy arrays of > 1 dimension.
Personally I think this is really an issue with the structure of unitest -- having a custom assertion for every possibility is intractable.
If you want to test numpy arrays, use the utilities provided by numpy.
This comment misses the key point, which is: `assertSequenceEqual` should not rely on behavior that is not ensured for typing.Sequence, but it currently does. The failure on a numpy array simply exposes this problem.
The array-dimension consideration is also a red herring. For example, `unittest.TestCase().assertSequenceEqual([1,2,3],(1,2,3))` pasess but `unittest.TestCase().assertSequenceEqual([[1,2,3]],[(1,2,3)])` raises. This behavior remains unchallenged.
Alan Isaac

On Sat, Jan 09, 2021 at 07:56:24AM -0500, Alan G. Isaac wrote:
This comment misses the key point, which is: `assertSequenceEqual` should not rely on behavior that is not ensured for typing.Sequence, but it currently does. The failure on a numpy array simply exposes this problem.
You are making that as a definitive statement of fact, but it's not clear to me that this is actually true. There are at least two problems with your position:
(1) The Sequence ABC requires only the *presence* of certain methods, not their semantics. We're entitled to assume the obvious, implicit, sequence-like semantics. If a class implements the methods, but provides unexpected semantics, anything could happen.
(2) Equality is a fundament operation that we are entitled to assume that *all* objects support. See above: we're entitled to assume the standard semantics for equality too. Objects which have unusual semantics for equality, such as float NANs, may behave in unexpected ways.
So I don't think that we are *required* to support unusual sequences like numpy.
On the other hand, I think that we can extend assertSequenceEqual to support numpy arrays quite easily. A quick glance at the source code:
https://github.com/python/cpython/blob/3.9/Lib/unittest/case.py
suggests that all we need do is catch a potential ValueError around the sequence equality test, and fall back on the element by element processing:
try: if seq1 == seq1: return except ValueError: # Possibly a numpy array? pass
I don't think that this is a breaking change, and I think it should do what you expect.
I don't believe that we need to accept your reasoning regarding the Sequence ABC to accept this enhancement. One need only accept that although numpy's array equality semantics are non-standard and unhelpful, numpy is an important third-party library, and the cost of supporting sequences like numpy arrays is negligible.

Numpy chose to violate the principal of equality by having __eq__ not return a bool. So a numpy type can't be used reliably outside of the numpy DSL.
-gps
On Tue, Dec 22, 2020, 11:51 AM Alan G. Isaac alan.isaac@gmail.com wrote:
Here, `seq1 == seq2` produces a boolean array (i.e., an array of boolean values). hth, Alan Isaac
On 12/22/2020 2:28 PM, Ivan Pozdeev via Python-Dev wrote:
You sure about that? For me, bool(np.array) raises an exception:
In [12]: np.__version__ Out[12]: '1.19.4'
In [11]: if [False, False]==np.array([False, False]): print("foo") <...>
ValueError: The truth value of an array with more than one element is ambiguous.
Use a.any() or a.all()
On 22.12.2020 21:52, Alan G. Isaac wrote:
The following test fails because because `seq1 == seq2` returns a
(boolean) NumPy array whenever either seq is a NumPy array.
import unittest import numpy as np
unittest.TestCase().assertSequenceEqual([1.,2.,3.], np.array([1.,2.,3.]))
I expected `unittest` to rely only on features of a
`collections.abc.Sequence`, which based on https://docs.python.org/3/glossary.html#term-sequence, I
believe are satisfied by a NumPy array. Specifically, I see no
requirement that a sequence implement __eq__ at all much less in any particular way.
In short: a test named `assertSequenceEqual` should, I would think,
work for any sequence and therefore (based on the available documentation) should not
depend on the class-specific implementation of __eq__.
Is that wrong?
Thank you, Alan Isaac _______________________________________________
Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to
python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/6Z43SU2R... Code of Conduct:
Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/74CUML37... Code of Conduct: http://python.org/psf/codeofconduct/

On Tue, Dec 22, 2020 at 6:57 PM Alan G. Isaac alan.isaac@gmail.com wrote:
The following test fails because because `seq1 == seq2` returns a (boolean) NumPy array whenever either seq is a NumPy array.
import unittest import numpy as np unittest.TestCase().assertSequenceEqual([1.,2.,3.],
np.array([1.,2.,3.]))
I expected `unittest` to rely only on features of a `collections.abc.Sequence`, which based on https://docs.python.org/3/glossary.html#term-sequence, I believe are satisfied by a NumPy array.
If you know you might be dealing with NumPy arrays (as the import suggests), I think it's simply right to spell it as:
unittest.TestCase().assertTrue(np.array_equal([1., 2., 3.], np.array([1., 2., 3.])))
Or for pytest etc., simply:
assert np.array_equal([1., 2., 3.], np.array([1., 2., 3.]))

On Tue, 22 Dec 2020 19:32:15 +0000 David Mertz mertz@gnosis.cx wrote:
If you know you might be dealing with NumPy arrays (as the import suggests), I think it's simply right to spell it as:
unittest.TestCase().assertTrue(np.array_equal([1., 2., 3.], np.array([1., 2., 3.])))
Please don't suggest this, it will produce unhelpful error messages (do you like "False is not true" errors in CI builds?).
The better solution is to use the dedicated assertions in the `numpy.testing` package: https://numpy.org/doc/stable/reference/routines.testing.html
Regards
Antoine.
participants (10)
-
Alan G. Isaac
-
Antoine Pitrou
-
Brett Cannon
-
Chris Barker
-
David Mertz
-
Gregory P. Smith
-
Guido van Rossum
-
Ivan Pozdeev
-
Jeff Allen
-
Steven D'Aprano