Allow == and != to raise errors
Hey, the array comparisons == and != never raise errors but instead simply return False for invalid comparisons. The main example are arrays of non-matching dimensions, and object arrays with invalid element-wise comparisons: In [1]: np.array([1,2,3]) == np.array([1,2]) Out[1]: False In [2]: np.array([1, np.array([2, 3])], dtype=object) == [1, 2] Out[2]: False This seems wrong to me, and I am sure not just me. I doubt any large projects makes use of such comparisons and assume that most would prefer the shape mismatch to raise an error, so I would like to change it. But I am a bit unsure especially about smaller projects. So to keep the transition a bit safer could imagine implementing a FutureWarning for these cases (and that would at least notify new users that what they are doing doesn't seem like the right thing). So the question is: Is such a change safe enough, or is there some good reason for the current behavior that I am missing? Regards, Sebastian (There may be other issues with structured types that would continue returning False I think, because neither side knows how to compare)
I can see where you are getting at, but I would have to disagree. First of all, when a comparison between two mis-shaped arrays occur, you get back a bone fide python boolean, not a numpy array of bools. So if any action was taken on the result of such a comparison assumed that the result was some sort of an array, it would fail (yes, this does make it a bit difficult to trace back the source of the problem, but not impossible). Second, no semantics are broken with this. Are the arrays equal or not? If they weren't broadcastible, then returning False for == and True for != makes perfect sense to me. At least, that is my take on it. Cheers! Ben Root On Fri, Jul 12, 2013 at 8:38 AM, Sebastian Berg <sebastian@sipsolutions.net>wrote:
Hey,
the array comparisons == and != never raise errors but instead simply return False for invalid comparisons.
The main example are arrays of non-matching dimensions, and object arrays with invalid element-wise comparisons:
In [1]: np.array([1,2,3]) == np.array([1,2]) Out[1]: False
In [2]: np.array([1, np.array([2, 3])], dtype=object) == [1, 2] Out[2]: False
This seems wrong to me, and I am sure not just me. I doubt any large projects makes use of such comparisons and assume that most would prefer the shape mismatch to raise an error, so I would like to change it. But I am a bit unsure especially about smaller projects. So to keep the transition a bit safer could imagine implementing a FutureWarning for these cases (and that would at least notify new users that what they are doing doesn't seem like the right thing).
So the question is: Is such a change safe enough, or is there some good reason for the current behavior that I am missing?
Regards,
Sebastian
(There may be other issues with structured types that would continue returning False I think, because neither side knows how to compare)
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
I also don't like that idea, but I'm not able to come to a good reasoning like Benjamin. I don't see advantage to this change and the reason isn't good enough to justify breaking the interface I think. But I don't think we rely on this, so if the change goes in, it probably won't break stuff or they will be easily seen and repared. Fred On Fri, Jul 12, 2013 at 9:13 AM, Benjamin Root <ben.root@ou.edu> wrote:
I can see where you are getting at, but I would have to disagree. First of all, when a comparison between two mis-shaped arrays occur, you get back a bone fide python boolean, not a numpy array of bools. So if any action was taken on the result of such a comparison assumed that the result was some sort of an array, it would fail (yes, this does make it a bit difficult to trace back the source of the problem, but not impossible).
Second, no semantics are broken with this. Are the arrays equal or not? If they weren't broadcastible, then returning False for == and True for != makes perfect sense to me. At least, that is my take on it.
Cheers! Ben Root
On Fri, Jul 12, 2013 at 8:38 AM, Sebastian Berg < sebastian@sipsolutions.net> wrote:
Hey,
the array comparisons == and != never raise errors but instead simply return False for invalid comparisons.
The main example are arrays of non-matching dimensions, and object arrays with invalid element-wise comparisons:
In [1]: np.array([1,2,3]) == np.array([1,2]) Out[1]: False
In [2]: np.array([1, np.array([2, 3])], dtype=object) == [1, 2] Out[2]: False
This seems wrong to me, and I am sure not just me. I doubt any large projects makes use of such comparisons and assume that most would prefer the shape mismatch to raise an error, so I would like to change it. But I am a bit unsure especially about smaller projects. So to keep the transition a bit safer could imagine implementing a FutureWarning for these cases (and that would at least notify new users that what they are doing doesn't seem like the right thing).
So the question is: Is such a change safe enough, or is there some good reason for the current behavior that I am missing?
Regards,
Sebastian
(There may be other issues with structured types that would continue returning False I think, because neither side knows how to compare)
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Fri, Jul 12, 2013 at 3:35 PM, Frédéric Bastien <nouiz@nouiz.org> wrote:
I also don't like that idea, but I'm not able to come to a good reasoning like Benjamin.
I don't see advantage to this change and the reason isn't good enough to justify breaking the interface I think.
But I don't think we rely on this, so if the change goes in, it probably won't break stuff or they will be easily seen and repared.
Fred
On Fri, Jul 12, 2013 at 9:13 AM, Benjamin Root <ben.root@ou.edu> wrote:
I can see where you are getting at, but I would have to disagree. First of all, when a comparison between two mis-shaped arrays occur, you get back a bone fide python boolean, not a numpy array of bools. So if any action was taken on the result of such a comparison assumed that the result was some sort of an array, it would fail (yes, this does make it a bit difficult to trace back the source of the problem, but not impossible).
Second, no semantics are broken with this. Are the arrays equal or not? If they weren't broadcastible, then returning False for == and True for != makes perfect sense to me. At least, that is my take on it.
Cheers! Ben Root
On Fri, Jul 12, 2013 at 8:38 AM, Sebastian Berg <sebastian@sipsolutions.net> wrote:
Hey,
the array comparisons == and != never raise errors but instead simply return False for invalid comparisons.
The main example are arrays of non-matching dimensions, and object arrays with invalid element-wise comparisons:
In [1]: np.array([1,2,3]) == np.array([1,2]) Out[1]: False
In [2]: np.array([1, np.array([2, 3])], dtype=object) == [1, 2] Out[2]: False
This seems wrong to me, and I am sure not just me. I doubt any large projects makes use of such comparisons and assume that most would prefer the shape mismatch to raise an error, so I would like to change it. But I am a bit unsure especially about smaller projects. So to keep the transition a bit safer could imagine implementing a FutureWarning for these cases (and that would at least notify new users that what they are doing doesn't seem like the right thing).
So the question is: Is such a change safe enough, or is there some good reason for the current behavior that I am missing?
Regards,
Sebastian
(There may be other issues with structured types that would continue returning False I think, because neither side knows how to compare)
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
I thought Benjamin sounds pretty convincing, and since I never use this, I don't care. However, I (and I'm pretty convinced all statsmodels code) uses equality comparison only element wise. Getting a boolean back is an indicator for a bug, which is most of the time easy to trace back. There is an inconsistency in the behavior with the inequalities.
np.array([1,2,3]) < np.array([1,2]) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: shape mismatch: objects cannot be broadcast to a single shape
np.array([1,2,3]) <= np.array([1,2]) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: shape mismatch: objects cannot be broadcast to a single shape
(np.array([1,2,3]) == np.array([1,2])).any() Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'bool' object has no attribute 'any'
The last one could be misleading and difficult to catch.
np.any(np.array([1,2,3]) == np.array([1,2])) False
numpy 1.5.1 since I'm playing rear guard Josef Josef
On Fri, 2013-07-12 at 19:29 -0400, josef.pktd@gmail.com wrote:
On Fri, Jul 12, 2013 at 3:35 PM, Frédéric Bastien <nouiz@nouiz.org> wrote:
I also don't like that idea, but I'm not able to come to a good reasoning like Benjamin.
I don't see advantage to this change and the reason isn't good enough to justify breaking the interface I think.
But I don't think we rely on this, so if the change goes in, it probably won't break stuff or they will be easily seen and repared.
Fred
<snip>
I thought Benjamin sounds pretty convincing, and since I never use this, I don't care.
However, I (and I'm pretty convinced all statsmodels code) uses equality comparison only element wise. Getting a boolean back is an indicator for a bug, which is most of the time easy to trace back.
There is an inconsistency in the behavior with the inequalities.
Well, I guess I tend to think on the purity side of things. And the comparisons currently mix container and element-wise comparison up. It seems to me that it can lead to bugs, though I suppose it is unlikely to really hit anyone. One thing that keeping the behaviour means, is that the object array comparisons will be a little buggy (you get False for the whole array, when an element comparison gives an error). Though I admit, that for example arrays inside containers make any equality for the container quirky, since arrays cannot define a truth value. But if there is concern that this really could break code I won't try to press for it. - Sebastian
np.array([1,2,3]) < np.array([1,2]) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: shape mismatch: objects cannot be broadcast to a single shape
np.array([1,2,3]) <= np.array([1,2]) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: shape mismatch: objects cannot be broadcast to a single shape
(np.array([1,2,3]) == np.array([1,2])).any() Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'bool' object has no attribute 'any'
The last one could be misleading and difficult to catch.
np.any(np.array([1,2,3]) == np.array([1,2])) False
numpy 1.5.1 since I'm playing rear guard
Josef
Josef _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Python itself doesn't raise an exception in such cases :
(3,4) != (2, 3, 4) True (3,4) == (2, 3, 4) False
Should numpy behave differently ? Bruno. 2013/7/12 Frédéric Bastien <nouiz@nouiz.org>
I also don't like that idea, but I'm not able to come to a good reasoning like Benjamin.
I don't see advantage to this change and the reason isn't good enough to justify breaking the interface I think.
But I don't think we rely on this, so if the change goes in, it probably won't break stuff or they will be easily seen and repared.
Fred
On Fri, Jul 12, 2013 at 9:13 AM, Benjamin Root <ben.root@ou.edu> wrote:
I can see where you are getting at, but I would have to disagree. First of all, when a comparison between two mis-shaped arrays occur, you get back a bone fide python boolean, not a numpy array of bools. So if any action was taken on the result of such a comparison assumed that the result was some sort of an array, it would fail (yes, this does make it a bit difficult to trace back the source of the problem, but not impossible).
Second, no semantics are broken with this. Are the arrays equal or not? If they weren't broadcastible, then returning False for == and True for != makes perfect sense to me. At least, that is my take on it.
Cheers! Ben Root
On Fri, Jul 12, 2013 at 8:38 AM, Sebastian Berg < sebastian@sipsolutions.net> wrote:
Hey,
the array comparisons == and != never raise errors but instead simply return False for invalid comparisons.
The main example are arrays of non-matching dimensions, and object arrays with invalid element-wise comparisons:
In [1]: np.array([1,2,3]) == np.array([1,2]) Out[1]: False
In [2]: np.array([1, np.array([2, 3])], dtype=object) == [1, 2] Out[2]: False
This seems wrong to me, and I am sure not just me. I doubt any large projects makes use of such comparisons and assume that most would prefer the shape mismatch to raise an error, so I would like to change it. But I am a bit unsure especially about smaller projects. So to keep the transition a bit safer could imagine implementing a FutureWarning for these cases (and that would at least notify new users that what they are doing doesn't seem like the right thing).
So the question is: Is such a change safe enough, or is there some good reason for the current behavior that I am missing?
Regards,
Sebastian
(There may be other issues with structured types that would continue returning False I think, because neither side knows how to compare)
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Mon, Jul 15, 2013 at 2:09 PM, bruno Piguet <bruno.piguet@gmail.com> wrote:
Python itself doesn't raise an exception in such cases :
(3,4) != (2, 3, 4) True (3,4) == (2, 3, 4) False
Should numpy behave differently ?
The numpy equivalent to Python's scalar "==" is called array_equal, and that does indeed behave the same: In [5]: np.array_equal([3, 4], [2, 3, 4]) Out[5]: False But in numpy, the name "==" is shorthand for the ufunc np.equal, which raises an error: In [8]: np.equal([3, 4], [2, 3, 4]) ValueError: operands could not be broadcast together with shapes (2) (3) -n
Just a question, should == behave like a ufunc or like python == for tuple? I think that all ndarray comparision (==, !=, <=, ...) should behave the same. If they don't (like it was said), making them consistent is good. What is the minimal change to have them behave the same? From my understanding, it is your proposal to change == and != to behave like real ufunc. But I'm not sure if the minimal change is the best, for new user, what they will expect more? The ufunc of the python behavior? Anyway, I see the advantage to simplify the interface to something more consistent. Anyway, if we make all comparison behave like ufunc, there is array_equal as said to have the python behavior of ==, is it useful to have equivalent function the other comparison? Do they already exist. thanks Fred On Mon, Jul 15, 2013 at 10:20 AM, Nathaniel Smith <njs@pobox.com> wrote:
On Mon, Jul 15, 2013 at 2:09 PM, bruno Piguet <bruno.piguet@gmail.com> wrote:
Python itself doesn't raise an exception in such cases :
(3,4) != (2, 3, 4) True (3,4) == (2, 3, 4) False
Should numpy behave differently ?
The numpy equivalent to Python's scalar "==" is called array_equal, and that does indeed behave the same:
In [5]: np.array_equal([3, 4], [2, 3, 4]) Out[5]: False
But in numpy, the name "==" is shorthand for the ufunc np.equal, which raises an error:
In [8]: np.equal([3, 4], [2, 3, 4]) ValueError: operands could not be broadcast together with shapes (2) (3)
-n _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
2013/7/15 Frédéric Bastien <nouiz@nouiz.org>
Just a question, should == behave like a ufunc or like python == for tuple?
That's what I was also wondering. I see the advantage of consistency for newcomers. I'm not experienced enough to see if this is a problem for numerical practitionners Maybe they wouldn't even imagine that "==" applied to arrays could do anything else than element-wise comparison ? "Explicit is better than implicit" : to me, np.equal(x, y) is more explicit than "x == y". But "Beautiful is better than ugly". Is np.equal(x, y) ugly ? Bruno.
I think that all ndarray comparision (==, !=, <=, ...) should behave the same. If they don't (like it was said), making them consistent is good. What is the minimal change to have them behave the same? From my understanding, it is your proposal to change == and != to behave like real ufunc. But I'm not sure if the minimal change is the best, for new user, what they will expect more? The ufunc of the python behavior?
Anyway, I see the advantage to simplify the interface to something more consistent.
Anyway, if we make all comparison behave like ufunc, there is array_equal as said to have the python behavior of ==, is it useful to have equivalent function the other comparison? Do they already exist.
thanks
Fred
On Mon, Jul 15, 2013 at 10:20 AM, Nathaniel Smith <njs@pobox.com> wrote:
On Mon, Jul 15, 2013 at 2:09 PM, bruno Piguet <bruno.piguet@gmail.com> wrote:
Python itself doesn't raise an exception in such cases :
(3,4) != (2, 3, 4) True (3,4) == (2, 3, 4) False
Should numpy behave differently ?
The numpy equivalent to Python's scalar "==" is called array_equal, and that does indeed behave the same:
In [5]: np.array_equal([3, 4], [2, 3, 4]) Out[5]: False
But in numpy, the name "==" is shorthand for the ufunc np.equal, which raises an error:
In [8]: np.equal([3, 4], [2, 3, 4]) ValueError: operands could not be broadcast together with shapes (2) (3)
-n _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Mon, 2013-07-15 at 17:12 +0200, bruno Piguet wrote:
2013/7/15 Frédéric Bastien <nouiz@nouiz.org> Just a question, should == behave like a ufunc or like python == for tuple?
That's what I was also wondering.
I am not sure I understand the question. Of course == should be (mostly?) identical to np.equal. Things like arr[arr == 0] = -1 etc., etc., are a common design pattern. Operations on arrays are element-wise by default, "falling back" to the python tuple/container behaviour" is a special case and I do not see a good reason for it, except possibly backward compatibility. Personally I doubt anyone who seriously uses numpy, uses the np.array([1, 2, 3]) == np.array([1,2]) -> False behaviour, and it seems a bit like a trap to me, because suddenly you get: np.array([1, 2, 3]) == np.array([1]) -> np.array([True, False, False]) (Though in combination with np.all, it can make sense and is then identical to np.array_equiv/np.array_equal) - Sebastian
I see the advantage of consistency for newcomers. I'm not experienced enough to see if this is a problem for numerical practitionners Maybe they wouldn't even imagine that "==" applied to arrays could do anything else than element-wise comparison ?
"Explicit is better than implicit" : to me, np.equal(x, y) is more explicit than "x == y".
But "Beautiful is better than ugly". Is np.equal(x, y) ugly ?
Bruno.
I think that all ndarray comparision (==, !=, <=, ...) should behave the same. If they don't (like it was said), making them consistent is good. What is the minimal change to have them behave the same? From my understanding, it is your proposal to change == and != to behave like real ufunc. But I'm not sure if the minimal change is the best, for new user, what they will expect more? The ufunc of the python behavior?
Anyway, I see the advantage to simplify the interface to something more consistent.
Anyway, if we make all comparison behave like ufunc, there is array_equal as said to have the python behavior of ==, is it useful to have equivalent function the other comparison? Do they already exist.
thanks
Fred
On Mon, Jul 15, 2013 at 10:20 AM, Nathaniel Smith <njs@pobox.com> wrote: On Mon, Jul 15, 2013 at 2:09 PM, bruno Piguet <bruno.piguet@gmail.com> wrote: > Python itself doesn't raise an exception in such cases : > >>>> (3,4) != (2, 3, 4) > True >>>> (3,4) == (2, 3, 4) > False > > Should numpy behave differently ?
The numpy equivalent to Python's scalar "==" is called array_equal, and that does indeed behave the same:
In [5]: np.array_equal([3, 4], [2, 3, 4]) Out[5]: False
But in numpy, the name "==" is shorthand for the ufunc np.equal, which raises an error:
In [8]: np.equal([3, 4], [2, 3, 4]) ValueError: operands could not be broadcast together with shapes (2) (3)
-n _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Thank-you for your explanations. So, if the operator "==" applied to np.arrays is a shorthand for the ufunc np.equal, it should definitly behave exactly as np.equal(), and raise an error. One side question about style : In case you would like to protect a "x == y" test by a try/except clause, wouldn't it feel more "natural" to write " np.equal(x, y)" ? Bruno. 2013/7/15 Nathaniel Smith <njs@pobox.com>
On Mon, Jul 15, 2013 at 2:09 PM, bruno Piguet <bruno.piguet@gmail.com> wrote:
Python itself doesn't raise an exception in such cases :
(3,4) != (2, 3, 4) True (3,4) == (2, 3, 4) False
Should numpy behave differently ?
The numpy equivalent to Python's scalar "==" is called array_equal, and that does indeed behave the same:
In [5]: np.array_equal([3, 4], [2, 3, 4]) Out[5]: False
But in numpy, the name "==" is shorthand for the ufunc np.equal, which raises an error:
In [8]: np.equal([3, 4], [2, 3, 4]) ValueError: operands could not be broadcast together with shapes (2) (3)
-n _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Mon, 2013-07-15 at 15:09 +0200, bruno Piguet wrote:
Python itself doesn't raise an exception in such cases :
(3,4) != (2, 3, 4) True (3,4) == (2, 3, 4) False
Should numpy behave differently ?
Yes, because Python tests whether the tuple is different, not whether the elements are:
(3, 4) == (3, 4) True np.array([3, 4]) == np.array([3, 4]) array([ True, True], dtype=bool)
So doing the test "like python" *changes* the behaviour. - Sebastian
Bruno.
2013/7/12 Frédéric Bastien <nouiz@nouiz.org> I also don't like that idea, but I'm not able to come to a good reasoning like Benjamin.
I don't see advantage to this change and the reason isn't good enough to justify breaking the interface I think.
But I don't think we rely on this, so if the change goes in, it probably won't break stuff or they will be easily seen and repared.
Fred
On Fri, Jul 12, 2013 at 9:13 AM, Benjamin Root <ben.root@ou.edu> wrote: I can see where you are getting at, but I would have to disagree. First of all, when a comparison between two mis-shaped arrays occur, you get back a bone fide python boolean, not a numpy array of bools. So if any action was taken on the result of such a comparison assumed that the result was some sort of an array, it would fail (yes, this does make it a bit difficult to trace back the source of the problem, but not impossible).
Second, no semantics are broken with this. Are the arrays equal or not? If they weren't broadcastible, then returning False for == and True for != makes perfect sense to me. At least, that is my take on it.
Cheers!
Ben Root
On Fri, Jul 12, 2013 at 8:38 AM, Sebastian Berg <sebastian@sipsolutions.net> wrote: Hey,
the array comparisons == and != never raise errors but instead simply return False for invalid comparisons.
The main example are arrays of non-matching dimensions, and object arrays with invalid element-wise comparisons:
In [1]: np.array([1,2,3]) == np.array([1,2]) Out[1]: False
In [2]: np.array([1, np.array([2, 3])], dtype=object) == [1, 2] Out[2]: False
This seems wrong to me, and I am sure not just me. I doubt any large projects makes use of such comparisons and assume that most would prefer the shape mismatch to raise an error, so I would like to change it. But I am a bit unsure especially about smaller projects. So to keep the transition a bit safer could imagine implementing a FutureWarning for these cases (and that would at least notify new users that what they are doing doesn't seem like the right thing).
So the question is: Is such a change safe enough, or is there some good reason for the current behavior that I am missing?
Regards,
Sebastian
(There may be other issues with structured types that would continue returning False I think, because neither side knows how to compare)
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Fri, Jul 12, 2013 at 2:13 PM, Benjamin Root <ben.root@ou.edu> wrote:
I can see where you are getting at, but I would have to disagree. First of all, when a comparison between two mis-shaped arrays occur, you get back a bone fide python boolean, not a numpy array of bools. So if any action was taken on the result of such a comparison assumed that the result was some sort of an array, it would fail (yes, this does make it a bit difficult to trace back the source of the problem, but not impossible).
Second, no semantics are broken with this. Are the arrays equal or not? If they weren't broadcastible, then returning False for == and True for != makes perfect sense to me. At least, that is my take on it.
But it does break semantics. Sure, it tells you that the arrays aren't equal -- but that's not the question you asked. "==" is not "are these arrays equal"; it's "is each pair of broadcasted aligned elements in these arrays equal", and these are totally different operations. It's unfortunate that "==" is a somewhat confusing name, but that's no reason to mix things up like this. "+" in python sometimes means "add all elements" and sometimes means "concatenate", but no-one would argue that ndarray.__add__ should the former when the arrays were broadcastable and the latter when they weren't. This is the same thing. "Errors should never pass silently", "In the face of ambiguity, refuse the temptation to guess." There's really no sensible interface here -- notice that '==' can return False but can never return True, and Josef gave an example of where it can silently produce misleading results. So to me it seems like a clear bug, but one of the sort that has a higher probability than usual that someone somewhere is depending on it... which makes it less clear what exactly to do with it. I guess one option is to just start raising errors in the first RC and see whether anyone complains! But people people don't seem to test the RCs enough to make this entirely reliable :-(. -n
On Sat, Jul 13, 2013 at 9:14 AM, Nathaniel Smith <njs@pobox.com> wrote:
On Fri, Jul 12, 2013 at 2:13 PM, Benjamin Root <ben.root@ou.edu> wrote:
I can see where you are getting at, but I would have to disagree. First of all, when a comparison between two mis-shaped arrays occur, you get back a bone fide python boolean, not a numpy array of bools. So if any action was taken on the result of such a comparison assumed that the result was some sort of an array, it would fail (yes, this does make it a bit difficult to trace back the source of the problem, but not impossible).
Second, no semantics are broken with this. Are the arrays equal or not? If they weren't broadcastible, then returning False for == and True for != makes perfect sense to me. At least, that is my take on it.
But it does break semantics. Sure, it tells you that the arrays aren't equal -- but that's not the question you asked. "==" is not "are these arrays equal"; it's "is each pair of broadcasted aligned elements in these arrays equal", and these are totally different operations. It's unfortunate that "==" is a somewhat confusing name, but that's no reason to mix things up like this. "+" in python sometimes means "add all elements" and sometimes means "concatenate", but no-one would argue that ndarray.__add__ should the former when the arrays were broadcastable and the latter when they weren't. This is the same thing.
"Errors should never pass silently", "In the face of ambiguity, refuse the temptation to guess."
There's really no sensible interface here -- notice that '==' can return False but can never return True, and Josef gave an example of where it can silently produce misleading results. So to me it seems like a clear bug, but one of the sort that has a higher probability than usual that someone somewhere is depending on it... which makes it less clear what exactly to do with it.
I guess one option is to just start raising errors in the first RC and see whether anyone complains! But people people don't seem to test the RCs enough to make this entirely reliable :-(.
I'm now +1 on the exception that Sebastian proposed. I like consistency, and having a more straightforward mental model of the numpy behavior for elementwise operations, that don't pretend sometimes to be "python" (when I'm doing array math), like this
[1,2,3] < [1,2] False [1,2,3] > [1,2] True
Josef
-n _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Sat, 2013-07-13 at 11:28 -0400, josef.pktd@gmail.com wrote:
On Sat, Jul 13, 2013 at 9:14 AM, Nathaniel Smith <njs@pobox.com> wrote: <snip>
I'm now +1 on the exception that Sebastian proposed.
I like consistency, and having a more straightforward mental model of the numpy behavior for elementwise operations, that don't pretend sometimes to be "python" (when I'm doing array math), like this
I am not sure what the result of this discussion is. As far as I see Benjamin and Frédéric were opposing and overall it seemed pretty mixed, so unless you two changed your mind or say that it was just a small personal preference I would drop it for now. I obviously think the current behaviour is inconsistent to buggy and am really only afraid of possibly breaking code out there. Which is why I think I maybe should first add a FutureWarning if we decide on changing it. Regards, Sebastian
[1,2,3] < [1,2] False [1,2,3] > [1,2] True
Josef
-n _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
I'm mixed, because I see the good value, but I'm not able to guess the consequence of the interface change. So doing your FutureWarning would allow to gatter some data about this, and if it seam to cause too much problem, we could cancel the change. Also, in the case there is a few software that depend on the old behaviour, this will cause a crash(Except if they have a catch all Exception case), not bad result. I think it is always hard to predict the consequence of interface change in NumPy. To help measure it, we could make/as people to contribute to a collection of software that use NumPy with a good tests suites. We could test interface change on them by running there tests suites to try to have a guess of the impact of those change. What do you think of that? I think it was already discussed on the mailing list, but not acted upon. Fred On Tue, Jul 23, 2013 at 10:29 AM, Sebastian Berg <sebastian@sipsolutions.net
wrote:
On Sat, 2013-07-13 at 11:28 -0400, josef.pktd@gmail.com wrote:
On Sat, Jul 13, 2013 at 9:14 AM, Nathaniel Smith <njs@pobox.com> wrote: <snip>
I'm now +1 on the exception that Sebastian proposed.
I like consistency, and having a more straightforward mental model of the numpy behavior for elementwise operations, that don't pretend sometimes to be "python" (when I'm doing array math), like this
I am not sure what the result of this discussion is. As far as I see Benjamin and Frédéric were opposing and overall it seemed pretty mixed, so unless you two changed your mind or say that it was just a small personal preference I would drop it for now. I obviously think the current behaviour is inconsistent to buggy and am really only afraid of possibly breaking code out there. Which is why I think I maybe should first add a FutureWarning if we decide on changing it.
Regards,
Sebastian
[1,2,3] < [1,2] False [1,2,3] > [1,2] True
Josef
-n _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Tue, Jul 23, 2013 at 4:10 PM, Frédéric Bastien <nouiz@nouiz.org> wrote:
I'm mixed, because I see the good value, but I'm not able to guess the consequence of the interface change.
So doing your FutureWarning would allow to gatter some data about this, and if it seam to cause too much problem, we could cancel the change.
Also, in the case there is a few software that depend on the old behaviour, this will cause a crash(Except if they have a catch all Exception case), not bad result.
I think we have to be willing to fix bugs, even if we can't be sure what all the consequences are. Carefully of course, and with due consideration to possible compatibility consequences, but if we rejected every change that might have unforeseen effects then we'd have to stop accepting changes altogether. (And anyway the show-stopper regressions that make it into releases always seem to be the ones we didn't anticipate at all, so I doubt that being 50% more careful with obscure corner cases like this will have any measurable impact in our overall release-to-release compatibility.) So I'd consider Fred's comments above to be a vote for the change, in practice...
I think it is always hard to predict the consequence of interface change in NumPy. To help measure it, we could make/as people to contribute to a collection of software that use NumPy with a good tests suites. We could test interface change on them by running there tests suites to try to have a guess of the impact of those change. What do you think of that? I think it was already discussed on the mailing list, but not acted upon.
Yeah, if we want to be careful then it never hurts to run other projects test suites to flush out bugs :-). We don't do this systematically right now. Maybe we should stick some precompiled copies of scipy and other core numpy-dependants up on a host somewhere and then pull them down and run their test suite as part of the Travis tests? We have maybe 10 minutes of CPU budget for tests still. -n
On Thu, Jul 25, 2013 at 7:48 AM, Nathaniel Smith <njs@pobox.com> wrote:
On Tue, Jul 23, 2013 at 4:10 PM, Frédéric Bastien <nouiz@nouiz.org> wrote:
I'm mixed, because I see the good value, but I'm not able to guess the consequence of the interface change.
So doing your FutureWarning would allow to gatter some data about this, and if it seam to cause too much problem, we could cancel the change.
Also, in the case there is a few software that depend on the old behaviour, this will cause a crash(Except if they have a catch all Exception case), not bad result.
I think we have to be willing to fix bugs, even if we can't be sure what all the consequences are. Carefully of course, and with due consideration to possible compatibility consequences, but if we rejected every change that might have unforeseen effects then we'd have to stop accepting changes altogether. (And anyway the show-stopper regressions that make it into releases always seem to be the ones we didn't anticipate at all, so I doubt that being 50% more careful with obscure corner cases like this will have any measurable impact in our overall release-to-release compatibility.) So I'd consider Fred's comments above to be a vote for the change, in practice...
I think it is always hard to predict the consequence of interface change in NumPy. To help measure it, we could make/as people to contribute to a collection of software that use NumPy with a good tests suites. We could test interface change on them by running there tests suites to try to have a guess of the impact of those change. What do you think of that? I think it was already discussed on the mailing list, but not acted upon.
Yeah, if we want to be careful then it never hurts to run other projects test suites to flush out bugs :-).
We don't do this systematically right now. Maybe we should stick some precompiled copies of scipy and other core numpy-dependants up on a host somewhere and then pull them down and run their test suite as part of the Travis tests? We have maybe 10 minutes of CPU budget for tests still.
Theano tests will be too long. I'm not sure that doing this on travis-ci is the right place. Doing this for each version of a PR will be too long for travis and will limit the project that we will test on. What about doing a vagrant VM that update/install the development version of NumPy and then reinstall some predetermined version of other project and run there tests? I started playing with vagrant VM to help test differente OS configuration for Theano. I haven't finished this, but it seam to do the job well. People just cd in a directory, then run "vagrant up" and then all is automatic. They just wait and read the output. Other idea? I know some other project used jenkins. Would this be a better idea? Fred
participants (6)
-
Benjamin Root -
bruno Piguet -
Frédéric Bastien -
josef.pktd@gmail.com -
Nathaniel Smith -
Sebastian Berg