Disallow orderring comparison to NaN

Another spin-off from the "[Python-Dev] PyObject_RichCompareBool identity shortcut" thread:

On Thu, Apr 28, 2011 at 10:01 AM, Steven D'Aprano <steve@pearwood.info> wrote:
I think I would like to see a demonstration of this rather than just take your word for it.
One demonstration would be def bubble_sort(xs): while True: changed = False for i in range(len(xs) - 1): if not (xs[i] < xs[i + 1]): changed = True xs[i], xs[i + 1] = xs[i + 1], xs[i] if not changed: break bubble_sort([float('nan)'] * 2)

Mike Graham wrote:
Thank you. Nevertheless, that does appear to be an easy fix: def bubble_sort(xs): while True: changed = False for i in range(len(xs) - 1): # don't use `not (xs[i] < xs[i + 1])` as that fails in the # presence of NANs if xs[i] >= xs[i + 1]: changed = True xs[i], xs[i + 1] = xs[i + 1], xs[i] if not changed: break -- Steven

On Fri, Apr 29, 2011 at 3:30 AM, Stephen J. Turnbull <stephen@xemacs.org> wrote:
There are different shades of "not working". In most cases, raising an exception is preferable to silently producing garbage or entering an infinite loop. NaNs are unordered and NaN < 0 makes as much sense as None < 0 or "abc" < 0. The later operations raise an exception in py3k.

Alexander Belopolsky writes:
On Fri, Apr 29, 2011 at 3:30 AM, Stephen J. Turnbull <stephen@xemacs.org> wrote:
Sure, Python's behavior when asked to perform mathematical operations that do not admit a usable definition can be improved, to the benefit of people who write robust, high performance code. I appreciate your contribution to that discussion greatly. But I really doubt that raising here is going to save anybody's eggshells. The cure you suggest might be better than silent garbage or an infinite loop, but in production code you will still have to think carefully about preventing or handling the exception. Not to mention finding a way to produce NaNs in the first place. That's far from what I would call "perfectly ordinary code working as is".

On Thu, Apr 28, 2011 at 12:00 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Note this actually isn't an improvement--it merely takes a noticeable error and turns it into a data-polluter. (Sorting a sequence containing NaNs is obviously not a valid operation, which is the argument for OP's suggestion.) MG

I posted a patch implementing this proposal on the tracker: http://bugs.python.org/issue11949 Interestingly, the only substantive change that was needed to pass the test suit revealed a bug in the test logic. The tests in cmath_testcases.txt include testing for -0.0 results, but the processing in test_math.py ignores the difference between 0.0 and -0.0. For example, test_math will still pass if you make the following change: --- a/Lib/test/cmath_testcases.txt +++ b/Lib/test/cmath_testcases.txt @@ -405,7 +405,7 @@ -- zeros asin0000 asin 0.0 0.0 -> 0.0 0.0 asin0001 asin 0.0 -0.0 -> 0.0 -0.0 -asin0002 asin -0.0 0.0 -> -0.0 0.0 +asin0002 asin -0.0 0.0 -> 0.0 0.0 asin0003 asin -0.0 -0.0 -> -0.0 -0.0

On Thu, Apr 28, 2011 at 12:25 PM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
I posted a patch implementing this proposal on the tracker:
Interesting indeed! I'd like to hear from the numpy folks about this. But isn't a similar change needed for Decimal? -- --Guido van Rossum (python.org/~guido)

On Thu, Apr 28, 2011 at 6:10 PM, Guido van Rossum <guido@python.org> wrote: ..
But isn't a similar change needed for Decimal?
I did not look into this, but decimal contexts allow for more compliant implementations because you can trap FP exceptions differently in different contexts. We don't have this luxury with floats.

On 4/28/11 5:10 PM, Guido van Rossum wrote:
I'm personally -1, though mostly on general conservative principles. I'm sure there is some piece of code that will break, but I don't know how significant it would be. I'm not sure that it solves a significant problem. I've never actually heard of anyone running into an infinite cycle due to NaNs, though a bit of Googling does suggest that it happens sometimes. I don't think it really moves us closer to IEEE-754 compliance. The standard states (section 7. "Exceptions") "The default response to an exception shall be to proceed without a trap." Python only intermittently turns INVALID operations into exceptions, mostly just (-1.0)**0.5 and integer conversion (0/0.0 and x%0.0 could be considered covered under the division by zero signal that *is* consistently turned into a Python exception). inf-inf, inf/inf, 0*inf, and inf%2.0, to give other examples of INVALID-signaling operations from the spec, all return a NaN without an exception. Given that we want to avoid exposing SIGFPE handlers for safety reasons, I think the status quo is a reasonable compromise interpretation of the spec. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On Thu, 28 Apr 2011 20:38:13 -0500 Robert Kern <robert.kern@gmail.com> wrote:
Same as Robert. This does not seem very useful and may break existing code. It also opens the door for attacks against code which takes floats as input strings and parses them using the float() constructor. An attacker can pass "nan", which will be converted successfully and can later raise an exception at an arbitrary point. Applications will have to actively protect against this, which is an unnecessary nuisance. Regards Antoine.

On Fri, Apr 29, 2011 at 3:00 AM, Nick Coghlan <ncoghlan@gmail.com> wrote: ..
That's what I thought and contrary to what Robert said early in the thread. By default, decimal operations trap InvalidOperation, DivisionByZero, and Overflow: traps=[InvalidOperation, DivisionByZero, Overflow]) The advantage that decimal has over float is that user can control what is trapped:

On 4/29/11 11:14 AM, Alexander Belopolsky wrote:
I have said nothing about decimal. I can requote the relevant portions of the IEEE-754 standard again, if you like. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On Fri, Apr 29, 2011 at 12:28 PM, Robert Kern <robert.kern@gmail.com> wrote: ..
I have said nothing about decimal. I can requote the relevant portions of the IEEE-754 standard again, if you like.
Please do. I had a draft of IEEE-754 standard somewhere at some point, but not anymore. I rely on Kahan's notes at http://www.cs.berkeley.edu/~wkahan/ieee754status/ieee754.ps .

On 4/29/11 11:39 AM, Alexander Belopolsky wrote:
(Section 7. "Exceptions") "The default response to an exception shall be to proceed without a trap." IEEE-854 has the same sentence. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On Fri, Apr 29, 2011 at 4:23 PM, Robert Kern <robert.kern@gmail.com> wrote: ..
(Section 7. "Exceptions") "The default response to an exception shall be to proceed without a trap."
I cannot find this phrase in my copy of IEEE Std 754-2008. Instead, I see the following in section 7.1: "This clause also specifies default non-stop exception handling for exception signals, which is to deliver a default result, continue execution, and raise the corresponding status flag." As I mentioned before, Python does not have a mechanism that would allow to simultaneously raise an exception and deliver the result. We have to choose one or the other. I think the choice made in the decimal module is a reasonable one: trap Overflow, DivisionByZero, and InvalidOperation while ignoring Underflow and Inexact. The choices made for float operations are more ad-hoc: DivisionByZero is always trapped:
Overflow is trapped in some cases:
and ignored in others:
InvalidOperation is not handled consistently. Let me copy the relevant section of the standard and show Python's behavior for each case where InvalidOperation exception is required by the standard: """ The invalid operation exception is signaled if and only if there is no usefully definable result. In these cases the operands are invalid for the operation to be performed. For operations producing results in floating-point format, the default result of an operation that signals the invalid operation exception shall be a quiet NaN that should provide some diagnostic information (see 6.2). These operations are: a) any general-computational or signaling-computational operation on a signaling NaN (see 6.2), except for some conversions (see 5.12) """ Python does not have support for sNaNs. It is possible to produce a float carrying an sNaN using struct.unpack, but the result behaves a qNaN. InvalidOperation not trapped. """ b) multiplication: multiplication(0, ∞) or multiplication(∞, 0) """
0.0 * float('inf') nan
InvalidOperation not trapped. """ c) fusedMultiplyAdd: fusedMultiplyAdd(0, ∞, c) or fusedMultiplyAdd(∞, 0, c) unless c is a quiet NaN; if c is a quiet NaN then it is implementation defined whether the invalid operation exception is signaled """ Not applicable. Python does not have fusedMultiplyAdd (x * y + z) function. """ d) addition or subtraction or fusedMultiplyAdd: magnitude subtraction of infinities, such as: addition(+∞, −∞) """
float('inf') + float('-inf') nan
InvalidOperation not trapped. """ e) division: division(0, 0) or division(∞, ∞) """
InvalidOperation trapped, but misreported as DivisionByZero.
float('inf') / float('inf') nan
""" f) remainder: remainder(x, y), when y is zero or x is infinite and neither is NaN """
InvalidOperation trapped, but misreported as DivisionByZero.
float('inf') % 2.0 nan
InvalidOperation not trapped. """ g) squareRoot if the operand is less than zero """
InvalidOperation trapped. """ h) quantize when the result does not fit in the destination format or when one operand is finite and the other is infinite """ Not applicable. """ For operations producing no result in floating-point format, the operations that signal the invalid operation exception are: i) conversion of a floating-point number to an integer format, when the source is NaN, infinity, or a value that would convert to an integer outside the range of the result format under the applicable rounding attribute. """
InvalidOperation trapped.
InvalidOperation trapped, but misclassified as OverflowError. """ j) comparison by way of unordered-signaling predicates listed in Table 5.2, when the operands are unordered """ This is the subject of my proposal.
float('nan') < 0.0 False
InvalidOperation not trapped. """ k) logB(NaN), logB(∞), or logB(0) when logBFormat is an integer format (see 5.3.3). """ Not applicable. Overall, it appears that in cases where InvalidOperation was anticipated, it was converted to some type of exception in Python. Exceptions to this rule seem to be an accident of implementation.

On 4/29/11 4:30 PM, Alexander Belopolsky wrote:
Ah. I have the 1985 version.
Well, for comparisons at least, it seems to have been anticipated, and returning a value was intentional. From the comments documenting float_richcompare() in floatobject.c: /* Comparison is pretty much a nightmare. When comparing float to float, * we do it as straightforwardly (and long-windedly) as conceivable, so * that, e.g., Python x == y delivers the same result as the platform * C x == y when x and/or y is a NaN. ... I'm not sure there's any evidence that the other behaviors have not been anticipated or are accidents of implementation. The ambiguous inf operations are documented and doctested in Lib/test/ieee754.txt. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On Fri, Apr 29, 2011 at 6:11 PM, Robert Kern <robert.kern@gmail.com> wrote: ..
Well, I may be overly pedantic, but this comment only mentions that the author considered == comparison and not ordering of NaNs. I guess we need to ask Tim Peters if he considered the fact that x < NaN is invalid operation according to IEEE 754 while x == NaN is not. Tim?

On Fri, Apr 29, 2011 at 5:30 PM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote: ..
I made this argument several times and it went unchallenged, but I now realize that Python does have a mechanism that would allow to simultaneously raise an exception and deliver the result. This is what warnings do. Since changing NaN < 0 to raise an error would have to be done by issuing a deprecation warning first, why can't we just issue appropriate warning on invalid operations? Isn't this what numpy does in some cases?

On 4/29/11 5:50 PM, Alexander Belopolsky wrote:
We have a configurable mechanism that lets you change between ignoring, warning, and raising an exception (and a few others). [~] |1> with np.errstate(invalid='raise'): ..> np.array([np.inf]) / np.array([np.inf]) ..> --------------------------------------------------------------------------- FloatingPointError Traceback (most recent call last) /Users/rkern/<ipython-input-1-d0b8f36f6dea> in <module>() 1 with np.errstate(invalid='raise'): ----> 2 np.array([np.inf]) / np.array([np.inf]) 3 FloatingPointError: invalid value encountered in divide [~] |2> with np.errstate(invalid='ignore'): ..> np.array([np.inf]) / np.array([np.inf]) ..> [~] |3> with np.errstate(invalid='warn'): ..> np.array([np.inf]) / np.array([np.inf]) ..> /Library/Frameworks/Python.framework/Versions/Current/bin/ipython:2: RuntimeWarning: invalid value encountered in divide I think I could support issuing a warning. Beats the hell out of arguing over fine details of ancient standards intended for low-level languages and hardware. :-) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On Fri, Apr 29, 2011 at 10:30 PM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
Roughly, the current situation is that math module operations try to consistently follow IEEE 754 exceptions: an IEEE 754 overflow is converted to an OverflowError, while invalid-operation or divide-by-zero signals produce a Python ValueError. Basic arithmetic is another story: ** behaves more-or-less like the math module operations, but the arithmetic operations mainly produce nans or infinities, except that division by zero is trapped. IMO, the ideal (ignoring backwards compatibility) would be to have OverflowError / ZeroDivisionError / ValueError produced wherever IEEE754 says that overflow / divide-by-zero / invalid-operation should be signaled. Mark

On Thu, Apr 28, 2011 at 10:01 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Hmm, a quick search of the tracker yielded issue7915 which demonstrates how presence of nans causes sort to leave a list unsorted. Using binary search on unsorted data leads to nonsensical results, but I don't seem to be able to produce infinite loops with python's bisect. Maybe I saw nan-caused infinite loops in some other libraries. I learned long ago to rid my data of NaNs before doing any type of comparison, so my recollection of the associated problems is admittedly vague. I'll try to come up with something, though. http://bugs.python.org/issue7915

On Thu, Apr 28, 2011 at 4:52 AM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
I'm -0 on this -- I really favor having NaNs behave like NaNs. Obviously this is a weird fit for Python, but so what? Python does its best never to give you NaNs. If you've done something to get a NaN it's because of a library bug or because you really wanted the NaNs and should know what you're doing.

On 4/28/11 10:02 AM, Alexander Belopolsky wrote:
Not quite, IIRC. I don't have it in front of me, but I do recall that it specifies how it behaves in two different situations: 1. Where you have a comparison function that returns the relationship between the two operands, IEEE-754 specifies that in addition to GT, LT, and EQ, you ought to include "unordered" to use when a NaN is involved. 2. Where you have comparison operators like <, ==, etc. that return bools, NaNs will return False for all comparisons. They may specify whether or not FPE signals should be issued, I don't recall, but I suspect that if they are quiet NaNs, they won't issue a SIGFPE. Higher-level exceptions were not contemplated by IEEE-754, IIRC. Python uses the < operator for sorting, not a comparison function, so it's current behavior is perfectly in line with the IEEE-754 spec. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On Thu, Apr 28, 2011 at 12:46 PM, Robert Kern <robert.kern@gmail.com> wrote: ..
Python uses the < operator for sorting, not a comparison function, so it's current behavior is perfectly in line with the IEEE-754 spec.
No, it is not. As I explained in the previous post, IEEE-754 prescribes different behavior for <, >, <=, and >= operations and != and ==. The former signal INVALID exception while the later don't. Python does not make this distinction.

On 4/28/11 12:00 PM, Alexander Belopolsky wrote:
But it also states that such signals should *not* trap by default. The only thing I can really fault Python for, compliance-wise, is that it will hide the FPE from being handled in user code because of the PyFPE_START_PROTECT/PyFPE_END_PROTECT macros that surround the actual C operation. The only way to get the FPE to handle it is to build and install fpectl, which is officially discouraged. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On Thu, Apr 28, 2011 at 4:52 AM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote: ..
Furthermore, IEEE 754 specifies exactly what I propose: """ IEEE 754 assigns values to all relational expressions involving NaN. In the syntax of C , the predicate x != y is True but all others, x < y , x <= y , x == y , x >= y and x > y, are False whenever x or y or both are NaN, and then all but x != y and x == y are INVALID operations too and must so signal. """ -- Lecture Notes on the Status of IEEE Standard 754 for Binary Floating-Point Arithmetic by Prof. W. Kahan http://www.cs.berkeley.edu/~wkahan/ieee754status/ieee754.ps The problem with faithfully implementing IEEE 754 in Python is that exceptions in IEEE standard don't have the same meaning as in Python. IEEE 754 requires that a value is computed even when the operation signals an exception. The program can then decide whether to terminate computation or propagate the value. In Python, we have to choose between raising an exception and returning the value. We cannot have both. It appears that in most cases IEEE 754 "INVALID" exception is treated as a terminating exception by Python and operations that signal INVALID in IEEE 754 raise an exception in Python. Therefore making <, >, etc. raise on NaN while keeping the status quo for != and == would bring Python floats closer to compliance with IEEE 754.

On 4/28/11 11:12 AM, Alexander Belopolsky wrote:
This is not true. In fact, in most cases that issue an INVALID exception are passed silently in Python. See my response to Guido elsewhere in this thread for a nearly complete list. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On Thu, Apr 28, 2011 at 5:12 PM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
Note that this text refers to the obsolete IEEE 754-1985, not the current version of the standard. IEEE 754 isn't really much help here: the current version of the standard specifies (in section 5.11: Details of comparison predicates) *twenty-two* distinct comparison predicates. That includes, for example: 'compareSignalingGreater' which is a greater-than comparison that signals an invalid operation exception on a comparison involving NaNs. But it also includes: 'compareQuietGreater' which returns False for comparisons involving NaNs. And IEEE 754 has nothing to say about how the specified operations should be mapped to language constructs---that's out of scope for the specification. (It does happen to list plain '>' as one of the names for 'compareSignalingGreater', but I don't think it's realistic to try to read anything into that.) I'm -0 on the proposal: I don't think there's enough of a real problem here to justify the change. Mark

On Thu, Apr 28, 2011 at 10:01 AM, Steven D'Aprano <steve@pearwood.info> wrote:
I think I would like to see a demonstration of this rather than just take your word for it.
One demonstration would be def bubble_sort(xs): while True: changed = False for i in range(len(xs) - 1): if not (xs[i] < xs[i + 1]): changed = True xs[i], xs[i + 1] = xs[i + 1], xs[i] if not changed: break bubble_sort([float('nan)'] * 2)

Mike Graham wrote:
Thank you. Nevertheless, that does appear to be an easy fix: def bubble_sort(xs): while True: changed = False for i in range(len(xs) - 1): # don't use `not (xs[i] < xs[i + 1])` as that fails in the # presence of NANs if xs[i] >= xs[i + 1]: changed = True xs[i], xs[i + 1] = xs[i + 1], xs[i] if not changed: break -- Steven

On Fri, Apr 29, 2011 at 3:30 AM, Stephen J. Turnbull <stephen@xemacs.org> wrote:
There are different shades of "not working". In most cases, raising an exception is preferable to silently producing garbage or entering an infinite loop. NaNs are unordered and NaN < 0 makes as much sense as None < 0 or "abc" < 0. The later operations raise an exception in py3k.

Alexander Belopolsky writes:
On Fri, Apr 29, 2011 at 3:30 AM, Stephen J. Turnbull <stephen@xemacs.org> wrote:
Sure, Python's behavior when asked to perform mathematical operations that do not admit a usable definition can be improved, to the benefit of people who write robust, high performance code. I appreciate your contribution to that discussion greatly. But I really doubt that raising here is going to save anybody's eggshells. The cure you suggest might be better than silent garbage or an infinite loop, but in production code you will still have to think carefully about preventing or handling the exception. Not to mention finding a way to produce NaNs in the first place. That's far from what I would call "perfectly ordinary code working as is".

On Thu, Apr 28, 2011 at 12:00 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Note this actually isn't an improvement--it merely takes a noticeable error and turns it into a data-polluter. (Sorting a sequence containing NaNs is obviously not a valid operation, which is the argument for OP's suggestion.) MG

I posted a patch implementing this proposal on the tracker: http://bugs.python.org/issue11949 Interestingly, the only substantive change that was needed to pass the test suit revealed a bug in the test logic. The tests in cmath_testcases.txt include testing for -0.0 results, but the processing in test_math.py ignores the difference between 0.0 and -0.0. For example, test_math will still pass if you make the following change: --- a/Lib/test/cmath_testcases.txt +++ b/Lib/test/cmath_testcases.txt @@ -405,7 +405,7 @@ -- zeros asin0000 asin 0.0 0.0 -> 0.0 0.0 asin0001 asin 0.0 -0.0 -> 0.0 -0.0 -asin0002 asin -0.0 0.0 -> -0.0 0.0 +asin0002 asin -0.0 0.0 -> 0.0 0.0 asin0003 asin -0.0 -0.0 -> -0.0 -0.0

On Thu, Apr 28, 2011 at 12:25 PM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
I posted a patch implementing this proposal on the tracker:
Interesting indeed! I'd like to hear from the numpy folks about this. But isn't a similar change needed for Decimal? -- --Guido van Rossum (python.org/~guido)

On Thu, Apr 28, 2011 at 6:10 PM, Guido van Rossum <guido@python.org> wrote: ..
But isn't a similar change needed for Decimal?
I did not look into this, but decimal contexts allow for more compliant implementations because you can trap FP exceptions differently in different contexts. We don't have this luxury with floats.

On 4/28/11 5:10 PM, Guido van Rossum wrote:
I'm personally -1, though mostly on general conservative principles. I'm sure there is some piece of code that will break, but I don't know how significant it would be. I'm not sure that it solves a significant problem. I've never actually heard of anyone running into an infinite cycle due to NaNs, though a bit of Googling does suggest that it happens sometimes. I don't think it really moves us closer to IEEE-754 compliance. The standard states (section 7. "Exceptions") "The default response to an exception shall be to proceed without a trap." Python only intermittently turns INVALID operations into exceptions, mostly just (-1.0)**0.5 and integer conversion (0/0.0 and x%0.0 could be considered covered under the division by zero signal that *is* consistently turned into a Python exception). inf-inf, inf/inf, 0*inf, and inf%2.0, to give other examples of INVALID-signaling operations from the spec, all return a NaN without an exception. Given that we want to avoid exposing SIGFPE handlers for safety reasons, I think the status quo is a reasonable compromise interpretation of the spec. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On Thu, 28 Apr 2011 20:38:13 -0500 Robert Kern <robert.kern@gmail.com> wrote:
Same as Robert. This does not seem very useful and may break existing code. It also opens the door for attacks against code which takes floats as input strings and parses them using the float() constructor. An attacker can pass "nan", which will be converted successfully and can later raise an exception at an arbitrary point. Applications will have to actively protect against this, which is an unnecessary nuisance. Regards Antoine.

On Fri, Apr 29, 2011 at 3:00 AM, Nick Coghlan <ncoghlan@gmail.com> wrote: ..
That's what I thought and contrary to what Robert said early in the thread. By default, decimal operations trap InvalidOperation, DivisionByZero, and Overflow: traps=[InvalidOperation, DivisionByZero, Overflow]) The advantage that decimal has over float is that user can control what is trapped:

On 4/29/11 11:14 AM, Alexander Belopolsky wrote:
I have said nothing about decimal. I can requote the relevant portions of the IEEE-754 standard again, if you like. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On Fri, Apr 29, 2011 at 12:28 PM, Robert Kern <robert.kern@gmail.com> wrote: ..
I have said nothing about decimal. I can requote the relevant portions of the IEEE-754 standard again, if you like.
Please do. I had a draft of IEEE-754 standard somewhere at some point, but not anymore. I rely on Kahan's notes at http://www.cs.berkeley.edu/~wkahan/ieee754status/ieee754.ps .

On 4/29/11 11:39 AM, Alexander Belopolsky wrote:
(Section 7. "Exceptions") "The default response to an exception shall be to proceed without a trap." IEEE-854 has the same sentence. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On Fri, Apr 29, 2011 at 4:23 PM, Robert Kern <robert.kern@gmail.com> wrote: ..
(Section 7. "Exceptions") "The default response to an exception shall be to proceed without a trap."
I cannot find this phrase in my copy of IEEE Std 754-2008. Instead, I see the following in section 7.1: "This clause also specifies default non-stop exception handling for exception signals, which is to deliver a default result, continue execution, and raise the corresponding status flag." As I mentioned before, Python does not have a mechanism that would allow to simultaneously raise an exception and deliver the result. We have to choose one or the other. I think the choice made in the decimal module is a reasonable one: trap Overflow, DivisionByZero, and InvalidOperation while ignoring Underflow and Inexact. The choices made for float operations are more ad-hoc: DivisionByZero is always trapped:
Overflow is trapped in some cases:
and ignored in others:
InvalidOperation is not handled consistently. Let me copy the relevant section of the standard and show Python's behavior for each case where InvalidOperation exception is required by the standard: """ The invalid operation exception is signaled if and only if there is no usefully definable result. In these cases the operands are invalid for the operation to be performed. For operations producing results in floating-point format, the default result of an operation that signals the invalid operation exception shall be a quiet NaN that should provide some diagnostic information (see 6.2). These operations are: a) any general-computational or signaling-computational operation on a signaling NaN (see 6.2), except for some conversions (see 5.12) """ Python does not have support for sNaNs. It is possible to produce a float carrying an sNaN using struct.unpack, but the result behaves a qNaN. InvalidOperation not trapped. """ b) multiplication: multiplication(0, ∞) or multiplication(∞, 0) """
0.0 * float('inf') nan
InvalidOperation not trapped. """ c) fusedMultiplyAdd: fusedMultiplyAdd(0, ∞, c) or fusedMultiplyAdd(∞, 0, c) unless c is a quiet NaN; if c is a quiet NaN then it is implementation defined whether the invalid operation exception is signaled """ Not applicable. Python does not have fusedMultiplyAdd (x * y + z) function. """ d) addition or subtraction or fusedMultiplyAdd: magnitude subtraction of infinities, such as: addition(+∞, −∞) """
float('inf') + float('-inf') nan
InvalidOperation not trapped. """ e) division: division(0, 0) or division(∞, ∞) """
InvalidOperation trapped, but misreported as DivisionByZero.
float('inf') / float('inf') nan
""" f) remainder: remainder(x, y), when y is zero or x is infinite and neither is NaN """
InvalidOperation trapped, but misreported as DivisionByZero.
float('inf') % 2.0 nan
InvalidOperation not trapped. """ g) squareRoot if the operand is less than zero """
InvalidOperation trapped. """ h) quantize when the result does not fit in the destination format or when one operand is finite and the other is infinite """ Not applicable. """ For operations producing no result in floating-point format, the operations that signal the invalid operation exception are: i) conversion of a floating-point number to an integer format, when the source is NaN, infinity, or a value that would convert to an integer outside the range of the result format under the applicable rounding attribute. """
InvalidOperation trapped.
InvalidOperation trapped, but misclassified as OverflowError. """ j) comparison by way of unordered-signaling predicates listed in Table 5.2, when the operands are unordered """ This is the subject of my proposal.
float('nan') < 0.0 False
InvalidOperation not trapped. """ k) logB(NaN), logB(∞), or logB(0) when logBFormat is an integer format (see 5.3.3). """ Not applicable. Overall, it appears that in cases where InvalidOperation was anticipated, it was converted to some type of exception in Python. Exceptions to this rule seem to be an accident of implementation.

On 4/29/11 4:30 PM, Alexander Belopolsky wrote:
Ah. I have the 1985 version.
Well, for comparisons at least, it seems to have been anticipated, and returning a value was intentional. From the comments documenting float_richcompare() in floatobject.c: /* Comparison is pretty much a nightmare. When comparing float to float, * we do it as straightforwardly (and long-windedly) as conceivable, so * that, e.g., Python x == y delivers the same result as the platform * C x == y when x and/or y is a NaN. ... I'm not sure there's any evidence that the other behaviors have not been anticipated or are accidents of implementation. The ambiguous inf operations are documented and doctested in Lib/test/ieee754.txt. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On Fri, Apr 29, 2011 at 6:11 PM, Robert Kern <robert.kern@gmail.com> wrote: ..
Well, I may be overly pedantic, but this comment only mentions that the author considered == comparison and not ordering of NaNs. I guess we need to ask Tim Peters if he considered the fact that x < NaN is invalid operation according to IEEE 754 while x == NaN is not. Tim?

On Fri, Apr 29, 2011 at 5:30 PM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote: ..
I made this argument several times and it went unchallenged, but I now realize that Python does have a mechanism that would allow to simultaneously raise an exception and deliver the result. This is what warnings do. Since changing NaN < 0 to raise an error would have to be done by issuing a deprecation warning first, why can't we just issue appropriate warning on invalid operations? Isn't this what numpy does in some cases?

On 4/29/11 5:50 PM, Alexander Belopolsky wrote:
We have a configurable mechanism that lets you change between ignoring, warning, and raising an exception (and a few others). [~] |1> with np.errstate(invalid='raise'): ..> np.array([np.inf]) / np.array([np.inf]) ..> --------------------------------------------------------------------------- FloatingPointError Traceback (most recent call last) /Users/rkern/<ipython-input-1-d0b8f36f6dea> in <module>() 1 with np.errstate(invalid='raise'): ----> 2 np.array([np.inf]) / np.array([np.inf]) 3 FloatingPointError: invalid value encountered in divide [~] |2> with np.errstate(invalid='ignore'): ..> np.array([np.inf]) / np.array([np.inf]) ..> [~] |3> with np.errstate(invalid='warn'): ..> np.array([np.inf]) / np.array([np.inf]) ..> /Library/Frameworks/Python.framework/Versions/Current/bin/ipython:2: RuntimeWarning: invalid value encountered in divide I think I could support issuing a warning. Beats the hell out of arguing over fine details of ancient standards intended for low-level languages and hardware. :-) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On Fri, Apr 29, 2011 at 10:30 PM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
Roughly, the current situation is that math module operations try to consistently follow IEEE 754 exceptions: an IEEE 754 overflow is converted to an OverflowError, while invalid-operation or divide-by-zero signals produce a Python ValueError. Basic arithmetic is another story: ** behaves more-or-less like the math module operations, but the arithmetic operations mainly produce nans or infinities, except that division by zero is trapped. IMO, the ideal (ignoring backwards compatibility) would be to have OverflowError / ZeroDivisionError / ValueError produced wherever IEEE754 says that overflow / divide-by-zero / invalid-operation should be signaled. Mark

On Thu, Apr 28, 2011 at 10:01 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Hmm, a quick search of the tracker yielded issue7915 which demonstrates how presence of nans causes sort to leave a list unsorted. Using binary search on unsorted data leads to nonsensical results, but I don't seem to be able to produce infinite loops with python's bisect. Maybe I saw nan-caused infinite loops in some other libraries. I learned long ago to rid my data of NaNs before doing any type of comparison, so my recollection of the associated problems is admittedly vague. I'll try to come up with something, though. http://bugs.python.org/issue7915

On Thu, Apr 28, 2011 at 4:52 AM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
I'm -0 on this -- I really favor having NaNs behave like NaNs. Obviously this is a weird fit for Python, but so what? Python does its best never to give you NaNs. If you've done something to get a NaN it's because of a library bug or because you really wanted the NaNs and should know what you're doing.

On 4/28/11 10:02 AM, Alexander Belopolsky wrote:
Not quite, IIRC. I don't have it in front of me, but I do recall that it specifies how it behaves in two different situations: 1. Where you have a comparison function that returns the relationship between the two operands, IEEE-754 specifies that in addition to GT, LT, and EQ, you ought to include "unordered" to use when a NaN is involved. 2. Where you have comparison operators like <, ==, etc. that return bools, NaNs will return False for all comparisons. They may specify whether or not FPE signals should be issued, I don't recall, but I suspect that if they are quiet NaNs, they won't issue a SIGFPE. Higher-level exceptions were not contemplated by IEEE-754, IIRC. Python uses the < operator for sorting, not a comparison function, so it's current behavior is perfectly in line with the IEEE-754 spec. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On Thu, Apr 28, 2011 at 12:46 PM, Robert Kern <robert.kern@gmail.com> wrote: ..
Python uses the < operator for sorting, not a comparison function, so it's current behavior is perfectly in line with the IEEE-754 spec.
No, it is not. As I explained in the previous post, IEEE-754 prescribes different behavior for <, >, <=, and >= operations and != and ==. The former signal INVALID exception while the later don't. Python does not make this distinction.

On 4/28/11 12:00 PM, Alexander Belopolsky wrote:
But it also states that such signals should *not* trap by default. The only thing I can really fault Python for, compliance-wise, is that it will hide the FPE from being handled in user code because of the PyFPE_START_PROTECT/PyFPE_END_PROTECT macros that surround the actual C operation. The only way to get the FPE to handle it is to build and install fpectl, which is officially discouraged. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On Thu, Apr 28, 2011 at 4:52 AM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote: ..
Furthermore, IEEE 754 specifies exactly what I propose: """ IEEE 754 assigns values to all relational expressions involving NaN. In the syntax of C , the predicate x != y is True but all others, x < y , x <= y , x == y , x >= y and x > y, are False whenever x or y or both are NaN, and then all but x != y and x == y are INVALID operations too and must so signal. """ -- Lecture Notes on the Status of IEEE Standard 754 for Binary Floating-Point Arithmetic by Prof. W. Kahan http://www.cs.berkeley.edu/~wkahan/ieee754status/ieee754.ps The problem with faithfully implementing IEEE 754 in Python is that exceptions in IEEE standard don't have the same meaning as in Python. IEEE 754 requires that a value is computed even when the operation signals an exception. The program can then decide whether to terminate computation or propagate the value. In Python, we have to choose between raising an exception and returning the value. We cannot have both. It appears that in most cases IEEE 754 "INVALID" exception is treated as a terminating exception by Python and operations that signal INVALID in IEEE 754 raise an exception in Python. Therefore making <, >, etc. raise on NaN while keeping the status quo for != and == would bring Python floats closer to compliance with IEEE 754.

On 4/28/11 11:12 AM, Alexander Belopolsky wrote:
This is not true. In fact, in most cases that issue an INVALID exception are passed silently in Python. See my response to Guido elsewhere in this thread for a nearly complete list. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On Thu, Apr 28, 2011 at 5:12 PM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
Note that this text refers to the obsolete IEEE 754-1985, not the current version of the standard. IEEE 754 isn't really much help here: the current version of the standard specifies (in section 5.11: Details of comparison predicates) *twenty-two* distinct comparison predicates. That includes, for example: 'compareSignalingGreater' which is a greater-than comparison that signals an invalid operation exception on a comparison involving NaNs. But it also includes: 'compareQuietGreater' which returns False for comparisons involving NaNs. And IEEE 754 has nothing to say about how the specified operations should be mapped to language constructs---that's out of scope for the specification. (It does happen to list plain '>' as one of the names for 'compareSignalingGreater', but I don't think it's realistic to try to read anything into that.) I'm -0 on the proposal: I don't think there's enough of a real problem here to justify the change. Mark
participants (12)
-
Alexander Belopolsky
-
Antoine Pitrou
-
Guido van Rossum
-
Mark Dickinson
-
Mike Graham
-
MRAB
-
Nick Coghlan
-
Rob Cliffe
-
Robert Kern
-
Stephen J. Turnbull
-
Steven D'Aprano
-
Terry Reedy