
[Changing the subject line, since we're way off the original topic] On Wed, Mar 24, 2010 at 7:04 PM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
On Wed, Mar 24, 2010 at 2:50 PM, Mark Dickinson <dickinsm@gmail.com> wrote: ..
If Python were to do something different then a naively translated algorithm from another language would fail. It's the behaviour that numerically-aware people expect, and I'd expect to get complaints from those people if it changed.
Numerically-aware people are likely to be aware of the differences in languages that they use.
Sure. But I'd still expect them to complain. :) Here's an interesting recent blog post on this subject, from the creator of Eiffel: http://bertrandmeyer.com/2010/02/06/reflexivity-and-other-pillars-of-civiliz... Mark

On Wed, Mar 24, 2010 at 3:26 PM, Mark Dickinson <dickinsm@gmail.com> wrote: ..
Here's an interesting recent blog post on this subject, from the creator of Eiffel:
http://bertrandmeyer.com/2010/02/06/reflexivity-and-other-pillars-of-civiliz...
It appears that Meyer's view has evolved over the years: """ In this context it doesn't particularly shock me that operations on NaN should cause invariant violations. After all, isn't NaN supposed to mean "not a number"? If it's not a number, it doesn't have to satisfy the properties of numbers. """ "NaN and floating point exceptions" by Roger Browne quoting an ISE Eiffel mailing list post by Bertrand Meyer http://teameiffel.blogspot.com/2006/04/nan-and-floating-point-exceptions.htm... Compare this with the conclusion he reached in "Pillars:" "It is rather dangerous indeed to depart from the fundamental laws of mathematics. " To bring the discussion back on topic for python-dev, I would argue that reflexivity should hold for hashable values. Having it otherwise would lead to unsurmountable problems when storing such values in sets or using them as keys in dictionaries. If x == x is False stays for x = float('nan'), I would argue that hash(float('nan')) should raise NotImplemented or ValueError.

On Mar 24, 2010, at 2:31 PM, Alexander Belopolsky wrote:
""" In this context it doesn't particularly shock me that operations on NaN should cause invariant violations. After all, isn't NaN supposed to mean "not a number"? If it's not a number, it doesn't have to satisfy the properties of numbers. """
This sound like a good, universal reply to "bug reports" about various containers/tools not working with NaNs :-) Bug report: "Container X or Function Y doesn't behave well when I give it a NaN as an input". Closed -- Won't fix: "It does not shock me that a NaN violates that tool or container's invariants."
To bring the discussion back on topic for python-dev, I would argue that reflexivity should hold for hashable values. Having it otherwise would lead to unsurmountable problems when storing such values in sets or using them as keys in dictionaries. If x == x is False stays for x = float('nan'), I would argue that hash(float('nan')) should raise NotImplemented or ValueError.
Hashability isn't the only place where you have a problem. Even unordered collections are affected:
class ListBasedSet(collections.Set): ''' Alternate set implementation favoring space over speed and not requiring the set elements to be hashable. ''' def __init__(self, iterable): self.elements = lst = [] for value in iterable: if value not in lst: lst.append(value) def __iter__(self): return iter(self.elements) def __contains__(self, value): return any(value == elem for elem in self.elements) def __len__(self): return len(self.elements)
n = float('Nan') s = ListBasedSet([n]) n in s # unexpected result False len(s) # expected result 1 len(s & s) # unexpected result 0
If we want to be able to reason about our programs, then we need to rely on equality relations being reflexsive, symmetric, and transitive. Otherwise, containers and functions can't even make minimal guarantees about what they do. Anything else smells of a Douglas Hofstadter style or Betrand Russell style logic bomb: * this sentence is a lie * this object isn't equal to itself * this is a set containing sets that are not members of themselves The property of NaN objects not being equal to themselves is more harmful than helpful. We should probably draw the line at well-defined numeric contexts such as the decimal module and stop trying to propagate NaN awareness throughout the entire object model. Raymond

On Wed, Mar 24, 2010 at 6:21 PM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote: ..
If we want to be able to reason about our programs, then we need to rely on equality relations being reflexsive, symmetric, and transitive. Otherwise, containers and functions can't even make minimal guarantees about what they do.
+1
.. We should probably draw the line at well-defined numeric contexts such as the decimal module and stop trying to propagate NaN awareness throughout the entire object model.
I am not sure what this means in practical terms. Should float('nan') == float('nan') return True or should float('nan') raise an exception to begin with? I would prefer the former.

On Wed, Mar 24, 2010 at 10:30 PM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
On Wed, Mar 24, 2010 at 6:21 PM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote: ..
If we want to be able to reason about our programs, then we need to rely on equality relations being reflexsive, symmetric, and transitive. Otherwise, containers and functions can't even make minimal guarantees about what they do.
+1
.. We should probably draw the line at well-defined numeric contexts such as the decimal module and stop trying to propagate NaN awareness throughout the entire object model.
I am not sure what this means in practical terms. Should float('nan') == float('nan') return True or should float('nan') raise an exception to begin with? I would prefer the former.
Neither is necessary, because Python doesn't actually use == as the equivalence relation for containment testing: the actual equivalence relation is: x equivalent to y iff id(x) == id(y) or x == y. This restores the missing reflexivity (besides being a useful optimization). Mark

On Wed, Mar 24, 2010 at 6:31 PM, Mark Dickinson <dickinsm@gmail.com> wrote: ..
Neither is necessary, because Python doesn't actually use == as the equivalence relation for containment testing: the actual equivalence relation is: x equivalent to y iff id(x) == id(y) or x == y. This restores the missing reflexivity (besides being a useful optimization).
No, it does not:
float('nan') in [float('nan')] False
It would if NaNs were always interned, but they are not.

Not to mention the following aberrations:
{x for x in [float('nan')] * 10} {nan} {float(x) for x in ['nan'] * 10} {nan, nan, nan, nan, nan, nan, nan, nan, nan, nan}
{float('nan')} | {float('nan')} {nan, nan} {float('nan')} & {float('nan')} set()
On Wed, Mar 24, 2010 at 6:36 PM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
On Wed, Mar 24, 2010 at 6:31 PM, Mark Dickinson <dickinsm@gmail.com> wrote: ..
Neither is necessary, because Python doesn't actually use == as the equivalence relation for containment testing: the actual equivalence relation is: x equivalent to y iff id(x) == id(y) or x == y. This restores the missing reflexivity (besides being a useful optimization).
No, it does not:
float('nan') in [float('nan')] False
It would if NaNs were always interned, but they are not.

On Wed, Mar 24, 2010 at 10:36 PM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
On Wed, Mar 24, 2010 at 6:31 PM, Mark Dickinson <dickinsm@gmail.com> wrote: ..
Neither is necessary, because Python doesn't actually use == as the equivalence relation for containment testing: the actual equivalence relation is: x equivalent to y iff id(x) == id(y) or x == y. This restores the missing reflexivity (besides being a useful optimization).
No, it does not:
float('nan') in [float('nan')] False
Sure, but just think of it as having two different nans there. (You could imagine thinking of the id of the nan as part of the payload.) There's no ideal solution here; IMO, the compromise that currently exists is an acceptable one. Mark

On Wed, Mar 24, 2010 at 6:47 PM, Mark Dickinson <dickinsm@gmail.com> wrote: ..
There's no ideal solution here; IMO, the compromise that currently exists is an acceptable one.
I don't see a compromise. So far I failed to find a use case that benefits from NaN violating reflexivity.

On Wed, Mar 24, 2010 at 10:52 PM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
On Wed, Mar 24, 2010 at 6:47 PM, Mark Dickinson <dickinsm@gmail.com> wrote: ..
There's no ideal solution here; IMO, the compromise that currently exists is an acceptable one.
I don't see a compromise. So far I failed to find a use case that benefits from NaN violating reflexivity.
So if I understand correctly, you propose that float('nan') == float('nan') return True. Would you also suggest extending this behaviour to Decimal nans?

On Wed, Mar 24, 2010 at 11:11 PM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
On Wed, Mar 24, 2010 at 7:02 PM, Mark Dickinson <dickinsm@gmail.com> wrote: ..
So if I understand correctly, you propose that float('nan') == float('nan') return True. Would you also suggest extending this behaviour to Decimal nans?
yes
Okay. So now it seems to me that there are many decisions to make: should any Decimal nan compare equal to any other? What if the two nans have different payloads or signs? How about comparing a signaling nan with either an arbitrary quiet nan, or with the exact quiet nan that corresponds to the signaling nan? How do decimal nans compare with float nans? Should Decimal.compare(Decimal('nan'), Decimal('nan')) return 0 rather than nan? If not, how do you justify the difference between == and compare? If so, how do you justify the deviation from the standard on which the decimal modulo is based? In answering all these questions, you effectively end up developing your own standard, and hoping that all the answers you chose are sensible, consistent, and useful. Alternatively, we could do what we're currently doing: make use of *existing* standards to answer these questions, and rely on the expertise of the many who've thought about this in depth. You say that you don't see any compromise: I say that there's value in adhering to (de facto and de jure) standards, and I see a compromise between standards adherence and Python pragmatics. Mark

Am 24.03.2010 22:47, schrieb Mark Dickinson:
On Wed, Mar 24, 2010 at 10:36 PM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
On Wed, Mar 24, 2010 at 6:31 PM, Mark Dickinson <dickinsm@gmail.com> wrote: ..
Neither is necessary, because Python doesn't actually use == as the equivalence relation for containment testing: the actual equivalence relation is: x equivalent to y iff id(x) == id(y) or x == y. This restores the missing reflexivity (besides being a useful optimization).
No, it does not:
float('nan') in [float('nan')] False
Sure, but just think of it as having two different nans there. (You could imagine thinking of the id of the nan as part of the payload.)
That's interesting. Thinking of each value created by float('nan') as a different nan makes sense to my naive mind, and it also explains nicely the behavior present right now. Each nan comes from a different operation and therefore is a "different" non-number. Of course, float being an immutable type, there is some reason to expect that all values created by float('nan') should be identical, but after all, datetime is an immutable type as well, but you wouldn't expect datetime.now() in [datetime.now()] to be true. The only wart left is that you can't distinguish different nans by their string representation -- this could be remedied by making it ``"nan-%s" % id(self)``, but that looks a bit ugly to me. Georg

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 03/25/2010 07:54 AM, Georg Brandl wrote:
float('nan') in [float('nan')] False
Sure, but just think of it as having two different nans there. (You could imagine thinking of the id of the nan as part of the payload.)
That's interesting. Thinking of each value created by float('nan') as a different nan makes sense to my naive mind, and it also explains nicely the behavior present right now. Each nan comes from a different operation and therefore is a "different" non-number.
Infinites are "not equal" for a good reason, for example. 1/0 and 2/0 are both infinites, but one is "greater" than the other. Or (1/0)^(1/0), an infinite infinitelly "bigger". - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea@jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea@jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBS6tYzJlgi5GaxT1NAQL7kgP+LqjzNKOkSOZ+gTvgKfUTrF1poNP1VMC4 1LTkCcpFQYUoc4d8kk6lmzN7RdBesidbnVC2SApKTdNTAfbKMB3hjkTIzoxbx9wf sLb5IUSqhtc+xJ+JQFepQwA7YLa64AVI23/wZcJCkqCBIg6S5DuGxhWErr3TXVgF GqcZjhvD2lA= =AVOO -----END PGP SIGNATURE-----

On Thu, Mar 25, 2010 at 12:36 PM, Jesus Cea <jcea@jcea.es> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 03/25/2010 07:54 AM, Georg Brandl wrote:
> float('nan') in [float('nan')] False
Sure, but just think of it as having two different nans there. (You could imagine thinking of the id of the nan as part of the payload.)
That's interesting. Thinking of each value created by float('nan') as a different nan makes sense to my naive mind, and it also explains nicely the behavior present right now. Each nan comes from a different operation and therefore is a "different" non-number.
Infinites are "not equal" for a good reason, for example.
Well, that depends on your mathematical model. The underlying mathematical model for IEEE 754 floating-point is the doubly extended real line: that is, the set of all real numbers augmented with two extra elements "infinity" and "-infinity", with the obvious total order. This is made explicit in section 3.2 of the standard: "The mathematical structure underpinning the arithmetic in this standard is the extended reals, that is, the set of real numbers together with positive and negative infinity." This is the same model that one typically uses in a first course in calculus when studying limits of functions; it's an appropriate model for dealing with computer approximations to real numbers and continuous functions. So the model has precisely two infinities, and 1/0, 2/0 and (1/0)**(1/0) all give the same infinity. The decision to make 1/0 "infinity" rather than "-infinity" is admittedly a little arbitrary. For floating-point (but not for calculus!), it makes sense in the light of the decision to have both positive and negative floating-point zeros; 1/(-0) is -infinity, of course. Other models of "reals + one or more infinities" are possible, of course, but they're not relevant to IEEE 754 floating point. There's a case for using a floating-point model with a single infinity, especially for those who care more about algebraic functions (polynomials, rational functions) than transcendental ones; however, IEEE 754 doesn't make provisions for this. Mark

On Thu, 25 Mar 2010 11:36:28 pm Jesus Cea wrote:
Infinites are "not equal" for a good reason, for example.
1/0 and 2/0 are both infinites, but one is "greater" than the other. Or (1/0)^(1/0), an infinite infinitelly "bigger".
I think you're mistaken. In Python 3.1:
x = float('inf') y = float('inf') + 1 x == y True x is y False
In cardinal arithmetic, there are an infinity of different sized infinities, but in ordinal arithmetic there are only two: +inf and -inf, corresponding to the infinities on the real number line. (I hope that I'm not over-simplifying -- it's been more than a decade since I've needed to care about this.) But in any case, the IEEE standard doesn't deal with cardinals: it only uses two signed infinities. -- Steven D'Aprano

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 03/25/2010 03:19 PM, Steven D'Aprano wrote:
On Thu, 25 Mar 2010 11:36:28 pm Jesus Cea wrote:
Infinites are "not equal" for a good reason, for example.
1/0 and 2/0 are both infinites, but one is "greater" than the other. Or (1/0)^(1/0), an infinite infinitelly "bigger".
I think you're mistaken. In Python 3.1:
I was refering to mathematical infinites and why inf!=inf is sensible and a natural consequence calculus and limits (mathematically). In any case, I am an engineer, not a mathematical or a language designer :). IANAL. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea@jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea@jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBS6ufvJlgi5GaxT1NAQKrpQP+PbbNP6tNtPi480/ovki89DEDNqvKH4RU A/R6jLXwm7byF0RQ+3B9gUh8SfANQaTOeCOYufUJHXPDM3BHsmt2S1kcPoW5sUYe a38nUuD0sTyCV23h2QtzZpNGG7qNa6iHTMEc6vYgY3CdfCw+301xdqH0ZkmXV1B9 OIrj7ec7m+s= =rTUf -----END PGP SIGNATURE-----

Georg Brandl wrote:
Thinking of each value created by float('nan') as a different nan makes sense to my naive mind, and it also explains nicely the behavior present right now.
Not entirely: x = float('NaN') y = x if x == y: ... There it's hard to argue that the NaNs being compared result from different operations. It does suggest a potential compromise, though: a single NaN object compares equal to itself, but different NaN objects are never equal (more or less what dict membership testing does now, but extended to all == comparisons). Whether that's a *sane* compromise I'm not sure. -- Greg

Am 25.03.2010 22:45, schrieb Greg Ewing:
Georg Brandl wrote:
Thinking of each value created by float('nan') as a different nan makes sense to my naive mind, and it also explains nicely the behavior present right now.
Not entirely:
x = float('NaN') y = x if x == y: ...
There it's hard to argue that the NaNs being compared result from different operations.
It does suggest a potential compromise, though: a single NaN object compares equal to itself, but different NaN objects are never equal (more or less what dict membership testing does now, but extended to all == comparisons).
Whether that's a *sane* compromise I'm not sure.
FWIW, I like it. Georg

On Mar 25, 2010, at 4:21 PM, Georg Brandl wrote:
Am 25.03.2010 22:45, schrieb Greg Ewing:
Georg Brandl wrote:
Thinking of each value created by float('nan') as a different nan makes sense to my naive mind, and it also explains nicely the behavior present right now.
Not entirely:
x = float('NaN') y = x if x == y: ...
There it's hard to argue that the NaNs being compared result from different operations.
It does suggest a potential compromise, though: a single NaN object compares equal to itself, but different NaN objects are never equal (more or less what dict membership testing does now, but extended to all == comparisons).
Whether that's a *sane* compromise I'm not sure.
FWIW, I like it.
Georg
+1 Raymond

On Fri, 26 Mar 2010 09:45:51 am Greg Ewing wrote:
Georg Brandl wrote:
Thinking of each value created by float('nan') as a different nan makes sense to my naive mind, and it also explains nicely the behavior present right now.
Not entirely:
x = float('NaN') y = x if x == y: ...
There it's hard to argue that the NaNs being compared result from different operations.
But unlike us, the equality operator only has a pinhole view of the operands. It can't distinguish between your example and this: x = float('nan') y = some_complex_calculation(x) if x == y: ... where y merely happens to end up with the same object as x by some quirk of implementation.
It does suggest a potential compromise, though: a single NaN object compares equal to itself, but different NaN objects are never equal (more or less what dict membership testing does now, but extended to all == comparisons).
Whether that's a *sane* compromise I'm not sure.
What do we do with Decimal? Aren't we committed to matching the Decimal standard, in which case aren't we committed to this? x = Decimal('nan') assert x != x If that's the case, then float NANs and Decimal NANs will behave differently. I think that's a mistake. For what it's worth, I'm -1 on allowing NANs to test equal to any NAN, including itself. However, I would be -0 on the following compromise: Allow NANs to test equal to themselves (identity testing). math module to grow a function ieee_equals(x, y) that keeps the current behaviour. -- Steven D'Aprano

Steven D'Aprano wrote:
What do we do with Decimal? Aren't we committed to matching the Decimal standard,
It's been pointed out that the Decimal standard only defines some abstract operations, and doesn't mandate that they be mapped onto any particular language syntax. That gives us enough flexibility to make == do what we want and still claim compliance with the standard. BTW, does IEEE754 give us the same flexibility? If so, we may not have much of a problem in the first place. -- Greg

On 3/25/2010 9:35 PM, Greg Ewing wrote:
Steven D'Aprano wrote:
What do we do with Decimal? Aren't we committed to matching the Decimal standard,
It's been pointed out that the Decimal standard only defines some abstract operations, and doesn't mandate that they be mapped onto any particular language syntax. That gives us enough flexibility to make == do what we want and still claim compliance with the standard.
BTW, does IEEE754 give us the same flexibility? If so, we may not have much of a problem in the first place.
I propose that the abstract Decimal operation for addition be mapped to the syntax of operator - and that the abstract Decimal operation for subtraction be mapped to the syntax of operator +. Then people will have to actually read the manual to learn how to use the Decimal type in Python, rather than assuming that things might work the way they expect. This will lead to more robust and correct programs, because people will have read the manual. Or at least it seems that it should work that way... ˚͜ ˚ (Hmm, that might not render consistently for everyone, so I'll throw in a couple :) :) also.) Glenn

Steven D'Aprano writes:
But unlike us, the equality operator only has a pinhole view of the operands. It can't distinguish between your example and this:
x = float('nan') y = some_complex_calculation(x) if x == y: ...
where y merely happens to end up with the same object as x by some quirk of implementation.
Note that Mark has already provided a related example of such a quirk (Decimal(-1).sqrt(), I think it was).

Mark Dickinson wrote:
Here's an interesting recent blog post on this subject, from the creator of Eiffel:
http://bertrandmeyer.com/2010/02/06/reflexivity-and-other-pillars-of-civiliz...
Interesting. So the natural tweak that would arise from that perspective is for us to restore reflexivity by declaring that any given NaN is equal to *itself* but not to any other NaN (even one with the same payload). With NaN (in general) not being interned, that would actually fit the idea of a NaN implicitly carrying the operation that created the NaN as part of its definition of equivalence. So, I'm specifically putting that proposal on the table for both float and Decimal NaNs in Python: "Not a Number" is not a single floating point value. Instead each instance is a distinct value representing the precise conditions that created it. Thus, two "NaN" values x and y will compare equal iff they are the exact same NaN object (i.e. "if isnan(x) then x == y iff x is y". As stated above, such a change would allow us to restore reflexivity (eliminating a bunch of weirdness) while still honouring the idea of NaN being a set of values rather than a single value. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Thu, Mar 25, 2010 at 11:22 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Mark Dickinson wrote:
Here's an interesting recent blog post on this subject, from the creator of Eiffel:
http://bertrandmeyer.com/2010/02/06/reflexivity-and-other-pillars-of-civiliz...
Interesting. So the natural tweak that would arise from that perspective is for us to restore reflexivity by declaring that any given NaN is equal to *itself* but not to any other NaN (even one with the same payload).
With NaN (in general) not being interned, that would actually fit the idea of a NaN implicitly carrying the operation that created the NaN as part of its definition of equivalence.
So, I'm specifically putting that proposal on the table for both float and Decimal NaNs in Python:
"Not a Number" is not a single floating point value. Instead each instance is a distinct value representing the precise conditions that created it. Thus, two "NaN" values x and y will compare equal iff they are the exact same NaN object (i.e. "if isnan(x) then x == y iff x is y".
In other words, this would make explicit, at the level of ==, what Python's already doing under the hood (e.g. in PyObjectRichCompareBool) for membership testing---at least for nans.
As stated above, such a change would allow us to restore reflexivity (eliminating a bunch of weirdness) while still honouring the idea of NaN being a set of values rather than a single value.
+0.2 from me. I could happily live with this change; but could also equally live with the existing weirdness. It's still a little odd for an immutable type to care about object identity, but I guess oddness comes with the floating-point territory. :) Mark

Mark Dickinson wrote:
+0.2 from me. I could happily live with this change; but could also equally live with the existing weirdness.
It's still a little odd for an immutable type to care about object identity, but I guess oddness comes with the floating-point territory. :)
The trick for me came in thinking of NaN as a set of values rather than a single value - at that point, the different id values just reflect the multitude of members of that set. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Thu, Mar 25, 2010 at 11:22 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
So, I'm specifically putting that proposal on the table for both float and Decimal NaNs in Python:
"Not a Number" is not a single floating point value. Instead each instance is a distinct value representing the precise conditions that created it. Thus, two "NaN" values x and y will compare equal iff they are the exact same NaN object (i.e. "if isnan(x) then x == y iff x is y".
I'd also suggest that the language make no guarantees about whether two distinct calls to float('nan') or Decimal('nan') (or any other function call returning a nan) return identical values or not, but leave implementations free to do what's convenient or efficient. For example, with the current decimal module: Decimal('nan') returns a new nan each time, but Decimal(-1).sqrt() always returns the same nan object (assuming that InvalidOperation isn't trapped). I think it's fine to regard this as an implementation detail. Python 2.6.2 (r262:71600, Aug 26 2009, 09:40:44) [GCC 4.2.1 (SUSE Linux)] on linux2 Type "help", "copyright", "credits" or "license" for more information.
from decimal import * getcontext().traps[InvalidOperation] = 0 x, y = Decimal('nan'), Decimal('nan') id(x), id(y) (47309953516000, 47309930620880) x, y = Decimal(-1).sqrt(), Decimal(-1).sqrt() id(x), id(y) (9922272, 9922272)
Mark

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 03/25/2010 12:22 PM, Nick Coghlan wrote:
"Not a Number" is not a single floating point value. Instead each instance is a distinct value representing the precise conditions that created it. Thus, two "NaN" values x and y will compare equal iff they are the exact same NaN object (i.e. "if isnan(x) then x == y iff x is y".
As stated above, such a change would allow us to restore reflexivity (eliminating a bunch of weirdness) while still honouring the idea of NaN being a set of values rather than a single value.
Sounds good. But IEEE 754 was created by pretty clever guys and sure they had a reason for define things in the way they are. Probably we are missing something. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea@jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea@jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBS6tZfZlgi5GaxT1NAQJZtQP/bxi6l5TaiiOzv+no6cLaVXbkWXb9v6OL jmejzrlAosXzzd/4CuinN2mdFs7bd9Y3O9gHoQ2nUfbfWQc4SwxpxjK67j10PODJ MMz7wXgz075A8S7gUlpwWznByU2VfAys6ZVxZCv/uogW9SXIHqEBC/sXwWN5Hwvn uHImzIL4bfs= =RI3T -----END PGP SIGNATURE-----

On Thu, Mar 25, 2010 at 12:39 PM, Jesus Cea <jcea@jcea.es> wrote:
But IEEE 754 was created by pretty clever guys and sure they had a reason for define things in the way they are. Probably we are missing something.
Well, if we are, then nobody seems to know what! See the Bertrand Meyer blog post that was linked to up-thread. Mark

Mark Dickinson <dickinsm <at> gmail.com> writes:
On Thu, Mar 25, 2010 at 12:39 PM, Jesus Cea <jcea <at> jcea.es> wrote:
But IEEE 754 was created by pretty clever guys and sure they had a reason for define things in the way they are. Probably we are missing something.
Well, if we are, then nobody seems to know what! See the Bertrand Meyer blog post that was linked to up-thread.
The missing part, IMO, is that allowing a given NaN value to compare equal to itself only pushes the problem up one level. Any single operation yielding a NaN will still be unequal to itself. That is, under what is being proposed, with a function func() returning the same result of some calculation: x = func() s1 = (x) print x in s1 print func() in s1 This would print True and False, even though func() is perfoming the same calculation and thus logically returning the same NaN. I think the IEEE NaN represent the fact that you have a number of an undefined set, but it doesn't specify which. The only way out, IMO, is to make *all* NaN comparisons yield False, but identity yield true. No interning necessary. At most, you could make the identity function return False for the different types of NaN.

Jesus Cea wrote:
But IEEE 754 was created by pretty clever guys and sure they had a reason for define things in the way they are. Probably we are missing something.
Yes, this is where their "implementable in a hardware circuit" focus comes in. They were primarily thinking of a floating point representation where the 32/64 bits are *it* - you can't have "multiple NaNs" because you don't have the bits available to describe them. We don't have that limitation - by bring id() into play for NaN equality tests we have a lot more bits to play with (effectively adding an extra 32 or 64 bits to the floating point value just to say "which NaN is this one?"). Doubling the size of your storage just to have multiple kinds of NaN would be insane, but in our case the extra storage is already in use - we would just be applying an additional interpretation to it. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Thu, Mar 25, 2010 at 7:08 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Jesus Cea wrote:
But IEEE 754 was created by pretty clever guys and sure they had a reason for define things in the way they are. Probably we are missing something.
Yes, this is where their "implementable in a hardware circuit" focus comes in. They were primarily thinking of a floating point representation where the 32/64 bits are *it* - you can't have "multiple NaNs" because you don't have the bits available to describe them.
Wait, what? I haven't been paying much attention, but this is backwards. There are multiple representations of NaN in the IEEE encoding; that's actually part of the problem with saying that NaN = NaN or NaN != NaN. If you want to ignore the "payload" in the NaN, then you're not just comparing bits any more. -- Curt Hagenlocher curt@hagenlocher.org

Le Thu, 25 Mar 2010 07:19:24 -0700, Curt Hagenlocher a écrit :
On Thu, Mar 25, 2010 at 7:08 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Jesus Cea wrote:
But IEEE 754 was created by pretty clever guys and sure they had a reason for define things in the way they are. Probably we are missing something.
Yes, this is where their "implementable in a hardware circuit" focus comes in. They were primarily thinking of a floating point representation where the 32/64 bits are *it* - you can't have "multiple NaNs" because you don't have the bits available to describe them.
Wait, what? I haven't been paying much attention, but this is backwards. There are multiple representations of NaN in the IEEE encoding; that's actually part of the problem with saying that NaN = NaN or NaN != NaN. If you want to ignore the "payload" in the NaN, then you're not just comparing bits any more.
This sounds a bit sophistic, if the (Python) user doesn't have access to the payload anyway. Antoine.

On Thu, Mar 25, 2010 at 2:26 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Le Thu, 25 Mar 2010 07:19:24 -0700, Curt Hagenlocher a écrit :
Wait, what? I haven't been paying much attention, but this is backwards. There are multiple representations of NaN in the IEEE encoding; that's actually part of the problem with saying that NaN = NaN or NaN != NaN. If you want to ignore the "payload" in the NaN, then you're not just comparing bits any more.
This sounds a bit sophistic, if the (Python) user doesn't have access to the payload anyway.
Well, you can get at the payload using the struct module, if you care enough. But yes, it's true that Python doesn't take much care with the payload: e.g., ideally, an operation on a nan (3.0 + nan, sqrt(nan), ...) should return exactly the same nan, to make sure that information in the payload is preserved. Python doesn't bother, for floats (though it does for decimal). Mark

On Thu, Mar 25, 2010 at 2:42 PM, Mark Dickinson <dickinsm@gmail.com> wrote:
On Thu, Mar 25, 2010 at 2:26 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
This sounds a bit sophistic, if the (Python) user doesn't have access to the payload anyway.
Well, you can get at the payload using the struct module, if you care enough. But yes, it's true that Python doesn't take much care with the payload: e.g., ideally, an operation on a nan (3.0 + nan, sqrt(nan), ...) should return exactly the same nan, to make sure that information in the payload is preserved. Python doesn't bother, for floats (though it does for decimal).
Hmm. I take it back. I was being confused by the fact that sqrt(nan) returns a nan with a new identity; but it does apparently preserve the payload. An example:
from struct import pack, unpack from math import sqrt x = unpack('<d', pack('<Q', (2047 << 52) + 12345))[0] y = sqrt(x) bin(unpack('<Q', pack('<d', x))[0]) '0b111111111110000000000000000000000000000000000000011000000111001' bin(unpack('<Q', pack('<d', y))[0]) '0b111111111111000000000000000000000000000000000000011000000111001'
Here you see that the payload has been preserved. The bit patterns aren't quite identical: the incoming nan was actually a signaling nan, which got silently (because neither Python nor C understands signaling nans) 'silenced' by setting bit 51. So the output is the corresponding quiet nan, with the same sign and payload. Mark

On Thu, Mar 25, 2010 at 7:54 AM, Mark Dickinson <dickinsm@gmail.com> wrote:
Hmm. I take it back. I was being confused by the fact that sqrt(nan) returns a nan with a new identity; but it does apparently preserve the payload. An example:
I played with this some a few months ago, and both the FPU and the C libraries I tested will preserve the payload. I imagine Python just inherits their behavior. -- Curt Hagenlocher curt@hagenlocher.org

On Thu, Mar 25, 2010 at 3:01 PM, Curt Hagenlocher <curt@hagenlocher.org> wrote:
On Thu, Mar 25, 2010 at 7:54 AM, Mark Dickinson <dickinsm@gmail.com> wrote:
Hmm. I take it back. I was being confused by the fact that sqrt(nan) returns a nan with a new identity; but it does apparently preserve the payload. An example:
I played with this some a few months ago, and both the FPU and the C libraries I tested will preserve the payload. I imagine Python just inherits their behavior.
Pretty much, yes. I think we've also taken care to preserve payloads in functions that have been added to the math library as well (e.g., the gamma function). Not that that's particularly hard: it's just a matter of making sure to do "if (isnan(x)) return x;" rather than "if (isnan(x)) return standard_python_nan;". If that's not true, then there's a minor bug to be corrected. Mark

Curt Hagenlocher wrote:
Wait, what? I haven't been paying much attention, but this is backwards. There are multiple representations of NaN in the IEEE encoding;
I think Nick's point is that there aren't enough bits to give the result of every operation its own unique NaN. The payload of a NaN in typical hardware implementations is quite small, because it has to fit into the exponent field. -- Greg

I impulsively wrote:
The payload of a NaN in typical hardware implementations is quite small, because it has to fit into the exponent field.
...which turns out to be precisely wrong. Some day I'll learn to wait until somebody else in the thread has checked the facts for me before posting. :-) -- Greg

On Thu, Mar 25, 2010 at 2:08 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Jesus Cea wrote:
But IEEE 754 was created by pretty clever guys and sure they had a reason for define things in the way they are. Probably we are missing something.
Yes, this is where their "implementable in a hardware circuit" focus comes in. They were primarily thinking of a floating point representation where the 32/64 bits are *it* - you can't have "multiple NaNs" because you don't have the bits available to describe them.
I'm not so sure about this: standard 64-bit binary IEEE 754 doubles allow for 2**53-2 different nans (2**52-2 signaling nans, 2**52 quiet nans): anything with bit pattern (msb to lsb) x1111111 1111xxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx is an infinity or a nan, and there are only 2 infinities. Mark

Mark Dickinson wrote:
On Thu, Mar 25, 2010 at 2:08 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Jesus Cea wrote:
But IEEE 754 was created by pretty clever guys and sure they had a reason for define things in the way they are. Probably we are missing something. Yes, this is where their "implementable in a hardware circuit" focus comes in. They were primarily thinking of a floating point representation where the 32/64 bits are *it* - you can't have "multiple NaNs" because you don't have the bits available to describe them.
I'm not so sure about this: standard 64-bit binary IEEE 754 doubles allow for 2**53-2 different nans (2**52-2 signaling nans, 2**52 quiet nans): anything with bit pattern (msb to lsb)
x1111111 1111xxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx
is an infinity or a nan, and there are only 2 infinities.
I stand corrected :) It still seems to me that the problems mostly arise when we're trying to get floats and Decimals to behave like Python *objects* (i.e. with reflexive equality) rather than like IEEE defined numbers. It's an extra element that isn't part of the problem the numeric standards are trying to solve. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Thu, Mar 25, 2010 at 3:05 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Mark Dickinson wrote:
On Thu, Mar 25, 2010 at 2:08 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Jesus Cea wrote:
But IEEE 754 was created by pretty clever guys and sure they had a reason for define things in the way they are. Probably we are missing something. Yes, this is where their "implementable in a hardware circuit" focus comes in. They were primarily thinking of a floating point representation where the 32/64 bits are *it* - you can't have "multiple NaNs" because you don't have the bits available to describe them.
I'm not so sure about this: standard 64-bit binary IEEE 754 doubles allow for 2**53-2 different nans (2**52-2 signaling nans, 2**52 quiet nans): anything with bit pattern (msb to lsb)
x1111111 1111xxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx
is an infinity or a nan, and there are only 2 infinities.
I stand corrected :)
It still seems to me that the problems mostly arise when we're trying to get floats and Decimals to behave like Python *objects* (i.e. with reflexive equality) rather than like IEEE defined numbers.
It's an extra element that isn't part of the problem the numeric standards are trying to solve.
Agreed. We don't have to be "missing something"; rather, the IEEE folks (quite understandably) almost certainly didn't anticipate this kind of usage. So I'll concede that it's reasonable to consider deviating from the standard in the light of this. Mark

On 3/25/2010 8:13 AM, Mark Dickinson wrote:
On Thu, Mar 25, 2010 at 3:05 PM, Nick Coghlan<ncoghlan@gmail.com> wrote:
Mark Dickinson wrote:
On Thu, Mar 25, 2010 at 2:08 PM, Nick Coghlan<ncoghlan@gmail.com> wrote:
Jesus Cea wrote:
But IEEE 754 was created by pretty clever guys and sure they had a reason for define things in the way they are. Probably we are missing something.
Yes, this is where their "implementable in a hardware circuit" focus comes in. They were primarily thinking of a floating point representation where the 32/64 bits are *it* - you can't have "multiple NaNs" because you don't have the bits available to describe them.
I'm not so sure about this: standard 64-bit binary IEEE 754 doubles allow for 2**53-2 different nans (2**52-2 signaling nans, 2**52 quiet nans): anything with bit pattern (msb to lsb)
x1111111 1111xxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx
is an infinity or a nan, and there are only 2 infinities.
I stand corrected :)
It still seems to me that the problems mostly arise when we're trying to get floats and Decimals to behave like Python *objects* (i.e. with reflexive equality) rather than like IEEE defined numbers.
It's an extra element that isn't part of the problem the numeric standards are trying to solve.
Agreed. We don't have to be "missing something"; rather, the IEEE folks (quite understandably) almost certainly didn't anticipate this kind of usage. So I'll concede that it's reasonable to consider deviating from the standard in the light of this.
It is my understand that even bit-for-bit identical NaN values will compare unequal according to IEEE 754 rules. I would have no problem with Python interning each encountered NaN value, to avoid having bit-for-bit identical NaN values with different Python IDs, but having them compare equal seems inappropriate. Glenn

On Thu, Mar 25, 2010 at 12:31 PM, Glenn Linderman <v+python@g.nevcal.com> wrote:
It is my understand that even bit-for-bit identical NaN values will compare unequal according to IEEE 754 rules.
I would have no problem with Python interning each encountered NaN value, to avoid having bit-for-bit identical NaN values with different Python IDs, but having them compare equal seems inappropriate.
Let's please not intern NaNs. Interning is for performance. If you have enough NaNs to affect your performance I think you have bigger worries! -- --Guido van Rossum (python.org/~guido)

On 3/25/2010 4:14 PM, Guido van Rossum wrote:
On Thu, Mar 25, 2010 at 12:31 PM, Glenn Linderman<v+python@g.nevcal.com> wrote:
It is my understand that even bit-for-bit identical NaN values will compare unequal according to IEEE 754 rules.
I would have no problem with Python interning each encountered NaN value, to avoid having bit-for-bit identical NaN values with different Python IDs, but having them compare equal seems inappropriate.
Let's please not intern NaNs. Interning is for performance. If you have enough NaNs to affect your performance I think you have bigger worries!
I'm OK with interning NaNs, and I'm OK with not interning NaNs. But I would much prefer to see Python conform to the IEEE 754 for float, and the Decimal standard for Decimal, where at all possible, including keeping NaNs not comparing equal to anything, themselves included. And retaining a mode where Decimal is standoffish, and doesn't accept non-Decimal comparisons or arithmetic seems like a big win for program correctness to me. Glenn

On Thu, Mar 25, 2010 at 9:39 PM, Jesus Cea <jcea@jcea.es> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 03/25/2010 12:22 PM, Nick Coghlan wrote:
"Not a Number" is not a single floating point value. Instead each instance is a distinct value representing the precise conditions that created it. Thus, two "NaN" values x and y will compare equal iff they are the exact same NaN object (i.e. "if isnan(x) then x == y iff x is y".
As stated above, such a change would allow us to restore reflexivity (eliminating a bunch of weirdness) while still honouring the idea of NaN being a set of values rather than a single value.
Sounds good.
But IEEE 754 was created by pretty clever guys and sure they had a reason for define things in the way they are. Probably we are missing something.
Yes, indeed. I don't claim having a deep understanding myself, but up to now, everytime I thought something in IEE 754 was weird, it ended up being for good reasons. I think the fundamental missing point in this discussion about Nan is exception handling: a lot of NaN quircky behavior becomes much more natural once you take into account which operations are invalid under which condition. Unless I am mistaken, python itself does not support for FPU exception handling. For example, the reason why x != x for x Nan is because != (and ==) are about the only operations where you can have NaN as operands without risking raising an exception, and support for creating and detecting NaN in languages have been coming only quite lately (e.g. C99). Concerning the lack of rationale: a relatively short reference concerned about FPU exception and NaN handling is from Kahan himself http://www.eecs.berkeley.edu/~wkahan/ieee754status/ieee754.ps David

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 03/25/2010 04:07 PM, David Cournapeau wrote:
Yes, indeed. I don't claim having a deep understanding myself, but up to now, everytime I thought something in IEE 754 was weird, it ended up being for good reasons.
I was wondering if we could bring the question to news:comp.arch newsgroup. They have the knowledge, and I know there are people from the IEEE 754 group lurking there. I only have read-only access, nevertheless. Another relevant group could be news:comp.arch.arithmetic, but I am not familiar with it. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea@jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea@jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBS6uhyZlgi5GaxT1NAQIZJwP/bAx5vXBLMLI8f724Hf0OtfATpV4SFQ84 RaLHNBPPkE5+cdFxWIv6VmFThtYfKnjutmLNU1TJYFoDwgOvqigYO8hOTFnWlfML Sx5B3LFdtGSZAfSsd+rMF23wKpbpAy/TicE+B6zg+Qy1LFv1V+OVn/Y3xBPGxVW5 m4yAKWT5T4U= =mwXx -----END PGP SIGNATURE-----

On Mar 25, 2010, at 4:22 AM, Nick Coghlan wrote:
Mark Dickinson wrote:
Here's an interesting recent blog post on this subject, from the creator of Eiffel:
http://bertrandmeyer.com/2010/02/06/reflexivity-and-other-pillars-of-civiliz...
Interesting. So the natural tweak that would arise from that perspective is for us to restore reflexivity by declaring that any given NaN is equal to *itself* but not to any other NaN (even one with the same payload).
With NaN (in general) not being interned, that would actually fit the idea of a NaN implicitly carrying the operation that created the NaN as part of its definition of equivalence.
So, I'm specifically putting that proposal on the table for both float and Decimal NaNs in Python:
"Not a Number" is not a single floating point value. Instead each instance is a distinct value representing the precise conditions that created it. Thus, two "NaN" values x and y will compare equal iff they are the exact same NaN object (i.e. "if isnan(x) then x == y iff x is y".
As stated above, such a change would allow us to restore reflexivity (eliminating a bunch of weirdness) while still honouring the idea of NaN being a set of values rather than a single value.
+1 Raymond

On Thu, 25 Mar 2010 06:26:11 am Mark Dickinson wrote:
Here's an interesting recent blog post on this subject, from the creator of Eiffel:
http://bertrandmeyer.com/2010/02/06/reflexivity-and-other-pillars-of- civilization/
Sorry, but he lost me right at the beginning when he quoted someone else: "there is no reason to believe that the result of one calculation with unclear value should match that of another calculation with unclear value" and then argued: "The exact same argument can be used to assert that the result should not be False: … there is no reason to believe that the result of one calculation with unclear value should not match that of another calculation with unclear value. Just as convincing! Both arguments complement each other: there is no compelling reason for demanding that the values be equal; and there is no compelling argument either to demand that they be different. If you ignore one of the two sides, you are biased." This whole argument is invalid on at least three levels. I'll get the first two out the way briefly #1: Bertrand starts by treating NANs as "unclear values", and concludes that we shouldn't prefer "two unclear values are different" as more compelling than "two unclear values are the same". But this is ridiculous -- if you ask me a pair of questions, and I answer "I'm not sure" to both of them, why would you assume that the right answer to both questions is actually the same? #2: But in fact NANs aren't "unclear values", they are not values at all. The answer to "what is the non-complex logarithm of -1?" is not "I'm not sure" but "there is no such value". Bertrand spends an awful lot of time trying to demonstrate why the reflexivity of equality (every x is equal to itself) should apply to NANs as well as the other floats, but RoE is a property of equivalence relations, which does not (and should not) hold for "there is no such value". By analogy: the Lizard King of Russia does not exist; the Vampire Queen of New Orleans also does not exist. We don't therefore conclude that the Lizard King and the Vampire Queen are therefore the same person. #3: We could, if we wish, violate the IEEE standard and treat equality of NANs as an equivalence relation. It's our language, we're free to follow whatever standards we like, and reflexivity of equality is a very useful axiom to have. Since it applies to all non-NAN floats (and virtually every object in Python, other than those with funny __eq__ methods), perhaps we should extend it to NANs as well? I hope to convince you that the cost of doing so is worse than the disease. Since NANs are usually found in mathematical contexts, we should follow the IEEE standard even at the cost of rare anomalies in non-mathematical code containing NANs. Simply put: we should treat "two unclear values are different" as more compelling than "two unclear values are the same" as it leads to fewer, smaller, errors. Consider: log(-1) = NAN # maths equality, not assignment log(-2) = NAN If we allow NAN = NAN, then we permit the error: log(-1) = NAN = log(-2) therefore log(-1) = log(-2) and 1 = 2 But if make NAN != NAN, then we get: log(-1) != log(-2) and all of mathematics does not collapse into a pile of rubble. I think that is a fairly compelling reason to prefer inequality over equality. One objection might be that while log(-1) and log(-2) should be considered different NANs, surely NANs should be equal to themselves? -1 = -1 implies log(-1) = log(-1) But consider the practicalities: there are far more floats than available NAN payloads. We simply can't map every invalid calculation to a unique NAN, and therefore there *must* be cases like: log(-123.456789e-8) = log(-9.876e47) implies 123.456789e-8 = 9.876e47 So we mustn't consider NANs equal just because their payloads are equal. What about identity? Even if we don't dare allow this: x = log(-1) # assignment y = log(-1) # another NAN with the same payload assert x is not y assert x == y surely we can allow this? assert x == x But this is dangerous. Don't be fooled by the simplicity of the above example. Just because you have two references to the same (as in identity) NAN, doesn't mean they represent "the same thing" or came from the same place: data = [1, 2, float('nan'), float('nan'), 3] x = harmonic_mean(data) y = 1 - geometric_mean(data) It is an accident of implementation whether x and y happen to be the same object or not. Why should their inequality depend on such a fragile thing? In fact, identity of NANs is itself an implementation quirk of programming languages like Python: logically, NANs don't have identity at all. To put it another way: all ONEs are the same ONE, even if they come from different sources, are in different memory locations, or have different identities; but all NANs are different, even if they come from the same source, are in the same memory location, or have the same identity. The fundamental problem here is that NANs are not values. If you treat them as if they were values, then you want reflexivity of equality. But they're not -- they're *signals* for "your calculation has gone screwy and the result you get is garbage", so to speak. You shouldn't even think of a specific NAN as a piece of specific garbage, but merely a label on the *kind* of garbage you've got (the payload): INF-INF is, in some sense, a different kind of error to log(-1). In the same way you might say "INF-INF could be any number at all, therefore we return NAN", you might say "since INF-INF could be anything, there's no reason to think that INF-INF == INF-INF." -- Steven D'Aprano

At 11:57 AM 3/26/2010 +1100, Steven D'Aprano wrote:
But they're not -- they're *signals* for "your calculation has gone screwy and the result you get is garbage", so to speak. You shouldn't even think of a specific NAN as a piece of specific garbage, but merely a label on the *kind* of garbage you've got (the payload): INF-INF is, in some sense, a different kind of error to log(-1). In the same way you might say "INF-INF could be any number at all, therefore we return NAN", you might say "since INF-INF could be anything, there's no reason to think that INF-INF == INF-INF."
So, are you suggesting that maybe the Pythonic thing to do in that case would be to cause any operation on a NAN (including perhaps comparison) to fail, rather than allowing garbage to silently propagate? In other words, if NAN is only a signal that you have garbage, is there really any reason to keep it as an *object*, instead of simply raising an exception? Then, you could at least identify what calculation created the garbage, instead of it percolating up through other calculations. In low-level languages like C or Fortran, it obviously makes sense to represent NAN as a value, because there's no other way to represent it. But in a language with exceptions, is there a use case for it existing as a value?

On Fri, Mar 26, 2010 at 10:19 AM, P.J. Eby <pje@telecommunity.com> wrote:
At 11:57 AM 3/26/2010 +1100, Steven D'Aprano wrote:
But they're not -- they're *signals* for "your calculation has gone screwy and the result you get is garbage", so to speak. You shouldn't even think of a specific NAN as a piece of specific garbage, but merely a label on the *kind* of garbage you've got (the payload): INF-INF is, in some sense, a different kind of error to log(-1). In the same way you might say "INF-INF could be any number at all, therefore we return NAN", you might say "since INF-INF could be anything, there's no reason to think that INF-INF == INF-INF."
So, are you suggesting that maybe the Pythonic thing to do in that case would be to cause any operation on a NAN (including perhaps comparison) to fail, rather than allowing garbage to silently propagate?
Nan behavior being tightly linked to FPU exception handling, I think this is a good idea. One of the goal of Nan is to avoid many testing in intermediate computation (for efficiency reason), which may not really apply to python. Generally, you want to detect errors/exceptional situations as early as possible, and if you use python, you don't care about potential slowdown caused by those checks. David

On Mar 25, 2010, at 7:19 PM, P.J. Eby wrote:
At 11:57 AM 3/26/2010 +1100, Steven D'Aprano wrote:
But they're not -- they're *signals* for "your calculation has gone screwy and the result you get is garbage", so to speak. You shouldn't even think of a specific NAN as a piece of specific garbage, but merely a label on the *kind* of garbage you've got (the payload): INF-INF is, in some sense, a different kind of error to log(-1). In the same way you might say "INF-INF could be any number at all, therefore we return NAN", you might say "since INF-INF could be anything, there's no reason to think that INF-INF == INF-INF."
So, are you suggesting that maybe the Pythonic thing to do in that case would be to cause any operation on a NAN (including perhaps comparison) to fail, rather than allowing garbage to silently propagate?
In other words, if NAN is only a signal that you have garbage, is there really any reason to keep it as an *object*, instead of simply raising an exception? Then, you could at least identify what calculation created the garbage, instead of it percolating up through other calculations.
In low-level languages like C or Fortran, it obviously makes sense to represent NAN as a value, because there's no other way to represent it. But in a language with exceptions, is there a use case for it existing as a value?
If a NaN object is allowed to exist, that is a float operation that does not return a real number does not itself raise an exception immediately, then it will always be possible to get (seemingly) nonsensical behavior when it is used in containers that do not themselves "operate" on their elements. So even provided that performing any "operation" on a NaN object raises an exception, it would still be possible to add such an object to a list or tuple and have subsequent containment checks for that object return false. So this "solution" would simply narrow the problem posed, but not eliminate it. None of the solution posed seem very ideal, in particular when they deviate from the standard in arbitrary ways someone deems "better". It's obvious to me that no ideal solution exists so long as you attempt to represent non-numeric values in a numeric type. So unless you simply eliminate NaNs (thus breaking the standard), you are going to confuse somebody. And I think having float deviate from the IEEE standard is ill advised unless there is no alternative (i.e., the standard cannot be practically implemented), and breaking it will confuse people too (and probably the ones that know this domain). I propose that the current behavior stands as is and that the documentation make mention of the fact that NaN values are unordered, thus some float values may not behave intuitively wrt hashing, equality, etc. The fact of the matter is that using floats as dict keys or set values or even just checking equality is much more complex in practice than you would expect. I mean even representing 1.1 is problematic ;^). Unless the float values you are using are constants, how would you practically use them as dict keys, or hsah set members anyway? I'm not saying it can't be done, but is a hash table with float keys ever a data structure that someone on this list would recommend? If so good luck and god speed 8^) -Casey

On 26 Mar 2010, at 18:40 , Casey Duncan wrote:
On Mar 25, 2010, at 7:19 PM, P.J. Eby wrote:
At 11:57 AM 3/26/2010 +1100, Steven D'Aprano wrote:
But they're not -- they're *signals* for "your calculation has gone screwy and the result you get is garbage", so to speak. You shouldn't even think of a specific NAN as a piece of specific garbage, but merely a label on the *kind* of garbage you've got (the payload): INF-INF is, in some sense, a different kind of error to log(-1). In the same way you might say "INF-INF could be any number at all, therefore we return NAN", you might say "since INF-INF could be anything, there's no reason to think that INF-INF == INF-INF."
So, are you suggesting that maybe the Pythonic thing to do in that case would be to cause any operation on a NAN (including perhaps comparison) to fail, rather than allowing garbage to silently propagate?
In other words, if NAN is only a signal that you have garbage, is there really any reason to keep it as an *object*, instead of simply raising an exception? Then, you could at least identify what calculation created the garbage, instead of it percolating up through other calculations.
In low-level languages like C or Fortran, it obviously makes sense to represent NAN as a value, because there's no other way to represent it. But in a language with exceptions, is there a use case for it existing as a value?
If a NaN object is allowed to exist, that is a float operation that does not return a real number does not itself raise an exception immediately, then it will always be possible to get (seemingly) nonsensical behavior when it is used in containers that do not themselves "operate" on their elements.
How about raising an exception instead of creating nans in the first place, except maybe within specific contexts (so that the IEEE-754 minded can get their nans working as they currently do)? That way, there cannot be any nan-induced seemingly nonsensical behavior except within known scopes.

On Mar 26, 2010, at 3:16 PM, Xavier Morel wrote:
On 26 Mar 2010, at 18:40 , Casey Duncan wrote:
On Mar 25, 2010, at 7:19 PM, P.J. Eby wrote:
At 11:57 AM 3/26/2010 +1100, Steven D'Aprano wrote:
But they're not -- they're *signals* for "your calculation has gone screwy and the result you get is garbage", so to speak. You shouldn't even think of a specific NAN as a piece of specific garbage, but merely a label on the *kind* of garbage you've got (the payload): INF-INF is, in some sense, a different kind of error to log(-1). In the same way you might say "INF-INF could be any number at all, therefore we return NAN", you might say "since INF-INF could be anything, there's no reason to think that INF-INF == INF-INF."
So, are you suggesting that maybe the Pythonic thing to do in that case would be to cause any operation on a NAN (including perhaps comparison) to fail, rather than allowing garbage to silently propagate?
In other words, if NAN is only a signal that you have garbage, is there really any reason to keep it as an *object*, instead of simply raising an exception? Then, you could at least identify what calculation created the garbage, instead of it percolating up through other calculations.
In low-level languages like C or Fortran, it obviously makes sense to represent NAN as a value, because there's no other way to represent it. But in a language with exceptions, is there a use case for it existing as a value?
If a NaN object is allowed to exist, that is a float operation that does not return a real number does not itself raise an exception immediately, then it will always be possible to get (seemingly) nonsensical behavior when it is used in containers that do not themselves "operate" on their elements.
How about raising an exception instead of creating nans in the first place, except maybe within specific contexts (so that the IEEE-754 minded can get their nans working as they currently do)?
That way, there cannot be any nan-induced seemingly nonsensical behavior except within known scopes.
Having NaN creation raise an exception would undoubtedly break plenty of existing code that either expects and deals with NaNs itself or works accidentally because the NaNs do not cause harm. I don't sympathize much with the latter case since they are just hidden bugs probably, but the former makes it hard to justify raising exceptions for NaNs as the default behavior. But since I assume we're talking Python 3 here, maybe arguments containing the phase "existing code" can be dutifully ignored, I dunno. -Casey

On Mar 26, 2010, at 2:16 PM, Xavier Morel wrote:
How about raising an exception instead of creating nans in the first place, except maybe within specific contexts (so that the IEEE-754 minded can get their nans working as they currently do)?
-1 The numeric community uses NaNs as placeholders in vectorized calculations. People do use them and there's no point in breaking their code. Of the ideas I've seen in this thread, only two look reasonable: * Do nothing. This is attractive because it doesn't break anything. * Have float.__eq__(x, y) return True whenever x and y are the same NaN object. This is attractive because it is a minimal change that provides a little protection for simple containers. I support either of those options. Raymond

On Sat, Mar 27, 2010 at 8:16 AM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
On Mar 26, 2010, at 2:16 PM, Xavier Morel wrote:
How about raising an exception instead of creating nans in the first place, except maybe within specific contexts (so that the IEEE-754 minded can get their nans working as they currently do)?
-1 The numeric community uses NaNs as placeholders in vectorized calculations.
But is this relevant to python itself ? In Numpy, we indeed do use and support NaN, but we have much more control on what happens compared to python float objects. We can control whether invalid operations raises an exception or not, we had isnan/isfinite for a long time, and the fact that nan != nan has never been a real problem AFAIK. David

On 2010-03-27 00:32 , David Cournapeau wrote:
On Sat, Mar 27, 2010 at 8:16 AM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
On Mar 26, 2010, at 2:16 PM, Xavier Morel wrote:
How about raising an exception instead of creating nans in the first place, except maybe within specific contexts (so that the IEEE-754 minded can get their nans working as they currently do)?
-1 The numeric community uses NaNs as placeholders in vectorized calculations.
But is this relevant to python itself ? In Numpy, we indeed do use and support NaN, but we have much more control on what happens compared to python float objects. We can control whether invalid operations raises an exception or not, we had isnan/isfinite for a long time, and the fact that nan != nan has never been a real problem AFAIK.
Nonetheless, the closer our float arrays are to Python's float type, the happier I will be. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On Sun, Mar 28, 2010 at 9:28 AM, Robert Kern <robert.kern@gmail.com> wrote:
On 2010-03-27 00:32 , David Cournapeau wrote:
On Sat, Mar 27, 2010 at 8:16 AM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
On Mar 26, 2010, at 2:16 PM, Xavier Morel wrote:
How about raising an exception instead of creating nans in the first place, except maybe within specific contexts (so that the IEEE-754 minded can get their nans working as they currently do)?
-1 The numeric community uses NaNs as placeholders in vectorized calculations.
But is this relevant to python itself ? In Numpy, we indeed do use and support NaN, but we have much more control on what happens compared to python float objects. We can control whether invalid operations raises an exception or not, we had isnan/isfinite for a long time, and the fact that nan != nan has never been a real problem AFAIK.
Nonetheless, the closer our float arrays are to Python's float type, the happier I will be.
Me too, but I don't see how to reconcile this with the intent of simplifying nan handling because they are not intuitive, which seems to be the goal of this discussion. David

On 2010-03-29 01:17 AM, David Cournapeau wrote:
On Sun, Mar 28, 2010 at 9:28 AM, Robert Kern<robert.kern@gmail.com> wrote:
On 2010-03-27 00:32 , David Cournapeau wrote:
On Sat, Mar 27, 2010 at 8:16 AM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
On Mar 26, 2010, at 2:16 PM, Xavier Morel wrote:
How about raising an exception instead of creating nans in the first place, except maybe within specific contexts (so that the IEEE-754 minded can get their nans working as they currently do)?
-1 The numeric community uses NaNs as placeholders in vectorized calculations.
But is this relevant to python itself ? In Numpy, we indeed do use and support NaN, but we have much more control on what happens compared to python float objects. We can control whether invalid operations raises an exception or not, we had isnan/isfinite for a long time, and the fact that nan != nan has never been a real problem AFAIK.
Nonetheless, the closer our float arrays are to Python's float type, the happier I will be.
Me too, but I don't see how to reconcile this with the intent of simplifying nan handling because they are not intuitive, which seems to be the goal of this discussion.
"Do nothing" is still on the table, I think. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On Fri, Mar 26, 2010 at 11:16 PM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
Of the ideas I've seen in this thread, only two look reasonable: * Do nothing. This is attractive because it doesn't break anything. * Have float.__eq__(x, y) return True whenever x and y are the same NaN object. This is attractive because it is a minimal change that provides a little protection for simple containers. I support either of those options.
Yes; those are the only two options I've seen that seem workable. Of the two, I prefer the first (do nothing), but would be content with second. I'd be interested to know whether there's any real-life code that's suffering as a result of nan != nan. While the nan weirdnesses certainly exist, I'm having difficulty imagining them turning up in real code. Casey Duncan's point that there can't be many good uses for floats as dict keys or set members is compelling, though there may be type-agnostic applications that care (e.g., memoizing). Similarly, putting floats into a list must be very common, but I'd guess that checking whether a given float is in a list doesn't happen that often. I suspect that (nan+container)-related oddities turn up infrequently enough to make it not worth fixing. By the way, for those suggesting that any operation producing a nan raise an exception instead: Python's math module actually does go out of its way to protect naive users from nans. You can't get a nan out of any of the math module functions without having put a nan in in the first place. Invalid operations like math.sqrt(-1), math.log(-1), consistently raise ValueError rather than return a nan. Ideally I'd like to see this behaviour extended to arithmetic as well, so that e.g., float('inf')/float('inf') raises instead of producing float('nan') (and similarly 1e300 * 1e300 raises OverflowError instead of producing an infinity), but there are backwards compatibility concerns. But even then, I'd still want it to be possible to produce nans deliberately when necessary, e.g., by directly calling float('nan'). Python also needs to be able to handle floating-point data generated from other sources; for this alone it should be at least able to read and write infinities and nans. Mark

Mark Dickinson wrote:
On Fri, Mar 26, 2010 at 11:16 PM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
Of the ideas I've seen in this thread, only two look reasonable: * Do nothing. This is attractive because it doesn't break anything. * Have float.__eq__(x, y) return True whenever x and y are the same NaN object. This is attractive because it is a minimal change that provides a little protection for simple containers. I support either of those options.
Yes; those are the only two options I've seen that seem workable. Of the two, I prefer the first (do nothing), but would be content with second.
I've ended up in the same place as Mark: +1 on retaining the status quo (possibly with better warnings about the potential oddities of floating point values being placed in equality-based containers), +0 on changing NaN equality to check identity first in order to provide reflexivity under == for these two types. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Fri, Mar 26, 2010 at 17:16, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
Of the ideas I've seen in this thread, only two look reasonable: * Do nothing. This is attractive because it doesn't break anything. * Have float.__eq__(x, y) return True whenever x and y are the same NaN object. This is attractive because it is a minimal change that provides a little protection for simple containers. I support either of those options.
What's the flaw in using isnan()? -- Adam Olsen, aka Rhamphoryncus

On 2010-03-27 13:36 , Adam Olsen wrote:
On Fri, Mar 26, 2010 at 17:16, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
Of the ideas I've seen in this thread, only two look reasonable: * Do nothing. This is attractive because it doesn't break anything. * Have float.__eq__(x, y) return True whenever x and y are the same NaN object. This is attractive because it is a minimal change that provides a little protection for simple containers. I support either of those options.
What's the flaw in using isnan()?
There are implicit comparisons being done inside list.__contains__() and other such methods. They do not, and should not, know about isnan(). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On Sat, Mar 27, 2010 at 18:27, Robert Kern <robert.kern@gmail.com> wrote:
On 2010-03-27 13:36 , Adam Olsen wrote:
What's the flaw in using isnan()?
There are implicit comparisons being done inside list.__contains__() and other such methods. They do not, and should not, know about isnan().
Those methods should raise an exception. Conceptually, NaN should contaminate the result and make list.__contains__() return some "unsortable", but we don't want to bend the whole language backwards just for one obscure feature, especially when we have a much better approach most of the time (exceptions). The reason why NaN's current behaviour is so disturbing is that it increases the mental load of everybody dealing with floats. When you write new code or debug a program you have to ask yourself what might happen if a NaN is produced. When maintaining existing code you have to figure out if it's written a specific way to get NaN to work right, or if it's even a fluke that NaN's work right, even if it was never intended for NaNs or never sees them on developer machines. This is all the subtlety we work so hard to avoid normally, so why make an exception here? NaNs themselves have use cases, but their subtlety doesn't. -- Adam Olsen, aka Rhamphoryncus

On Sun, 28 Mar 2010 05:32:46 pm Adam Olsen wrote:
On Sat, Mar 27, 2010 at 18:27, Robert Kern <robert.kern@gmail.com> wrote:
On 2010-03-27 13:36 , Adam Olsen wrote:
What's the flaw in using isnan()?
There are implicit comparisons being done inside list.__contains__() and other such methods. They do not, and should not, know about isnan().
Those methods should raise an exception. Conceptually, NaN should contaminate the result and make list.__contains__() return some "unsortable", but we don't want to bend the whole language backwards just for one obscure feature, especially when we have a much better approach most of the time (exceptions).
I disagree -- if I ask: 3.0 in [1.0, 2.0, float('nan'), 3.0] I should get True, not an exception. Comparing NANs for equality isn't an error. +1 on leaving the behaviour alone -- the surprising behaviour people have pointed out with NANs in lists, dicts and sets occurs more often in theory than in practice. -- Steven D'Aprano

On Sun, Mar 28, 2010 at 17:55, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Steven D'Aprano wrote:
I disagree -- if I ask:
3.0 in [1.0, 2.0, float('nan'), 3.0]
I should get True, not an exception.
Yes, I don't think anyone would disagree that NaN should compare unequal to anything that isn't a NaN. Problems only arise when comparing two NaNs.
NaN includes real numbers. Although a NaN is originally produced for results that are not real numbers, further operations could produce a real number; we'd never know as NaN has no precision. Extending with complex numbers instead gives enough precision to show how this can happen. -- Adam Olsen, aka Rhamphoryncus

On Fri, 26 Mar 2010 12:19:06 pm P.J. Eby wrote:
At 11:57 AM 3/26/2010 +1100, Steven D'Aprano wrote:
But they're not -- they're *signals* for "your calculation has gone screwy and the result you get is garbage", so to speak. You shouldn't even think of a specific NAN as a piece of specific garbage, but merely a label on the *kind* of garbage you've got (the payload): INF-INF is, in some sense, a different kind of error to log(-1). In the same way you might say "INF-INF could be any number at all, therefore we return NAN", you might say "since INF-INF could be anything, there's no reason to think that INF-INF == INF-INF."
So, are you suggesting that maybe the Pythonic thing to do in that case would be to cause any operation on a NAN (including perhaps comparison) to fail, rather than allowing garbage to silently propagate?
Certainly not. That defeats the whole purpose of NANs. I wish floating point calculations in Python would return NANs rather than raise the exceptions they do now. I can't speak for others, but in my experience NANs are a much nicer way to do maths-related programming. I've programmed with a system that supported NANs extensively (Apple's SANE, circa 1990), and I miss it so. Note also that NANs do not necessarily contaminate every expression or function call. The standard allows for them to "cancel out", so to speak, where it is mathematically justifiable:
nan = float('nan') 1.0**nan 1.0
so you shouldn't assume that the presence of a NAN in a calculation is the kiss of death.
In other words, if NAN is only a signal that you have garbage, is there really any reason to keep it as an *object*, instead of simply raising an exception? Then, you could at least identify what calculation created the garbage, instead of it percolating up through other calculations.
The standard distinguishes between signalling NANs and quiet NANs (which propagate as values). By default, signalling NANs are usually converted to quiet NANs, but the caller is supposed to be able to be able to change that behaviour to a floating point signal which can be trapped. In Python, the equivalent would be an exception. -- Steven D'Aprano

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 03/26/2010 01:57 AM, Steven D'Aprano wrote:
In fact, identity of NANs is itself an implementation quirk of programming languages like Python: logically, NANs don't have identity at all.
To put it another way: all ONEs are the same ONE, even if they come from different sources, are in different memory locations, or have different identities; but all NANs are different, even if they come from the same source, are in the same memory location, or have the same identity.
+inf. Bravo!. :-) - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea@jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea@jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBS6wP7Jlgi5GaxT1NAQJXFgP+KHshsGGOM1GO/f1lEazul1+vipw6tUuF Hf8Pm39srX3pKV7OUabWJWBaRURnJWIHEymjwILBwVo9X4604klbTRi3MrqbOeZv G8NFeMzagXLwEOlAnPbVcfDb3KcW4C/Zm0A5TOPY4X5T/8FRmhfbrC4Ip2klyyMh 24fGUWTr5DI= =d1RL -----END PGP SIGNATURE-----

On Thu, Mar 25, 2010 at 18:57, Steven D'Aprano <steve@pearwood.info> wrote:
Simply put: we should treat "two unclear values are different" as more compelling than "two unclear values are the same" as it leads to fewer, smaller, errors. Consider:
log(-1) = NAN # maths equality, not assignment log(-2) = NAN
If we allow NAN = NAN, then we permit the error:
log(-1) = NAN = log(-2) therefore log(-1) = log(-2) and 1 = 2
But if make NAN != NAN, then we get:
log(-1) != log(-2)
and all of mathematics does not collapse into a pile of rubble. I think that is a fairly compelling reason to prefer inequality over equality.
One objection might be that while log(-1) and log(-2) should be considered different NANs, surely NANs should be equal to themselves?
-1 = -1 implies log(-1) = log(-1)
IMO, this just shows how ludicrous it is to compare NaNs. No matter what we do it will imply some insane mathematical consequence implied and code that will break. They are, after all, an error passed silently. Why is it complex can raise an exception when sorted, forcing you to use a sane (and explicit) method, but for NaN it's okay to silently fail? -- Adam Olsen, aka Rhamphoryncus

Steven D'Aprano wrote:
By analogy: the Lizard King of Russia does not exist; the Vampire Queen of New Orleans also does not exist. We don't therefore conclude that the Lizard King and the Vampire Queen are therefore the same person.
But it's equally invalid to say that they're *not* the same person, because it's meaningless to say *anything* about the properties of nonexistent people. -- Greg
participants (21)
-
Adam Olsen
-
Alexander Belopolsky
-
Antoine Pitrou
-
Casey Duncan
-
Curt Hagenlocher
-
David Cournapeau
-
Georg Brandl
-
Glenn Linderman
-
Greg Ewing
-
Guido van Rossum
-
Jesus Cea
-
Maciej Fijalkowski
-
Mark Dickinson
-
Nick Coghlan
-
P.J. Eby
-
Pierre B.
-
Raymond Hettinger
-
Robert Kern
-
Stephen J. Turnbull
-
Steven D'Aprano
-
Xavier Morel