[Python-ideas] checking for identity before comparing built-in objects

Mon Oct 8 05:09:06 CEST 2012

On 08/10/2012 03:35, Ned Batchelder wrote:
> On 10/7/2012 9:51 PM, Guido van Rossum wrote:
>> On Sun, Oct 7, 2012 at 6:09 PM, Alexander Belopolsky
>> <alexander.belopolsky at gmail.com> wrote:
>>> On Sun, Oct 7, 2012 at 8:54 PM, Guido van Rossum <guido at python.org> 
>>> wrote:
>>>> Seriously, we can't change our position on this topic now without
>>>> making a lot of people seriously unhappy. IEEE 754 it is.
>>> I did not suggest a change.  I wrote: "I am not suggesting any
>>> language changes, but I think it will be
>>> useful to explain why float('nan') != float('nan') somewhere in the
>>> docs."  If there is a concise explanation for the choice of IEEE 754
>>> vs. Java, I think we should write it down and put an end to this
>>> debate.
>> Referencing Java here is absurd and I still consider this suggestion
>> as a troll. Python is not in any way based on Java.
>>
>> On the other hand referencing IEEE 754 makes all the sense in the
>> world, since every other aspect of Python float is based on IEEE 754
>> double whenever the underlying platform implements this standard --
>> and all modern CPUs do. I don't think there's anything else we need to
>> say.
>>
> I don't understand the reluctance to address a common conceptual 
> speed-bump in the docs.  After all, the tutorial has an entire chapter 
> (http://docs.python.org/tutorial/floatingpoint.html) that explains how 
> floats work, even though they work exactly as IEEE 754 says they should.
>
> A sentence in section 5.4 (Numeric Types) would help.  Something like, 
> "In accordance with the IEEE 754 standard, NaN's are not equal to any 
> value, even another NaN.  This is because NaN doesn't represent a 
> particular number, it represents an unknown result, and there is no 
> way to know if one unknown result is equal to another unknown result."
>
> --Ned.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>
I understand that the undefined result of a computation is not the same 
as the undefined result of another computation.
(E.g. one might represent positive infinity, another might represent 
underflow or loss of accuracy.)
But I can't help feeling (strongly) that the result of a computation 
should be equal to itself.
In other words, after
     x = float('nan')
     y = float('nan')
I would expect
     x != y
but
     x == x

After all, how much sense does this make (I got this in a quick test 
with Python 2.7.3):
 >>> x=float('nan')
 >>> x is x
True            # Well I guess you'd sorta expect this
 >>> x==x
False           # You what?
 >>> D = {1:x, 2:x}
 >>> D[1]==D[2]
False          # I see, both NANs - hmph!
 >>> [x]==[x]
True            # Oh yeh, it doesn't always work that way then?

Making equality non-reflexive feels utterly wrong to me, partly no doubt 
because of my mathematical background, partly because of the difficulty 
in implementing container objects and algorithms and God knows what else 
when you have to remember that some of the objects they may deal with 
may not be equal to themselves.  In particular the difference between my 
last two examples ( D[1]!=D[2] but [x]==[x] ) looks impossible to 
justify except by saying that for historical reasons the designers of 
lists and the designers of dictionaries made different - but entirely 
reasonable - assumptions about the equality relation, and (perhaps) 
whether identity implies equality (how do you explain to a Python 
learner that it doesn't (pathological code examples aside) ???).
Couldn't each NAN when generated contain something that identified it 
uniquely, so that different NANs would always compare as not equal, but 
any given NAN would compare equal to itself?
Rob Cliffe