[Python-ideas] checking for identity before comparing built-in objects
Rob Cliffe
rob.cliffe at btinternet.com
Mon Oct 8 05:09:06 CEST 2012
On 08/10/2012 03:35, Ned Batchelder wrote:
> On 10/7/2012 9:51 PM, Guido van Rossum wrote:
>> On Sun, Oct 7, 2012 at 6:09 PM, Alexander Belopolsky
>> <alexander.belopolsky at gmail.com> wrote:
>>> On Sun, Oct 7, 2012 at 8:54 PM, Guido van Rossum <guido at python.org>
>>> wrote:
>>>> Seriously, we can't change our position on this topic now without
>>>> making a lot of people seriously unhappy. IEEE 754 it is.
>>> I did not suggest a change. I wrote: "I am not suggesting any
>>> language changes, but I think it will be
>>> useful to explain why float('nan') != float('nan') somewhere in the
>>> docs." If there is a concise explanation for the choice of IEEE 754
>>> vs. Java, I think we should write it down and put an end to this
>>> debate.
>> Referencing Java here is absurd and I still consider this suggestion
>> as a troll. Python is not in any way based on Java.
>>
>> On the other hand referencing IEEE 754 makes all the sense in the
>> world, since every other aspect of Python float is based on IEEE 754
>> double whenever the underlying platform implements this standard --
>> and all modern CPUs do. I don't think there's anything else we need to
>> say.
>>
> I don't understand the reluctance to address a common conceptual
> speed-bump in the docs. After all, the tutorial has an entire chapter
> (http://docs.python.org/tutorial/floatingpoint.html) that explains how
> floats work, even though they work exactly as IEEE 754 says they should.
>
> A sentence in section 5.4 (Numeric Types) would help. Something like,
> "In accordance with the IEEE 754 standard, NaN's are not equal to any
> value, even another NaN. This is because NaN doesn't represent a
> particular number, it represents an unknown result, and there is no
> way to know if one unknown result is equal to another unknown result."
>
> --Ned.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>
I understand that the undefined result of a computation is not the same
as the undefined result of another computation.
(E.g. one might represent positive infinity, another might represent
underflow or loss of accuracy.)
But I can't help feeling (strongly) that the result of a computation
should be equal to itself.
In other words, after
x = float('nan')
y = float('nan')
I would expect
x != y
but
x == x
After all, how much sense does this make (I got this in a quick test
with Python 2.7.3):
>>> x=float('nan')
>>> x is x
True # Well I guess you'd sorta expect this
>>> x==x
False # You what?
>>> D = {1:x, 2:x}
>>> D[1]==D[2]
False # I see, both NANs - hmph!
>>> [x]==[x]
True # Oh yeh, it doesn't always work that way then?
Making equality non-reflexive feels utterly wrong to me, partly no doubt
because of my mathematical background, partly because of the difficulty
in implementing container objects and algorithms and God knows what else
when you have to remember that some of the objects they may deal with
may not be equal to themselves. In particular the difference between my
last two examples ( D[1]!=D[2] but [x]==[x] ) looks impossible to
justify except by saying that for historical reasons the designers of
lists and the designers of dictionaries made different - but entirely
reasonable - assumptions about the equality relation, and (perhaps)
whether identity implies equality (how do you explain to a Python
learner that it doesn't (pathological code examples aside) ???).
Couldn't each NAN when generated contain something that identified it
uniquely, so that different NANs would always compare as not equal, but
any given NAN would compare equal to itself?
Rob Cliffe
More information about the Python-ideas
mailing list